您好,登錄后才能下訂單哦!
本篇文章為大家展示了如何實現ClickHouse與 Elasticsearch聚合性能對比測試,內容簡明扼要并且容易理解,絕對能使你眼前一亮,通過這篇文章的詳細介紹希望你能有所收獲。
Elasticsearch以其優秀的分布式架構與全文搜索引擎等特點在機器數據的存儲、分析領域廣為使用,但隨著數據量的增長,其聚合分析性能已無法滿足業務需求。而ClickHouse作為一個高性能的OLAP列式數據庫管理系統有望解決這一痛點。
本文是對ClickHouse與Elasticsearch聚合性能的簡單對比測試。主要關注查詢語句的響應時間,暫不考慮資源占用情況。
組件 | 版本 | CPU | 內存 |
---|---|---|---|
ClickHouse | 7.9.0 | 4C | 8G |
Elasticsearch | 20.11.4.13 | 4C | 8G |
使用ClickHouse官方提供的測試數據集,共67G,約6億行。
其中,ClickHouse使用LO_ORDERDATE字段作為分區鍵,使用LO_ORDERDATE, LO_ORDERKEY作為排序鍵。
# ClickHouse
SELECT LO_SHIPMODE,COUNT() FROM lineorder GROUP BY LO_SHIPMODE ORDER BY COUNT() DESC LIMIT 10
# Elasticsearch
GET lineorder/_search
{
"aggs": {
"1": {
"terms": {
"field": "LO_SHIPMODE.keyword",
"order": {
"_count": "desc"
},
"size": 10
}
}
},
"size": 0
}
# ClickHouse
SELECT toYear(LO_ORDERDATE),COUNT() FROM lineorder GROUP BY toYear(LO_ORDERDATE) FORMAT PrettyCompactMonoBlock
# Elasticsearch
GET lineorder/_search
{
"aggs": {
"2": {
"date_histogram": {
"field": "LO_ORDERDATE",
"calendar_interval":"1y",
"format":"yyyy-MM-dd"
}
}
},
"size": 0
}
# ClickHouse
SELECT LO_ORDERDATE,LO_ORDERKEY,LO_SHIPMODE,LO_ORDERPRIORITY,LO_COMMITDATE FROM lineorder WHERE LO_ORDERDATE >= '1992-01-01' AND LO_ORDERDATE < '1993-01-01' ORDER BY LO_ORDERDATE LIMIT 500
# Elasticsearch
GET lineorder/_search
{
"size": 500,
"sort": [
{
"timestamp": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"query": {
"bool": {
"must": [],
"filter": [
{
"match_all": {}
},
{
"match_all": {}
},
{
"range": {
"LO_ORDERDATE": {
"gte": "1992-01-01",
"lte": "1993-01-01",
"format": "strict_date_optional_time"
}
}
}
],
"should": [],
"must_not": []
}
}
}
# ClickHouse
SELECT toYear(LO_ORDERDATE),LO_SHIPMODE,COUNT() FROM lineorder GROUP BY toYear(LO_ORDERDATE),LO_SHIPMODE ORDER BY toYear(LO_ORDERDATE) FORMAT PrettyCompactMonoBlock
# Elasticsearch
GET lineorder/_search
{
"aggs": {
"3": {
"terms": {
"field": "LO_SHIPMODE.keyword",
"order": {
"_count": "desc"
},
"size": 10
},
"aggs": {
"2": {
"date_histogram": {
"field": "LO_ORDERDATE",
"calendar_interval": "1y",
"time_zone": "Asia/Shanghai",
"min_doc_count": 1
}
}
}
}
},
"size": 0
}
# ClickHouse
SELECT toYear(LO_ORDERDATE),LO_SHIPMODE,COUNT() FROM lineorder GROUP BY toYear(LO_ORDERDATE),LO_SHIPMODE ORDER BY toYear(LO_ORDERDATE) FORMAT PrettyCompactMonoBlock
# Elasticsearch
GET lineorder/_search
{
"aggs": {
"3": {
"terms": {
"field": "LO_SHIPMODE.keyword",
"order": {
"_count": "desc"
},
"size": 10
},
"aggs": {
"2": {
"date_histogram": {
"field": "LO_ORDERDATE",
"calendar_interval": "1y",
"time_zone": "Asia/Shanghai",
"min_doc_count": 1
}
}
}
}
},
"size": 0
}
# ClickHouse
SELECT LO_SHIPMODE,COUNT(LO_SHIPMODE),LO_ORDERPRIORITY,COUNT(LO_ORDERPRIORITY) FROM lineorder GROUP BY LO_SHIPMODE,LO_ORDERPRIORITY ORDER BY COUNT(LO_SHIPMODE),COUNT(LO_ORDERPRIORITY) LIMIT 5 BY LO_SHIPMODE,LO_ORDERPRIORITY
# Elasticsearch
GET lineorder/_search
{
"aggs": {
"2": {
"terms": {
"field": "LO_SHIPMODE.keyword",
"order": {
"_count": "desc"
},
"size": 5
},
"aggs": {
"3": {
"terms": {
"field": "LO_ORDERPRIORITY.keyword",
"order": {
"_count": "desc"
},
"size": 5
}
}
}
}
},
"size": 0
}
聚合場景 | ck(ms) | es(ms) | 性能對比 |
---|---|---|---|
基于時間的多字段聚合 | 5506 | 15599 | 近3倍 |
多個字段按年進行計數(數據表) | 381 | 6267 | 16倍多 |
某字段出現次數 TOP 10(餅圖) | 4048 | 7317 | 近2倍 |
某字段按年進行計數(時間趨勢圖) | 901 | 23257 | 25倍多 |
聚合嵌套(非時間字段) | 6937 | 15767 | 2倍多 |
相同數據量下,ClickHouse的聚合性能都要優于Elasticsearch,且如果基于排序鍵進行聚合,性能更好,是ES的數倍。
此外,ClickHouse的SummaryMergeTree、AggregatingMergeTree表引擎支持后臺自動聚合數據,所以在某些場景下其聚合分析性能會更優。
上述內容就是如何實現ClickHouse與 Elasticsearch聚合性能對比測試,你們學到知識或技能了嗎?如果還想學到更多技能或者豐富自己的知識儲備,歡迎關注億速云行業資訊頻道。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。