实施HAVING https://en.wikipedia.org/wiki/Having_(SQL)类似行为
您可以使用其中之一pipeline aggregations https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html,即桶选择器聚合 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-bucket-selector-aggregation.html。查询如下所示:
POST my_index/tdrs/_search
{
"aggregations": {
"reseller_sale_sum": {
"aggregations": {
"sales": {
"sum": {
"field": "requestAmountValue"
}
},
"max_sales": {
"bucket_selector": {
"buckets_path": {
"var1": "sales"
},
"script": "params.var1 > 10000"
}
}
},
"terms": {
"field": "senderResellerId",
"order": {
"sales": "desc"
},
"size": 5
}
}
},
"size": 0
}
将以下文档放入索引后:
"hits": [
{
"_index": "my_index",
"_type": "tdrs",
"_id": "AV9Yh5F-dSw48Z0DWDys",
"_score": 1,
"_source": {
"requestAmountValue": 7000,
"senderResellerId": "ID_1"
}
},
{
"_index": "my_index",
"_type": "tdrs",
"_id": "AV9Yh684dSw48Z0DWDyt",
"_score": 1,
"_source": {
"requestAmountValue": 5000,
"senderResellerId": "ID_1"
}
},
{
"_index": "my_index",
"_type": "tdrs",
"_id": "AV9Yh8TBdSw48Z0DWDyu",
"_score": 1,
"_source": {
"requestAmountValue": 1000,
"senderResellerId": "ID_2"
}
}
]
查询的结果是:
"aggregations": {
"reseller_sale_sum": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "ID_1",
"doc_count": 2,
"sales": {
"value": 12000
}
}
]
}
}
IE。只有那些senderResellerId
其累计销售额为>10000
.
数桶数
实现相当于SELECT COUNT(*) FROM (... HAVING)
可以使用以下组合桶脚本聚合 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-bucket-script-aggregation.html with 总和桶聚合 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-sum-bucket-aggregation.html。虽然似乎没有直接的方法来计算有多少个桶bucket_selector
实际上选择,我们可以定义一个bucket_script
产生0
or 1
取决于条件,并且sum_bucket
产生它的sum
:
POST my_index/tdrs/_search
{
"aggregations": {
"reseller_sale_sum": {
"aggregations": {
"sales": {
"sum": {
"field": "requestAmountValue"
}
},
"max_sales": {
"bucket_script": {
"buckets_path": {
"var1": "sales"
},
"script": "if (params.var1 > 10000) { 1 } else { 0 }"
}
}
},
"terms": {
"field": "senderResellerId",
"order": {
"sales": "desc"
}
}
},
"max_sales_stats": {
"sum_bucket": {
"buckets_path": "reseller_sale_sum>max_sales"
}
}
},
"size": 0
}
输出将是:
"aggregations": {
"reseller_sale_sum": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
...
]
},
"max_sales_stats": {
"value": 1
}
}
所需的桶数位于max_sales_stats.value
.
重要考虑因素
我必须指出两件事:
- 该功能是实验性的(从 ES 5.6 开始,它仍然是实验性的,尽管它是在2.0.0-beta1 https://www.elastic.co/guide/en/elasticsearch/reference/2.0/search-aggregations-pipeline.html.)
- 管道聚合应用于先前聚合的结果:
管道聚合作用于其他聚合产生的输出,而不是
从文档集中,将信息添加到输出树中。
这意味着bucket_selector
聚合将在以下结果之后应用terms
聚合于senderResellerId
。例如,如果有更多senderResellerId
than size
of terms
聚合定义,你不会得到all集合中的 idssum(sales) > 10000
,但仅限于那些出现在输出中的terms
聚合。考虑使用排序和/或设置足够的size
范围。
这也适用于第二种情况,COUNT() (... HAVING)
,它只会计算聚合输出中实际存在的那些存储桶。
如果这个查询太重或者桶的数量太大,请考虑非规范化 https://www.elastic.co/guide/en/elasticsearch/guide/current/relations.html您的数据或直接将此总和存储在文档中,因此您可以使用普通range https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html查询以实现您的目标。