对此进行建模的一种方法是使用父/子文档。房间文档将是父文档,可用性文档将是其子文档。对于每个房间,在房间可用的每个日期都会有一份可用文件。然后,在查询时,我们可以查询父房间,这些房间对于搜索间隔内的每个日期都有一个可用子文档(甚至是不相交的日期)。
请注意,您需要确保在预订房间后立即删除每个预订日期的相应子文档。
让我们试试这个。首先创建索引:
PUT /rooms
{
"mappings": {
"room": {
"properties": {
"room_num": {
"type": "integer"
}
}
},
"availability": {
"_parent": {
"type": "room"
},
"properties": {
"date": {
"type": "date",
"format": "date"
},
"available": {
"type": "boolean"
}
}
}
}
}
然后添加一些数据
POST /rooms/_bulk
{"_index": { "_type": "room", "_id": 233}}
{"room_num": 233}
{"_index": { "_type": "availability", "_id": "20160701", "_parent": 233}}
{"date": "2016-07-01"}
{"_index": { "_type": "availability", "_id": "20160702", "_parent": 233}}
{"date": "2016-07-02"}
{"_index": { "_type": "availability", "_id": "20160704", "_parent": 233}}
{"date": "2016-07-04"}
{"_index": { "_type": "availability", "_id": "20160705", "_parent": 233}}
{"date": "2016-07-05"}
{"_index": { "_type": "availability", "_id": "20160707", "_parent": 233}}
{"date": "2016-07-07"}
{"_index": { "_type": "availability", "_id": "20160708", "_parent": 233}}
{"date": "2016-07-08"}
最后我们就可以开始查询了。首先,假设我们想要找到一个可用的房间2016-07-01
:
POST /rooms/room/_search
{
"query": {
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-01"
}
}
}
}
}
=> result: room 233
然后,让我们尝试从以下位置搜索可用的房间:2016-07-01
to 2016-07-03
POST /rooms/room/_search
{
"query": {
"bool": {
"minimum_should_match": 3,
"should": [
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-01"
}
}
}
},
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-02"
}
}
}
},
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-03"
}
}
}
}
]
}
}
}
=> Result: No rooms
然而,寻找可用的房间2016-07-01
to 2016-07-02
是否有233号房间
POST /rooms/room/_search
{
"query": {
"bool": {
"minimum_should_match": 2,
"should": [
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-01"
}
}
}
},
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-02"
}
}
}
}
]
}
}
}
=> Result: Room 233
我们还可以搜索不相交的区间,例如2016-07-01
to 2016-07-02
+ from 2016-07-04
to 2016-07-05
POST /rooms/room/_search
{
"query": {
"bool": {
"minimum_should_match": 4,
"should": [
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-01"
}
}
}
},
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-02"
}
}
}
},
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-04"
}
}
}
},
{
"has_child": {
"type": "availability",
"query": {
"term": {
"date": "2016-07-05"
}
}
}
}
]
}
}
}
=> Result: Room 233
等等……重点是加一个has_child
查询您需要检查可用性并设置的每个日期minimum_should_match
到您要检查的日期数。
UPDATE
另一种选择是使用script filter https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-query.html,但对于 1 亿个文档,我不确定它是否能很好地扩展。
在这种情况下,您可以保留原始设计(最好是第二个设计,因为使用第一个设计,您将在映射中创建太多不必要的字段),查询将如下所示:
POST /rooms/room/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"inline": "def dates = doc.availability.sort(false); from = Date.parse('yyyy-MM-dd', from); to = Date.parse('yyyy-MM-dd', to); def days = to - from; def fromIndex = doc.availability.values.indexOf(from.time); def toIndex = doc.availability.values.indexOf(to.time); return days == (toIndex - fromIndex)",
"params": {
"from": "2016-07-01",
"to": "2016-07-04"
}
}
}
}
}
}
}