我有以下文件:
south africa
north africa
我想从以下位置检索我的“南非”文档:
-
s africa
(a)
-
southafrica
(b)
-
safrica
(c)
我定义了以下过滤器和分析器:
POST test_index
{
"settings": {
"analysis": {
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"south,s",
"north,n"
]
},
"shingle_filter": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 3,
"token_separator": ""
}
},
"analyzer": {
"my_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": ["shingle_filter"]
},
"my_shingle_synonym": {
"type": "custom",
"tokenizer": "standard",
"filter": ["shingle_filter", "synonym_filter"]
},
"my_synonym_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": ["synonym_filter", "shingle_filter"]
}
}
}
},
"mappings": {}
}
1) With 我的木瓦 south africa
将被索引为south
, southafrica
, africa
2) With 我的木瓦同义词 south africa
将被索引为south
, s
, southafrica
, africa
3) With my_synonym_shingle south africa
将被索引为south
, souths
, southsafrica
, s
, safrica
, africa
So with
(1) 我会找到b
(2)我会找到a,b
(3)我会找到a,c
I want south africa
被索引为:south
, s
, southafrica
, safrica
, africa