您可以将其他停用词添加到标准英语停用词集的副本中,或者仅添加另一个 StopFilter。喜欢:
TokenStream tokenStream = new StandardTokenizer(Version.LUCENE_36, new StringReader(string));
CharArraySet stopSet = CharArraySet.copy(Version.LUCENE_36, StandardAnalyzer.STOP_WORD_SET);
stopSet.add("add");
stopSet.add("your");
stopSet.add("stop");
stopSet.add("words");
tokenStream = new StopFilter(Version.LUCENE_36, tokenStream, stopSet);
//Or, if you just need the added stopwords in a standardanalyzer, you could just pass this stopfilter into the StandardAnalyzer...
//analyzer = new StandardAnalyzer(Version.LUCENE_36, stopSet);
or:
TokenStream tokenStream = new StandardTokenizer(Version.LUCENE_36, new StringReader(string));
tokenStream = new StopFilter(Version.LUCENE_36, tokenStream, StandardAnalyzer.STOP_WORDS_SET);
List<String> stopWords = //your list of stop words.....
tokenStream = new StopFilter(Version.LUCENE_36, tokenStream, StopFilter.makeStopSet(Version.LUCENE_36, stopWords));
如果您尝试创建自己的分析器,那么遵循类似于示例中的模式可能会更好。分析仪文档 http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/analysis/Analyzer.html?is-external=true.