近日,部分pywsd
代码已被移植到前沿版本NLTK
' 在里面wsd.py
模块,尝试:
>>> from nltk.wsd import lesk
>>> sent = 'I went to the bank to deposit my money'
>>> ambiguous = 'bank'
>>> lesk(sent, ambiguous)
Synset('bank.v.04')
>>> lesk(sent, ambiguous).definition()
u'act as the banker in a game or in gambling'
为了获得更好的 WSD 性能,请使用pywsd
图书馆而不是NLTK
模块。一般来说,simple_lesk()
from pywsd
比lesk
from NLTK
。我会尝试更新NLTK
当我有空的时候,尽可能多地使用模块。
回应克里斯·斯宾塞的评论,请注意Lesk算法的局限性。我只是给出算法的准确实现。这不是灵丹妙药http://en.wikipedia.org/wiki/Lesk_algorithm http://en.wikipedia.org/wiki/Lesk_algorithm
另请注意,尽管:
lesk("My cat likes to eat mice.", "cat", "n")
没有给你正确的答案,你可以使用pywsd
实施max_similarity()
:
>>> from pywsd.similarity import max_similiarity
>>> max_similarity('my cat likes to eat mice', 'cat', 'wup', pos='n').definition
'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'
>>> max_similarity('my cat likes to eat mice', 'cat', 'lin', pos='n').definition
'feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats'
@Chris,如果你想要一个 python setup.py ,只需做一个礼貌的请求,我会写它......