我收到此代码的以下错误:
model = lda.LDA(n_topics=15, n_iter=50, random_state=1)
model.fit(X)
topic_word = model.topic_word_
print("type(topic_word): {}".format(type(topic_word)))
print("shape: {}".format(topic_word.shape))
print ("\n")
n = 15
doc_topic=model.doc_topic_
for i in range(15):
print("{} (top topic: {})".format(titles[i], doc_topic[0][i].argmax()))
topic_csharp=np.zeros(shape=[1,n])
np.copyto(topic_csharp,doc_topic[0][i])
for i, topic_dist in enumerate(topic_word):
topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n+1):-1]
print('*Topic {}\n- {}'.format(i, ' '.join(topic_words)))
错误是:
Traceback (most recent call last):
File "C:\Users\csharp.py", line 56, in <module>
topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n+1):-1]
MemoryError
我正在拟合模型的文档大约有 1,50,000 行文本。
词汇大小:558270
n_words:13075390(预处理后)
我该如何解决这个错误?
如果您的数组对于 RAM 来说太大,请使用 numpy.memmap。
看http://docs.scipy.org/doc/numpy-1.10.0/reference/ generated/numpy.memmap.html http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.memmap.html
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)