这是一个非常基本的概念:我对训练有多个依赖性。我的数据都是文本,并且有三个单独的字段。我能找到的每个示例都具有如下设置的文本数据:
data = ['text1','text2',...]
我的看起来像:
data = [['text1','text2','text3'],[...],...]
但是当我尝试适应数据时,我得到以下回溯:
ValueError Traceback (most recent call last)
<ipython-input-25-e3356a0f62f8> in <module>()
----> 1 classifier.fit(X,y)
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/svm/base.pyc in fit(self, X, y, sample_weight)
140 "by not using the ``sparse`` parameter")
141
--> 142 X = atleast2d_or_csr(X, dtype=np.float64, order='C')
143
144 if self.impl in ['c_svc', 'nu_svc']:
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/utils/validation.pyc in atleast2d_or_csr(X, dtype, order, copy)
114 """
115 return _atleast2d_or_sparse(X, dtype, order, copy, sparse.csr_matrix,
--> 116 "tocsr")
117
118
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/utils/validation.pyc in _atleast2d_or_sparse(X, dtype, order, copy, sparse_class, convmethod)
94 _assert_all_finite(X.data)
95 else:
---> 96 X = array2d(X, dtype=dtype, order=order, copy=copy)
97 _assert_all_finite(X)
98 return X
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/utils/validation.pyc in array2d(X, dtype, order, copy)
78 raise TypeError('A sparse matrix was passed, but dense data '
79 'is required. Use X.toarray() to convert to dense.')
---> 80 X_2d = np.asarray(np.atleast_2d(X), dtype=dtype, order=order)
81 _assert_all_finite(X_2d)
82 if X is X_2d and copy:
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
318
319 """
--> 320 return array(a, dtype, copy=False, order=order)
321
322 def asanyarray(a, dtype=None, order=None):
ValueError: setting an array element with a sequence.
我有什么具体的方法可以解决这个问题吗?谢谢你!
NOTES:
我使用的所有文本数据都是由HashingVectorizer
clf.fit(X,y)
where X
是包含 3 个矢量化文本的列表的列表,并且y
是元素所属各个类别的列表X
属于