我想加入数据框中的元组列表。
我尝试了几种在数据框中执行此操作的方法join
与lambda
import pandas as pd
from nltk import word_tokenize, pos_tag, pos_tag_sents
data = {'Categories': ['animal','plant','object'],
'Type': ['tree','dog','rock'],
'Comment': ['The NYC tree is very big', 'NY The cat from the UK is small',
'The rock was found in LA.']}
def posTag(data):
data = pd.DataFrame(data)
comments = data['Comment'].tolist()
taggedComments = pos_tag_sents(map(word_tokenize,comments))
data['taggedComment'] = taggedComments
print data['taggedComment']
data['taggedComment'].apply(lambda x: (' '.join(x)))
return data
taggedData = posTag(data)
print data
其他一些方法tuple
我尝试过的加入有:
(' '.join(['_'.join(x) for x in data['taggedComment']]))
[''.join(x) for x in data['taggedComment']]
['_'.join(str(x)) for x in data['taggedComment']]
无论我做什么,我都会遇到同样的错误。
TypeError: sequence item 0: expected string, tuple found
对于每个列表,我想要的回应
[('A', 'B'), ('B', 'C'), ('C', 'B')]
在数据框中到 outPutFile
'A_B B_C C_B'
关于出了什么问题有什么建议吗?