Use str.findall http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.findall.html with str.join http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.join.html:
my_list=["is", "foot"]
df['my_list'] = df['Value'].str.findall('(' + '|'.join(my_list) + ')').str.join(',')
print (df)
name Value my_list
0 Sri is a cricketer Sri,is is
1 Ram player Ram
2 Ravi is a singer is is
3 cricket and foot is ball and,is,foot is,foot
另一种解决方案是split http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.split.html并得到intersection
s of set
s:
my_list=["is", "foot"]
df['my_list']=df['Value'].str.split(',').apply(lambda x: set(x) & set(my_list)).str.join(',')
print (df)
name Value my_list
0 Sri is a cricketer Sri,is is
1 Ram player Ram
2 Ravi is a singer is is
3 cricket and foot is ball and,is,foot is,foot
最后:
df['Value'] = (df['Value'].str.replace('(' + '|,'.join(my_list) + ')', '')
.str.replace('[,]{2,}',',')
.str.strip(','))
print (df)
name Value my_list
0 Sri is a cricketer Sri is
1 Ram player Ram
2 Ravi is a singer is
3 cricket and foot is ball and is,foot
Or:
my_list=["is", "foot"]
s1 = df['Value'].str.split(',')
df['my_list'] = s1.apply(lambda x: set(x) & set(my_list)).str.join(',')
df['Value'] = s1.apply(lambda x: set(x) - set(my_list)).str.join(',')
print (df)
name Value my_list
0 Sri is a cricketer Sri is
1 Ram player Ram
2 Ravi is a singer is
3 cricket and foot is ball and is,foot