如何拆分数据列中的值并将其添加到具有 pandas 条件的新列

2024-03-05

我有一个df,

name                        Value
Sri is a cricketer          Sri,is
Ram player                  Ram
Ravi is a singer            is
cricket and foot is ball    and,is,foot

和一个清单,

my_list=["is", "foot"]

我正在尝试按 (,) 拆分 df["value"] ,如果 my_list 中存在该值,则将该值添加到新列。 我的预期输出是

name                      Value        my_list
Sri is a cricketer        Sri           is      
Ram player                Ram 
Ravi is a singer                        is     
cricket and foot is ball  and          is,foot

请帮助实现这一目标,提前致谢


Use str.findall http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.findall.html with str.join http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.join.html:

my_list=["is", "foot"]
df['my_list'] = df['Value'].str.findall('(' + '|'.join(my_list) + ')').str.join(',')
print (df)
                       name        Value  my_list
0        Sri is a cricketer       Sri,is       is
1                Ram player          Ram         
2          Ravi is a singer           is       is
3  cricket and foot is ball  and,is,foot  is,foot

另一种解决方案是split http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.split.html并得到intersections of sets:

my_list=["is", "foot"]
df['my_list']=df['Value'].str.split(',').apply(lambda x: set(x) & set(my_list)).str.join(',')
print (df)
                       name        Value  my_list
0        Sri is a cricketer       Sri,is       is
1                Ram player          Ram         
2          Ravi is a singer           is       is
3  cricket and foot is ball  and,is,foot  is,foot

最后:

df['Value'] = (df['Value'].str.replace('(' + '|,'.join(my_list) + ')', '')
                          .str.replace('[,]{2,}',',')
                          .str.strip(','))
print (df)
                       name Value  my_list
0        Sri is a cricketer   Sri       is
1                Ram player   Ram         
2          Ravi is a singer             is
3  cricket and foot is ball   and  is,foot

Or:

my_list=["is", "foot"]

s1 = df['Value'].str.split(',')

df['my_list'] = s1.apply(lambda x: set(x) & set(my_list)).str.join(',')
df['Value'] = s1.apply(lambda x: set(x) - set(my_list)).str.join(',')
print (df)

                       name Value  my_list
0        Sri is a cricketer   Sri       is
1                Ram player   Ram         
2          Ravi is a singer             is
3  cricket and foot is ball   and  is,foot
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何拆分数据列中的值并将其添加到具有 pandas 条件的新列 的相关文章

随机推荐