Python pandas:使用包含和连接语句从另一个数据帧过滤一个数据帧

2023-12-20

我有一个数据框,如下所示:

df:

Noun    Thumb_count  
ability     19.0
account     3.0
accuracy    155.0
accurate    151.0
activity    163.0
adapt       3.0
app         15.0
gps         13.0

我有另一个数据框,如下所示:

df1:

Review Text                                         Noun        Thumbups    Rating  Review Date
This app is not working properly. GPS is showi...   app           34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   gps           34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   network       34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   connectivity  34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   signal        34.0        2 August 3, 2020

现在我想保留 df1 的唯一行,其中 df1 的名词列与 df 的名词列具有相同的值。这是我的过滤代码:

df1[df1.Noun.str.contains(('|').join(df.Noun.values.tolist()))]

当我运行上面的命令时,它抛出以下错误:

error: nothing to repeat at position 2

我不确定我在哪里犯了错误。谁能指导我哪里做错了?


我认为您添加了一个额外的括号,请尝试:

df1[df1.Noun.str.contains('|'.join(df.Noun.tolist()))]

您可以使用isin https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.isin.html method:

df1[df1.Noun.isin(df.Noun)]
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Python pandas:使用包含和连接语句从另一个数据帧过滤一个数据帧 的相关文章

随机推荐