我有 2 个如下所示的数据框。
df_1
Index Fruit
1 Apple
2 Banana
3 Peach
df_2
Fruit Taste
Apple Tasty
Banana Tasty
Banana Rotten
Peach Rotten
Peach Tasty
Peach Tasty
我想基于合并两个数据框Fruit
但只保留第一次出现的Apple
, Banana
, and Peach
在第二个数据框中。最终结果应该是:
df_output
Index Fruit Taste
1 Apple Tasty
2 Banana Tasty
3 Peach Rotten
Where Fruit
, Index
, and Taste
是列标题。我尝试过类似的东西df1.merge(df2,how='left',on='Fruit
但它根据长度创建了额外的行df_2
Thanks.
Use drop_duplicates http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html对于第一行:
df = df_1.merge(df_2.drop_duplicates('Fruit'),how='left',on='Fruit')
print (df)
Index Fruit Taste
0 1 Apple Tasty
1 2 Banana Tasty
2 3 Peach Rotten
如果只想更快地添加一列,请使用map http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html:
s = df_2.drop_duplicates('Fruit').set_index('Fruit')['Taste']
df_1['Taste'] = df_1['Fruit'].map(s)
print (df_1)
Index Fruit Taste
0 1 Apple Tasty
1 2 Banana Tasty
2 3 Peach Rotten
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)