我试图合并两个熊猫数据框,尽管我想要的实际上可能不是合并。
我在两个匹配的框架中有两列,一列共享可用于连接的唯一值。另一列有一个空字段和一个填充字段。
我想在匹配唯一字段时覆盖空字段,但只保留被覆盖的列,我不想要第二个 DataFrame 中的其余列。
希望下面能进一步解释一下
>>> animals = [{"animal" : "dog", "name" : "freddy", "food" : ""},{"animal" : "cat", "name" : "dexter", "food" : ""},{"animal" : "dog", "name" : "lou lou", "food" : ""}]
>>> foods = [{"name" : "freddy", "food" : "dog mix", "brand" : "doggys dog"},{"name" : "dexter", "food" : "fussy cat mix", "brand" : "fish fishy"},{"name" : "lou lou", "food" : "bones", "brand" : "i was a cow"}]
>>> a_pd = pd.DataFrame(animals)
>>> a_pd
animal food name
0 dog freddy
1 cat dexter
2 dog lou lou
>>> f_pd = pd.DataFrame(foods)
>>> f_pd
brand food name
0 doggys dog dog mix freddy
1 fish fishy fussy cat mix dexter
2 i was a cow bones lou lou
>>>
>>>
>>> animal_data = a_pd.merge(f_pd, on='name', how='left')
>>> animal_data
animal food_x name brand food_y
0 dog freddy doggys dog dog mix
1 cat dexter fish fishy fussy cat mix
2 dog lou lou i was a cow bones
>>>
我应该只吃食物,我不想要品牌(还要注意,这是示例数据,实时数据有更多列
期望的结果
>>> animal_data
animal name food
0 dog freddy dog mix
1 cat dexter fussy cat mix
2 dog lou lou bones