我想根据第一组列中的值(具体来说,第一列为空白),同时将多个列的值替换为其他列中的相应值。这是我正在尝试做的一个例子:
import pandas as pd
df = pd.DataFrame({'a1':['m', 'n', 'o', 'p'],
'a2':['q', 'r', 's', 't'],
'b1':['', '', 'a', '' ],
'b2':['', '', 'b', '']})
df
# a1 a2 b1 b2
# 0 m q
# 1 n r
# 2 o s a b
# 3 p t
我想将 b1 和 b2 中的 '' 值替换为 a1 和 a2 中的相应值,其中 b1 为空:
# a1 a2 b1 b2
# 0 m q m q
# 1 n r n r
# 2 o s a b
# 3 p t p t
这是我的思考过程(我对 pandas 比较陌生,所以我在这里说话可能带有浓重的 R 口音):
missing = (df.b1 == '')
# First thought:
df[missing, ['b1', 'b2']] = df[missing, ['a1', 'a2']]
# TypeError: 'Series' objects are mutable, thus they cannot be hashed
# Fair enough
df[tuple(missing), ('b1', 'b2')] = df[tuple(missing), ('a1', 'a2')]
# KeyError: ((True, True, False, True), ('a1', 'a2'))
# Obviously I'm going about this wrong. Maybe I need to use indexing?
df[['b1', 'b2']].ix[missing,:]
# b1 b2
# 0
# 1
# 3
# That looks right
df[['b1', 'b2']][missing, :] = df[['a1', 'a2']].ix[missing, :]
# TypeError: 'Series' objects are mutable, thus they cannot be hashed
# Deja vu
df[['b1', 'b2']].ix[tuple(missing), :] = df[['a1', 'a2']].ix[tuple(missing), :]
# ValueError: could not convert string to float:
# Uhh...
我可以逐列进行:
df['b1'].ix[missing] = df['a1'].ix[missing]
df['b2'].ix[missing] = df['a2'].ix[missing]
...但我怀疑有一种更惯用的方法可以做到这一点。想法?
Update:为了澄清这一点,我特别想知道是否可以同时更新所有列。例如,对 Primer 答案的假设修改(这不起作用并导致 NaN,尽管我不确定为什么):
df.loc[missing, ['b1', 'b2']] = f.loc[missing, ['a1', 'a2']]
# a1 a2 b1 b2
# 0 m q NaN NaN
# 1 n r NaN NaN
# 2 o s a b
# 3 p t NaN NaN