连接列表Series
返回一个DataFrame
。因此,serie_4
is a DataFrame
. serie_3
is a Series
。连接一个DataFrame
with a Series
引发异常。
你可以使用
import pandas as pd
serie_5 = pd.concat([serie_1, serie_2, serie_3], join='outer', axis=1)
instead.
例如,
import functools
import numpy as np
import pandas as pd
s1 = pd.Series([0,1], index=list('AB'))
s2 = pd.Series([2,3], index=list('AC'))
result = pd.concat([s1, s2], join='outer', axis=1, sort=False)
print(result)
yields
0 1
A 0.0 2.0
B 1.0 NaN
C NaN 3.0
请注意,您会收到 ValueError
如果您尝试使用非唯一索引连接一系列。
例如,
s3 = pd.Series([0,1], index=list('AB'), name='s3')
s4 = pd.Series([2,3], index=list('AA'), name='s4') # <-- non-unique index
result = pd.concat([s3, s4], join='outer', axis=1, sort=False)
raises
ValueError: cannot reindex from a duplicate axis
要解决此问题,请重置索引并合并数据框 https://stackoverflow.com/a/30512931/190597反而:
import functools
s3 = pd.Series([0,1], index=list('AB'), name='s3')
s4 = pd.Series([2,3], index=list('AA'), name='s4') # <-- non-unique index
result = functools.reduce(
lambda left,right: pd.merge(left,right,on='index',how='outer'),
[s.reset_index() for s in [s3,s4]])
print(result)
yields
index s3 s4
0 A 0 2.0
1 A 0 3.0
2 B 1 NaN