您可以使用crosstab http://pandas.pydata.org/pandas-docs/stable/generated/pandas.crosstab.html为此功能:
In [14]: pd.crosstab(index=df['values'], columns=[df['convert_me'], df['age_col']])
Out[14]:
convert_me Convert1 Convert2 Convert3
age_col 23 33 44
values
21.71502 1 0 0
58.35506 0 1 0
60.41639 0 0 1
or the pivot_table http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot_table.html (with len
作为聚合函数,但这里你必须fillna
手动带零的 NaN):
In [18]: df.pivot_table(index=['values'], columns=['age_col', 'convert_me'], aggfunc=len).fillna(0)
Out[18]:
age_col 23 33 44
convert_me Convert1 Convert2 Convert3
values
21.71502 1 0 0
58.35506 0 1 0
60.41639 0 0 1
请参阅此处有关此内容的文档:http://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations http://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations
pandas 中的大多数函数将返回多级(分层)索引,在本例中为列。如果你想像 R 一样将其“融入”一个级别,你可以这样做:
In [15]: df_cross = pd.crosstab(index=df['values'], columns=[df['convert_me'], df['age_col']])
In [16]: df_cross.columns = ["{0}_{1}".format(l1, l2) for l1, l2 in df_cross.columns]
In [17]: df_cross
Out[17]:
Convert1_23 Convert2_33 Convert3_44
values
21.71502 1 0 0
58.35506 0 1 0
60.41639 0 0 1