我使用离散化数据框中的一列pandas.cut
与创建的垃圾箱IntervalIndex.from_tuples
.
剪切按预期工作,但是类别显示为我在IntervalIndex
。有什么方法可以将类别重命名为不同的标签,例如(小号中号大号)?
Example:
bins = pd.IntervalIndex.from_tuples([(0, 1), (2, 3), (4, 5)])
pd.cut([0, 0.5, 1.5, 2.5, 4.5], bins)
结果类别将是:
[NaN, (0, 1], NaN, (2, 3], (4, 5]]
Categories (3, interval[int64]): [(0, 1] < (2, 3] < (4, 5]]
我正在努力改变[(0, 1] < (2, 3] < (4, 5]]
变成类似的东西1, 2 ,3
or small, medium ,large
.
遗憾的是,使用 IntervalIndex 时,pd.cut 的 labels 参数被忽略。
Thanks!
UPDATE:
感谢@SergeyBushmanov,我注意到这个问题仅在尝试更改数据帧内的类别标签时才存在(这就是我想要做的)。更新的示例:
In [1]: df = pd.DataFrame([0, 0.5, 1.5, 2.5, 4.5], columns = ['col1'])
In [2]: bins = pd.IntervalIndex.from_tuples([(0, 1), (2, 3), (4, 5)])
In [3]: df['col1'] = pd.cut(df['col1'], bins)
In [4]: df['col1'].categories = ['small','med','large']
In [5]: df['col1']
Out [5]:
0 NaN
1 (0, 1]
2 NaN
3 (2, 3]
4 (4, 5]
Name: col1, dtype: category
Categories (3, interval[int64]): [(0, 1] < (2, 3] < (4, 5]]