LabelEncoder：类型错误：“float”和“str”实例之间不支持“>”

2024-02-12

即使处理缺失值，我也面临多个变量的错误。例如：

le = preprocessing.LabelEncoder()
categorical = list(df.select_dtypes(include=['object']).columns.values)
for cat in categorical:
    print(cat)
    df[cat].fillna('UNK', inplace=True)
    df[cat] = le.fit_transform(df[cat])
#     print(le.classes_)
#     print(le.transform(le.classes_))


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-424a0952f9d0> in <module>()
      4     print(cat)
      5     df[cat].fillna('UNK', inplace=True)
----> 6     df[cat] = le.fit_transform(df[cat].fillna('UNK'))
      7 #     print(le.classes_)
      8 #     print(le.transform(le.classes_))

C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit_transform(self, y)
    129         y = column_or_1d(y, warn=True)
    130         _check_numpy_unicode_bug(y)
--> 131         self.classes_, y = np.unique(y, return_inverse=True)
    132         return y
    133 

C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in unique(ar, return_index, return_inverse, return_counts)
    209 
    210     if optional_indices:
--> 211         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
    212         aux = ar[perm]
    213     else:

TypeError: '>' not supported between instances of 'float' and 'str'

检查导致错误的变量会导致：

df['CRM do Médico'].isnull().sum()
0

除了 nan 值之外，还有什么可能导致此错误？

这是由于该系列df[cat]包含具有不同数据类型的元素，例如（字符串和/或浮点数）。这可能是由于读取数据的方式造成的，即数字读取为浮点数，文本读取为字符串，或者数据类型为浮点数并在读取后发生更改。fillna手术。

换句话说

pandas 数据类型“Object”表示混合类型而不是 str 类型

所以使用下面的行：

df[cat] = le.fit_transform(df[cat].astype(str))

应该有帮助

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

LabelEncoder：类型错误：“float”和“str”实例之间不支持“>”

python

pandas

scikitlearn

LabelEncoder：类型错误：“float”和“str”实例之间不支持“>” 的相关文章

从正在运行的 python 脚本检测优化标志是否为 -O 或 -OO

为什么 .setGeometry() 不改变 QWidget 实例的大小？

如何使用Python将WebP图像转换为Gif？

python 中分割字符串以获得一个值？

错误：permission_manager_qt.cpp(82) 不支持的权限类型：13

Python igraph：从图中删除顶点

定义函数后对其进行修饰？

如何使用 python urllib 在 HTTP/1.1 中保持活力

无法打开 Python。错误 0xc000007b

更改 pandas 中多个日期时间列的时区信息

在可编辑的QSqlQueryModel中实现setEditStrategy

Django 在选择列表更改时创建毫无意义的迁移

Pandas Dataframe：将包含列表的行扩展到多行，并为所有列提供所需的索引

导入错误：没有名为 google.auth 的模块

Flask WTForms 使用变量自动填充 StringField

Scipy 稀疏 Cumsum

Python组合目录中的所有csv文件并按日期时间排序