在 Python 中根据多个条件突出显示数据框单元格

2023-12-02

给定一个小数据集如下:

   id    room   area           situation
0   1   A-102  world  under construction
1   2     NaN     24  under construction
2   3    B309    NaN                 NaN
3   4   C·102     25    under decoration
4   5  E_1089  hello    under decoration
5   6      27    NaN          under plan
6   7      27    NaN                 NaN

感谢@jezrael 的代码这个链接,我能够得到我需要的结果:

a = np.where(df.room.str.match('^[a-zA-Z\d\-]*$', na = False), None,
                               'incorrect room name')
b = np.where(df.area.str.contains('^\d+$', na = True), None,
                                 'area is not a numbers')  
c = np.where(df.situation.str.contains('under decoration', na = False),
                                      'decoration is in the content', None) 

f = (lambda x: '; '.join(y for y in x if pd.notna(y)) 
                if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a,b,c)]
print(df)

   id    room   area           situation  \
0   1   A-102  world  under construction   
1   2     NaN     24  under construction   
2   3    B309    NaN                 NaN   
3   4   C·102     25    under decoration   
4   5  E_1089  hello    under decoration   
5   6      27    NaN          under plan   
6   7      27    NaN                 NaN   

                                               check  
0                              area is not a numbers  
1                                incorrect room name  
2                                                NaN  
3   incorrect room name;decoration is in the content  
4  incorrect room name;area is not a numbers;deco...  
5                                                NaN  
6                                                NaN  

但现在我想进一步强调有问题的细胞room, area, situation列,然后将数据框保存为 Excel 文件。

enter image description here

我怎样才能在 Pandas(更好)或其他 Python 包中做到这一点?

提前致谢。


想法是创建自定义返回函数DataFrame样式和重用m1, m2, m3布尔掩码:

m1 = df.room.str.match('^[a-zA-Z\d\-]*$', na = False)
m2 = df.area.str.contains('^\d+$', na = True)
m3 = df.situation.str.contains('under decoration', na = False)
a = np.where(m1, None, 'incorrect room name')
b = np.where(m2, None, 'area is not a numbers')  
c = np.where(m3, 'decoration is in the content', None) 


f = (lambda x: '; '.join(y for y in x if pd.notna(y)) 
                if any(pd.notna(np.array(x))) else np.nan )
df['check'] = [f(x) for x in zip(a, b, c)]
print(df)


def highlight(x):
    c1 = 'background-color: yellow'

    df1 = pd.DataFrame('', index=x.index, columns=x.columns)
    df1['room'] = np.where(m1, '', c1)
    df1['area'] = np.where(m2, '', c1)
    df1['situation'] = np.where(m3, c1, '')
    # print(df1)
    return df1

df.style.apply(highlight, axis = None).to_excel('test.xlsx', index = False)
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

在 Python 中根据多个条件突出显示数据框单元格 的相关文章

随机推荐