这是我之前问题的后续:根据 pandas DataFrame 列中的值序列查找行索引 https://stackoverflow.com/questions/61735585/finding-the-index-of-rows-based-on-a-sequence-of-values-in-a-column-of-pandas-da
我想获得一个索引为非常糟糕的元组列表,后跟第一次出现“坏”的索引:
import random
df = pd.DataFrame({
'measure': [random.randint(0,10) for _ in range(0,20)],
})
df['status'] = df.apply(
lambda x: 'good' if x['measure'] > 4 else 'very bad' if x['measure'] < 2 else 'bad',
axis=1)
这是数据框:
measure status
0 8 good
1 8 good
2 0 very bad
3 5 good
4 2 bad
5 3 bad
6 9 good
7 9 good
8 10 good
9 5 good
10 1 very bad
11 7 good
12 7 good
13 6 good
14 5 good
15 10 good
16 3 bad
17 0 very bad
18 3 bad
我怎样才能得到这样的组合的元组?
[(2,4), (10,16), (17,18)]
IIUC,你可以尝试:
# filters only rows with bad and very bad
m = df[df['status'].isin(['bad','very bad'])]
# check id current row is very bad and next row is bad
c = m['status'].eq('very bad') & m['status'].shift(-1).eq('bad')
# if true return next row as true too and get only index values
idx = m[c|c.shift()].index
# convert every 2 items into a tuple
res = [*zip(idx[::2],idx[1::2])]
[(2, 4), (10, 16), (17, 18)]
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)