Pandas 中通过多个分隔符将一列拆分为多列

2023-12-07

给定一个数据框如下:

                                     player     score
0   Sergio Agüero Forward — Manchester City    209.98
1            Eden Hazard Midfield — Chelsea    274.04
2          Alexis Sánchez Forward — Arsenal    223.86
3     Yaya Touré Midfield — Manchester City    197.91
4  Angel María Midfield — Manchester United    132.23

怎么会分裂player分成三个新列name, position and team?

                                     player     score   name    position  team
0   Sergio Agüero Forward — Manchester City    209.98   Sergio  Forward   Manchester City
1            Eden Hazard Midfield — Chelsea    274.04   Eden    Midfield  Chelsea
2          Alexis Sánchez Forward — Arsenal    223.86   Alexis  Forward   Arsenal
3     Yaya Touré Midfield — Manchester City    197.91   Yaya    Midfield  Manchester City
4  Angel María Midfield — Manchester United    132.23   Angel   Midfield  Manchester United

我考虑过将其分成两列df[['name_position', 'team']] = df['player'].str.split(pat= ' — ', expand=True),然后分割name_position to name and position。但还有更好的解决方案吗?

非常感谢。


您可以使用str.extract如果您想一次性完成,也可以:

print(df["player"].str.extract(r"(?P<name>.*?)\s.*?\s(?P<position>[A-Za-z]+)\s—\s(?P<team>.*)"))

     name  position               team
0  Sergio   Forward    Manchester City
1    Eden  Midfield            Chelsea
2  Alexis   Forward            Arsenal
3    Yaya  Midfield    Manchester City
4   Angel  Midfield  Manchester United
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Pandas 中通过多个分隔符将一列拆分为多列 的相关文章

随机推荐