给定一个数据框如下:
player score
0 Sergio Agüero Forward — Manchester City 209.98
1 Eden Hazard Midfield — Chelsea 274.04
2 Alexis Sánchez Forward — Arsenal 223.86
3 Yaya Touré Midfield — Manchester City 197.91
4 Angel María Midfield — Manchester United 132.23
怎么会分裂player
分成三个新列name
, position
and team
?
player score name position team
0 Sergio Agüero Forward — Manchester City 209.98 Sergio Forward Manchester City
1 Eden Hazard Midfield — Chelsea 274.04 Eden Midfield Chelsea
2 Alexis Sánchez Forward — Arsenal 223.86 Alexis Forward Arsenal
3 Yaya Touré Midfield — Manchester City 197.91 Yaya Midfield Manchester City
4 Angel María Midfield — Manchester United 132.23 Angel Midfield Manchester United
我考虑过将其分成两列df[['name_position', 'team']] = df['player'].str.split(pat= ' — ', expand=True)
,然后分割name_position
to name
and position
。但还有更好的解决方案吗?
非常感谢。
您可以使用str.extract
如果您想一次性完成,也可以:
print(df["player"].str.extract(r"(?P<name>.*?)\s.*?\s(?P<position>[A-Za-z]+)\s—\s(?P<team>.*)"))
name position team
0 Sergio Forward Manchester City
1 Eden Midfield Chelsea
2 Alexis Forward Arsenal
3 Yaya Midfield Manchester City
4 Angel Midfield Manchester United
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)