Use the csv module https://docs.python.org/3/library/csv.html将数据拆分为列。使用csv.DictReader()
对象,以便更轻松地通过标题选择列:
import csv
source = r'C:\Users\jk\Desktop\helloworld.csv'
dest = 'test.csv'
with open(source, newline='') as inf, open(dest, 'w', newline='') as outf:
reader = csv.DictReader(inf)
writer = csv.DictWriter(outf, fieldnames=reader.fieldnames)
for row in reader:
words = row['Keyword'].split()
row['Keyword'] = words[0]
writer.writerow(row)
writer.writerows({'Keyword': w} for w in words[1:])
The DictReader()
将从文件中读取第一行并将其用作为每行生成的字典的键;所以一行看起来像:
{'Keyword': 'Lions Tigers Bears', 'Source': 'US', 'Number': '3'}
现在,您可以单独处理每一列,并仅使用该列的第一个单词更新字典Keyword
为剩余单词生成附加行之前的列。
我假设你的文件是comma分开了。如果需要不同的分隔符,则设置delimiter
该角色的参数:
reader = csv.DictReader(inf, delimiter='\t')
用于制表符分隔的格式。有关各种选项,请参阅模块文档,包括称为dialects.
Demo:
>>> import sys
>>> import csv
>>> from io import StringIO
>>> sample = StringIO('''\
... Keyword,Source,Number
... Lions Tigers Bears,US,3
... Dogs Zebra,Canada,5
... Sharks Guppies,US,2
... ''')
>>> output = StringIO()
>>> reader = csv.DictReader(sample)
>>> writer = csv.DictWriter(output, fieldnames=reader.fieldnames)
>>> for row in reader:
... words = row['Keyword'].split()
... row['Keyword'] = words[0]
... writer.writerow(row)
... writer.writerows({'Keyword': w} for w in words[1:])
...
12
15
13
>>> print(output.getvalue())
Lions,US,3
Tigers,,
Bears,,
Dogs,Canada,5
Zebras,,
Sharks,US,2
Guppies,,