for row in data:
row = row.strip().split(';')
你执行的事实split(';')
结果是一行(你应该写:line)根据';'分割总是给出一个非空列表,即使它是一个空行,甚至在被删除之后strip() : ''.split(';')
gives ['']
。所以你的以下情况if row:
是没用的。
这意味着您的代码相当于:
for row in data:
row = row.strip().split(';')
for subrow in row:
subrow = subrow.split()
if subrow:
out.writerow(subrow)
然后到:
for row in data:
for subrow in row.strip().split(';'):
subrow = subrow.split()
if subrow:
out.writerow(subrow)
.
此外,事实上您使用split() on subrow存在于列表中row.strip().split(';')消除中每个单词之前和之后的所有空格subrow。所以第一个strip()
in row.strip().split(';')
也是没用的。
那么您的代码相当于:
for row in data:
for subrow in row.split(';'):
subrow = subrow.split()
if subrow:
out.writerow(subrow)
Now , subrow.split()
当子行仅为空白时,可以产生一个空列表,因为split()
无参数有其特殊的算法。所以指令if subrow
很有用。
.
事实上,在读取这样一个文件的内容之后,您的代码所做的是:
Blackcurrant, Redcurrant ; Orange ; Blueberry
Pear;Chestnut; Lemon Lime, Grapefruit
Apple;Apricot ; Pineapple, Fig; Mulberry, Hedge Apple
记录另一个文件,如下所示:
Blackcurrant
Redcurrant
Orange
Blueberry
Pear
Chestnut
Lemon Lime
Grapefruit
Apple
Apricot
Pineapple
Fig
Mulberry
Hedge
Apple
我更喜欢下面的代码来做到这一点:
filename = raw_input("Enter name of file to be written row wise:") + '.txt'
filepath = 'I:\\' + filename
with open(filepath) as handler,open("myfile.csv","wb") as outfile:
out = csv.writer(outfile)
for row in handler:
gen = ( subrow.split() for subrow in row.split(';') )
out.writerow([x for x in gen if x])
del out
.
该代码将始终运行,即使对于内存无法容纳其内容的极其巨大的文件,因为文件的行是一个接一个地读取的。
如果文件不是那么大,可以像您一样继续,使用读取行():
with open(filepath) as handler:
data = handler.readlines()
with open("myfile.csv","wb") as outfile:
out = csv.writer(outfile)
for row in data:
gen = ( subrow.split() for subrow in row.split(';') )
out.writerow([x for x in gen if x])
del out
但没有特别的兴趣继续,所以,你可以这样做for row in handler
以及。
.
就我个人而言,我认为使用 writerows() 会更好:
filename = raw_input("Enter name of file to be written row wise:") + '.txt'
filepath = 'I:\\' + filename
with open(filepath) as handler,open("myfile.csv","wb") as outfile:
out = csv.writer(outfile)
gen = ( x for row in handler for x in (subrow.split() for subrow in row.split(';')) )
out.writerows([x for x in gen if])
del out
.
我通过告诉您使用正则表达式的代码会更加有效来结束这个答案:
import csv, re
regx = re.compile('[ ;\r\n]+')
filename = raw_input("Enter name of file to be written row wise:") + '.txt'
filepath = 'I:\\' + filename
with open(filepath) as handler,open("myfile.txt","w") as outfile:
outfile.write('\n'.join(x for x in regx.split(handler.read()) if x))
Edit 1
handler = open(filepath)
outfile = open("myfile.txt","wb")
out = csv.writer(outfile)
for row in handler:
gen = ( subrow.split() for subrow in row.split(';') )
out.writerow([x for x in gen if x])
del out
outfile.close()
handler.close()
or
import csv, re
regx = re.compile('[ ;\r\n]+')
filename = raw_input("Enter name of file to be written row wise:") + '.txt'
filepath = 'I:\\' + filename
handler = open(filepath)
outfile = open("myfile.txt","w")
outfile.write('\n'.join(x for x in regx.split(handler.read()) if x))
outfile.close()
handler.close()