我认为逐行逐字的方法更好,因为您不必为分隔符和条带而烦恼。
有了这样的文件:
king
sing
ping
cling
booked
looked
cooked
packed
像这样的代码,使用re.sub https://docs.python.org/3/library/re.html#re.sub替换模式:
import re
with open("new_abcd.txt", "w") as new, open("abcd.txt") as original:
for word in original:
new_word = re.sub("ing$", "xyz", word)
new_word = re.sub("ed$", "abcd", new_word)
new.write(new_word)
它创建一个结果文件:
kxyz
sxyz
pxyz
clxyz
bookabcd
lookabcd
cookabcd
packabcd
我尝试了你给我们的变音符号,它似乎工作得很好:
print(re.sub("ा$", "ing", "का"))
>>> कing
编辑:添加了多个替换。您可以将替换项放入列表中并对其进行迭代以执行以下操作re.sub
如下。
import re
# List where first is pattern and second is replacement string
replacements = [("ing$", "xyz"), ("ed$", "abcd")]
with open("new_abcd.txt", "w") as new, open("abcd.txt") as original:
for word in original:
new_word = word
for pattern, replacement in replacements:
new_word = re.sub(pattern, replacement, word)
if new_word != word:
break
new.write(new_word)
这限制了每个单词的一次修改,仅采用第一个修改该单词的修改。