再次为这里的菜鸟道歉:尝试下面的代码来搜索从关键字读取的多个字符串并搜索f
并打印该行。
如果我只有一个关键字,它会起作用,但如果我有多个关键字,它就不起作用。
keywords = input("Please Enter keywords path as c:/example/ \n :")
keys = open((keywords), "r").readline()
with open("c:/saad/saad.txt") as f:
for line in f:
if (keys) in line:
print(line)
查找关键字的挑战之一是定义关键字的含义以及如何解析文件内容以查找完整的关键字集。如果“aa”是关键字,它应该匹配“aaa”还是“aa()”?关键字中可以包含数字吗?
一个简单的解决方案是关键字仅按字母顺序排列,并且应与连续的字母字符串精确匹配,忽略大小写。此外,匹配应该逐行考虑,而不是逐句考虑。我们可以使用正则表达式来查找字母序列和集合来检查包含性,如下所示:
keys.txt
aa bb
test.txt
aa is good
AA is good
bb is good
cc is not good
aaa is not good
test.py
import re
keyfile = "keys.txt"
testfile = "test.txt"
keys = set(key.lower() for key in
re.findall(r'\w+', open(keyfile , "r").readline()))
with open(testfile) as f:
for line in f:
words = set(word.lower() for word in re.findall(r'\w+', line))
if keys & words:
print(line, end='')
Result:
aa is good
AA is good
bb is good
为比赛的含义添加更多规则,事情会变得更加复杂。
EDIT
假设您每行有一个关键字,并且您只需要子字符串匹配(即“aa”匹配“aaa”)而不是关键字搜索,您可以这样做
keyfile = "keys.txt"
testfile = "test.txt"
keys = [key for key in (line.strip() for line in open(keyfile)) if key]
with open(testfile) as f:
for line in f:
for key in keys:
if key in line:
print(line, end='')
break
但我只是猜测你的标准是什么。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)