我得到了一个字符串集合,我想要的正则表达式就是收集所有以 http 开头的..
href="http://www.test.com/cat/1-one_piece_episodes/"href="http://www.test.com/cat/2-movies_english_subbed/"href="http://www.test .com/cat/3-english_dubbed/"href="http://www.exclude.com"
这是我的正则表达式模式..
href="(.*?)[^#]"
并返回这个
href="http://www.test.com/cat/1-one_piece_episodes/"
href="http://www.test.com/cat/2-movies_english_subbed/"
href="http://www.xxxx.com/cat/3-english_dubbed/"
href="http://www.exclude.com"
排除最后一个匹配的模式是什么..或排除具有以下内容的匹配exclude内部域名,例如 href="http://www.exclude.com"
EDIT:对于多重排除
href="((?:(?!"|\bexclude\b|\bxxxx\b).)*)[^#]"
@ridgerunner 和我会将正则表达式更改为:
href="((?:(?!\bexclude\b)[^"])*)[^#]"
它匹配所有href
属性,只要它们不以#
并且不包含这个词exclude
.
解释:
href=" # Match href="
( # Capture...
(?: # the following group:
(?! # Look ahead to check that the next part of the string isn't...
\b # the entire word
exclude # exclude
\b # (\b are word boundary anchors)
) # End of lookahead
[^"] # If successful, match any character except for a quote
)* # Repeat as often as possible
) # End of capturing group 1
[^#]" # Match a non-# character and the closing quote.
允许多个“禁止词”:
href="((?:(?!\b(?:exclude|this|too)\b)[^"])*)[^#]"
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)