子区域不是必须由单独的匹配器处理吗?喜欢:
public static void main(String[] args) {
String inputText = "dog1 start dog2a dog2b end dog3 start dog4a dog4b end dog5";
System.out.println("Input = " + inputText);
StringBuffer result = new StringBuffer();
Pattern pattern = Pattern.compile("(start(.*?)end)");
Matcher matcher = pattern.matcher(inputText);
while (matcher.find()) {
int s = matcher.start();
int e = matcher.end();
System.out.printf("(%d .. %d) -> \"%s\"\n", s, e, matcher.group(1));
matcher.appendReplacement(result, processSubGroup(matcher.group(1), matcher.group(2)));
}
matcher.appendTail(result);
System.out.println("Final result = " + result);
}
static String processSubGroup(String subGroup, String contents) {
StringBuffer result = new StringBuffer();
Pattern pattern = Pattern.compile("dog");
Matcher matcher = pattern.matcher(subGroup);
while (matcher.find())
matcher.appendReplacement(result, "cat");
matcher.appendTail(result);
return result.toString();
}
或者,没有日志相关的东西并且更简单:
public static void main(String[] args) {
String inputText = "dog1 start dog2a dog2b end dog3 start dog4a dog4b end dog5";
StringBuffer result = new StringBuffer();
Pattern pattern = Pattern.compile("(start(.*?)end)");
Matcher matcher = pattern.matcher(inputText);
while (matcher.find())
matcher.appendReplacement(result, processSubGroup(matcher.group(1), matcher.group(2)));
matcher.appendTail(result);
System.out.println("Final result = " + result);
}
static String processSubGroup(String subGroup, String contents) {
return Pattern.compile("dog").matcher(subGroup).replaceAll("cat");
}
Result:
Input = dog1 start dog2a dog2b end dog3 start dog4a dog4b end dog5
(5 .. 26) -> "start dog2a dog2b end"
(32 .. 53) -> "start dog4a dog4b end"
Final result = dog1 start cat2a cat2b end dog3 start cat4a cat4b end dog5
或者更抽象的方法:
interface GroupProcessor {
String process(String group);
}
public static void main(String[] args) {
String inputText = "dog1 dogs dog2a dog2b enddogs cow1 dog3 cows cow2a cow2b endcows dog4 dogs dog5a dog5b enddogs cow3";
String result = inputText;
result = processGroup(result, "dogs*enddogs", (group) -> {
return Pattern.compile("dog").matcher(group).replaceAll("cat");
});
result = processGroup(result, "cows*endcows", (group) -> {
return Pattern.compile("cow").matcher(group).replaceAll("sheep");
});
System.out.println("Input = " + inputText);
System.out.println("Final result = " + result);
}
static String processGroup(String input, String regex, GroupProcessor processor) {
StringBuffer result = new StringBuffer();
Pattern pattern = Pattern.compile(String.format("(%s)", regex.replace("*", "(.*?)")));
Matcher matcher = pattern.matcher(input);
while (matcher.find())
matcher.appendReplacement(result, processor.process(matcher.group(1)));
matcher.appendTail(result);
return result.toString();
}
这将为我们提供:
Input = dog1 dogs dog2a dog2b enddogs cow1 dog3 cows cow2a cow2b endcows dog4 dogs dog5a dog5b enddogs cow3
Final result = dog1 cats cat2a cat2b endcats cow1 dog3 sheeps sheep2a sheep2b endsheeps dog4 cats cat5a cat5b endcats cow3
Upd.
原因,为什么Matcher.region()
重置隐式匹配器状态,因此,lastAppendPosition
.
appendReplacement
and appendTail
在某种程度上是一种只向前移动的机制,而.region()
并不是那么确定。
假设以下情况:对于 100 个字符的字符串,您应用了区域 0..20,执行find()
-appendReplacement()
循环,然后将区域移动到例如 30..60,并再次执行替换循环。
现在你有 0..100 源字符串和 0..60 替换结果字符串StringBuffer
.
接下来,将区域 10..40 应用到源字符串...接下来做什么?如果源字符串的该区域不包含匹配项 - 好的,什么都不做,但如果它does包含匹配项?应该在哪里appendReplacement
追加/插入替换结果?结果字符串已经超过了 10..40 区域并且appendReplacement
only appends, not replaces输出缓冲区中字符串的分区。
如果存在某种约束机制,则该区域设置仅限于类似MAX(start, lastAppendPosition)..MIN(end, sourceLength)
,那么好吧,附加机制可以正常工作,但是.region()
方法没有这样的限制,或者它们(限制)会使.region()
方法对于搜索来说毫无用处(其中is主要目的.region()
方法)。
这就是为什么.region()
重置匹配器的隐式状态,使其与appendReplacement()
相关的东西。如果您需要不同的行为 - 扩展Matcher
通过封装的类。