我正在努力理解 Java 中正则表达式的行为,并且遇到了一些看起来很奇怪的事情。在下面的代码中,由于我在测试时不明白的原因,测试突然失败,消息标签为“6 个字母匹配,否定”(随后的两个测试也失败)。是我盯着这个看得太久了,还是确实发生了一些奇怪的事情?我不认为这与可变长度负前瞻断言(?!X)有关,但我很高兴听到任何理论,甚至确认其他人正在经历同样的问题,并且它并不特定于我的 JVM。抱歉,正则表达式太做作了,但你不想看到真实的东西:)
// $ java -version
// java version "1.7.0_10"
// Java(TM) SE Runtime Environment (build 1.7.0_10-b18)
// Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)
// test of word without agreement
String test = "plusieurs personne sont";
// match the pattern with curly braces
assertTrue("no letters matched", Pattern.compile("plusieurs personne\\b").matcher(test).find());
assertTrue("1 letters matched", Pattern.compile("plusieurs personn\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("2 letters matched", Pattern.compile("plusieurs person\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("3 letters matched", Pattern.compile("plusieurs perso\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("4 letters matched", Pattern.compile("plusieurs pers\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("5 letters matched", Pattern.compile("plusieurs per\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("6 letters matched", Pattern.compile("plusieurs pe\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("7 letters matched", Pattern.compile("plusieurs p\\p{Alpha}{1,100}\\b").matcher(test).find());
assertTrue("8 letters matched", Pattern.compile("plusieurs \\p{Alpha}{1,100}\\b").matcher(test).find());
// match the negative pattern (without s or x) with curly braces
assertTrue("no letters matched, negative", Pattern.compile("plusieurs (?!personne[sx])\\w+").matcher(test).find());
assertTrue("1 letters matched, negative", Pattern.compile("plusieurs (?!personn\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
assertTrue("2 letters matched, negative", Pattern.compile("plusieurs (?!person\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
assertTrue("3 letters matched, negative", Pattern.compile("plusieurs (?!perso\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
assertTrue("4 letters matched, negative", Pattern.compile("plusieurs (?!pers\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
assertTrue("5 letters matched, negative", Pattern.compile("plusieurs (?!per\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
// the assertion below fails (is false) for reasons unknown
assertTrue("6 letters matched, negative", Pattern.compile("plusieurs (?!pe\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
assertTrue("7 letters matched, negative", Pattern.compile("plusieurs (?!p\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());
assertTrue("8 letters matched, negative", Pattern.compile("plusieurs (?!\\p{Alpha}{1,100}[sx])\\w+").matcher(test).find());