This is a follow up to this question: Concatenate previous and latter words to a word that match a condition in R https://stackoverflow.com/questions/58551389/concatenate-previous-and-latter-words-to-a-word-that-match-a-condition-in-r
我正在寻找一个正则表达式,它可以在逗号之后的第二个空格处分割字符串。看下面的例子:
vector <- c("Paulsen", "Kehr,", "Diego",
"Schalper", "Sepúlveda,", "Alejandro",
"Von Housen", "Kush,", "Terry")
X <- paste(vector, collapse = " ")
X
## this is the string I am looking to split:
"Paulsen Kehr, Diego Schalper Sepúlveda, Diego Von Housen Kush, Terry"
每个逗号后的第二个空格是我的标准regex /questions/tagged/regex。所以,我的输出将是:
"Paulsen Kehr, Diego"
"Schalper Sepúlveda, Alejandro"
"Von Housen Kush, Terry"
我想出了一个模式,但不太有效。
[^ ]+ [^ ]+, [^ ]+( )
与它一起使用strsplit
删除所有单词而不是在 group-1 处拆分(即[^ ]+ [^ ]+, [^ ]+(group-1)
) 仅有的。我想我只需要排除完整的匹配项并仅与后面的空格匹配。 --正则表达式演示 https://regex101.com/r/z383ig/17
strsplit(X, "[^ ]+ [^ ]+, [^ ]+( )")
# [1] "" [2] "" [3] "Von Housen Kush, Terry"
谁能想到一个regex /questions/tagged/regex为了找到每个逗号后的第二个空格?