您的 grok 模式与日志中的所有内容都不匹配,这就是它不起作用的原因。例如,%{WORD}
只会匹配Mozilla
, not /5.0
。您可以创建自定义模式来匹配整个browser/version
像这样(?<browser>%{WORD}(/%{NUMBER})?)
.
你可以逃脱INFO (6):
只需将其与.*
并且它将在输出中被忽略。
就空格而言,请使用预定义的 grok 模式进行匹配%{SPACE}
.
最后的代码可以通过创建自定义模式变得可选,即(?<optional_code>%{WORD}?)
然后你的整个 grok 模式将变成,
%{TIMESTAMP_ISO8601:timestamp}.*%{IP:ip}%{SPACE}%{WORD:company_name}%{SPACE}%{EMAILADDRESS:email}%{SPACE}%{URIPROTO:method}%{SPACE}%{URIPATH:page}%{SPACE}(?<browser>%{WORD}(/%{NUMBER})?)%{SPACE}\(%{GREEDYDATA:content}\).*\{%{GREEDYDATA:json}\}%{SPACE}(?<optional_code>%{WORD}?)
它将输出,
{
"timestamp": [
[
"2016-09-01T10:58:41+02:00"
]
],
"YEAR": [
[
"2016"
]
],
"MONTHNUM": [
[
"09"
]
],
"MONTHDAY": [
[
"01"
]
],
"HOUR": [
[
"10",
"02"
]
],
"MINUTE": [
[
"58",
"00"
]
],
"SECOND": [
[
"41"
]
],
"ISO8601_TIMEZONE": [
[
"+02:00"
]
],
"ip": [
[
"165.225.76.76"
]
],
"IPV6": [
[
null
]
],
"IPV4": [
[
"165.225.76.76"
]
],
"SPACE": [
[
" ",
" ",
" ",
" ",
" ",
" ",
" "
]
],
"company_name": [
[
"entreprise1"
]
],
"email": [
[
"[email protected] /cdn-cgi/l/email-protection"
]
],
"EMAILLOCALPART": [
[
"email1"
]
],
"HOSTNAME": [
[
"gmail.com"
]
],
"method": [
[
"POST"
]
],
"page": [
[
"/application/controller/action"
]
],
"browser": [
[
"Mozilla/5.0"
]
],
"WORD": [
[
"Mozilla",
"86rkt2dqsdze5if1bqldfl1"
]
],
"NUMBER": [
[
"5.0"
]
],
"BASE10NUM": [
[
"5.0"
]
],
"content": [
[
"Windows NT 6.1; Trident/7.0; rv:11.0"
]
],
"json": [
[
""getid":"1""
]
],
"optional_code": [
[
"86rkt2dqsdze5if1bqldfl1"
]
]
}
When 在线测试 https://grokdebug.herokuapp.com/请添加电子邮件的自定义模式,因为目前不支持它们,
EMAILLOCALPART [a-zA-Z][a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILLOCALPART}@%{HOSTNAME}