splunkninja

The dojo of Splunk. Learn, share, teach, mentor.

I've tried and failed to extract the IP Address field such that it only includes sets of 4 numbers that are all separated by periods.  The built-in Splunk Regex pattern generator always seems to tag additional text or punctuation that makes it took specific. 

 

For instance, the pattern generator tells me to use this:

(?i) accepted: (?P<FIELDNAME>.*)

 

That works to find 172.25.97.121 in the line below:

2010-03-16 09:46:57.288/[NioTCPListener, swiftlet=sys$jms, port=4001]/INFORMATION/connection accepted: 172.25.97.121

 

But the same Regex doesn't find the same IP address in this line:

2010-03-16 09:45:15.986/sys$jms/INFORMATION/JMSConnection v630/172.25.97.121:2355/connection closed

 

Any ideas?

Thanks,

Swack

Reply to This

Replies to This Discussion

FYI, I've tried
\d+\.\d+\.\d+\.\d+
but it doesn't find anything in the sample lines above.

Reply to This

Hai Patrick,

Guess you have to dig into the pre and postfix part :

(?i) accepted: (?P.*) means : search for accepted: and put everyting .* after that in FIELDNAME
This wil not work for the other example....

2010-03-16 09:45:15.986/sys$jms/INFORMATION/JMSConnection v630/172.25.97.121:2355/connection closed

based on this info you have to use something for prefix :

v630\/



And what you want to place in this ( post fix) \d+\.\d+\.\d+\.\d+

so fi

(?i) v630\/ (?P\d+\.\d+\.\d+\.\d+)


Cheers
Ferry Leirissa

Reply to This

Oops paste errors....



* | rex "v630\/(?P\d+\.\d+\.\d+\.\d+)"

then you get the IP as a field,,hope this helps!

Cheers
Ferry

Reply to This

Thanks Ferry! I was able to get it working using this:
| rex "(?\d+\.\d+\.\d+\.\d+)"

Reply to This

Would you consider something like ...
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

meaning:
(1 to 3 digits), (then a dot), (1 to 3 digits), (dot), (1 to 3 digits), (dot), (1 to 3 digits).

It is very similar to the one you refer; the only small advantage is that it limits the amount of digits to 3.
The string 1.12.123.1234 would not be matched as an IP address.


HTH,
Marcelo

Reply to This

Great Marcelo! Thanks! I learn something new everyday! (Of course, it's not hard for me to learn something new about regular expressions.)

Reply to This

pffff ... show me someone who says they "know everything" about regex and i will show you a liar.

glad to help :-)
Marcelo

Reply to This

Thats pretty sweet... I like that idea of limiting it to between one and three characters. I've seen some other ones the limit it to the actual possible digits in an IP... i'm still trying to understand the cryptic nature of them.

(([2]([0-4][0-9]|[5][0-5])|[0-1]?[0-9]?[0-9])[.]){3}(([2]([0-4][0-9]|[5][0-5])|[0-1]?[0-9]?[0-9]))

I found this on Regexr.com I have no idea how to read it... i'll have to break it down tomorrow morning... yikes.

Reply to This

just a humble note.
I *ALWAYS* write a comment beside any regex I prepare, translating it step by step into "plain human readable".
The more verbose the documentation, the better.

I find regexes very risky; this comes together with their great power, i suppose.

HTH
Marcelo

Reply to This

Guys,

I found your thread and hope that you can help me with a similar extraction problem. Here is a single syslog entry I'm trying to extract a field:

Apr 3 15:04:55 adsl-068-153-219-120.sip.bct.bellsouth.net 6807: Router-1969: 006804: Apr 3 15:04:54: %SEC-6-IPACCESSLOGP: list FromInternet permitted udp 69.173.64.15(15) -> 68.153.219.120(123), 1 packet

I'm trying to create a Regex that extracts the source ip address (69.173.64.15) as a new field when the access list "FromInternet" appears in the syslog entry.

As well, I'd love to be able to extract the destination ip address (68.153.219.120) as a separate field.

The built-in pattern generator couldnt help me at all. I'm hoping that you guys can!

Thanks very much for your Regex experience!

James E

Reply to This

Try this one:

(?<src_ip>\d+\.\d+\.\d+\.\d+)\(\d+\) \-> (?<dest_ip>\d+\.\d+\.\d+\.\d+)\(\d+\)

it should extract the first ip as "src_ip" and the second one as "dest_ip"

Cheers, Siegfried

Reply to This

what about this?

index=* frominternet | rex field=_raw ".*?\s(?\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*\s(?\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*" | fields + IP1, IP2 | head 3

note that this search filters only lines that contain "frominternet"
then creates a regex taking the _raw field (the event line)
then matches against
.*? := any chars but not greedy until a word separator followed by
() := your field, that will be named "IP1".
.*\s := then more chars until a word separator
() := then the second field (IP2) and then
.* := the rest of the line

as for the content of IP1 and IP2 they both match
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
\d{1,3} := one, two or three digits
\. := a dot (escaped wiht a backslash because it is exactly a dot, not "any single char"
\d{1,3} := one, two or three digits
\. := a dot
\d{1,3} := one, two or three digits
\. := a dot
\d{1,3} := one, two or three digits


makes sense ?
Marcelo Finkielsztein

Reply to This

RSS

© 2010   Created by Michael Wilde.   Powered by .

Badges  |  Report an Issue  |  Terms of Service

Sign in to chat!