Multiple times I’ve been searching (too long) for a working IPv6 regular expression. There’s a lot of crap out of there which doesn’t take into account certain cases. Of course you only get to know which one works best if you test them all. I’ve tried A LOT and finally found the right one
As a little reminder for myself, and perhaps a helpful hand for somebody else, if found this page useful and working fine.
The regex itself is:
\s*((([0-9A-Fa-f]{1,4}:){7}(([0-9A-Fa-f]{1,4})|:))|
(([0-9A-Fa-f]{1,4}:){6}(:|((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})|(:[0-9A-Fa-f]{1,4})))|
(([0-9A-Fa-f]{1,4}:){5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:){4}(:[0-9A-Fa-f]{1,4}){0,1}((:((25[0-5]|2[0-4]\d|
[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){3}(:[0-9A-Fa-f]{1,4})
{0,2}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|
[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:){2}(:[0-9A-Fa-f]{1,4}){0,3}
((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:)(:[0-9A-Fa-f]{1,4}){0,4}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(:(:[0-9A-Fa-f]{1,4}){0,5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|
[01]?\d{1,2})){3})))(%.+)?\s*
Which, for example in PHP, will become:
define('IPV6_REGEX', "/^\s*((([0-9A-Fa-f]{1,4}:){7}
(([0-9A-Fa-f]{1,4})|:))|(([0-9A-Fa-f]{1,4}:){6}
(:|((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|
[01]?\d{1,2})){3})|(:[0-9A-Fa-f]{1,4})))|
(([0-9A-Fa-f]{1,4}:){5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){4}
(:[0-9A-Fa-f]{1,4}){0,1}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){3}
(:[0-9A-Fa-f]{1,4}){0,2}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|
(([0-9A-Fa-f]{1,4}:){2}(:[0-9A-Fa-f]{1,4}){0,3}((:((25[0-5]|2[0-4]\d|
[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:)(:[0-9A-Fa-f]{1,4}){0,4}
((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(:(:[0-9A-Fa-f]{1,4}){0,5}((:((25[0-5]|2[0-4]\d|
[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|
((:[0-9A-Fa-f]{1,4}){1,2})))|(((25[0-5]|2[0-4]\d|[01]?\d{1,2})
(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})))(%.+)?\s*$/");
The http://forums.dartware.com/viewtopic.php?t=452 regex allows the following:
1111:2222:3333:4444::5555:
1111:2222:3333::5555:
1111:2222::5555:
1111::5555:
::5555:
Which are invalid.
I’ve email the author.
Nice work Aeron!
It’s a shame it doesn’t work in 100% of the cases, but I must say that it’s the best I’ve seen so far. If the author could correct the error, that would be fantastic 🙂
The regex allows the use of leading zero’s in the IPv4 parts.
Some Unix and Mac distro’s convert those segments into octals
I suggest using: 25[0-5]|2[0-4]d|1dd|[1-9]?d as a IPv4 segement.
I’m working on an updated version; I’ve replied to Aeron and if my updates work for him, I’ll post it at the original Dartware site and send it to you as well.
Cool!
Thanks very much Stephen!
I am using the regexp for ipv6 (copied from dartware 13 January) in my erlang programming.
But I have problem with two of the examples. Maybe you can help me.
Since I am not using Perl, but the erlang re module. I have removed the /^s* from the beginning and the (%.+)?s*$/ from the end.
I got nomatch when trying these two:
fe80:0000:0000:0000:0204:61ff:254.157.241.086 // IPv4 dotted quad at the end
fe80:0:0:0:0204:61ff:254.157.241.86 // drop leading zeroes, IPv4 dotted quad at the end
I still got a match for the address 1111:2222:3333:4444::5555:
Which is invalid according to post 1.
Do you see any explanation or suggestion?
Many thanks for a reply
Hi Jonas,
I’ve tested the regex on RegexTester
With the preg dialect.
When doing that, only the first one you mention (fe80:0000:0000:0000:0204:61ff:254.157.241.086 // IPv4 dotted quad at the end) fails. I should have a look if the update Stephen Ryan (see above) sent is already made available.
Folks,
We have done some maintenance on that Regular Expression knowledgebase article at InterMapper, and as of early February 2010, it should handle all those cases properly. (http://forums.dartware.com/viewtopic.php?t=452)
Note that there are additional articles that link to Perl, Javascript, Ruby, and Java implementations.
The Javascript on the IPv6 Address Validator page (http://intermapper.com/ipv6validator) also converts the address to its “best representation”.
Best regards,
Rich Brown
Dartware, LLC
The Dartware forums are now hidden behind a login. You can update the main article to point to the same data at a bitbucket repo: https://bitbucket.org/intermapper/ipv6-validator
Thanks for the feedback Rich, I’ve updated the article!