[nycphp-talk] regexp for URLs (is this correct?)
Chris Hubbard
chubbard at next-online.net
Mon May 3 17:32:38 EDT 2004
Jay,
I've played with this one a lot and I've got a regex that I'm using.
It has one bug that I know about, maybe someone on this list can
suggest a fix:
$regx =
"^((http|https|ftp)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z0-9]{2,5}(:[a-zA-Z0
-9]*)?\/?([a-zA-Z0-9\-\._\?\,\'\/\\\+&%\$#\=~])*)$";
The bug is domain names with a hyphen fail. So, www.big-boots.com
would fail, where www.bigboots.com passes.
Chris
On May 3, 2004, at 11:30 AM, Jayesh Sheth wrote:
> Hello all,
>
> I came up with this Perl-style regular expression to validate URLs.
>
> For example, if I want to match "http://www.google.com"
> then I would test it against the following pattern:
>
> #^([a-z]{3,}://)(([0-9a-z-]+\.)+[0-9a-z]{2,4})$#i
>
> maybe I want to do something like this though:
>
> #^([a-z]{3,5}://)(([0-9a-z-]+\.)+[0-9a-z]{2,4})$#i
>
> so that
> http://www.google.com
>
> and
> ftp://ftp.mozilla.org
>
> and
> https://www.amazon.com
>
> are matched
>
> but
> httpsabc://www.somewhere.com
>
> is not matched.
>
> I primarily want to validate http:// links, though.
>
> Another thought: I want to be flexible and allow the user to enter
> something like:
>
> www.google.com
>
> Should I first try to match against:
> #^([a-z]{3,5}://)(([0-9a-z-]+\.)+[0-9a-z]{2,4})$#i
>
> and then, if that fails against:
> #^(([0-9a-z-]+\.)+[0-9a-z]{2,4})$#i
>
> I also realize that something like
> http://www.google.com/
>
> will not be matched.
>
> Any suggestions or corrections?
>
> Thanks in advance!
>
> Best Regards,
>
> - Jay
>
>
> _______________________________________________
> talk mailing list
> talk at lists.nyphp.org
> http://lists.nyphp.org/mailman/listinfo/talk
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1836 bytes
Desc: not available
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20040503/6148a3dc/attachment.bin>
More information about the talk
mailing list