[nycphp-talk] Squashing accented characters
Andrew Yochum
andrew at plexpod.com
Fri Oct 22 14:57:53 EDT 2010
Hi Paul,
You can achieve that with unicode transliteration:
http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines
Check out the PHP Iconv extension:
http://us.php.net/manual/en/intro.iconv.php
Hope that helps!
Regards,
Andrew
On 10/22/10 2:50 PM, Paul A Houle wrote:
> For my site at
>
> http://ookaboo.com/
>
> I'm running into the problem that people are searching for
> "Dusseldorf" but the name of the place is "Düsseldorf", so they don't
> find it.
>
> It seems to me a good answer to this is to have some function that
> squashes accented characters down to unaccented forms. I'd index the
> unaccented forms and also squash down queries so they'd always match
> up. I definitely need to do both ISO-Latin-1 and the
> Latin-Extended-A, because fate has given me a lot of place names
> that have the Polish dark L in them (?
> <http://fileformat.info/info/unicode/char/0142/>). It also seems like
> there are a lot of characters in Latin Extended-B that would also map
> plausably to unaccented characters.
>
> I can see how to write something like this, I'd need to parse out the
> Unicode code points from UTF-8 and run them through a lookup table,
> but it's a lot of details and I wonder if anybody has written a PHP
> function to do this already.
>
>
> _______________________________________________
> New York PHP Users Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/Show-Participation
--
Andrew Yochum
Plexpod
andrew at plexpod.com
office: 718-360-0879
mobile: 347-688-4699
fax: 718-504-6289
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20101022/fdf531ed/attachment.html>
More information about the talk
mailing list