NYCPHP Meetup

NYPHP.org

[nycphp-talk] question about utf-8 or unicode?

csnyder chsnyder at gmail.com
Mon Sep 6 13:22:06 EDT 2004


Unicode isn't out of the question -- it's just encoded in a way
different from how Javascript wants to do it in this case. Unicode
characters are represented in URIs by their octet sequences -- you
just need more octets per character then you do with ASCII.

The trick when you get an octet sequence like that is figuring out
what characterset to translate it into, a problem mentioned but not
resolved by RFC 2396 (which supercedes 1738).

My guess is that the %uNNNN notation is a (misguided?) attempt to
encode around this problem by identifying the sequence as unicode up
front.



More information about the talk mailing list