Just in case

Joe Clark is such a stickler for technical correctness that I’m surprised he let a simple error like this get through: URLs are case-insensitive by spec. (He’s referring to the (partial) HTTP URI cbc.ca/thering.) For the record, directly from the specification, RFC 2616 (I’ve chosen to link directly to the W3C’s HTML version, but the text is equivalent to the official text version provided by IETF):

3.2.3 URI Comparison

When comparing two URIs to decide if they match or not, a client SHOULD use a case-sensitive octet-by-octet comparison of the entire URIs, with these exceptions:

      - A port that is empty or not given is equivalent to the default
        port for that URI-reference;
– Comparisons of host names MUST be case-insensitive; – Comparisons of scheme names MUST be case-insensitive; – An empty abs_path is equivalent to an abs_path of “/”.

In fact, this is the default case for all URIs, as defined by RFC 3986: The other generic syntax components are assumed to be case-sensitive unless specifically defined otherwise by the scheme.

As the man himself says, Don’t nitpick angry!

Aggressive canonicalization

Herewith, a simple demonstration of what aggressive canonicalization can produce. […] The cache is simply files in Atom 1.0 format, with all textual content normalized to XHTML.

More importantly for my purposes, Sam’s Venus branch of Planet also normalizes URLs, which means I can use it to generate a feed so Gregarius (based on MagpieRSS) will no longer mung up relative links in Atom feeds like his and Tim Bray’s. (Neither does my private SimplePie-based branch of Gregarius, but that’s another story.)