Robot blocking microformat

How about I rename my robot-blocking profile as a microformat and see if that gets any discussion going?

And while we’re at it, can someone please explain for the record why microformats aren’t encouraged to specify HTML profiles (think namespaces for metadata) to formalize the idea that User agents may… perform some activity based on known conventions for that profile? Because a profile only applies to the document’s HEAD element is not a valid argument.

Nonsense

[London Knights hockey team] governor Trevor Whiffen… said fear of cold weather prompted organizers to cancel the May 19 opening parade through downtown. It has been replaced with a paid ticket opening ceremony….

Let me get this straight. The parade—a free, public event—for a hockey tournament—which, for those unaware, is a winter sport, played on ice, both outdoors in cold weather and indoors in refrigerated buildings—that was to occur in mid-May—when the outdoor temperature averages 13 degrees Celsius and a low of 7 degrees—in other words, temperatures suited to a light jacket—is being cancelled in favour of a paid-ticket event because it might be too cold‽ Is it possible, just maybe, that the organizers saw a way to squeeze more money out of the people for whom they’re purportedly putting on this event and are trying to spin things to their benefit?

No, of course not: we’re assured that there are enough activities that will involve the community. And, of course, the event is injecting an estimated $10 million into the city’s economy.

Or not.

In its bid, the Knights outlined plans to create a virtual beer garden involving 15 downtown bars. The beer garden will [now] take place in the parking lot of the John Labatt Centre…. We want to keep the crowds centralized in the immediate John Labatt Centre area, he said. We thought it would be better on balance to create the festival area in the immediate vicinity of the JLC and not lose that opportunity.

Wouldn’t it be a shame to lose that opportunity to isolate people so they won’t wander off, as far as 5 blocks away, and forget about the hockey that they’ve come from across Canada to see? To lose that opportunity for the event organizers to keep as much profit for themselves as they can at the expense of a downtown that’s already suffering financially? To lose that opportunity to show that the CHL and its teams are in any way different from the NHL in their approach to the supporters of the product it produces?

Yup, that would be a real shame. Why, it would be almost as bad as lying to the community—which overwhelmingly supported the competition for the tournament in the first place—about the purely avaricious reasons for which the events in that community are being cancelled.

Apparently I’m not the only one who had these thoughts.

Backlog

I’ve got almost 20 partially-finished posts cluttering up my admin page that date back almost two and a half years; in several cases I’ve started a post and then lost the train of thought that initiated it, but mostly I’ve just wanted to add another paragraph or two before hitting Publish.

Well, no more. Starting with Comedy Jazz, I’m going to try to clear out the old posts, either by finishing them off or by deleting them. As a bit of self-prodding, here are some of the titles of items I want to finish soon: Enterprise blogging, Promised the Moon, Reach for the Top, Last Tango in Nia, and Oh! The places you’ll go.

Comedy jazz

It’s an accepted, and expected, tradition in jazz that musicians play songs written by and for those that inspired them. In the jazz lexicon, the most influential of those songs are called standards. Everybody knows them. Every jazz musician has played a standard at some point in his or her career; some reinterpret them to such an extent that they’re unrecognizable, others use them as jumping-off points for their own work, and still others play them as they were played the very first time and always will be played. It’s a sign of respect, of knowing the history of what’s come before.

The philosophy of comedy, on the other hand, seems to be exactly the opposite: a comedian who performs another’s jokes is looked upon as a cad and a thief. You don’t hear up-and-comers doing George Burns jokes unless they’re impressionists. Sure, the homages are there to some small extent: even a casual listener knows Abbott and Costello’s Who’s On First? (I don’t know. Third base!) and has heard more bad imitations of it than they’d care to. By and large, though, comedy routines are performed only by the individual or group who did them in the first place. Bob Newhart is The Driving Instructor; Bill Cosby is Noah; there’s only one wild and crazy guy, and that’s Steve Martin.

It’s too bad this is the case; I think it would be hilarious to hear comedians performing others’ jokes, especially established ones. They can be similar or different in their own styles: Woody Allen doing Bob Newhart would be just as funny, I think, as Robin Williams doing Tom Lehrer. Picture Jerry Seinfeld doing Steven Wright’s material and vice versa; how about George Carlin and Denis Leary in the Pythons’ Nudge Nudge?

I’ve considered and rejected quite a few theories, and I just don’t get it. Anyone else care to wager a guess?

It was twenty years ago today

The smalltown weekly newspaper at home regularly prints a set of excerpts from items it published 5, 10, 20 and 50 years ago. My parents forwarded me an item from this week’s list that reminded me just how old I’m getting, how long it’s been since I was in touch with people from that period, and (surprisingly) how little certain aspects of my life have changed:

Twenty years ago

Warwick Central School won the Super Quiz competition in Lambton County. Members of the team were Peter Agocz, Susan Hoeksema, Scott Burchill, Jonathan Craig, Bob Bork, Peter Janes and Jeff Frayne. Their coach was Mrs. Frances O’Neil.

My sister was also mentioned in the same set of excerpts, as the winner of a public speaking competition.

Those citations pale in comparison to my mother, though, who beat us both by being named on the front page no less than four times.

Google off

Google on! Google off! Google on, Google off… the Googler!

But seriously.

ckaminski of the Web Standards Project recently highlighted the redesign of AT&T’s home page and developer Joe D’Andrea’s informative discussion of the project. One thing D’Andrea doesn’t mention, however, is the presence of the two comment-style directives <!--googleoff: index--> and <!--googleon: index-->.

Question answered: Those directives are not used by Google proper but rather our own slice-o-Google*, a mighty spiffy search appliance. Thanks Joe! Wonder why this is only an option on the Google Appliance and isn’t turned on web-wide?

I’ve seen googleoff/on on the aforementioned AT&T site and on several CBC news items (which also use the argument all). It’s also visible on a number of disparate sites because they have improperly-formatted comments; however, I believe this indicates that there are many more sites that use it properly.

I wrote about the benefits of a content-level robot exclusion scheme in August. googleoff/on is quite similar in concept, although it’s outside of the markup and thus requires a separate parser for XML-based content. (On the tag-soup web, though, it’s just another special case.) There’s also some comparison to be made to rel="nofollow"; it’s interesting to note that googleoff/on: follow could be used to accomplish exactly the same thing.

Whether it works or not is another question. I’m inclined at the moment to say it doesn’t: a search for the CBC article above including a term that is only present in the googleoff: index or googleoff: all blocks still finds the page, while a control request that includes a word not in the page at all returns no hits.

Still, it’s worth an experiment, so In the hopes that this will someday make its way to the web at large, I’ve added the directives to the comment forms, navigation links, and other non-content sections of my weblog pages. Currently a search for occurrences of petroglyphs wordpress on this domain returns only 3 results, and it’s my hope that that number won’t increase as Google respiders all the pages that now announce the presence of WordPress.

If by some chance the content exclusion works is ever turned on, it might be interesting to try several other tests. The index argument is a keyword from the robots meta tag, so perhaps follow would also work. (I’m presuming that all takes the place of index,follow and that the mere presence of googleoff implies the no prefix to those arguments.) Also, because the comments are external to the markup, it should be possible to nest or otherwise intertwine them.

rel="nofollow" broken

A few months ago, Google and several other search and aggregation companies introduced rel="nofollow" tagging. Rather than rehash the arguments over that, I’ll simply point to Lachlan Hunt’s cogent analysis of nofollow and add one point: the nofollow relationship should have been defined in a metadata profile as an additional link type.

Despite the implementation, at least Google et al are well-intentioned: comment spam is harmful and needs to be stopped. Only a few attempts have ever made it through my various blocking mechanisms and appeared on my weblog, but the bandwidth the spammers eat up trying to find pages they can exploit is double or triple that of the legitimate users of this site. Perhaps if there were a way to prevent them from finding comment forms in the first place….

Blocking WordPress wp-content listings

This Google search for wp-content directory listings should be of interest and concern to all of those folks who have recently set up a WordPress weblog. In short, it shows that everything in those directories—themes, plugins, images, whatever—is accessible from a single common point of reference. If this worries you, you might want to limit access: for those with Apache just add:

Options -Indexes

to a .htaccess file in that directory. Other directories from a standard install that may be of similar concern are wp-images and wp-includes; both may be restricted in the same fashion. The standard wp-admin directory includes an index.php file that will generally be used in place of a generated index, but any subdirectories will be open to the public so it might not be a bad idea to block it too. If you don’t want to worry about every individual subdirectory that might appear, add the line above to a .htaccess file in your main WP directory.