Google on! Google off! Google on, Google off… the Googler!
But seriously.
ckaminski of the Web Standards Project recently highlighted the redesign of AT&T’s home page and developer Joe D’Andrea’s informative discussion of the project. One thing D’Andrea doesn’t mention, however, is the presence of the two comment-style directives <!--googleoff: index-->
and <!--googleon: index-->
.
Question answered: Those directives are not used by Google proper but rather our own slice-o-Google*, a mighty spiffy search appliance.
Thanks Joe! Wonder why this is only an option on the Google Appliance and isn’t turned on web-wide?
I’ve seen googleoff/on
on the aforementioned AT&T site and on several CBC news items (which also use the argument all
). It’s also visible on a number of disparate sites because they have improperly-formatted comments; however, I believe this indicates that there are many more sites that use it properly.
I wrote about the benefits of a content-level robot exclusion scheme in August. googleoff/on
is quite similar in concept, although it’s outside of the markup and thus requires a separate parser for XML-based content. (On the tag-soup web, though, it’s just another special case.) There’s also some comparison to be made to rel="nofollow"
; it’s interesting to note that googleoff/on: follow
could be used to accomplish exactly the same thing.
Whether it works or not is another question. I’m inclined at the moment to say it doesn’t: a search for the CBC article above including a term that is only present in the googleoff: index
or googleoff: all
blocks still finds the page, while a control request that includes a word not in the page at all returns no hits.
Still, it’s worth an experiment, so In the hopes that this will someday make its way to the web at large, I’ve added the directives to the comment forms, navigation links, and other non-content sections of my weblog pages. Currently a search for occurrences of petroglyphs wordpress on this domain returns only 3 results, and it’s my hope that that number won’t increase as Google respiders all the pages that now announce the presence of WordPress.
If by some chance the content exclusion works is ever turned on, it might be interesting to try several other tests. The index
argument is a keyword from the robots meta tag, so perhaps follow
would also work. (I’m presuming that all
takes the place of index,follow
and that the mere presence of googleoff
implies the no
prefix to those arguments.) Also, because the comments are external to the markup, it should be possible to nest or otherwise intertwine them.
As a matter of fact, that’s where I first saw it used – on CBC’s site! Their GSA wrangler-in-residence, Blake Crosby, has published some very useful scripts (for folks with a Google Search Appliance, at any rate).
Why not support this web-wide? Good question. Perhaps Google’s looking for a better way to eliminate site-wide redundancies automagically, without requiring changes to the markup (or any other type of file for that matter).