Robot blocking microformat

How about I rename my robot-blocking profile as a microformat and see if that gets any discussion going?

And while we’re at it, can someone please explain for the record why microformats aren’t encouraged to specify HTML profiles (think namespaces for metadata) to formalize the idea that User agents may… perform some activity based on known conventions for that profile? Because a profile only applies to the document’s HEAD element is not a valid argument.

5 thoughts on “Robot blocking microformat”

  1. Hi Peter,

    Thanks for bringing up your robot-blocking profile again. I’ve added it to the section on “microformats experiments and first thoughts” on the microformats wiki page. I think that is a great approach to the problem of sectioning off parts of a page for robots to exclude etc. One possible improvement to consider is to base the names and semantics of the new
    class names on an already established schema, like the Robots Exclusion standard. This would be inline with the microformats principle of “reusing the schema (names, objects, properties, values, types, hierarchies, constraints) as much as possible from pre-existing, established, well-supported standards by reference.”

    E.g. instead of “ignore-content”, how about: “robots-noindex” which reuses both the literal meta name “robots” and the ‘content’ value of “noindex” from the Robots Exclusion standard. You could even point to the Robots Exclusion standard for the semantics, which in this case would apply to the element with the class name rather than the whole document. By leveraging the same processing model, you would probably also make it easier for current Robots Exclusion implementations to adopt your microformat.

    Similarly “ignore-links” could be “robots-nofollow”

    And additionally, if you wanted to carve out a robots-safe section inside a “robots-noindex” section or “robots-nofollow” section, you could define the respective opposites “robots-index” and “robots-follow”, again, leveraging the semantics defined in the well established Robots Exclusion standard.

    Finally, you’re totally right about microformats and HTML profiles. All microformats should specify respective HTML profiles that define the class names, rel values, and meta names tha are introduced. Some (most?) microformats already have XMDP profiles, and folks are working hard on writing such profiles for the other microformats as well.

    Thanks again for your efforts, and I’m looking forward to seeing what you do with your robot blocking microformat.

    Tantek

  2. Thanks for the comments, Tantek. You’re absolutely right (of course) about reusing the Robots Exclusion names, and I like the idea about nesting safe (or unsafe) areas. I’ll update the page and make it a little more spec-like.

    Regarding XMDP profiles for microformats, I’ve seen mention in the text of various working specs—for example, the profile of hCalendar that is used/referred to in the ‘profile’ attribute of the <head> element—but very few actually show the use of the profile in their examples or give any sort of indication of what the profile URL is. Perhaps the PURL service would be appropriate for those that don’t have homes of their own?

    Finally (for now) presuming this does turn into something relatively well-specified, what is the next step?

  3. Rereading the comments above, I think I misspoke: instead of profile I mean profile URL, i.e. an URL that defines the location of the XMDP profile. That allows pages to specify exactly which profiles are to be used. (Perhaps more importantly, the omission of a profile indicates which are not to be used for one reason or another. For example, the summary class in hCalendar is very likely to overlap a user-defined class.)

Comments are closed.