Petroglyphs – Page 46 – ('petrə-, properly combining form of proper name Peter; -glIf, to cut out, carve.)

Silly result set 1

The first results of the experiment are in, and they’d seem to indicate that Google places no importance at all on heading elements relative to any other text in a page.

In order, the current results (with duplicate hits included) are:

embedded h2
full content of a normal paragraph
embedded h1
normal text embedded in a paragraph
split between h1 and h2

Note, however, that the missed h1 test–probably the most relevant of any of them–is not yet included. I don’t hold high hopes that it will fare any better, though.

I’ll leave this to age for a few more days and see if things change at all, then try a new iteration. Suggestions are welcome, as always.

Via Eric Meyer I see that XFN 1.1 has been released. I’ve updated my stylesheet accordingly; it’s quite a bit bigger because I chose to duplicate the 1.0 selectors rather than switch to CSS3 *= syntax. As noted in the original post, it’s largely theoretical: most people can’t use the CSS3 rules, and both CSS2 and CSS3 are made redundant by the last rule that overrides everything above.

Anyway, that’s not what this is about. What this is about is a bit of geek humour using the rel attribute.

All of the above are legal… that is, with respect to the specification! That’s not to say there aren’t some illegal combinations that you wouldn’t necessarily want to come across either.

Stuart Smalley
I’m my own grandpa
Eve
Also Eve
a master debater, if you know what I mean
Shirley Maclaine

Comments are open… you know what to do!

Another one bites the dust

Hot on the heels of flaming Mac death, my trusty rusty firewall/mailserver died after a power outage this morning. (Yes, it was hooked up to the UPS. Yes, it shut down cleanly. Yes, when the power had been out for almost an hour I had a flashback to last August 14.) I’m sure it’s just another dead power supply, but again it’s hardly worth replacing, this time due to the age of the machine (hint: it’s a Pentium 120). Fortunately I was able to salvage the hard drive and network cards and install them in one of my other boxes, so the overall downtime was fairly minimal.

Ironically, I’ve been meaning to upgrade the dead box to a recent Linux kernel so I could use some of the more advanced firewalling features; the replacement already had 95% of what I needed, so I actually managed to save myself a few hours.

Silly expert experiment

Can anyone confirm that engines like Google actually make use of heading elements in determining page rank? I’m looking for a link to actual results demonstrating the effect of headings on Google’s ranking of a page; if you have one handy, kindly drop it to me via e-mail. I’d just like to know one way or the other.

No existing results here, but it sounds like a fun experiment along the lines of nigritude ultramarine. So here’s what I’ve done: I’ve created a three-word term that’s not currently found in Google (and which I won’t include here so as to not skew the results). It’s embedded in different ways in five randomly-named and titled XHTML 1.0 Strict files that contain only Lipsum text:

as an <h1> at the top of a page
as <h1> within a page
as an <h2> within a page
split between <h1> and <h2> elements
as the entire contents of a <p>
embedded in the middle of a paragraph as normal text

The pages are linked in random order from a single page. If heading elements do help to determine page rank, one would expect the pages to be ranked in the order I’ve listed them above.

Although I’ve tried to reduce bias as much as possible in the test, it’s not exhaustive and hardly scientific. I’m open to any suggestions on how to improve the method.

(In case you’re wondering, the phrase was chosen by taking words from webpages I had open at the time and adding a descriptive adjective. J. is a fellow Lenni Jabour fan and the host of a radio program in Toronto, and M. is the name of a show being put on by former Lenni cohort Andrew Downing.)

Bleah. While checking to see if Google had crawled the pages–it has–I discovered that I forgot to link the very first item in the list above. I’ve added it now and re-requested a crawl, but it may skew the results. (The term doesn’t appear in search results as of yet, so I have some hope it may not matter.)

More on attributes

Not much to say, just a followup to my earlier post about comma-separated values to point out that Tantek found that even the W3C itself has trouble with them: the CSS Validator incorrectly labels [a Technorati page] as invalid due to the validator’s failure to parse the perfectly valid media attribute value screen,projection.

Blockquote citations

One of the (apparently) little-known attributes of HTML’s blockquote element is the cite attribute:

cite = uri [CT]

The value of this attribute is a URI that designates a source document or message. This attribute is intended to give information about the source from which the quotation was borrowed.

I’ve used it for quite a while here and recently added an XHTML-compatible version of Dunstan Orchard’s citation extractor that makes the information stored there actually useful. Unfortunately, few of the blog crawlers (notably Technorati, if only for the connotation of its name) recognize that the cite attributes are links just as much as href attributes are. I’d always wondered why my links never showed up in the cosmos for articles I’d commented on, and now I guess I know.

Irony

The Java Technology Concept Map 1.0 is an interactive diagram, a web of linked terms, to show the relationships among and uses of the Java technologies.

…that’s written in Flash.

Why can’t Jonny spell?

IBM and HP both hopped onto the social movement called linux.

…the world changes: linux enters the product portfolio…

…the vast majority of enterprise datacenter deployments are now occurring on Red Hat’s linux.

…SuSe has added in an application server to their linux distribution…

…IBM [needs] to defend its increasingly curious linux strategy.

…the GNU linux kernel…

etc. Why won’t jonathan schwartz capitalize Linux? Is he worried about the trademark? He doesn’t seem to have a problem capitalizing Red Hat, IBM, GNU, SuSe, Microsoft, Sun (obviously) and even the name of the Linux World conference and SuSe’s Enterprise Linux, but for some reason he’s morbidly averse to using the appropriate form of the name of the kernel that’s one of the biggest competitors to Solaris (another correctly-capitalized name).

From his writings on his weblog, I don’t think schwartz is a particularly petty individual, but I can’t determine any motivation for his continual apparently deliberate misuse of the name.

I’m waiting for an explanation, jonathan. (Not that I have any expectation that I’ll get one… I’m just [counting on fingers] one person–who happens to be a Java proponent, by the way–with a question.)

The awful truth

George W. Bush and his administration have taken “normal” mendacity to a startling new level far beyond lies of convenience. On top of the usual massaging of public perception, they traffic in big lies, indulge in any number of symptomatic small lies, and, ultimately, have come to embody dishonesty itself. They are a lie. And people, finally, have started catching on.

None of this, needless to say, guarantees Bush a one-term presidency.

(via Tim Bray)

Extensions

I’ve set up a category for software extensions that I’ve written or want to comment on. It’s currently targeted at Firefox and Thunderbird. Share and enjoy.