The Grub Robots.txt problemA number of accesses were made to a script called "display-person" on the counter. This script displays information about a person, including his or her email address.
Because this is sensitive data (spammers NOT welcome), it is protected in various ways. But these accesses seemed to be coming from all over the place, not just from one site or subnet. (That's happened before).
A frantic hour of log investigation found the culprit: The Grub project, a distributed indexing program modelled on that earliest of spiders, the Harvest project - allowing a multitude of spiders to collect data and share the resulting index.
Unfortunately, the project did not respect robots.txt, which plainly warned all such spidering efforts against accessing this particular URL. Investigation of the Grub site revealed that the project is quite aware that there is a problem, but is waffling, claiming that it's already solved, or doesn't need to be solved.
Nonsense.
The Grub project is breaking the rules, and they know it.
So far, there has been more than 20.000 Grub accesses overall to the counter - more than 3000 of them definitely violating the robots.txt rules in the 15 hours since I started counting the rule-breaking accesses.
Unusually for an opensource project, the web site offers zero (none, nada) email addresses to complain to - and I have no desire to become a participant in their forum system; I have enough of those.
You can follow the progress of the various attempts to scan the counter at the Malus status page. When the Grub count stops increasing, and the date of last access stops advancing, I'll believe that they've fixed the bug.
Until then - don't expect me to believe them.
UPDATE: at April 24, 17:03 GMT, the grubbing seems to have stopped.
UPDATE: the respite was temporary. On May 3, at 3 in the morning, the grub returned to its erroneous ways.
UPDATE: (Jan 2004): After May, the problem mostly disappeared. The last hit from Grub was seen on September 9, 2003. Apparently, the problem has been fixed permanently now.
Harald Alvestrand, for the Linux Counter Project