Spam and bots

Two days ago I wrote about spam (my first post in a long time).

This morning, less than 48 hours after posting the above, I have 100 spam in my filter. Woohoo! The odd thing is that practically all spam are comments on old or very old posts. It is always like that. Somehow posting twice after a hiatus of more than two months got the spammers working. On old posts. Hmmm... This made me check the statistics of this site. I use awstats provided by my host. Old school, eh? 😉

Most days there are bot-visits. The past two days it has been an increase, including Googlebot early this morning, its first visit since March 31. Does Google have a sort of update bot that just checks for updates? An hour before Googlebot's visit I had a call from Feedfetcher-Google. It seems someone (Google?) uses Google as feed reader. Looking back the past few months, there is always a bunch of bots on the last day of the month. Not all identified bots favor the last day of month, but Google, Yahoo, WordPress(!), Baidu and Yandex does. I presume they do not do just random botting (following links) but check if there have been updates on their (gigantic) "lists". There also seems to be a burst just the past two days, from search engines and from "unknown bots", which are about as many. The dates of visits seem even less random for unidentified bots than identified bots, if I count the unknown as a collective. Some of them might be very random of course. I wonder if any of the unknown bots, i.e. those that do not identify themselves, have "lists", too? The sudden increase of spam seem to indicate that. Unless they somehow have access to Google's et al's "lists"?

I am mostly guessing here. I do not know very much about web spiders/bots/crawlers, and definitely not how spammers do their data collection. They do have huge collections of email addresses, but they cannot as easily be crawled, and are mostly collected by criminal means, I assume.

The frequency of bots and what types, I have glanced from the monthly statistics for this year. My interpretation above of tendencies is from my reading of it. I have not done any statistical analysis correlating dates, bots, types of bots etc. But the stats are quite easy to read, and I do not think it is all in my head :p

Comments from knowledgeable people, or those less logically challenged than me, are welcome 🙂