Server Stats Reveal Data Mining
It’s amazing what you can learn by poking around in your web site’s server stats. I use AWStats to analyze server log statistics for ForumPoint.com. I like to peek at the “keyphrases used on search engines” statistics to see how people are getting to the site. AWStats shows you the top phrases that people searched on Google, Yahoo, MSN, etc. This snapshot tell s a story. Notice the phrase at the bottom:
contact or mail or email or phone or fax or tel site www.forumpoint.com
It looks as if someone has a massive database of domain names for which they are performing automated queries on search engines to mine contact information. I would imagine that this type of data could have value since it would be more reliable–at least for marketing purposes–than the oft spamed whois record data. Once this info is mined from thousands of sites, what will be done with it? Dump it on a scraper site? Organize it in an AdSense laden yellow pages directory? Sell it? This is not a very comforting discovery and probably a good reason for the paranoid to hide data intended only for site visitors. A text image or JavaScript should do the trick.





