AOL Releases Query Data From 20 Million Searches
AOL recently released a document on the Internet containing query data for about 20 million searches from 658,000 different users.
This has been seen as a huge mistake on AOL part and raises some serious privacy concerns. The AOL user IDs in the data had been replaced by random numbers, but there was still quite a bit of private data in the search queries such as Social Security Numbers and credit card numbers. See Elliot Back’s post about the privacy issues.
The reason this is of interest to publishers is that it is a very extensive source of information about keywords and search behavior. Mining the data can reveal which keyword phrases are popular and what alternate phrases people use when one phrase does not show the desired results. This data is much more detailed than anything obtained from sources such as Wordtracker or Overature keyword listings.
I expect to see people’s analysis of this data popping up soon.
See this Digital Point thread for more discussion.
Update: Here are some websites which you can use to access the data:
http://www.aolsearchdatabase.com/ – Users can query by user id, keyword, date or clicked website
http://fuaol.com/ – A list of the top 5000 keywords
http://simplifiedsec.com/KeywordDigger.html – Search by keyword
Update 2: http://dontdelete.com/ This looks like a good, quick interface to the data.