By analysing the customised results of online searches, it’s now possible to reconstruct the user information that has been collected by a search website.

Research Reveals How Search Engines Build User Profiles

Would you like to know just what Google, Yahoo and Bing know about you? Two researchers from the University of Bangalore, India, appear to have found the solution. They started from the premise that most search engines offer their users information – especially product and service advertising – which is tailored to their profiles and came up with a simple idea. Since such web services need to build up a personal profile on individual searchers as a basis for customising the results, all you have to do is to trace the thread back up to thestarting point in order to get at this user profile. In other words, analysing the personalised information which the search engine retrieves for the user makes it possible to reconstruct little by little the personal data which the user is inadvertently providing to the search site. To
test their approach, the researchers worked on the web history of ten individual Google users.

30% of searches generate personalised information

The Mountain View-based web search specialist does in fact construct user profiles from their
web search history. The Bangalore researchers call the results of their research ‘black boxes’.
By comparing the results of these ten users with those of a ‘control’ user – for whom no
personal information and no web history is available – the researchers were able to pinpoint
the differences in the information that Google offered. The research showed that about 30%
of searches give rise to customised information. More importantly, with the algorithm they
had created, they managed to reconstruct a basic user profile for each of the Internet users
they were working with.

A prototype at development stage

For example, for a given search, the Bangalore system is able to identify, with over 80%
probability, those topics which have given rise to customised search results. Placed end-to-
end, this means a list of the Internet user’s focuses of interest can be reconstructed from the
search engine. In the long term, the researchers hope to go further. They think that “the tool
could be used to reconstruct customisation based on searches of a more sensitive nature -
financial, medical, etc.” They believe that this information can then be used to create a tool
which will enable Internet users to keep better control of their own user profiles, and thus
safeguard their privacy. At the moment they are at work on a prototype.