Two Indian researchers have developed a system which enables them to exploit information on Twitter to search for a Facebook identity.

A Facebook account, a Twitter profile, another for Google Plus, LinkedIn, Viadeo, etc. – many Internet users possess multiple online identities. Without calling on the skills of a private detective to scrutinise profile photos for a vague resemblance, making the link between the various profiles and one individual Internet user can be a tough job. Now two Indian researchers from the Institute of Information Technology in Delhi have written a paper on their system*, dubbed – after the famous animated film – Finding Nemo, which does the work for you. “The system could be useful to identify the various identities of a spammer,” state the authors. For the purposes of their research, they focused on the two most widely-used social networks, Twitter and Facebook. Starting out from the known identity of the Internet user on one profile, the system searches for an associated account on the other social network. In tracking down the eponymous Nemo, whom they take to be a female Internet user, the researchers based their work on three types of information: her profile name, her posts, and her network.

Identifying a list of potential ‘Nemo’s

An important aspect of the study is that the researchers used only publicly accessible information, i.e. data freely available without obtaining any prior authorisation from the social network user. They carried out four types of search, using an algorithm for each: first a Profile Search, second a Content Search, third a ‘Self-mention’ Search, and fourth a Network Search. For the first they start out from Nemo's username on Twitter, and query the Twitter API to extract her name, username, location, profile image and URL. The system then searches Facebook to find a list of candidates who might be Nemo. Sometimes the answer is obvious. As the researchers say in their paper, “a user referring to her blog on Twitter URL when her blog has her Facebook identity" makes the work a lot easier. This isn’t always enough, however, so as a second step they look at publicly available content. “The identity of a user on a social network also includes the content that she creates or is shared with her. Owing to the popularity of social aggregation sites and ways to link multiple networks together, a user pushes the same content on multiple networks simultaneously,” the paper’s authors point out.

40% accuracy from a sample of 543

The Content Search helps refine the information gathered from the profile search. The third step is the ‘Self-mention’ Search which assumes that “if Nemo has accounts on two or more networks, she might cross refer to her other account in a few of her tweets.” The fourth step is the Network Search. They researchers look at her three Twitter networks: the ‘follower’ network, the ‘followee’ network and the two-way ‘friends’ network. Network is an important aspect of a user's social media identity, as it is a shared identity built by the user with the involvement of other users, and so this type of search is quite likely to reveal a user’s Facebook identity. Making the assumption that “Nemo connects to the same subset of users on both social networks,” the Network Search algorithm explores the possibility of a non-intentional user identity leak via her network attribute, thus providing a way to find Nemo on Facebook without any surreptitious hacking. Combining all this information, the authors observe that “correct Facebook identities were identified” for 40% of the Twitter users in their dataset of 543 twitterers, and they are now working on improving their system further.