1. A user correlation system, comprising:
- one or more processors configured to:
- receive a unique identifier association request for unique identifiers;
- identify a first website containing user activity information associated with a first one of the unique identifiers;
- extract a first set of data from the user activity information on the first website associated with the first one of the unique identifiers;
- identify a second website containing user activity information associated with a second one of the unique identifiers;
- extract a second set of data from the user activity information on the second website associated with the second one of the unique identifiers;
- search for similarities between the first set of data extracted from the user activity information on the first website and the second set of data extracted from the user activity information on the second website;
- compare the first set of data extracted from the first website with the second set of data extracted from the second website; and
- generate an association score that indicates a confidence factor that the first one of the unique identifiers and the second one of the unique identifiers are associated with a same user according to the comparison between the first set of data and the second set of data.
2. The user correlation system according to claim 1 wherein one or more of the processors are further configured to:
- determine if a profile or account information exists on the first and second website associated with the unique identifiers;
- extract data from the profile or account information on the first and second website; and
- generate the association score according to the similarities in the data extracted from the profile or account information.
3. The user correlation system according to claim 2 wherein one or more of the processors are configured to operate as an information filter that extracts correlation information from the profile or account information.
4. The user correlation system according to claim 1 wherein one or more of the processors are further configured to return the association score to a client in response to receiving the unique identifier association request.
5. The user correlation system according to claim 1 wherein the first and second set of data each include a list of friends.
6. The user correlation system according to claim 1 wherein one or more of the processors are configured to operate as a user submission platform that link different unique identifiers with the same user.
7. A method comprising:
- receiving, by a computing device, a unique identifier (UID) request for one or more user UIDs;
- identifying, by the computing device, a website location containing online user information associated with the one or more user UIDs;
- obtaining, by the computing device, the online user information from the website location associated with the one or more user UIDs;
- comparing, by the computing device, the online user information from the website location with the personal information associated with the one or more user UIDs;
- searching, by the computing device, for similarities between the online user information from the website location and personal information associated with the one or more user UIDs; and
- generating, by the computing device, a UID association score based on similarities between the online user information obtained from the website location and the personal information associated with the one of more user UIDs, wherein the UID association score indicates a confidence level that the one or more user UIDs are associated with a particular person; and
- returning the UID association score back to a client in response to the client sending the UID request.
8. The method according to claim 7 wherein the one or more user UIDs include a first and second UID, and wherein the method further comprises:
- identifying user names and domain names in the first and second UID; and
- deriving the UID association score according to similarities in the user names and similarities in the domain names in the first and second UID.
9. The method according to claim 8 further comprising:
- varying the UID association score according to variations in domain name extensions in the first and second UID;
- varying the UID association score according to domain names for the first and second UID; or
- varying the UID association score according to domain extensions in the first and second UID that point to a same email account.
10. The method according to claim 7 further comprising adjusting the UID association score according to rarity of names in the one or more user UIDs.
11. The method according to claim 7 further comprising:
- identifying email signatures associated with the one or more user UIDs, wherein the email signatures identify a name of a user sending the emails and are located in a footer of the emails;
- comparing the email signatures; and
- generating the UID association score according to the comparison of the email signatures.
12. The method according to claim 7 further comprising:
- identifying “from”, “reply to”, or “on behalf of” email addresses for emails associated with a first and second one of the one or more user UIDs;
- comparing the “from”, “reply to”, or “on behalf of” email addresses; and
- generating the UID association score according to the similarities of the “from”, “reply to”, or “on behalf of” email addresses associated with the first and second one of the one or more user UIDs.
13. The method according to claim 7 further comprising:
- identifying user profiles or accounts on the website location and on an additional different website location;
- comparing information identified in the user profiles or accounts; and
- linking the user profiles or accounts on the website location and on the additional website location together when the information indicates the user profiles or accounts are associated with the particular person.
14. The method according to claim 13 further comprising:
- identifying different UIDs in the user profiles or accounts; and
- linking the different UIDs to the particular person according to the different UIDs identified in the user profiles or accounts.
15. The method according to claim 7 further comprising:
- receiving user information from users manually identifying different UIDs associated with the particular person; and
- generating the UID association score according to the user information.
16. A method comprising:
- receiving, by a computing device, an association request for an association indicator between a first unique identifier (UID) and a second UID;
- identifying, by the computing device, a first website associated with the first UID;
- identifying, by the computing device, a second website associated with the second UID;
- extracting, by a computing device, a first set of information from a first user profile on the first website associated with the first UID;
- extracting, by a computing device, a second set of information from a second user profile on the second website associated with the second UID;
- comparing, by the computing device, the first set of information with the second set of information;
- search for similarities between the first set of information extracted from the first website with the second set of information extracted from the second website; and
- generating, by the computing device, an association score to indicate a confidence factor the first UID and the second UID are associated with a same person according to the comparison of the first set of information with the second set of information.
17. The method according to claim 16 further comprising:
- searching through an additional different website for additional information associated with the first or second UID;
- comparing the information identified on the first and second website with the additional information identified on the additional website; and
- adjusting the association score according to similarities between the information identified on the first and second website and the additional information identified on the additional website.
18. The method according to claim 16 wherein the first and second set of information includes—photos, circle of friends, usernames, and address information.
19. A computer device containing instructions that when executed by a computer result in:
- receiving a unique identifier association request to determine a probability that two different unique identifiers (UIDs) are associated with a same person;
- identifying data for the different UIDs from different online locations that associate the two different UIDs with actual persons;
- search for similarities between the data for the different UIDs from the different online locations;
- comparing the data for the two different UIDs from the different online locations;
- identifying similarities and differences between the data for the two different UIDs from the different online locations;
- generating an association score indicating the probability that the two different UIDs are associated with the same person according to the identified similarities and differences; and
- returning the association score in response to the unique identifier association request indicating a confidence level that the two different UIDs are associated with the same person.
20. The computer device according to claim 19 further comprising instructions that when executed by the computer result in:
- searching different profiles or accounts at the different online locations for the data associated with the two different UIDs;
- filtering the data for personal information that may be associated with the same person; and
- using the filtered data to generate the association score.
21. The computer device according to claim 20 further comprising instructions that when executed by the computer result in:
- searching a database for a previously derived association score for the two different UIDs; and
- searching the different online locations for the personal information when the database does not contain a previously derived association score.
22. The computer device according to claim 20 further comprising instructions that when executed by the computer result in:
- identifying different types of personal information in the profiles or accounts that have previously been associated with high association scores; and
- increasing weighting for the different types of personal information when deriving future association scores.