How to Find Facebook Users on Match.com by Using Face Recognition Tools

One day I was drinking coffee with my friend and he told me a story of how he got in trouble with his girlfriend because she found his old profile on a popular dating site match.com. Allegedly someone from their common friends bumped into it and sent her a “friendly notification”. He was unable to prove her that it’s just ‘an old thing’ that he forgot to delete and their story ended there – they broke up pretty fast after this incident. Probably their relationships wasn’t that strong anyway, but this story struck my mind with a couple of thoughts. What are the odds of bumping into a profile of someone you know on a dating website and how easily such privacy could be violated if someone had a direct intention?

I remembered that there were some open-source face detection and recognition libraries available and thought that it’s probably possible to write a tool that would crawl photos on dating sites and try to recognize a particular person on them. Then I ran into face.com – a platform that provides a RESTful API for detecting, tagging and recognizing faces on the pictures. I recalled that story again, told to my friends, we all laughed, agreed that such a tool would be creepy, but I did put it in my list of ideas for hacking.

So, guess what, let’s go creepy and run a small experiment to see how easy that would be. To do that we’ll write a tool that will take tagged photos of a Facebook user and try to find his/her profile on match.com.

Let’s split a task into smaller problems.

  • How send authorized search requests to match.com
  • How to get URLs of profile images
  • How to make parsing fast by running requests asynchronously
  • How to use face.com API

How to parse match.com

Sending authorized requests

To get profile pictures we need a search output. If you go to match.com you’ll figure that it doesn’t allow you to browse search results without registration. So the very first step would be to create an account on match.com. After that go to the search form, choose the parameters, like sex, age and a zipcode and click “Search now”. Now we see the results with the profile pictures, but let’s try to get this from our tool. Let’s copy URL of the request from the browser and write a simple Ruby script.

1
2
3
4
5
6
require 'rubygems'
require 'open-uri'

response = open(YOUR_URL_OF_REUEST)

p response.read

Run this and you won’t see the results but that’s what we expected, right? Ruby script sends a request without a session cookie and therefore hitting a sign in form. So we need to pass this session cookie to send requests as a signed in user. If you’re already logged in, open the list of cookies for match.com in your browser and you’ll see the bunch of shit it stores. Save yourself sometime, cause I already figured that their session cookie is called SECU. Copy the value of this cookie and update the script.

1
response = open(YOUR_URL_OF_REUEST, "cookie" => "SECU=VALUE_OF_THE_COOKIE")

If you run it, you’ll see a different response. Search through it and you can find something like Welcome, your_name and that means we sent a request as an authorized user.

Getting URL of profile images

Now, how do we get URLs of profile images? Let’s analyze HTML structure of the search results page. Use Web Inspector in Chrome or Safari or Firebug if you use Firefox. Point to a profile image and you’ll see that HTML code for it looks something like this:

1
<img class="profilePic" src="http://sthumbnails.match.com/sthumbnails/03/06/95230303242.jpeg" style="border-width:0px;>

All profile pictures on the page have class “profilePic”. Awesome. But those are very small images which would be hard to use for recognition. We need the bigger ones. Let’s click on someone’s profile, find the big image of the thumbnail from the search result and see how the link for it looks like:

http://pictures.match.com/pictures/03/06/95230303242.jpeg

Boom! Looks like we’ve found a pattern. The image has the same name, the difference is only in the part of the path: we should replace sthumbnails.match.com/sthumbnails with pictures.match.com/pictures to get a big image for the thumbnail. That way parsing only pages from the search results we can have URLs for big profile images without additionally requesting a profile page. Ok, let’s do it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
require 'rubygems'
require 'nokogiri'
require 'open-uri'

# can be different for your specific search
PAGES = 140

PAGES.times do |page_num|

  response = open("http://www.match.com/search/searchSubmit.aspx?\
    by=radius&lid=226&cl=1&gc=2&tr=1&lage=27&uage=29&ua=29&pc=94121&\
    dist=10&po=1&oln=0&do=2&q=woman,men,27,29,1915822078&st=quicksearch&\
    pn=#{page_num}&rn=4",
    "cookie" => "SECU=VALUE_OF_THE_COOKIE")

  doc = Nokogiri::HTML(response.read)

  doc.xpath("//img[@class='profilePic']/..").each do |link|
    img_src = link.xpath("img/@src").to_s
    img_src.gsub!('sthumbnails.match.com/sthumbnails', 'pictures.match.com/pictures')
    puts img_src
  end

end

That will print URLs of big profile pictures from the first results page. For parsing I’m using here a Nokogiri gem and a little of XPATH. Easy.

Getting images from the first page is not enough, we gotta get them from all pages so let’s run a loop and change the page num param in URL. Examine URL structure of the search request and you’ll see a parameter called ‘pn’ where we pass a number of the page.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
require 'rubygems'
require 'nokogiri'
require 'open-uri'

# can be different for your specific search
PAGES = 140

PAGES.times do |page_num|

  response = open("http://www.match.com/search/searchSubmit.aspx?\
    by=radius&amp;lid=226&amp;cl=1&amp;gc=2&amp;tr=1&amp;lage=27&\
    amp;uage=29&amp;ua=29&amp;pc=94121&amp;dist=10&amp;po=1&\
    amp;oln=0&amp;do=2&amp;q=woman,men,27,29,1915822078&amp;st=quicksearch&\
    amp;pn=#{page_num}&amp;rn=4",
    "cookie" => "SECU=VALUE_OF_THE_COOKIE")

  doc = Nokogiri::HTML(response.read)

  doc.xpath("//img[@class='profilePic']/..").each do |link|
    img_src = link.xpath("img/@src").to_s
    img_src.gsub!('sthumbnails.match.com/sthumbnails', 'pictures.match.com/pictures')
    puts img_src
  end

end

Note the &pn=#{page_num} part. The rest of the URL should be yours as your copied from your browser.

Making parser work fast

Although, that would get part of the job done, that solution doesn’t scale. HTTP requests here are running sequently and taking too much time to complete if you want to crawl a lot of pages. What we need to do is to run HTTP requests asynchronously. There is a number of way to achieve that (no, using threads is not a way), but I’d suggest using Eventmachine. Eventmachine gives you an ability to run IO operations asynchronously without blocking a process. There’s a em-http-request, an asynchronous HTTP client that works on top of the Eventmachine and ideally fit for our purpose. So let’s rewrite our small program using Eventmachine and see what happens.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
require 'nokogiri'
require 'eventmachine'
require 'em-http-request'
require 'em-redis'

PAGES = 140

EM.run {

  @redis = EM::Protocols::Redis.connect

  PAGES.times do |page_num|

    url = "http://www.match.com/search/searchSubmit.aspx?by=radius&amp;lid=226&\
      amp;cl=1&amp;gc=2&amp;tr=1&amp;lage=27&amp;uage=29&\
      amp;ua=29&amp;pc=94121&amp;dist=10&amp;po=1&amp;oln=0&amp;do=2&\
      amp;q=woman,men,27,29,1915822078&amp;st=quicksearch&amp;\
      pn=#{page_num}&amp;rn=4"

    http = EM::HttpRequest.new(URI.escape(url)).get :head =>  {'cookie' => "SECU=YOUR_SESSION_COOKIE;"}

    http.callback {

      p "parsing page #{num+1}"
      doc = Nokogiri::HTML(http.response)

      doc.xpath("//img[@class='profilePic']/..").each do |link|
        img_src = link.xpath("img/@src").to_s
        img_src.gsub!('sthumbnails.match.com/sthumbnails', 'pictures.match.com/pictures')
        @redis.hset("people", img_src, link['href'])
      end

    }

  end
}

There’s a few things I have to explain here. Eventmachine runs an event loop and everything should be passed there as a block. Then, when you run EM::HttpRequest.new(URI.escape(url)).get instead of blocking a process and waiting for the response it will immediately return, but when response will be received Eventmachine will call a callback method http.callback where I put our parsing logic. Also you may have noticed that instead of printing out the URLS on the screen, I save them in Redis hash where each profile image associated with the profile URL. We’ll be accessing this hash from a recognizing tool later. Note that I’m using asynchronous version of Redis client for Eventmachine here: em-redis. If you run this script you’ll see that it’s working way way faster than its synchronous version. Now that we have profile pictures let’s get to using face.com API to recognize faces on them.

How to use Face.com API

First we’ll need to register at face.com and get some credentials to be able to use their API.

Go there and sign up. Save API KEY and API SECRET that you’ll be given after registration is complete.

To recognize faces you have to first “train” their app by feeding it some images of the people you want to find. There’s a number of ways to do that. You can pass URLs of images, detect faces on them, tag them and then pass URL of of other images for recognition. Or you can pass a twitter user or Facebook user, they will automatically take available tagged photos for this user to train and “remember” them.

Geting fb_oauth_token

So let’s go the Facebook way. To be able to access tagged photos of your friends we need to have an fb_oauth_token for any Facebook app that has permission to access your profile and your friend’s images. You can either register your own Facebook app, or (that would be faster) grant permission to an app of face.com. In any case we just need a value of fb_oauth_token that we gonna use later in our script.

Doing that is a bit tricky. Go to http://developers.face.com/tools/. Choose a method faces.recognize and a Facebook connect button will show up. Click it and grant their app requested permissions. Then click a “call method” button, ignore whatever appears in the response body but checkout the REST URL on the top of the response body. You’ll see a fb_oauth_token parameter in the end of URL. Copy and save its value.

Finding profiles!

The good news is there’s a ruby gem for face.com that works. I assume that you were following this post and have URLs of profile images stored in a hash named ‘people’ in Redis. A script that’ll do the rest of the job:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
require 'redis'
require 'face'

# fb user_id of the user that needs to be find on match.com
FB_UID = '11111@facebook.com'

# show all profiles whith the given confidence of recognition
ACCURACY = 30

redis = Redis.new
pictures =  redis.hgetall("people")

# recognize user
client = Face.get_client(:api_key => FACECOM_APIKEY,
                         :api_secret => FACECOM_APISECRET)

# note in fb_user you pass here YOUR fb user id, not an id of the user your are looking for
client.facebook_credentials = { :fb_user => YOUR_FB_USER_ID,
                                :fb_oauth_token =>YOUR_FB_OAUTH_TOKEN }

#train pictures
response = client.faces_train(:uids => [FB_UID] )

pictures.keys.each_slice(20) do |pictures_chunk|

  response = client.faces_recognize(:urls => pictures_chunk, :uids => [FB_UID])

  response["photos"].each do |photo|
    photo["tags"].each{|tag|
      next if tag.nil? || tag['uids'].empty?
      if tag['uids'][0]['confidence'].to_i > ACCURACY
        p "Profile found, #{pictures[photo['url']]}, confidence #{tag['uids'][0]['confidence']}"
      end
    }
  end

end

That’s it. It will display a list of profiles where the confidence of recognition was more than 30%. You can change this number to see more accurate results.

There was a lot of fun to run this test against my own photos and get the results with the high number of confidence and see people who allegedly look like me (well and sometimes they do). I ran this test for some friends (with their consent!)and was able to find a couple of them too. No details and pictures can be revealed here for the obvious reasons :)

Conclusion about privacy: if you have your photos associated with your profiles on the different websites and especially photos with your tagged face, there is possibly a way to find this profiles just using your photos.

Comments