Hi, my name is Keith Adams, and I worked on Facebook's typeahead. The system has evolved a bit since we launched, but I talked about first_degree.php in the tech talk we did about the typeahead back-end last year:
Briefly, first_degree.php returns objects you're directly connected to in the graph, and if there's space a few machine-generated guesses at other good results. We preload these as soon as you focus the "Search" box at the top of the page, in the hopes of having some decent results to show when you start typing. The index field does, as the article inferred, represent our best guess at a ranking function on these first degree objects. The inputs to this ranking function explicitly do not include other users' behavior on the site. I talked a bit about our ranking function in this quora question:
Keith, I would like a feature to clear my facebook search history/profile view history. I consider it a privacy issue.
The knowledge that anyone who stumbles across my logged-in profile can see who I'm interested in by experimenting in my friend search bar has a chilling effect on the profiles I view. Google and most browsers have a 'clear search history' feature- it would be nice if facebook had the same.
That's not a privacy issue, its a security issue. If you can't be bothered to log out your account on shared computers, people are going to get access to your account, and a malicious person will be able to do anything they please. Search type-aheads are the smallest of your problems.
Being able to uncover what profiles have been viewed is a significant problem for a lot of people. And they aren't worried about malicious users - they are worried about their partners/parents/children seeing which profiles they've viewed. Who looks at what profile is one of the most sensitive pieces of information facebook has, and I'm a little surprised that they do this.
I'd be more worried about people accessing my Internet banking or my email than who I look at on Facebook. Log out of your bloody accounts before you leave your PC!
Just to be clear - I don't think people are especially worried about strangers seeing which profiles they've viewed (it would be quite unlikely it would be meaningful) - they are worried about the people they trust seeing which profiles they've viewed.
Facebook doesn't show which profiles you've viewed explicitly for a good reason - I think they were just hoping that this would be obfuscated enough not to get much notice.
they are worried about the people they trust seeing which profiles they've viewed
In other words, you don't trust the people you trust. Perhaps you need to rethink your behavior, based on your revelation that you don't actually trust them.
It's not always as simple as that. There are times where it's better to keep certain information from certain people no matter how much you trust them or like them. It's just better that way.
Some of us have had significant others with jealousy issues. It's not that we don't trust them, it's just better if they don't worry about silly issues like whose Facebook profiles we've been perusing. Not because it's a secret or because there is anything to hide, just because it helps the SO control their counter-productive impulses if they aren't informed on the matter.
That same kind of avoidance can be applicable in all other sorts of relationships. It's not that the people aren't trusted, it's just that there's no reason to know, and it will only cause damage if they find out -- not because it's bad or trust-breaking, but because the person's reaction may be problematic for completely different reasons (compulsions, or potential taint of future circumstance).
this is the old "you don't have anything to hide" mantra dressed up in more personal terms. people aren't perfect and they have the right to distrust their closest friends if they want to.
Of course they do. That's why modern operating systems have security.
It just doesn't make sense to insist that both (a) I want to leave everything unlocked and open; and (b) I don't want anyone to be able to see what's there.
You've got every right to protect your privacy from prying eyes. But if you want to do so, do it.
I often hand my laptop or iPad to my wife so that she can look at something/use it for a few minutes while sat on the sofa (her computer is a desktop in the other room).
I normally don't bother logging out of whatever sites I'm logged in to.
I'm not too bothered what she sees if she opens up the Facebook tab with me logged in - but I can see that it may be a problem for some.
And I don't think that "hand it over for five minutes to check something" is that uncommon a use-case.
"here - look at this interesting article on Wikipedia"
... she starts reading ...
... I get bored and go to make cup of tea, forgetting that I hadn't logged out of Facebook ...
... she finishes article, closes tab and finds herself on my Facebook page ...
It is my fault, but a "clear history" (or at least make my history invisible) function would mean that mistakes like the above will have much less impact
I disagree, it most certainly is a privacy issue. You may share a computer with friends, room mates, and significant others. If someone had access to your facebook account they could read all your messages, but this is something we are all aware of. Up until now I was _not_ aware of the fact that someone gaining access to my account could actually discover which profiles I frequent, which is like having a browsing history I did not know about, which I cannot even clear/remove after I learn about it. This is what makes it problematic.
If someone you don't trust has access to your logged-in profile, you have much worse problems. For example, they can read all of your private mail and send mail as you.
There's more to it than just logging out. What if you don't want a subpoena to reflect your most visited profiles? There should be a way to purge any data kept longer than the routine access log cycle, including information on frequently viewed profiles.
The fact that someone can "stumble across" your logged profile is less of a concern for you than the fact that they might see your search history? Seriously?
If you read his discussion of how it works, though, that's very much not what this is. What you're asking for doesn't even make sense, to be honest.
It's not about who you search or view. It's all just a big guess at who you might be interested in. From what I've gathered, it tries to find probable relationships based on all of these factors:
Comments, Likes, Tags, Events, Applications, Friends, and Work and Education data
Note that those are both your own comments, etc, and those of the people it's trying to relate you to. Who knows exactly how it combines all those to come up with its guesses, but I'm quite positive it has fairly little to do with who you're actually viewing (though I'll admit, maybe that's a small factor in there, too).
I guess this implies that Facebook does not really remove the imported information, but that it keeps this information and just hides those contacts from you in the friend finder interface.
I think you're leaping to a bad conclusion. Keith's post mentions "computer-generated guesses" to fill in the list. Looking at the guesses in my list, they appear to be friends of friends. (The prefill list is probably using logic similar or identical to the logic that presents "people you may know.")
Find somebody on your mystery list who isn't a friend of your friends, and then you can get paranoid.
I see plenty of people I know of but have never interacted with online (in fact I just found the previous-unfound profile of a coworker with a very common name). Facebook is /smart/--though I wouldn't doubt for a second that they've used your friend finder data to strength their edgerank data I would not be surprised to learn their seeming precognition comes from more esoteric sources.
Interesting point - the "friend ranking" numbers are different when I checked Mozilla vs. Chrome.
In Mozilla, the first person ranked was at -6.2650374; on Chrome, it was -7.2581474 (I go back and forth with this person quite a bit). Also, the ranking of some of the people were different.
My guess is something to do with browser cookies or caching - any ideas why this might be the case?
OT, but for the second time this year, someone is newly (just as of today) using one of my secondary Gmail accounts as their FB user ID. They don't appear to have access to the Gmail account -- and I've newly killed any concurrent sessions and then changed its password and security Q/A to be sure. So, I don't know how they're accomplishing this. But that Gmail account received the sign up confirmation messages and is now filling up with friend confirmations.
Any chance you could plug me into someone relevant at FB, as this appears to be a recurring problem without a ready explanation? Email to pasbesoin at that gmail place.
P.S. As best I can determine, my systems are clean, and I've no other problems/compromises that are apparent. There appears to be something borked with the Facebook account creation confirmation process.
Indeed, it is not wise to run third-party bookmarklets while logged into Facebook. This one may be benign, but the next one may not be. If you want to see the JSON we're talking about, just load
Nevermind, I misinterpreted. I just realized it's supposed to be "viewer=[ID]", I was interpreting it as "viewer=[ID]&userid=[ID]". I'm still getting errors, but presumably that's another matter.
It's javascript. Read the source before running it. The obfuscated gook that's 80% of the script has 0 diffs from jQuery as downloaded at jQuery.com; and the rest of the script is easy to verify in 5 minutes.
It's also a great habit to get into. Reading source code is invaluable in understanding and learning. And it is a skill that can be cultivated just like others. For example, the Prey project is an invaluable piece of software, except it's potentially extremely sensitive. Probably worth reading the source first.
You also start to get very well versed in the "usual way of doing things", especially if it's a language/paradigm you're not programming daily in.
Those who don't want to run the script can visit the facebook first_degree page[1], search for "path" in the output. Note that you need to replace your facebook id in the link which can be obtained via graph api [2].
Would it be possible for somebody to create a virus that would grab this file and publish it to people's profiles? I think I'd crawl into a hole and die if my ex girlfriend discovered how highly she ranked...
What Anonymous should do is claim they are ready to do this and give everybody a couple days notice to remove their accounts before the hit. lulz would ensue if they could get the media worked up, which I bet wouldn't be that hard to do.
Well, this script/data now has the increased effect of being very socially detrimental. It would be absolutely awkward as hell if data like this came out to friends.
I think because the request is asking for content type JSON but the Facebook response comes back as javascript with a for (;;); at the beginning. The content types don't match so jQuery invokes the error callback.
My list is all random. The first entry is my girlfriend, the next though is a random girl from college whose profile I have hardly visited more than say 4-5 times ever on Facebook.
Same for others down the list.
So is it something were people are apparently assuming that it is onto something because they can see those 2 names they do not want anyone to know about among the 10 being in the top or is the result for everyone actually correct?
In the case of the latter how is it so random for me? I have Michael Arrington on the top 15 and I swear I do not stalk his profile.
Facebook's reps haven't said the list is based on just on profiles you view, they said it's based on "people you interact with" -- there may be other stuff in there, like messaging activity, or commenting, or liking. Even if you don't go directly to Arrington's profile, you can be interacting with him in the newsfeed.
I would think that clicking on links that a person posts is a good signal and thus would increase your engagement number. Other signals that probably feed into this are pictures you look at, likes, comments, groups you are in common with, etc. i.e. anything that ties you to another person, though I bet they are all weighted differently.
Why? I don't use Facebook, but visiting people's profiles on social sites doesn't mean much on the other ones. Most of the people I look at on Twitter or Google+ have done something so brazenly spammish that I have to go to their profile to find the "report spam" link or whatever. These people aren't my friends, they're just people that spam me that I want to go away.
Well, imagine if you will you were stalking some person you kind of know but not very well, or someone who is an ex-whatever, etc...and that info gets out, and they know. It could spell disaster. Sure, it's entirely the person's fault for stalking in the first place, but the internet enables that so...
Here's how the top 209 people in the list rank for me (the whole list was too long for me to bother cleaning up at 1:30am). 22 people had rankings of less than zero. http://i.imgur.com/WqcKF.png
Tried browsing through the javascript to see if you're screwing with me somehow ... got too lazy 1/3 way through and decided to trust that as a member of HN community you wouldn't (probably a terrible idea). But yea, the script works ... all too well.
You can always just use the inspector in your browser to view the original JSON and run it through JSLINT to make it readable. No external script needed. ;)
I might be one of the few here who has a facebook account with 0 friends and never really use it. So the results are very interesting as they contain people who I know but do not interact with. Some of them must be from profile searches I have performed but I cannot explain the others. I am inclined to believe they must be incorporating 'other users behavior on the site'.
If you have folks with non-ASCII names, here's a Python three-liner to convert the output of first_degree.php to a text file:
with open("first_degree.php.txt", "w") as f:
for e in json.loads(open("first_degree.php.json","rb").read().replace("for (;;);",""))["payload"]["entries"]:
f.write(("%s %r\n" % (e['text'],-e['index'])).encode('utf-8'))
I'm curious whether these numbers factor in people who are looking for you/interacting with your profile. My list has some people I don't recognize as well people who I definitely have not clicked anything of recently, which is why it might.
Well I changed my privacy settings to visible to all, made a dummy account and click raped my profile and my dummy account didn't show up. I'll add the accounts as friends and then make note of the number, then do clicking from the dummy and see if it changes. There could be some privacy implications behind the initial coolness of this.
Also, if you make an account and do nothing your value for yourself is 0.939565, which I guess is some sort of baseline of 0 interaction? Although I don't understand how they are modeling your interaction with yourself, tbh.
The bottom half of my list consists entirely of people who I didn't recognize at all. All but one of the ones I looked at have at least one mutual friend with me. (Now I'm curious whether Facebook generates some of these entries just by crawling my social graph?)
Also, can anyone share what range of numbers they're seeing? At the very top of my list is one negative number. Beyond that, the top half of my list ranges between 0.1 and 1.0. The last half of the list ranges from 1.0 to 1.2.
Top of my list is -3.7312181, bottom is 1.237559. 11 negative numbers, 108 positive. A friends was skewed more to negative numbers, so I'm guessing that maybe you just aren't a heavy/frequent user?
I also thought that other people's interactions with me are a part of the algorithm for this same reason. However, the reply by Keith Adams seems to suggest that is not the case, and that they are just "machine generated guesses" which weigh into the algorithm.
Part of the algorithm may involve them trying to predict who you will want to add as a friend. People that attended the same events as you, people who have recently added your friends as friends, stuff like that.
I am wondering this too, there are many people in that list which I am not friends with and have only visited their profile once, yet they have a higher score than some of my really close friends.
I'm not a frequent Facebook user but I log in at least once a day. The first person on my list is a girl I am dating who is about a -2. I am the second person on my list also at about -2. About 30 more people have a negative number. The remaining people (about 500) are all a positive number. I don't even know who many of the people are at the bottom of my list.
It is interesting to note however that my brother is a frequent Facebook user and his first person on his list ranks at about a -26, and EVERY SINGLE person in his graph is assigned a negative number. Despite this, he still doesn't know who a lot of the people are on the bottom of his list.
Other things I noticed:
- Some people that I barely even know but just became friends with are ranking highly on my list. I imagine this is because I probably viewed a bunch of their pictures after being friends, combined with the fact that I don't often view profiles of people I am friends with.
- Some of my more recently added friends have the same exact value.
I wonder how much information could be extracted about the algorithm by creating a dummy community of people and connecting them together, then recording the results. I imagine much of the algorithm could be reverse engineered by this if anybody were up to the task. I will probably work on it if nobody else does and publish my findings.
EDIT: Explained what I was trying to say better, fixed grammar.
I think it's a bit funny that with a score of -0.3 I am #2 on my own list (#1 being a girl who's profile I checkout a bit too often), I'm stalking myself??
That said, I've always thought it was strange how FB comes up with the people not only in search but also the ten friends list on the left in profiles, the chat list on the right, and 'people you may know'. I've done a few tests w/ friends regarding the topic, and pretty much what everyone here has guessed seems to be true. There's a very good chance FB factors in people who are looking for you/interacting with your profile (unreciprocated), but of course they won't admit that..
I don't think it's such a big deal in either case. The top Chinese Facebook-like SNS site, http://renren.com/, shows everyone the most recent 9 visitors on a page, and its often used (by would-be stalkers) as a way to overtly show interest "hey, i'm checking you out" kinda deal, or for couples to let each other know they are attentive. Otherwise, its just friends keeping tabs on each other. I'll admit, at first I was weirded out by it, but now it seems almost normal.
I'm probably way out of the norm here, but I honestly wouldn't mind having this in FB.
Pretty much exactly what I was going to say about this (I'm #3 on my list). I just posted the article and my top 10 on my wall, tagging everyone in it. Curious as to what the non HN crowd thinks.
Great. You're probably going to be Ground Zero for next week's version of the "run this program to see who looks at your profile" rumor, and the week-after-next's bogus virus warning about running the script.
That is really strange to me. Maybe I don't stalk enough or something, but often when I look up a friend (someone who is actually in my friend list and I communicate often with) it seems to be impossible for Facebook to find this guy and I have to go directly to the friendlist search to get this person's profile. Always (or so often that I don't remember the exceptions, which is in usability terms "always") when I use the normal search bar I get people I don't know, have not even indirect relations with or be in any way interested in. Sometimes the names don't even consist of the words I was putting in the search bar. That's why I really wonder why people can think of facebook's people search as something cool or "stealable".
As some have mentioned, doing a couple of tests with dummy accounts seems to indicate that a friend visiting your page can influence your first_degree.php, which would explain why there are a few people who you never stalk who happen to be on your list.
That's a possibility. The post that Keith Adams made however refutes this, but it is still possible that he is wrong/mistaken or that the first_degree.php was changed after he left Facebook.
I should have been more clear. Keith seems to indicate that views from other users don't affect your first_degree.php. I disagree, here was my process:
I built a dummy account in incognito that is friends with with my regularly used profile (A) and a profile owned by a friend of mine (B). All friend requests were made from A & B TO the dummy. I then checked the first_degree.php of the dummy, as expected, me and my friend were first. Perhaps not surprisingly, the next people on the list were the intersection of A & B 's friend lists.
Now I logged back in to A, knowing that B wasn't searching for the dummy, and searched for the dummy a few times a day for a couple days (I learned about first_degree a couple weeks ago).
Checking the dummy's first_degree showed that friends passed A & B had more of A's friends.
My guess is that Keith is right in general. I might not be able to move myself up the ladder of the dummy, but I can change the Facebook Social Graph by changing my viewing behavior. The machine generated list uses that graph to determine first_degree. Even if it's indirect, it means first_degree can be influenced by searches of others. Or I'm wrong!
Thanks for all the comments and feedback on the bookmarklet. Glad it was entertaining. As mentioned by other commenters, the popularity of something like this does raise many interesting security issues.
Facebook rape (posting embarrassing statements when using a friend's account) is common. What happens when facebook rape becomes "post who your friend has a crush on"?
is facebook censoring this? i tried to post the link to the thekeesh.com page to a friends 'wall' and everything in my comment after and including the link was elided. Or maybe facebook post entries are always that broken.
I don't think they're censoring it -- The HTML on this page makes heavy use of the TABLE and SPAN elements for formatting. That seems to be something that makes it difficult for Facebook's link-grabber to read pages. I've encountered the same problem posting links to other sites with idiosyncratic HTML.
It's not that fast, Facebook actually starts the search query from the second letter (about 174 ms), before that, it uses the prefetched cache of first degree friends and apps.
I get why the parent was downvoted; it's not relevant to the topic. That said, when you open a page and all you see is the header and the article title? I think it's perfectly valid to comment on the design, or lack thereof, of the page.
Seriously, the article was a worthwhile read, but I don't see how you can NOT comment on the design.
https://www.facebook.com/video/video.php?v=432864835468
Briefly, first_degree.php returns objects you're directly connected to in the graph, and if there's space a few machine-generated guesses at other good results. We preload these as soon as you focus the "Search" box at the top of the page, in the hopes of having some decent results to show when you start typing. The index field does, as the article inferred, represent our best guess at a ranking function on these first degree objects. The inputs to this ranking function explicitly do not include other users' behavior on the site. I talked a bit about our ranking function in this quora question:
http://www.quora.com/How-does-the-new-Facebook-search-engine...
Edit: To clarify what a lot of people seem to be wondering, visiting someone's profile does not affect the search results of anyone but yourself.