Referrer URLs and Privacy Risks

The Wall Street Journal’s recent article in the "What They Know" series discussed the problem of Facebook IDs being passed to ad networks.  This is a serious potential privacy risk – and most Facebook applications are impacted by this issue.

The underlying issue is with a piece of the HTTP header called the referrer URL.  We recognize that referrer URLs are a major industry-wide problem with the structure of internet security, so Rapleaf has taken extra steps to strip out identifying information from referrer URLs.

When we discovered that Facebook ids were being passed to ad networks by applications that we work with, we immediately researched the cause and implemented a solution to cease the transmissions.  As of last week, no Facebook ids are being transmitted to ad networks in conjunction with the use of any Rapleaf service.  The transmissions, when they occurred, were not a result of any purposefully engineered process by Rapleaf. Instead, they were due to broader issues -- as discussed in the article -- concerning site referrer URLs, which are managed by sites themselves and ad networks.

We are committed to working with the industry to fix these issues, and all issues that may emerge in the future from this complex ecosystem.  Our mission is that everyone can have a personalized experience on the web that is safe and anonymous, and we will continue to work hard to make this a reality.

Below are more details about referrer URLs and steps the industry should take to eliminate the privacy risk.



Referrer URLs have been core to the web since its creation in the early 1990s. They have a number of useful functions, such as helping a website administrator understand which other websites link to her site and how her visitors are finding her. But referrer URLs come with a cost to user privacy, a cost that is not widely acknowledged and is generally underestimated.

What are "Referrer URLs"?

When you visit the new frozen yogurt shop in town, the owner might ask you how you heard about the store, and you might reply that you read about it in Bob Friedman's column in the County Chronicle. This referral information is valuable to the yogurt shop owner because it helps her understand which of her marketing efforts are successful (in this case, soliciting Bob's column).

A similar process is at work on the web. If you click a link on the countychronicle.com website that takes you to tastyfroyo.com, your browser will typically tell the tastyfroyo.com web server that you are coming from a particular page at countychronicle.com. More specifically, your browser makes an HTTP request for a page on the tastyfroyo.com server, and within this HTTP request is a field named "Referer" (yes, that is how it is spelled, or misspelled) that contains the URL to the linking countychronicle page.

What's the problem?

If you didn't want to tell the frozen yogurt owner that you learned about her business from Bob Friedman's column, you could have chosen not to tell her (you might not want her knowing that you're the kind of person who reads the County Chronicle). But on the web, your browser will automatically send the referrer URL. (Some browsers allow you to disable the transmission of the referrer URL, but this is rarely done.) This automatic referrer transmission leads to two types of privacy problems.

One problem is that if you visit a site anonymously (that is, you do not provide your identity), the site could potentially discover your identity based on information passed along in the referrer URL, thus breaching the principle of presumed anonymity. This problem is discussed in a good handful of places, including here and here. The HTTP1.1 specification acknowledges it as well, articulating, "Although [the Referer URL] can be very useful, its power can be abused if user details are not separated from the information contained in the Referer."

A second problem is less discussed. If you are visiting a site that knows your identity (i.e. any site you're logged into), then this site may receive referrer URLs of other pages on the web that you have visited. For example, you may visit a web page about a particular medical condition, click a link on that page to a site that knows your identity, and now that site can associate your identity with having visited that particular medical webpage.

These problems are compounded by the fact that in actuality most web pages are composed not of a single HTTP request, but many dozens of HTTP requests (every image on a web page, for example, is a separate HTTP request). Factoring in iframes and redirects, the prescribed behavior of referrer URLs isn't even always clear.

What should be done?

Firstly, web sites must take care when linking to external web sites to not include personally identifying information that may get placed in referral URLs. Many major sites do a good job with this, though it is not always easy to be comprehensive.

Secondly, we need to give deeper thought to whether or not the privacy risks associated with referral URLs can be adequately managed. Referral URLs are used by most web sites for constructive purposes (e.g. link statistics, or preventing hotlink bandwidth theft). If the privacy risks cannot be managed, then privacy-centric browsers may decide to turn off referrer URLs entirely.

10 Responses to “Referrer URLs and Privacy Risks”

  1. Stephen October 17, 2010 at 10:31 pm #

    As much as you talk about anonymity and the importance of privacy, the point of your company is to try to track people using information that they didn't intend to give you. Hence your name, similar to "Rap Sheet." You're not fooling anybody.

    I also remember when you were explicitly developing a service that people could use to look up information on other people: Rapleaf.com 3 years ago

    So I can see why somebody wouldn't trust you now.

  2. Tolman Marks October 18, 2010 at 8:37 am #

    Great piece on referrer URLs. this helps explain many things that I worry about. while you guys are pushing the envelop on personalization, you're also thoughtful on privacy and I appreciate your candor.

  3. Jennifer Duxfra October 19, 2010 at 10:18 am #

    I really appreciate that you gave me an easy way to see my data and opt-out. My husband chose to opt-out while I chose to delete just one thing about me that was personally sensitive. It was very easy to do which is more than I can say for 95% of data companies.

  4. Paul Chaney October 19, 2010 at 2:57 pm #

    MailChimp, among others, depends on the date you gather for association with list members using Social Pro. In light of the recent Facebook privacy breach, is gathering user data for use by a third-party, in this case MailChimp, also a breach of privacy? The WSJ seems to think so as it named you in particular.

    My intention is not to take a cheap shot at you or throw stones (I don't perform those kinds of stunts). I do want to spark a conversation about the issue of privacy and whether or not you see what you're doing as possibly being culpable. And, if not, what the difference is.

    In fairness, I asked the same question to the folks at Flowtown too.

  5. PhilGo20 October 19, 2010 at 4:36 pm #

    Great piece. Too many people only think about referrer url as part of the analytics data of theri website when it can be much more.

  6. Michael Chang October 20, 2010 at 10:26 am #

    This article was extremely helpful to me in understanding referrers and how data is passed in HTTP requests. I have asked people in my company to read this thoroughly.

    One thought: every time i do a search on Google (or Bing, Yahoo, whatever), my search terms are sent to the site I am going to. I guess that could be a good thing as they can better customize my experience. But it is also worrying that my data is sent without my permission. Is there an easy way for browsers to stop this from happening?

  7. Waste Collection December 4, 2010 at 12:12 am #

    Nice post! i like it , so interesting...Thank you for sharing it.

  8. Trophy Engraving December 4, 2010 at 12:13 am #

    Great post!. It's clear, helpful and so interesting.

  9. geoge July 28, 2011 at 8:49 am #

    oQgkHf http://fnYwlOpd2n9t4Vx6A3lbk.com

  10. Waste Collection December 6, 2011 at 2:57 am #

    This is a well thought out post on the privacy risks of referrer URLs. I particularly like the comparison you make between online and real life and how in real life we are given the chance to make the choice. We lose that ability online...

Leave a Reply