February 2, 2018

A Mystery: Firefox and User Privacy

I’m mystified by some Firefox browser privacy policies, and I wonder if anyone can help me understand them better.

I hadn’t been following browser HTTP referrer policy closely. I knew that referrals were sent, and that had always vaguely puzzled me from a privacy perspective, but I assumed that Smart People Were Working On It, and that there were probably reasons why things are the way they are. After reading this post on the Mozilla Security Blog yesterday, I suddenly wished I’d been following things more closely. The post is meant to tell us about how Firefox is getting better about privacy. But after reading it, I feel worse about privacy than I did before reading it. Here’s a summary of what the post says:

When you follow a link on web page X to go to web page Y, your browser sends Y’s server an indication that you were referred to Y by X. (This information is sent in the “HTTP Referer” [sic] header, for those keeping score at home; yes, it is probably the most famous misspelling in all of Web standards.) The referral information typically includes the entire URL of the page you’re coming from, that is, the site address and path of X. For example, for this post the site address is “www.rants.org” and the path is “/2018/02/a-mystery-firefox-and-user-privacy/“.

Okay, pausing for a moment to ask the obvious first question:

Can I turn this off in my browser settings? Because maybe I consider that information private and don’t want to tell one web site what other web site I’m coming from.

Answer: not unless you have a Ph.D. in Firefox Studies. At least, in the “Preferences → Privacy Settings” menu of Firefox 52.5, there is no identifiable option for controlling this. You can do it via about:config, by setting Network.http.sendRefererHeader to 0 instead of the default 2, but that way of setting preferences won’t fly for the majority of users. There really should be a way to do it from Firefox’s normal preferences dashboard.

Continuing with the post:

As of Firefox 59, when you’re browsing in Private Mode, Firefox will not send the path portion of the referrer information.

Well, uh, okay, that’s an improvement, I guess. But then why even send the origin site at all, even without the path? Shouldn’t “Private” mean private? In Private Browsing Mode, I would expect no referral information to be sent at all. Then, to make matters worse, a bit later the post says:

In Firefox Regular and Private Browsing Mode, if a site specifically sets a more restrictive or more liberal Referrer Policy than the browser default, the browser will honor the websites [sic] request since the site author is intentionally changing the value.

Now I’m even more confused. Why would the site author get to decide what the value should be? At all, I mean, but especially in Private Browsing Mode! I thought the whole point of Private Browsing Mode was that the browser user would decide that. Browser users are often in an adversarial relationship with site authors. The browser should take the user’s side in that relationship, every time.

I must be missing something here. Education welcome. (The answer might be somewhere under this post, but I haven’t found it yet.)

3 comments

David Glasser says:

February 3, 2018 at 5:00 pm

I’m not an expert but: The referral policy header is put on the page with the link, not the page receiving the extra information. If the page with the link wants the link target to have information about that page, it can always get that information in (say, by adding an extra query parameter to the URL). The main privacy concern is about the link target getting extra information from the link source, not about the link source explicitly including extra information.

Ie, if you don’t want a site to know you got there from Google search but Google wants the site to know that, it can get the information through no matter whether there is a Referer header or not.

Of course in practice maybe many people will just cargo cult in a overly sharing referral policy anyway. But I can see how in theory there’s a difference.

Reply
Karl Fogel says:

February 4, 2018 at 2:50 am

Thanks, David. That’s an interesting point, that the referring site could add an extra query parameter anyway. However, that would require a lot more cooperation and standardization between sites than is currently the case… Now, one could view the “HTTP Referer” header as an example of precisely such standardization, but that’s begging the question: the point is, once it’s standardized, the browser can identify it and turn it off, either by not sending the header or, in some counterfactual universe, by appropriately modifying the query parameters on the URL the user clicked on.

Furthermore, URLs and HTTP headers are very different beasts. If I go to a URL, then the destination site certainly knows I have requested that URL. But users are much more able to see and control the URLs they visit than they are able to tamper with the headers being sent with an HTTP request — I mean, headers aren’t even visible, whereas at least URLs are. When an “HTTP Referer” header comes in, the receiving site can be pretty sure that it accurately tells where the user is coming from. From just the URL, the site can’t be so sure; it’s strong evidence, maybe, if there’s some special query parameter in there, but it’s not definitive. After all, someone else might have copied the URL and handed it to me. Now when I go there, all the query parameters are preserved, sure, but the referral information is lost (as it should be). Also, a page can make up whatever parameters it wants to a link, but a page can’t make my browser randomly lie about a Referer header value, right? The page might be able to influence the value within certain parameters, but it can’t change it to an arbitrary string the way it can with query parameters on a link’s URL… Or can it?

If that’s correct, then in (say) a court of law, I would probably accept an “HTTP Referer” value as evidence that someone came from a certain site, but I wouldn’t accept a URL alone, because URLs are easily copied and pasted and edited.

(And yes, I am arguing that I, as the browser’s user, should be able to change the Referer value to any arbitrary string I want, where as some page sent to my browser by a server should not have that ability. “It’s my computer, dammit!” and all that.)

In private email, another reader said:

“I think what you are missing is that Referer: has been around since the dawn of the web and turning off may break stuff? It was a huge mistake and we should get rid of it, but I’m not sure Mozilla can do it all at once, or unilaterally.”

To which my response is:

Turning it off, at least for origin-crossings, can’t really break anything, right? Because any link you click on on a web page is a link you could have just visited directly. I mean, the destination site *might* in some cases vary its behavior depending on the “HTTP Referer” value, but it couldn’t be a very reliable or useable site if it truly *depended* on that header, since then URLs would break in unpredictable, user-mystifying ways.

So I remain puzzled as to why browsers continue to leak private information by sending the header at all, and especially about why Mozilla Firefox in particular would ever send the header when in Private Browsing Mode.

In the meantime, I’ve turned it off completely in about:config, as described in the post. We’ll see if the Web breaks for me :-).

Reply
Karl Fogel says:

February 7, 2018 at 12:11 pm

Update: Well, turning off “HTTP Referer” completely doesn’t lead to good things. For example, we use Zulip chat at our company, and Zulip basically doesn’t work in Firefox if you turn off referral-sending.

So much for that experiment. There are some Firefox extensions (add-ons? plugins? I can never keep that terminology straight.) for controlling referrer information; maybe one of those will help. I don’t mind same-origin referrer information being sent. It’s the cross-site referrals that really bother me, from a privacy perspective.

Reply

rants.org

A Mystery: Firefox and User Privacy

3 comments

Leave a Reply Cancel reply

Latest posts

The Right to Lie: Google’s “Web Environment Integrity” Proposal is a Geyser of Badness Threatening to Swamp the Open Web.

count-fold-lines: Emacs hack to fold duplicate lines and count them.

Twelve Pieces of Classical Music, for Jim

Why not to sign the anti-Stallman petition on GitHub.

Actual comment from a LaTeX document that I am writing now.

So this happened.

Don’t Cover For, Just Cover: How to Report on Trump

Why the Internet Archive’s National Emergency Library is a Good Idea.

Ethics Enforcement Via Software Licenses Considered Harmful.

Archives