vltf mailrss

Referrer leaks in self-hosted web apps

Jul 2015

Referrer headers are a browser mechanism that websites use to track where their visitors come from. When you follow a link from one site to another, your browser will often tell the new site which URL you were previously looking at. The same thing happens when one site contains images, CSS stylesheets or fonts loaded from an external domain.

Sensitive information can be leaked via the Referrer header, and the leaks are subtle and unexpected because the information is sent invisibly and without the user’s consent. In the past, high-profile sites including Facebook, Dropbox, Google and HealthCare.gov have inadvertently leaked information via the Referrer header. I did some testing with a handful of self-hosted web apps to see if they also contained referrer leaks.

Test process

  1. Open the browser’s network inspector to begin recording HTTP requests.
  2. Navigate to a self-hosted web app and use it normally.
  3. Check for any requests to third-party sites and make note of URLs contained within their referrer headers.
  4. If the app doesn’t make any third-party requests on its own, try inserting and clicking on external links. Insert and view externally-hosted images/videos where possible. Referrer leaks are caused by linking or embedding external content, so the goal is to jam it into any place where it gets displayed back to the user.

Results

Out of the 24 apps tested, 21 of them could be made to send at least some information to third-party sites via Referrer headers.

App Leaks referrers Notes Full notes
Cozy Contacts Yes Only leaks app URL. *
Cozy Emails No Emails are contained within a sourceless iframe. *
Diaspora Yes Referrers can contain profile IDs, post IDs, tag names. *
EtherCalc Yes Sheet name/ID is revealed. Knowing the sheet ID allows editing. *
Etherpad Yes Pad name/ID is revealed. Knowing the pad ID allows editing. *
Feedbin Yes Also sends referrers to Instagram, Stripe, SubToMe and Twitter during setup. *
GitLab Yes Referrers can contain usernames, group names, repo names, branches, filenames, wiki pages, commit IDs. Also sends referrers to Gravatar by default. *
Gogs Yes Referrers can contain usernames, repo names, branches, filenames, commit IDs. Also sends referrers to Bootstrap CDN, jQuery, Gravatar. *
IPython Notebook Yes Referrer contains notebook filename and folder names, which also get sent to MathJax. *
Mailpile Yes Referrer contains thread ID. *
MediaGoblin Yes Referrers can contain usernames, media titles, collection titles. *
Miniflux Yes Uses rel="noreferrer" on links. Can proxy images. Embedded videos still leak referrers. Fix merged for next release. *
ownCloud Bookmarks No Uses rel="noreferrer". *
ownCloud Contacts Yes Only leaks app URL. *
ownCloud Mail No Uses rel="noreferrer", redirect pages and image proxying. *
Roundcube Yes Nothing leaked unless user clicks “display images”. Referrer contains mailbox name, internal message ID. *
selfoss Yes Only leaks app URL. *
Shaarli Yes Referrers can contain tag names. Optionally uses an online redirection serice to mask referrers. *
ShareLaTeX Yes Referrer contains project ID. Also sends referrers to Bootstrap CDN, Google Fonts. *
Shout Yes Also sends referrers to Google Fonts. *
SquirrelMail Yes Referrer contains mailbox name, message ID. *
Tiny Tiny RSS Yes Uses rel="noreferrer" on some links. Image proxying can be enabled, off by default. *
wallabag Yes Referrers can contain link ID, tag ID, search terms. *
YaCy Yes Referrer contains search terms, page number. Proxies images on the results page. *

Paper cuts

Referrer leaks are widespread and greatly vary in severity. Some of the ones shown here are minor and require jumping through hoops to trigger them (Cozy Contacts) while others result in potentially sensitive information being sent to third-parties automatically (GitLab, Gogs).

Leaking user data is obviously a problem, but even if the referrer URL doesn’t contain personal information it can still be used to track users. If you visit a page and you browser reveals that you arrived via the Google search page, that’s not a big issue. Lots of people use Google so it’s not enough information to identify a single person. Arriving from a smaller site narrows down the pool of people, with the extreme case being self-hosted web apps where there is possibly only a single person using the site.

The referrer mechanism has existed in browsers for ages. It is reasonably well understood among web developers and finding leaks in the wild is neither particularly interesting nor technically challenging. I can see why it’s not a high-priority issue, though it would be great if browsers defaulted to protecting their users by not sending referrers at all.

Mozilla has plans to send shorter referrers by default. This is a good first step and would eliminate some of the worst leaks. However, for self-hosted web apps, revealing the domain is enough to track users so shortened referrers alone wouldn’t fully protect them.

Patching the leaks

I think it’s still worthwhile fixing the leaky apps where possible, and recently that become a lot easier. Limiting referrer headers previously involved ugly hacks such as redirect pages, image proxying and iframe trickery. Now there are saner methods as described in the upcoming Referrer Policy spec. Here’s a test suite to find out which mechanisms work in browsers today.

So far I’ve had patches accepted into EtherCalc, Mailpile, Gogs, MediaGoblin and Shout. There’s so much left to do! If you use self-hosted web apps then I would encourage you to check for leaks. The patches are often single-line code changes and are an easy way to begin contributing to a project.

If all else fails, Firefox and Chrome both have internal settings to disable referrers across all sites, all the time. In Firefox it’s named network.http.sendRefererHeader (set it to 0) and in Chrome it’s the --no-referrers command-line flag. Revisit the test suite after making those changes and you should see all green.