Paper: Quantitative analysis of privacy compromising mechanisms on websites

Timothy Libert, Exposing the Hidden Web: An Analysis of Third-Party HTTP Requests on One Million Websites (pdf)
Abstract: This article provides a quantitative analysis of privacy compromising mechanisms on one million popular websites. Findings indicate that nearly nine in ten websites leak user data to parties of which the user is likely unaware of; over six in ten websites spawn third-party cookies; and over eight in ten websites load Javascript code from external parties onto users’ computers. Sites which leak user data contact an average of nine external domains, indicating users may be tracked by multiple entities in tandem. By tracing the unintended disclosure of personal browsing histories on the web, it is revealed that a handful of American companies receive the vast bulk of user data. Finally, roughly one in five websites are potentially vulnerable to known NSA spying techniques at the time of analysis. 

Update 06.11.2015: Also intriguing: [...] We found that many mobile apps transmitted potentially sensitive user data to third-party domains, especially a user’s current location, email, and name. [...] Who Knows What About Me? A Survey of Behind the Scenes Personal Data Sharing to Third Parties by Mobile Apps by Jinyan Zang, Krysta Dummit, James Graves, Paul Lisker, and Latanya Sweeney