Rewriting History: Manipulating the Archived Web from the Present

October 30, 2017

Web archives such as the Internet Archive’s Wayback Machine are used for a variety of important uses today, including citations and evidence in journalism, scientific articles, and legal proceedings. In a new paper, Security Lab PhD alumna Ada Lerner (now an assistant professor at Wellesley College) and Lab co-directors Yoshi Kohno, and Franzi Roesner show how a malicious actor might be able to manipulate what users see when they view archived pages. The image on the right shows a proof-of-concept example in which a 2011 snapshot of a website has been temporarily modified to show 2017 content.

For more details about how these attacks work and how to defend against them, see the Rewriting History project website or read the full conference paper. Dr. Lerner will be presenting this work this week at the ACM Conference on Computer and Communications Security (CCS) 2017.

We disclosed our results to the Wayback Machine before publication, and we are extremely grateful to Mark Graham and his team at the Internet Archive for their prompt and thoughtful responses in taking action to mitigate these attacks! They have already implemented Content-Security Policy headers, which instruct client browsers not to load content from outside the Archive, blocking many vulnerabilities to one of our attacks. Additionally, they launched a new feature, described in this blog post, which shows users of the Archive the relationship of the timestamps of subresources to the snapshot currently being viewed. This information can help expert users better interpret archival snapshots and catch “anachronistic” requests which may result in benign or malicious modifications to the view of a page.

With this paper, we are also releasing Tracking Excavator, a tool for measuring web tracking in the Archive. Tracking Excavator is described in more detail in our paper from USENIX Security 2016.