Amazing Databases: The Wayback Machine

A permanent archive of the Web--some 200 million sites' worth

Share

The purpose of the Wayback Machine is to copy and store the Internet. Since the San Francisco–based nonprofit Internet Archive created the database 15 years ago, browsing software called crawlers have captured 180 billion Web page snapshots from more than 200 million sites.

Now, at four petabytes, with another 35 to 40 terabytes added every month, the Wayback Machine is the largest accessible Web archive in existence. Plug in the URL of, say, a shuttered blog, and you’ll get a timeline of crawl-dates, most of which link to functional versions of the website on that day. The Wayback Machine is free, so any curious browser can use the data for historical research or to study the evolution of the Web. Researchers at the Library of Congress, for example, used the Wayback Machine to assemble a gallery of websites as they appeared on September 11th, 2001, and in the following three months.

Check out the other nine most amazing databases in the world here.