TECH
06/14/2010 05:12 am ET | Updated May 25, 2011

Library Of Congress Twitter Archive: All Tweets Since 2006 To Be Acquired

The US Library of Congress announced a major new acquisition: it will be obtaining all public tweets dating back to March 2006.

Appropriately, the library spilled the news on Twitter via the official Library of Congress account (@LibraryCongress). The tweet read, "Library to acquire ENTIRE Twitter archive -- ALL public tweets, ever, since March 2006! Details to follow."

The Library of Congress directed users to its blog, which explained, "important tweets in the past few years include the first-ever tweet from Twitter co-founder Jack Dorsey, President Obama's tweet about winning the 2008 election, and a set of two tweets from a photojournalist who was arrested in Egypt and then freed because of a series of events set into motion by his use of Twitter."

"Expect to see an emphasis on the scholarly and research implications of the acquisition," wrote Library of Congress blogger Matt Raymond. "I’m no Ph.D., but it boggles my mind to think what we might be able to learn about ourselves and the world around us from this wealth of data. And I’m certain we’ll learn things that none of us now can even possibly conceive."

While there's no doubt tweets have made both news and history (here are some of the tweets that shook the world in recent years), the Library of Congress will also be archiving a good number of mundane tweets about our jogs and jobs, friends and family, check-ins and meals (people write about 55 million tweets a day). Also, how might knowing tweets are to be archived by a federal institution change users' posts?

Read Write Web says of the acquisition, "It's hard to imagine a more significant milepost in social media's early march toward becoming an essential component of our social experience."

The Library of Congress has already started a collection of digital data. Ars Technica explains,

While archiving the entire Web and all its changes is simply impossible, the Libary of Congress has collected a curated, limited subset of Web content "since it began harvesting congressional and presidential campaign websites in 2000." Today, it has 167TB of Web data.