Yahoo! Groups and the Legacy of Internet Content

by Nathan Kiesman (nkizz)

Yahoo! announced last year that they would be shutting down "Yahoo! Groups" on December 14th, 2019.

This service contains over 20 years of mailing list messages, photos, and other content that will be deleted after that date.  This is just the latest of several rushed shutdowns of sites that Yahoo! has become known for, the most infamous of which being GeoCities.

GeoCities was the largest website host in the 1990s and early-2000s,and provided many people's first experiences with posting content on the Internet.  However, it had stopped making Yahoo! money, so it was unceremoniously shut down.  Some particularly large sites that have been shut down include Megaupload, parts of MySpace and Tumblr, Google+, and hundreds of smaller services, sites, and web forums that make up a significant portion of the culture and history of the web.

Senator Ted Stevens is famous for saying, "The Internet is a series of tubes."

Although this quote is often mocked, it's actually accurate.

All the Internet does is move bits from one place to another.  To retrieve a web page, there has to be a server on the other end with enough disk space, electricity, and bandwidth to run it.  All of these things cost money and effort to maintain.  This is fine when the web page makes enough money to justify its upkeep, or it's maintained by an entity who's interested in its continued existence.

However, when sites run out of money, this upkeep becomes an issue.  Many will say that nothing is ever truly deleted from the Internet, however that's only true if someone is around to copy it.  If the server running in someone's basement gets turned off, crashes, or the company maintaining them decides to shut them down, all the data is lost.

Barring situations like catastrophic hardware failure, web services are usually shut down because no one uses them anymore.

Hosting costs outweigh the revenue from advertising, and the operator stops paying the hosting bill.  So what if we can't access a bunch of web pages that haven't been updated since 2001?  After all, if they're not making enough money to support themselves, than clearly not many people care about them.

However, the erasure of these sites eliminates parts of the greatest trove of primary source documents that has ever existed.  User content online, especially the exact kind of un-updated content that are on these legacy services, provide unprecedented snapshots of life in the late nineties and turn of the century that don't exist for any other time period.

There are millions of web pages and posts created by regular people chronicling their lives, their loves, and their experiences.  This may not seem like history now, but considering there are many people alive today who were born after that time period, like myself, it is history now.

Additionally, there's a lot of knowledge stored on the "old Internet" that is still directly useful today.  Every hobby imaginable most likely has 25 years of web forums, message groups, and websites with truly awful graphic design filled with advice and information that isn't available anywhere else.

However, I can't help but be optimistic.

The same democratization of content creation that allowed all this content to exist also applies to preserving it.  Most of the time, if content is still accessible, it's downloadable, and many people have dedicated themselves to doing so.

The biggest player in this space is the aptly named Internet Archive.  They maintain an archive of a staggering 330 billion web pages, and millions of videos, books, and audio recordings.  They also digitize older media, like books, tapes, and records.  Textfiles.com and Bitsavers.org are also archival projects with technical focuses like BBS and software archives.  Archive Team is a group of volunteers who archive "at risk" content on online services.

As I write this, Yahoo! Groups has officially shut down and even though Yahoo! actively attempted to prevent archivists from accessing the site, Archive Team was able to save over 90 percent of the content on the site.

Even large entities are getting into the game, like Google maintaining a USENET archive and the Library of Congress keeping their own web archive.

Many of these projects, like so many other online projects, rely on a combination of volunteers and employees, and are funded by donations from individuals and interested corporations.  Also, like many other Internet projects, there are ways for individuals to help.  You can upload media to the Internet Archive, nominate web pages to be archived, run "warrior" programs that download web pages, and donate to the various archive organizations.

Additionally, for personal data, the GDPR requires sites to allow users to export their data, so people can backup their data before a site becomes defunct.

Like any other form of media, online media requires maintenance to preserve it.

Digital preservation presents its own set of problems and challenges to archivists of the 21st century that we are still learning how to overcome.

But as long as there are people creating, there will be people dedicated to preserving those creations too.

Return to $2600 Index