After posting my troubles on the Newsgator Support Forum, it was determined that theres was a bug in the “Delete all posts on this page” function. In less than three days, they fixed the bug, got the code into production, and they were able to restore all my lost clips. My 300 lost clips were joined by about 400 other clippings that I had saved and subsequently deleted over the past year. While I’m very happy to have my clippings back, it got me thinking about how Web 2.0 companies retain data.
Now Newsgator has mostly my attention data. I consider this data to be fairly public, but some folks might disagree. The fact that they were able to restore my data means that they are retaining it for some period of time. Every company that stores user data on the web faces a choice. What do they do with data that is “deleted” by users? There’s an obvious value in keeping it, both for the customers, and for the business. The customer might want the data restored. The business might want it for historical analysis.
Now, I haven’t researched any of these companies, so I don’t know what the answers are. Just food for thought. If you remove a photo from Flickr, is it really gone? What about the email you delete from Hotmail or Gmail? The draft blog post on Blogger that you decided not to publish?
We’re used to data retention questions coming up in a work context, but more and more of our personal data is living in data farms operated by companies like Yahoo!, Google, and Microsoft. I’m sure if you dig, you can find most of the data retention policies. Probably in the long legalese usage agreements that normal users click past without reading. This probably won’t come to the public’s attention until some high profile criminal prosecution pulls out all the stops and subpoenas all this retained data.