Wednesday, May 25, 2005

Real-Time Backup

I started prototyping an application for real-time backup. You fire it up and point it at directories that you would like to track. The app takes a snapshot the directory contents and then begins tracking file changes. When it detects a change, it makes a note, and then stores the difference between the snapshot and new file. I have not written the actually differencing part yet for good reason.

Why do I need real-time backup? The first reason is trivial. My desktop frequently becomes cluttered with documents I no longer need, but which I need to retain just in case management needs something in the future. I would like the ability to simply delete every file and be secure in the knowledge that I have lost no data. The second is more important. At my job I frequently must work with large excel files. Sometimes I spend tens of hours working with a single spreadsheet (cleaning data, performing analysis, coding, etc). Every once in a while I take the time to make a backup copy of my work. But that’s not enough. Several times I have lost a whole day of work because excel has lame undo capabilities. The moment you save a file, your undo history is gone. I have this habit of pressing ctrl+s every few seconds. Hence, I never have more than a couple seconds of undo. So I would like an application that will take every single save I make and store the complete history, transparently. There is no reason a user should have to actively store content changes, ever. None. Nada. Zilch. Except...

Real-time backup of office documents is nearly impossible. I wrote my application all the way up to the point where it makes note that a file has changed. I pointed the app at a sample excel file and.... changing a single character results and 20 or 30 file changes. And none of the changes occur in the file you are working in. It looks something like this:

1) Your original file is untouched
2) Make a change and save
3) Excel makes the change in any number of temporary files
4) Excel builds a new file with the change
5) Excel deletes your original file
6) Excel renames the file from step 4 to your original file name

Basically, in order to solve the real-time backup problem, I would have to build a bunch of code to identify the pattern and store the real change. Word pretty much works the same way, which raises issues not present with excel. There are multiple editors for .doc files. What if the user edits the file in WordPad instead of word? What if WordPad has a different pattern for saving the file than word? Well then maybe I could start tracking what application has which file handles open... maybe... but as you can tell, it all becomes a complicated mess.

1 Comments:

Anonymous Anonymous said...

I just slapped my huge e-comment all across your e-blog...

5/27/2005 10:47 AM  

Post a Comment

<< Home