File formats

J

jay

I have a question concerning file formats. As a small business manager using
personal computers in the workplace is essential. Over the course of any
fiscal year documents are generated and files produced which are important
for maintaining from year to year.

Often these important files are generated in Microsoft Word and Excel in my
work. I recently referred to one such file that was 5 years old to see the
frightening words: “File Error: data may have been lost.â€

My question related to which file formats are best for file retention. I am
not interested in paying a company to provide me archival services. What I
am interested in is learning some features of file formats which will provide
me with a better way of doing my work.

If a document if generated in Word as FILE.DOCX or Excel as FILE.XLSX, would
it better survive the tortures of storage on an active, operational hard
drive in a PDF format? Are graphic formats more resilient to time than
others? Surely some formats have fewer problems that I’ve been experiencing.
What is the best process for retaining important files? Should I convert
all retained files to a TIFF format?

Surely this is not a new subject of investigation. Many other industries
have a greater need to maintain files than I do. I would expect this to be
commonly held knowledge in the computer science community.
 
S

Steve Rindsberg

If a document if generated in Word as FILE.DOCX or Excel as FILE.XLSX, would
it better survive the tortures of storage on an active, operational hard
drive in a PDF format?

You're really looking at several problems here:

- Protecting the data against computer/mechanical loss. It doesn't much matter
what format you choose for the files if the hard drive toasts itself and all of
the files are gone.

- Ensuring that the media on which the data is stored will be readable in the
future. For example, data stored on 8" and 5.25" floppies is gone for all
practical purposes. The same is, or soon will be, true for 3.5" floppies.
Even if you happen to have a drive that can read 'em, the magnetic media has
either flaked away or lost its magnetism.

- Ensuring that data created in earlier software is readable in tomorrow's
software.

Some formats may be more "self-repairable" than others in the event that a few
bytes are trashed, but I suspect that you'd want to be more concerned with
losing the whole hard drive or major chunks of it, in which case the data's just
gone (or recoverable only by very expensive specialists).

That said, I'd put my faith in backups, not file formats. In several places and
on different types of media: portable hard drives, DVD, CD, USB stick.

If the data is really critical over long time periods, I'd duplicate the backups
every couple of years (and possibly move a copy onto The Next Greatest Thing In
Storage so that when CD drives go out of support, you've got your data on
whatever's current). This also protects against data slowly fading away because
of media deterioration.
Are graphic formats more resilient to time than
others? Surely some formats have fewer problems that I’ve been experiencing.
What is the best process for retaining important files? Should I convert
all retained files to a TIFF format?

Advantages of converting to raster graphics formats like TIFF or simpler BMP
files:

- Easier to read and potentially to print and more likely to be supported by
other software over time.

Disadvantages:

- Will usuall take up more space
- Cost/effort of conversion
- Slower to print (usually)
- You no longer have data; you have a PICTURE of data. You can no longer work
with the data itself (other than by looking at the picture and manually re-
entering the data).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top