Jeff Wiseman said:
I hadn't thought of the fast saves and versions issue. That the
kind of information that I'm looking for.
I understand about data being hidden out of view. When you have
to take a file off of a classified PC, before the file is
declassified it must be converted to a ordinary ASCII text file
in order to avoid this very issue. Word files cannot be
declassified. The process is so onerous that it is just easier to
print the file, declassify the printout, and then pull it in on
an unclassified machine using an OCR
So I guess that my question now is:
- If track changes are off and all previous changes are accepted
or denied.
- And if Fast Saves is disabled
- And if versioning is disabled
- And I then manually delete text in the document that I can see,
is there any other caching that occurs that might retain that
text I just deleted after I've done my Save and quit (and if
there is, will a Save-As avoid this)
Yep. There is still all the user and other metadata, and then there is
the crud between the end of file and the end of the last logical disk
block of the file.
The crud could include sensitive stuff that was on the author's
computer at some time in the past with nothing to do with Word or the
doc being transported.
Check the archives, there was a recent flurry of activity on this very
topic.
The only thing I can think of is if there was a style set up as a
boilerplate or something. Deleting the visible text wouldn't
remove it from the style in the template. But it is extremely
unlikely that the nature of such text would ever be a problem.
You can have daft things left behind like the path to the file you have
been working on. The folder and computer name could be quite
significant to some adversary.
I can see why Word files should *never* be declassified.
Print, declassify the paper, then OCR on an unclassified machine looks
like a pretty efficient mechanism to me.
As long as Microsoft keeps the file format a secret there is no way you
can prove there is nothing hidden in a Word document you don't want
sent. Even if there were a proper specification of the format, the
proof would be quite onerous.
The .docx format is a step in the right direction. It is XML in
intention. Of course a single undocumented binary blob inside would
kick the whole declassification proof into touch. (Word's OOXML
standard candidate has a ton of those I believe)
In a less stringent situation, printing to PDF might be acceptable. PDF
is well enough documented to permit a proof in principle.
That only leaves you with the possibility of steganographic images.
How can you be sure there is not a hidden message in the colour table
of a JPG or GIF, or perhaps a message in the least significant bits of
an audio file attachment?
Digital security is a *very* slippery topic.