Don't leave your edit history lying about

E

Elliott Roper

Several times in the past, I have been able to use the edit history of
Word docs sent to me for commercial advantage.
Here is a story of a bigger fish being, if not caught, jagged with the
hook.
http://www.theinquirer.net/?article=10273
hit the computerbytesman.com link for the details.

I smugly noted that Tony/Alistair resorted to my method in the end.
(distribute as PDF *never* Word)

Without doubt the prime minister's office was inept, but there remains
a large sense of "unfit for purpose" in their choice of software.

When will we see proper tools for exporting documents in Office?
 
E

Elliott Roper

John McGhie said:
Hi Elliott:

I think that blaming Microsoft for the errors the users of its products make
is not so productive. I would be utterly outraged if Microsoft decided that
it was going to take over and dictate to me how I should work and what I
should do. I suspect I may not be alone in that view.

A long and serious reply requires a serious rebuttal. I'll see what I
can do.
I was told by my woodworking instructor in 1963 "If you want to use power
tools, first learn to do the job properly. Power tools are a great asset in
the hands of a properly trained user: in the hands of a beginner, they
simply enable him to make a bigger mistake faster." I have never forgotten
that lesson. Word is a power tool: it *will* do damage if you use it
carelessly. But I certainly would not want Microsoft to fix that by
reducing its power -- by removing the professional features that many of us
rely on to get our work done :)

It is more subtle than than. My complaint was that there is no proper
*export* aka 'send the finished work off to the customer' facility in
Word. It is actually extremely difficult to deliver a Word document to
a third party and have them see it exactly as you want and with no
hidden extras.

It is extremely difficult to remove the last traces of your preparation
work from a Word file. You must go to the length of saving it to an
innocuosly named directory on an innocuosly named disk to remove the
last traces.

And don't get me started on fonts. Or paper sizes. Or printers
Any person who becomes Prime Minister of a nation should understand that
they must choose staff who understand that they need to learn how to use the
provided tools properly, and to accept responsibility for speaking up when
they don't know.

Yeah, I know. I voted for the muppet. What does that make me?

We are talking here about some of the highest profile, highest paid
staff in the British Government, and they are not alone in making
mistakes like that. I was approached by a very expensive consultant for
some software design work. It was super-secret stuff. He and his staff
had spent ages deleting all clues to the ultimate client's name in
several hundred pages of specification. Thirty seconds with BBEDIT
after he left the building, and the directory names in his Word docs
told me all I needed to know. Once when negotiating a contract, the
other side's lawyers actually handed me a document where the
accepted/rejected tracked changes were still visible to a text editor.
In there it was clear they were prepared to accept a much harder
negotiating position from me than I was about to offer. Naturally I
obliged them.
Word processors have been around a year or so now: I think most people
working in offices have come to accept that they need to be as skilled with
their word processor as we expect an electrician to be with his tools of
trade. It is, after all, how they earn their living. I would be a little
disconcerted if my electricity started going places it was not meant to be!

Nope. The electric saw is supposed to have a guard. It should not be so
difficult to export clean material.
Sure, there are a lot of people out there who have not reached a
professional level of skill yet. This newsgroup and others like it exist
precisely to help them get there if they choose to. That's why I, in
particular, so value questions from people just starting out, and why I see
the people just starting out as being the very reason I am here. I am an
old fart at the end of my career: I have to pass the baton to someone: so
new people are in one sense my very reason for being.

But perhaps the Prime Minister's Office of any nation is not the place for
such people to be working. I suspect that Sabrina Williams understood that
she should first learn to play tennis before going to Wimbledon. Similarly,
people handling confidential information need to accept that the
responsibility for knowing what their word-processor DOES is THEIRS.
Similarly, in places such as the Prime Minister's Office, the Information
Technology Staff there must accept some responsibility for making this
information available to users, and for providing the simple tools required
to ensure that no breaches of security occur.

Clean export should be part of the product. There is no such facility
in Word that I know of.
This stuff is not rocket-science: any user of Word who has been hanging
around here for a few months will be able to ensure that they do not
embarrass anyone with what the do in Word: it's simple stuff.

It is not that simple to get rid of every bit. Reading that Inquirer
article taught me a couple more ways of hacking a doc file for hidden
gems.
Come to think of it, I know that you know all this stuff, so I am surprised
that you didn't give others the benefit of your knowledge. It occurs to me
that your post may perhaps have been motivated by a desire to do something
else.

In this case, it was a warning to the unwary. How many Word users can
or do drive BBEDIT or emacs to inspect their output? How many of those
could change their Word document in emacs to obfuscate directory names
without wrecking it.
PDF is not the answer. If you send me a PDF, the warmest reception you are
likely to get is an email asking you for a useable file format (preferably,
a properly-constructed Word document or an XML file).

You are bucking a trend. The entire printing and pre-press industry is
moving to PDF workflow. I have no other way of delivering finished work
where the page breaks stay where I want them, where the font variants
are mine and where the bullet points arrive in the same shape as I left
them. XML is an acceptable answer I'll concede, and I discern a trend
toward it in Microsoft products. But I'll want to see the full DTDs and
DOMs and have all of it fully auditable for finished work delivery
purposes.
Yes, PDF may look pretty. But it's not useable. I can't take the
information from it and use it anywhere else without re-typing it. I can't
use information that sits around in my head: reading stuff doesn't do it for
me: I work by managing information, and that means transporting it from one
place to another. A PDF is just a blob of binary to me, I don't clutter my
computer with it :)

That point is well made. You are describing a different task from
delivering finished product to a hostile world though.

That's the point of PDF. Part of the workflow is controlling the audit
trail, and ensuring that unintended information is not leaked in the
process. Word does not have such a facility.

If I were the intelligence service delivering a document to the prime
minister's office this week, they'd be getting a signed timestamped
encrypted PDF and I'd be expecting a countersigned receipt pretty
smartish. If I really had to send them a Word document, I would have
gone to extreme lengths - probably OCR and re-edit on a specially
isolated word processor. Because there are no simple tools, inside or
outside of Word to eliminate all traces of the way the work was done.

What the world's press gets from the prime minister's office should be
at least as well protected. In the first case the parties were batting
on the same side. If I were Alistair's IT man, the press could come and
get the hard copy till I did a massive audit on PDF conversion before I
even permitted that. The press sure would not be getting Word docs past
me.
Not until there is a facility in Word to perform clean auditable
document export with preserved fonts and formatting and signing. I
really honestly believe that Word is currently not fit for this
purpose. I really would like to see it in a future version.
 
J

John McGhie [MVP - Word]

Hi Elliot:

This responds to microsoft.public.mac.office.word on Thu, 03 Jul 2003
09:15:01 +0100 said:
It is more subtle than than. My complaint was that there is no proper
*export* aka 'send the finished work off to the customer' facility in
Word.

Well, there *is* :) Save the thing to HTML (Filtered) and have a look.
What you see is actually what you get :)
It is actually extremely difficult to deliver a Word document to
a third party and have them see it exactly as you want and with no
hidden extras.

Well, not in my experience. It took me around 15 seconds to clean up Tony
Blair's nonsense :)

What they were complaining about is the Edit Log. If you open the document
as "Recover Text From Any File" you see:

cic22JC:\DOCUME~1\phamill\LOCALS~1\Temp\AutoRecovery save of Iraq -
security.asd
cic22JC:\DOCUME~1\phamill\LOCALS~1\Temp\AutoRecovery save of Iraq -
security.asd
cic22JC:\DOCUME~1\phamill\LOCALS~1\Temp\AutoRecovery save of Iraq -
security.asd
cic22JC:\DOCUME~1\phamill\LOCALS~1\Temp\AutoRecovery save of Iraq -
security.asd
cic22JC:\DOCUME~1\phamill\LOCALS~1\Temp\AutoRecovery save of Iraq -
security.asd
cic22JC:\DOCUME~1\phamill\LOCALS~1\Temp\AutoRecovery save of Iraq -
security.asd
JPratt
JPratt
C:\TEMP\Iraq - security.doc
C:\TEMP\Iraq - security.doc
JPratt
JPratt
A:\Iraq - security.doc
ablackshaw!C:\ABlackshaw\Iraq - security.doc
ablackshaw#C:\ABlackshaw\A;Iraq - security.doc
ablackshaw
A:\Iraq - security.doc
ablackshaw!C:\ABlackshaw\Iraq - security.doc
ablackshaw#C:\ABlackshaw\A;Iraq - security.doc
ablackshaw
A:\Iraq - security.doc
A:\Iraq - security.doc
C:\TEMP\Iraq - security.doc
C:\TEMP\Iraq - security.doc
MKhan(C:\WINNT\Profiles\mkhan\Desktop\Iraq.doc
MKhan(C:\WINNT\Profiles\mkhan\Desktop\Iraq.doc
nR"zLl_ÿ


They made no attempt to cloak the document at all.

Performing a "Save As" in the latest (Word 2003) version of Word, with
"Remove personal information" set to ON, turns that into this:

§?§?©?©?ª?ª??­?¯?°?²?³?Æ?Ç?8?:?~?
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3

If you don't have a copy of Word 2003, then you simply Save As "Web Page
(Filtered)". You can then manually eyeball the code (but there will be
nothing left).
It is extremely difficult to remove the last traces of your preparation
work from a Word file. You must go to the length of saving it to an
innocuosly named directory on an innocuosly named disk to remove the
last traces.

Well, it's not that difficult, compared to, say, finding a parking space in
London :) Yes, you have to do that stuff, but that's what all these
highly-paid spooks are there for, to know all this stuff.
And don't get me started on fonts. Or paper sizes. Or printers

Well, I don't have these difficulties, so maybe we could discuss this
off-line? Yes, Word will re-flow the document to fit the locally-available
fonts, paper-size and printer. That's what word-processors do. No, I
wouldn't choose Word as a delivery format for a commercial printing job
either. Ummm.... Actually, I might: if the presentation and layout
mattered that much to me, I probably *would* turn the thing over to a
commercial pre-press shop as a Word document and let a professional get it
"right" :) If I was determined to show the world how rough my layout and
design is, I would deliver the thing as a PostScript file.
We are talking here about some of the highest profile, highest paid
staff in the British Government, and they are not alone in making
mistakes like that.

Strewth! You should see their Australian counterparts :) No wonder Al
Quaeda has not been found: these chumps can't even drive a word-processor.

It seems to me pretty simple to explain to them: You must make a choice
between information control and ease of use. For non-critical information,
Word is fine, and will be a lot easier for the recipient to use. But if the
nature of your document is critical, then you must send in valid SGML, or,
if that is not available, as plain Unicode text.

Make this choice before creating the document. You either want to make
formatting and audit-control easy for the recipient, or you want to exactly
control the information you are sending. There is no middle ground. You
can't be "half-pregnant" or "half secure". Choose your software tools
appropriately.
I was approached by a very expensive consultant for
some software design work. It was super-secret stuff. He and his staff
had spent ages deleting all clues to the ultimate client's name in
several hundred pages of specification. Thirty seconds with BBEDIT
after he left the building, and the directory names in his Word docs
told me all I needed to know. Once when negotiating a contract, the
other side's lawyers actually handed me a document where the
accepted/rejected tracked changes were still visible to a text editor.
In there it was clear they were prepared to accept a much harder
negotiating position from me than I was about to offer. Naturally I
obliged them.

Yep :) People do screw up, and their competitors in business do take
advantage of that. This is why I lose patience with the "Just tell me what
to do, I don't have time to learn this stuff" crowd. OK, so you don't have
time to learn to use the tools of your trade? Are you sure that does not
mean your time-management skills are not good enough to actually do your
job? :)
Nope. The electric saw is supposed to have a guard. It should not be so
difficult to export clean material.

Word has a guard too. It's called the Help File. People have been known to
remove safety guards because "it's easier to use the saw that way". You
hear that quite often in the Emergency Department of your local hospital.
Seriously, producing an untraceable copy of a document is a 30-seconds job
if you know how, and if you don't, you should not be working with
confidential information.
Clean export should be part of the product. There is no such facility
in Word that I know of.

You haven't looked :)
It is not that simple to get rid of every bit.

Save As is too hard?
In this case, it was a warning to the unwary. How many Word users can
or do drive BBEDIT or emacs to inspect their output?

Anyone who handles confidential information needs to know how to force Word
to display a document as TEXT. So you can know exactly what's in your
document even if you do not have Word installed.
How many of those
could change their Word document in emacs to obfuscate directory names
without wrecking it.

Granted, but then, I didn't use EMACs to do it. If you do not have Word
2003 you simply save the thing as HTML.
You are bucking a trend. The entire printing and pre-press industry is
moving to PDF workflow.

Well, that's not a trend where I work. Practically NONE of my work is ever
printed these days: it is not useable in printed form. The last manual I
worked on was part of a 35,000-odd page opus. Nobody's going to use a book
that needs a truck to carry it. Printing is a very specialised part of the
industry, and not one that is relevant to what I do these days.

Yes, I started on a newspaper with ink in my veins, but that was a long time
ago :)
XML is an acceptable answer I'll concede, and I discern a trend
toward it in Microsoft products. But I'll want to see the full DTDs and
DOMs and have all of it fully auditable for finished work delivery
purposes.

So get yourself a preview copy of Word 2003 and take a look. BTW: XML
documents are likely to be using SCHEMAS going forward. The tools to
produce your own DTDs are not likely to be shipped with Microsoft Office.
You can attach your own DTDs if you have got them, but most people will
prefer to use Schemas, because they enable more powerful control and
constraint.
That point is well made. You are describing a different task from
delivering finished product to a hostile world though.

Not wishing to be argumentative, but I am describing a "different task". I
am still delivering finished product to a hostile world :)
If I were the intelligence service delivering a document to the prime
minister's office this week, they'd be getting a signed timestamped
encrypted PDF and I'd be expecting a countersigned receipt pretty
smartish. If I really had to send them a Word document, I would have
gone to extreme lengths - probably OCR and re-edit on a specially
isolated word processor. Because there are no simple tools, inside or
outside of Word to eliminate all traces of the way the work was done.

Just a Save As Web Page Filtered, then re-open it on a secure copy of Word
will do. "File 0001, Edited by 'SecureUser' from C:\TEMP" won't tell even
Al Quaeda much. Take care that the system clock is set to UTC :)
What the world's press gets from the prime minister's office should be
at least as well protected.

Of course it should. I trust someone is asking a few questions in
Parliament about this goof.
Not until there is a facility in Word to perform clean auditable
document export with preserved fonts and formatting and signing.

We have all this on the PC. There are some issues with getting Signing to
work on the Mac: Word doesn't provide it, but there are plenty of
third-party tools that do. Everything else you can do on the Mac. But if
you are going to handle information so confidential that it could send
nations to war, you really do need to know how, or hire someone who does. I
suspect that I might have done that for a document that was to be sent to
the American President :)

If anyone is interested in eactly how to "Anonymize" your document, contact
me via www.keen.com and I will send you all that you need to know (for a
fee!). Or hunt around the help file and Microsoft website and you will get
it all for free!

Cheers

Please post all comments to the newsgroup to maintain the thread.

John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltd
Sydney, Australia. GMT + 10 Hrs
+61 4 1209 1410, mailto:[email protected]
 
J

John McGhie [MVP - Word]

Correction: I just remembered -- Word 2002 has the "Remove personal
information" setting also. I suspect that Word 11 on the Mac will also have
it.

Please post all comments to the newsgroup to maintain the thread.

John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltd
Sydney, Australia. GMT + 10 Hrs
+61 4 1209 1410, mailto:[email protected]
 
Top