Hey Beth:
That¹s a pretty impressive summary of a 90-minute phone call! (Those of you
who ³can¹t² ring Australia for three bucks an hour should speak to your
phone company ‹ they¹re holding out on you!!)
Since I know Beth loves to get these things "exact" I will make a couple of
detail-level updates:
Documents created in Word Document Format 8 (WDF 8) have the file extension
.doc. Documents created in earlier WDFs do not.
It would be better to express this as:
Word has always used the ".doc" file extension. Unfortunately the "content
type" of files with a .doc extension has changed over the years, as Beth
points out. That was seriously poor practice, which presumably came from
the idea that Microsoft was going to dispense with the use of file
extensions altogether. Well, I think the computer world has moved back to
extensions: those of us who are used to them automatically turn display of
them on, because it's just so much easier to operate a computer when you can
see what's really going on.
So as Beth points out there have been three major "kinds" of Word binary
file format, but we need to realise that they have all used .doc as their
extension on the PC.
Let me tell you a story. Of course, it isn't true... Some years ago,
WordPerfect discovered that if you added a ".doc" extension to a file that
contained WordPerfect, it would crash Word. So they enabled WordPerfect to
use the .doc extension. Microsoft retaliated by adding the .doc extension
to a file that contained both RTF and Word format, which they gleefully
discovered would crash WordPerfect. When WordPro started writing .doc
extensions, everyone realised how silly this was getting, and began changing
their products to do the right thing. Now, of course that story couldn't be
true, now could it?
The bottom line is that the file extension, content creator type, and file
type mechanisms all rely on people to tell the truth as to what is in the
file. There have been times in the past when they haven't done that.
Currently, there is nothing to stop you changing the file extension on a
file if you want to. If you do, well you may create difficulties for
someone a few years down the track. And that "someone" may be yourself!!!
Furthermore, when I went back to the G3 running OS 9/Word 2001, opened the
³Word 98² document in Word 2001 and did a Save As, I found that when I checked
³Append File Extension², no file extension got appended. That¹s because the
Word 5.1 format preceded the use of file extensions (at least in Word; not
sure I¹ve got that exactly right).
Well, ummm... Something went wrong there: Word 5.1 format is "one" of the
formats that "should" have been given a file extension of ".doc". I expect
this was actually Word 2001 handling the condition of being asked to create
a same-named file rather badly.
On a modern computer, the entire name string including the extension is
assessed to determine whether the name is unique or not. On older Mac
software, I think some programs truncate the extension if it is present
before comparing file names. That's probably the easy way to do it, but not
necessarily the "correct" way to achieve the result.
The best you can do is save it in Word 4.0-6.0/95
Compatible (RTF) which will allow a computer running the earlier versions of
Word to read it. John believes this setting saves both a Word Document and an
RTF within the same file so that the file can be read as either type of
document. He also believes that this setting will play havoc if the file is
ported to a machine running Windows Word and that you¹re better off just
saving in regular Rich Text Format (RTF).
What I actually believe is that this format is extremely corruption-prone on
either platform and is best avoided on either platform. It is also twice
the size of the .doc format, because it contains both the .doc and the .rtf
file kinds in the same file.
Older versions of Word will see only the RTF version, saving and editing to
that version. Newer versions will see both versions. They check to see
which one is newer and update the RTF on Save. The problems occur if the
user makes a deletion in an older version of Word. This may delete
something that is referred to elsewhere in the document. But the older
version of Word can't see that, because it can't read the latest version of
RTF. So it goes ahead and performs the deletion. When the latest version
of Word gets the file back, it finds that the RTF is "corrupt", ignores it
and goes for the .doc version. Now, the entire document is corrupt, and
weighed down by a vast lump of RTF that is now stranded. The new copy of
Word thinks its corrupt and ignores it, the old version of Word thinks its
OK and keeps using it. The two users see different documents in the same
file!!
And that¹s it! There are some complicated issues here but hopefully this
makes sense to others besides me
.
I tell you, it took us a while to work out what was going on here. I think
you have done a brilliant summary!
Cheers
--
Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.
John McGhie <
[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 4 1209 1410