Hi Jim:
That's a great post! Thank you for spending the time
As you know, I have a different view of some of the points you raise...
You are correct that you are not in a position to explain to
them how to maintain their stuff. They certainly don't listen to
security experts who urge them to update, so I doubt they would listen
to you, either, no matter how right you would be.
Yeah, we have retired from the battlefield also, at work. We just say
"Well, .docx is the format we use. Sorry. If you have a problem with it,
call your Help Desk!" But then, the outfit I am contracting to is a very
large company
As to the reason why Microsoft changed from .doc to .docx - it wasn't
because Microsoft thought that .docx is a better file format. They know
it in many ways it is not.
Well! That's certainly an alternative view
I would have thought
"Smaller, more robust, more powerful, more flexible, and non-proprietary"
might have registered on somebody's meter
The .docx file format is a response to
pressure from the Open Source community, which was whining that .doc is
a proprietary, non-human readable (binary) format.
Naaahhhh! You're just saying that to wind me up
I was talking to
Microsoft about moving Word to SGML in 1989, before the "Open Source"
movement was even a gleam in it's daddy's eye
Microsoft actually did put out an SGML extension to Word, in around
1990-ish. That was in response to whining from the US/UK/Australian
Military/Government (and Boeing!).
What we have now, is simply the optimised version of that.
SGML was an intellectually eloquent idea. But I said for years "If SGML is
the answer, it was a silly question!" Of course, IBM loved it, because a)
they invented it, and b) it required a computer the size of a house, and c)
it required an army of consultants to make it work...
WordPerfect did some good work in the area at that time, and everyone
thought they had Microsoft beaten. Microsoft was worried for a minute! But
as they discovered "compliant", "well-formed" and "valid" together form a
lexicon that describes almost all the sins there are.
As the various interpretation of "Web Browser" have shown on a small scale,
it is easy to get an application that is "compliant" with any given
specification of SGML. Making one that displays the exact same result is a
bridge too far.
And that's just for a simple browser application to display HTML -- a
simplistic cut-down of SGML.
XML provides a perfect circuit-breaker to the whole problem. AND it is
small, fast, and rugged
In response to the whining (which had gotten the attention of government
agencies who started to switch to OpenOffice and hurting Microsoft's
sales and reputation), Microsoft came up with an open format based on
existing XML standards.
I am not sure that I would agree with the phrasing there -- OOXML is not
"based on" existing XML standards, it IS standard XML. So is ODF. Both are
exactly compliant with XML.
OOXML and ODF are effectively different DTDs and FOSIs (or if you like,
XSLTs) implemented in the same standard syntax. ODF is just the "no-frills
version". It was purposely cut back to enable the relatively feeble
machines it is designed for to parse it easily. It's a bit like Esperanto.
Esperanto (remember that?) was designed as a world-wide natural language
that would be easy to use, easy to learn, and standard throughout the world.
Problem is, Esperanto is an abridged version, and like any "lite" version of
anything, it does not have the full power of, say, English. ODF is like a
subcompact car: small, cheap, and it will get you many places with
reasonable fuel consumption. But you probably wouldn't buy one, because
your family needs something a bit more capable in one or more areas.
Microsoft submitted their open XML format
proposals to international standards committees, which found the
Microsoft standard to be more open and superior to the OpenDoc format,
which the OpenOffice ant-Microsoft fans were betting on.
Now, that bit I agree with. I do not think Microsoft had any intention of
making their XML format into an international "standard" originally. They
didn't need to: XML was already a standard, there was no need to do any more
than that. Anyone who knew XML could read a Microsoft Word document, and
extract the DTD and the FOSI. They don't NEED a "standard".
However, the "Anyone But Microsoft" crowd were busily trying to get ODF
ratified as "the" standard. Microsoft didn't want that -- seriously didn't
want that. Because ODF is not powerful enough to describe the content of an
Office document. And the ABM camp knew it. Had they succeeded in making
ODF the new COBOL, they would have been able to force the American courts to
put Microsoft out of business, something that their legitimate efforts were
conspicuously failing to do.
Thwarted in this ambition, these people next turned their attention to Wall
Street. And we can all see the result. The ABM camp didn't care what they
wrecked, provided they made money when they pumped it on the way up, then
made even more money short-selling on the way down.
I wonder how the ex-employees of Bear Stearns, Washington Mutual, AIC, etc
feel now? Well: that's what these people were trying to do to Microsoft.
Perhaps they were forgetting that there are millions and millions of people
around the world who also need to feed their families, who owe at least part
of their business income to the products of Microsoft and its partners.
IMHO the switch to .docx is based purely on politics and preserving
market share. It has nothing at all to do with introducing a better
technology. It cost Microsoft millions of dollars so their products can
do something in a different way that they were already doing perfectly:
saving files.
Well, I quite strongly disagree with that. The binary formats do not
support "data warehousing". And that's the holy grail. Making every piece
of knowledge in any of a corporation's documents instantly useable in any
other document. The nascent "Content Controls" that are beginning to appear
in PC Office, and can be found in Mac Office 2008 if you search real hard,
are part of the beginning of making this work. XML is designed for this
functionality; the .doc binary can't do it reliably.
.doc format has only one drawback - it's proprietary. The .doc format
has been reverse engineered and hundreds of programs can read and write
to the format. It is binary and straightforward, meaning it has compact
files that don't take up a lot of drive space.
It has three other major drawbacks that I can think of, just off the top of
my head...
1) It is an unholy rat's nest of linked lists and pointers. Which means a
single one-bit error in a disc read will potentially break the entire file
(i.e. "document corruption" -- that's what it is!)
2) It's not extensible. At all. You can not add new "kinds of things" to
the .doc format. You have to wait for Microsoft to do so. Any user can add
anything they like to XML.
3) It is not fault-tolerant. You can either read and understand all of the
binary format, or you cannot open the file. XML has built-in resiliency --
an application will read and process what it can understand, and ignore the
rest. That gives an infinite number of levels of graceful fall-back.
4) It's huge: up to four times the disk space of XML.
IMHO from a technical perspective the new file format is a huge,
wasteful expense for all concerned. But I think Microsoft had little
choice. The pressure to do it was external, and therefore the change
was necessary.
My opinion is not quite so harsh
I agree there was huge pressure on
Microsoft to change. Some of it came from users like me who were thoroughly
sick of losing data to Word document corruption. Another lot came from the
military, government, and industry customers, who wanted a smaller more
rugged file that they could data-mine.
I happen to think that the "real cost" of switching to XML for nearly all
users is almost negligible. It's out there. It costs nothing. It works --
Download the converter and move on!
The most immediate benefit, and the one most users will notice first, is
that the amount of disk space occupied by Office files will begin to shrink
dramatically. A little later the more professional companies will begin to
notice that the number of calls logged to the help desk about broken
documents will fall off sharply.
There are huge competitive advantages available to businesses who truly
understand and utilise XML. But those benefits require thought and effort
to realise. Large corporations will have a team on this out in the back
room right now. Customers who believe the FUD being spread by the opposing
forces will miss out. There's a name for that: "Darwinian Evolution"
Cheers
--
Don't wait for your answer, click here:
http://www.word.mvps.org/
Please reply in the group. Please do NOT email me unless I ask you to.
John McGhie, Microsoft MVP, Word and Word:Mac
Nhulunbuy, NT, Australia. mailto:
[email protected]