The massive size of Word 2004 files

R

Rob Schneider

They aren't, normally, unless the file has become corrupted somehow or
another, they aren't simple plain text in actuality, or there is indeed
a massive number of characters.

You need to say a bit more about your situation/problem to get help.

--rms

www.rmschneider.com
 
J

John Wolf

I have a file only 21 words 81 characters and its 32K in size. That's way
too large for a word 2004 document for such a small amount of text.
 
J

John McGhie

Hi John:

That's about right for a Word .doc binary. A completely blank .doc is 20kb.
One that has had some formatting in it will grow a little.

There's about 20 k of metadata in a Word document before you get to the
first character of text. All of the formatting descriptions are in there,
in case they get used. All of the bullets and numbering list definitions.
All of the spelling overrides. The saved/opened/editing times. The
location of each edit.

There's a lot of stuff in there that tells Word "how this document is laid
out and where the components are". A Word document is not a stream of text
with formatting embedded: it's a large table of formatting properties, with
pointers that indicate where in the text they apply.

The downside of this method is that for very short documents, you store all
this information whether any of it is being used or not (well, some of it is
always used).

But the size of this "metadata" is fixed: there's 20 k of it in a half-page
document. And 20 k of it in a 1,000-page document.

It also enables Microsoft to dramatically speed up Word for larger files.
WordPerfect internally constructs its documents the "other" way: all the
formatting is embedded in the text. If you compare Word and WordPerfect on
large files, you discover that WordPerfect really starts to grind as the
file size exceeds 100 pages.

And Rob is quite right: we can give you much better answers if you give us
more information :)

Hope this helps

I have a file only 21 words 81 characters and its 32K in size. That's way
too large for a word 2004 document for such a small amount of text.

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

--

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]
 
J

John Wolf

I believe they fixed things in the docx format as those files are smaller.
Thanks!
 
J

John McGhie

I believe they fixed things in the docx format as those files are smaller.
Thanks!
Hi John:

They added a compression function. The metadata is still there: it's a bit
more compact, and now the entire file is compressed.

Cheers

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

--

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]
 
J

John Wolf

I don't like the docx format it messes up the entire world, as MILLIONS have
been using the old formats. Remember MS controls the world. Too bad Access
never made Mac, but they have FMP instead.
 
J

John McGhie

Hi John:

Relax! You don't have to like it, or even use it :)

The old .doc format simply doesn't have the power or strength to do the job
in the modern world, so I am very pleased to be rid of it.

But you are more than welcome to use it if you like.

My only suggestion is that if you don't want to move up to .docx, I would
consider putting Office 2004 back. In my experience, each version of Office
is more stable running in its own native format. Running 2008 in the old
..doc format can lead to trouble: the .doc format is not strong enough to do
some of the things Office 2008 can do.

Hope this helps

I don't like the docx format it messes up the entire world, as MILLIONS have
been using the old formats. Remember MS controls the world. Too bad Access
never made Mac, but they have FMP instead.

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

--

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]
 
P

Phillip Jones, C.E.T.

DocX maybe short lived. There is a lawsuit against MS for using the
particular version of XML its seems that another company owns the patent
on that particular version or method and MS has won a 30 day stay. It
sounds like MS has lost so far, but they won a stay so they would have
time to explain things.

Of course MS is so large they don't have to worry, and no one will ever
win a lawsuit against them.
 
M

Michel Bintener

Hi Phillip,

I believe this lawsuit is not about the Office Open XML file format, but
about the XML format that was introduced in Word 2003 and can also be found
in Word 2007. Incidentally, Apple's TextEdit can also save in that specific
file format.
 
P

Phillip Jones, C.E.T.

I guess the company with the lawsuit figure Ms had more bucks in the
coffers than Apple or just went after them first.
 
J

John Wolf

You know Mickey always up to his same old tricks! Regardless he designs very
well done software in MS office and Windows XP (Vista is garbage).


John
 
M

Michel Bintener

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top