Matt's a Microsoft Staffer: we are not his colleagues, we're volunteers.
Cross-references and captions are the same thing, internally. A caption is
essentially an automatically-created cross-reference.
However: A caption can be more complex than a cross-reference, because a
caption can be a floating object anchored to a floating object in the
graphics layer. A Word document can be thought of as a "sandwich". The
text is the "meat", and the two graphics layers above and below the text are
the "bread". The reality is a lot more complex, but that's the principle.
Within these three layers, objects are located by "pointers" which are
simply labels that identify which object we are talking about. The actual
objects are not stored in either the text or the graphics layers, but in an
"Object Store" at the end of the document. The pointer indirectly indicates
which object is to be displayed at the current point in the text.
If you understand HTML, a "bookmark" is essentially an "Anchor" tag. That
tag has an opening and a closing terminator (normally, bracket characters).
If one end of the bookmark is sitting in a text string that is affected by a
tracked change, and that tracked change is affected by a tracked change, and
the source of the cross-reference to the bookmark is a floating object, and
the target is an independently-floating object, you've built a real rat's
nest of code.
Even Word 2003 will fall over if you do too much of this.
The answer to any question you may ask that contains the word "Why" is
always "Because that's the way Microsoft made it". We usually don't know
the reason, but it always contains some elements of "cost, time, and
customer requirements". Usually when it doesn't work right the reason is
that it would cost too much and take too long to fix it.
In the case of the stability around bookmarks and cross-references, the
"fix" was the new .docx file format. The old file format simply cannot
handle the complexity involved, and is susceptible to minor errors that
produce crashes and freezes and document corruption.
However, the new file format meant a lot of new code throughout the
application, which inevitably leads to lots of bugs; which in turn cause
crashes and freezes and document corruptions. So: The fix is here, but it
isn't working right either in this version. We could expect an improvement
in the next version. Some of those improvements may be able to be applied
to the 2008 version, but it's less likely, because the fixes may require a
different internal structure to the software, so the fixed bits won't plug
into the old design.
The DOs are "Edit carefully and neatly and NEVER work with your non-printing
characters hidden. ALWAYS run with your paragraph marks and spaces and
bookmarks displayed so that you can see what you are doing. Do save
whenever you stop to think. Do enable 'Always make backup'. Do adopt an
aggressive manual backup strategy (Make a new copy of the document every
morning at least, preferably twice a day). Keep ALL the previous copies
until you have finished the project."
The DON'Ts are "Don't use .doc format. Don't use Tracked Changes (use
Compare Documents instead). Don't hide tracked changes if you do use them.
Don't stack bookmarks within bookmarks upon changes within changes. Don't
edit at or close to the boundary of a bookmark or tracked change."
Word does do housekeeping in the background. This is a lot more effective
in a .docx than in .doc format (because the errors are easier to detect, and
because the document code is less likely to suffer errors). Word cleans up
the document every time you close it. Saving as HTML format effectively
performs the same thing, but it re-expresses the document in a simpler form
of coding. In doing so, it discards some of the really complex structures
in a Word document that cannot be described in HTML, and in doing so, often
removes the corruption.
Of course you lose some formatting, and your pictures convert to bitmaps.
But that's often the lesser of two evils.
The .docx format is a more powerful form of the same kind of coding as HTML.
HTML is a simplified form of the Standard Generalized Markup Language. The
Extensible Markup Language (XML) format that is the new native format for
Word 2008 is an enhanced version of the same SGML coding.
So "No", HTML is not a good way to keep Word documents. Because you lose
content.
If you change a Word document to HTML you lose the things that HTML can't
describe. You may also lose the corruptions. A "corruption" is often "a
segment of code that is too complex to read". It's not "wrong", it's just
too difficult (like Algebra as far as I am concerned...) The "rules" for an
export filter translating to a different encoding are "Convert what you can
understand, and discard the rest". So the HTML export filter discards the
"things it can't understand" because it knows it can't express them in the
HTML language. Some of those things are corruptions. And some are complex
modern graphics. The export filter can't tell the difference ‹ it just
ignores what it can't understand
Hope this helps
Dear Bob,
thank you for your help.
Please find my comments below.
I have been discussing this topic with a colleague of you, Matt Elggren of
Microsoft some months ago (Nov-Dec 2008), but suddenly I lost contact with
him. I don't know about his findings.
But I still have a copy of a Word document that produces this problem and that
I sent to him.
I would be glad to send it to you if you would kindly find a few minutes to
look at it.
The most common causes of such behavior while the program is in use are: 1)
Need of repairing disk permissions and
OK, I tried this and it does not solve the problem.Please note that usually I
am logged as a normal user (with no administrator privileges).
But the problem happens also when logged as administrator.
I installed this morning the most recent update, 12.1.7. The problem remains.
Although you
refer to the latter you make no mention of the former
As for the cross-references themselves I've not seen any indication of an
inherent problem in the program. I'm a little unclear on what you mean by
"when the object they refer to is moved throughout the document". If you're
referring to normal repagination there should not be any such problem. If,
however, you're physically moving such content [cut/paste or drag 'n' drop]
then YES, that's one of the best ways to induce cause #3: Corruption in the
document.
Sometimes (but rarely), when I need to create a drawing that's similar to
another one already present in the document, I copy and paste the existing
drawing and then apply the needed modifications to the copy.
But I experience stability problems also without doing so.
By the way, when doing the same manipulations with WinWord 2003, that software
NEVER crashes.
Mac Word 2008 crashes or freezes too often (many times a day)!
I already uninstalled it and reinstalled it from scratch, cleared all the
preference files, etc. But stability of the program is still far from being
acceptable.
You might try the techniques suggested here in an effort to resolve
corruption issues:
<
http://word.mvps.org/mac/DocumentCorruption.html>
Thanks: here I found the suggestion to save the document as html in order to
"uncorrupt" it. That worked on my sample document.
Would this mean that saving Word documents always in html is a good way to
keep corruptions away?
Why is Word not doing some housekeeping in the background to check and resolve
corruptions on an open document, as soon as they happen?
What are exactly the "DOs" and "DONTs" when working with captions and cross
references?
Thank you and best regards.
--
HTH |:>)
Bob Jones
Office:Mac MVP
wrote in message
--
Don't wait for your answer, click here:
http://www.word.mvps.org/
Please reply in the group. Please do NOT email me unless I ask you to.
John McGhie, Microsoft MVP, Word and Word:Mac
Sydney, Australia. mailto:
[email protected]