Why the odd characters in a Word document

3

3253535

A friend and I are writing a book together, using Word, and are having
trouble getting the text to not have strange characters in some of the
words, like ones with apostrophes, etc.

Because I wanted to avoid this, I went through the following process:

I have a fresh copy of Word 2004 on my G5 (OS10.3.5), and she has a
fresh copy of Word X on her iBook (OS 10.3.5).

To eliminate any problems, before we started sharing the file, I
copied the text of the existing 5,000 Word doc into TextEdit. I then
had us both trash the Normal template, along with the Word .plist
preferences. I then opened Word and pasted the 5000 words from the
TextEdit doc. We decided that we want to keep a very Zen document, so
we only needed/wanted the following Styles: Normal, Heading 1, 2, 3,
and 4. So, I set up each of these Styles, keeping to standard fonts:
New Times Roman for the Normal Style, and Helvetica and Geneva for the
Heading Styles. Very simple, nothing fancy.

I then zipped both the Normal template, and our Book doc to my friend.
She installed the Normal template, and then opened the Book doc. In
virtually every sentence of the Normal Styel, she sees strange
characters, generally where there are apostrophes, colons, stuff like
that.

I had her check to make sure her Mac has Times New Roman. She has
never added any additional fonts, other than the ones that Word
installed.

My second question: today, while troubleshooting this problem, I saw
that my clean Zen document now has over 15 Styles in it. I had only
put the aforementioned 4 in it. Where did these come from? I though
that by building a new Normal template, there would not be any Styles
in there that I didn't actually place in it. How do I get rid of them,
and once they're out, will they stay out?

Many thanks for any help on these issues.
 
D

Dayo Mitchell

Times New Roman in Word 2004 is Unicode, and has many more characters than
Times New Roman in Word X, which does not support Unicode. So you think you
are using the same font, but you aren't.

But you should not be having a problem with apostrophes and colons, which
don't require Unicode. Certain accents might produce problems. What are the
strange characters your friend is seeing? Empty boxes, or different
characters?

All that trashing really shouldn't be necessary to exchange docs. You might
want to send her a different file, that you haven't done anything special
to, and see if she gets the same problems. I'm also not entirely sure that
Word X would like a Normal created in Word 2004, so that might have caused
difficulties. Tell her to move it to the desktop and let Word generate a
new one, before you run the second test. Also, since the location for
Normal is different in Word X and Word 2004, are you sure your friend
installed the Normal you sent her in the right place?

Tell your friend to make sure she is updated:
http://word.mvps.org/MacWordNew/Update.htm
(hit refresh a few times in Safari, or use a different browser)

Re styles--I think Normal templates are generated with a ton of styles, but
I also think you can just ignore extra styles that show up in the drop-down
menu. However, while it is a good idea to make sure shared documents are
using the same template, for this purpose you probably should have created a
dedicated document template and sent that to your friend, not used Normal.
It would be much easier to control, for one thing, and to keep separate from
your other work.
http://word.mvps.org/faqs/customization/CreateATemplatePart1.htm
(hit refresh a few times in Safari, or use a different browser)

Side note: did you two not have customizations in the Normal template that
you were unwilling to trash? I fear this action suggests you may not be
taking full advantage of Word's power, and if you are writing a book, it's
worth putting the time into learning to customize Word.

Side note 2: *three* fonts sounds like a lot for a Zen doc.... ? :)
 
A

Alan Wood

A friend and I are writing a book together, using Word, and are having
trouble getting the text to not have strange characters in some of the
words, like ones with apostrophes, etc.

I have a fresh copy of Word 2004 on my G5 (OS10.3.5), and she has a
fresh copy of Word X on her iBook (OS 10.3.5).

Word 2004 uses Unicode, but Word v.X has very limited Unicode support.

Your files probably use character 8217 for apostrophe, and this will be
present in the Unicode font that Word 2004 is using.

Your friend's Word v.X probably uses 213 for apostrophe, and is probably
using a font that does not have a character at 8217.

If you try to use files created on Word v.X, you will probably find that
where you expect an apostrophe, you get a capital O with tilde, which is 213
in Unicode.

You are almost certain to have problems unless you both use the same version
of Word.
 
3

3253535

Many thanks, Dayo, for the detailed response. Very helpful. We have
gone ahead and replaced the Times New Roman font with plain old Times.
I'll report back to see if this fixes the problem.

The odd characters were not the boxes, but things like Ø and other
letters with accents.

I did make sure that she placed her Normal template where Word X likes
it to be.
Side note: did you two not have customizations in the Normal template that
you were unwilling to trash? I fear this action suggests you may not be
taking full advantage of Word's power, and if you are writing a book, it's
worth putting the time into learning to customize Word.

I lost track of the number of times I would format the Normal template
to perfection, along with other personalized Word templates, going
back before Word 6 (ah yes, 90 second start up times... the good old
days), spending hours tweaking that sacred Normal with 12 different
styles, and colors and indents. I would dutifully back up this perfect
Normal template, and still, weirdness would happen. I'd trash the
Normal template, try the backup, try removing prefs, extensions (back
in 9), and then reinstall the entire Word application. At some point
along the way, a fellow author said, "Stop. Stop the insanity." He
showed me his method of writing his books: he used only Outline view;
he used only one template (Normal); he used only five Styles: Normal
(for all the text), and 4 Heading styles. He'd tweak these five to
make it visually helpful (it only took a few minutes), but that was
it. Once the book was done, it all went into InDesign (formally Quark)
anyway.

I know, I'm only using a fraction of Word's power. And man, I cranked
out some groovy looking templates! But I certainly don't miss those
hours Tweaking Word To Perfection.

Thanks again for the response. I may try to look around for another
font that is pleasing to write with; I'll miss TNRoman, and plain
Times is hard on my eyeballs.
 
D

Dayo Mitchell

I assume you saw Alan Wood's post as well? He's the Unicode expert (I was
really just brainstorming). Interestingly, it does sound like if you *both*
were using non-Unicode fonts, there wouldn't be a problem. I believe Word
2004 only has a few Unicode fonts (Times New Roman, Verdana, Trebuchet MS)
so you should be able to find something you like.

Re customizing Word, I was thinking more in terms of custom toolbars than
styles, when I said not taking advantage of Word, but if you feel like
giving it another try at any point, save this link:
http://word.mvps.org/MacWordNew/GlobalTemplate.htm
(hit refresh a few times in Safari, or use a different browser)
Which gives several methods to protect yourself from Normal weirdness by
keeping your customizations elsewhere.
 
K

Klaus Linke

Hi Alan, 3253535,

Strange that the Word *.doc format should fail to be backwards compatible
between Word2004 and Word X :-(
But your analysis sounds very convincing.

Perhaps HTML could be used as an exchange format?
There, you have a lot of control over the encoding and can choose one that
both versions can deal with:

File > Save as web page > Web options > Encoding > Western European
(Macintosh)

If it's just a problem with the ambiguous font name (Unicode Times New
Roman vs. non-Unicode TNR), using another font might also do the trick.

And maybe MS might be persuaded to supply an update with an improved import
filter for Word X?

Regards,
Klaus
 
P

Paul Berkowitz

Hi Alan, 3253535,

Strange that the Word *.doc format should fail to be backwards compatible
between Word2004 and Word X :-(

The .doc format is perfectly compatible. It's the Unicode characters of 2004
which aren't, since Unicode was only just implemented for 2004. If you're
careful, and restrict yourself to non-Unicode fonts or (trickier, but
possible) non-Unicode characters even in Unicode fonts, you'll be OK. If
you restrict yourself to Mac-approved ways of entering accented characters -
i.e. using the option and option-shift modifiers, you should be OK.
Inserting from Symbol in 2004 may be a problem, since that's what now uses
Unicode characters. (As well, you have a zillion more such characters to
choose from using the system's Character Palette, so avoid that.) You're not
necessarily safe in "non-Unicode" fonts inserting symbols: if the symbol is
not available in your current font, I think Word may do a font substitution.
So be careful with symbols - don't use them if you can avoid them, and make
sure there has not been a font substitution after you insert one.

But your analysis sounds very convincing.

Perhaps HTML could be used as an exchange format?
There, you have a lot of control over the encoding and can choose one that
both versions can deal with:

File > Save as web page > Web options > Encoding > Western European
(Macintosh)

That's a good idea. I think that should work.
If it's just a problem with the ambiguous font name (Unicode Times New
Roman vs. non-Unicode TNR), using another font might also do the trick.

And maybe MS might be persuaded to supply an update with an improved import
filter for Word X?

Regards,
Klaus


--
Paul Berkowitz
MVP MacOffice
Entourage FAQ Page: <http://www.entourage.mvps.org/faq/index.html>
AppleScripts for Entourage: <http://macscripter.net/scriptbuilders/>

Please "Reply To Newsgroup" to reply to this message. Emails will be
ignored.

PLEASE always state which version of Microsoft Office you are using -
**2004**, X or 2001. It's often impossible to answer your questions
otherwise.
 
R

Ronald Florence

Paul said:
If you're
careful, and restrict yourself to non-Unicode fonts or (trickier, but
possible) non-Unicode characters even in Unicode fonts, you'll be OK. If
you restrict yourself to Mac-approved ways of entering accented characters -
i.e. using the option and option-shift modifiers, you should be OK.

You might want to add the qualifier that if you're using the U.S.
keyboard (and not the "U.S. Extended" keyboard), entering accented
charcters with the Option and Option-Shift modifiers should be OK on
files sent to a non-Unicode capable version of MS-Word. The U.S.
Extended keyboard includes many Option and Option-Shift modified
combinations which are Unicode characters, such as macrons.
 
K

Klaus Linke

Hi Paul,
The .doc format is perfectly compatible. It's the Unicode characters
of 2004 which aren't, since Unicode was only just implemented for
2004.

To clarify:
Microsoft was completely free to encode those "Unicode characters" any way
they wanted to in the default *.doc format.
It seems they chose to do so in a way that's not compatible with WordX, at
least in some circumstances.

Regards,
Klaus
 
P

Paul Berkowitz

To clarify:
Microsoft was completely free to encode those "Unicode characters" any way
they wanted to in the default *.doc format.
It seems they chose to do so in a way that's not compatible with WordX, at
least in some circumstances.

That's because Word X (and earlier versions) used MacRoman-only
"international style" characters for some non-ASCII characters. These were
never compatible with WordWin Unicode characters. MacBU chose to make Word
2004's new Unicode compatible with Word Win, so that eventually there will
be just one standard version rather than perpetuate two. Apple is doing the
same thing, as far as I can see.

--
Paul Berkowitz
MVP MacOffice
Entourage FAQ Page: <http://www.entourage.mvps.org/faq/index.html>
AppleScripts for Entourage: <http://macscripter.net/scriptbuilders/>

Please "Reply To Newsgroup" to reply to this message. Emails will be
ignored.

PLEASE always state which version of Microsoft Office you are using -
**2004**, X or 2001. It's often impossible to answer your questions
otherwise.
 
M

Matt Centurión [MSFT]

Hi Paul,


To clarify:
Microsoft was completely free to encode those "Unicode characters" any way
they wanted to in the default *.doc format.
It seems they chose to do so in a way that's not compatible with WordX, at
least in some circumstances.

Regards,
Klaus

The Document format between Word X & Word 2004 in regards to Unicode
characters is the same. The only issue with Word vX, 2001 & 98 is that they
won't be able to *display* the Unicode characters, however they will still
allow you to edit them and preserve them on save. Unicode characters appear
as "_" (underscores) for the most part in past versions of Word.

Word 2004 finally uses ATSUI to display the Unicode characters, and OS X has
made it easier to insert Unicode characters.

To summarize, all versions of Word starting at 98 allow you to open & save
documents with Unicode text in them. However it is only until 2004 that
these Unicode characters are visible and thus useful.

Past posters do bring up good points regarding how entering characters may
affect how they display in other apps. For example, entering symbol fonts
using the "Insert | Symbol" dialog will insert characters that are
visible/compatible pre-2004 versions of Word (for the most part). Using
Apple's character palette will insert Unicode only versions of these symbols
which would only be visible in Word 2004.

Hope that helps,

Matt
MacWord Testing
Macintosh Business Unit

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send email directly to this e-mail address. It is for
newsgroup purposes only.

Find out everything about Microsoft Mac Newsgroups at:
[http://www.microsoft.com/mac/community/community.aspx?pid=newsgroups]
Check out product updates and news & info at:
[http://www.microsoft.com/mac]
 
M

Matt Centurión [MSFT]

The Document format between Word X & Word 2004 in regards to Unicode
characters is the same. The only issue with Word vX, 2001 & 98 is that they
won't be able to *display* the Unicode characters, however they will still
allow you to edit them and preserve them on save. Unicode characters appear
as "_" (underscores) for the most part in past versions of Word.

Word 2004 finally uses ATSUI to display the Unicode characters, and OS X has
made it easier to insert Unicode characters.

To summarize, all versions of Word starting at 98 allow you to open & save
documents with Unicode text in them. However it is only until 2004 that
these Unicode characters are visible and thus useful.

Past posters do bring up good points regarding how entering characters may
affect how they display in other apps. For example, entering symbol fonts
using the "Insert | Symbol" dialog will insert characters that are
visible/compatible pre-2004 versions of Word (for the most part). Using
Apple's character palette will insert Unicode only versions of these symbols
which would only be visible in Word 2004.

Hope that helps,

Before I start spreading confusion I'd like clarify that Word vX, 2001, & 98
will not let you ENTER Unicode characters, that was added as part of the
Unicode changes to Word 2004. So when I talk about these previous versions
able to open, edit, save Unicode characters, I'm talking about Unicode
characters that were entered using Word 2004 (or versions of Windows Word
that allow Unicode entry as well).

Apologies for any confusion previous statements may have caused...

Matt
MacWord Testing
Macintosh Business Unit

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send email directly to this e-mail address. It is for
newsgroup purposes only.

Find out everything about Microsoft Mac Newsgroups at:
[http://www.microsoft.com/mac/community/community.aspx?pid=newsgroups]
Check out product updates and news & info at:
[http://www.microsoft.com/mac]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top