Chinese characters

M

MT666

Version: 2008 Operating System: Mac OS X 10.4 (Tiger) Processor: Power PC I have always been able to use Chinese characters in the English version of Word 2004 (Mac) and Word 2003, 2007 (Windows). Now with Word 2008, no-go. I can type fresh characters fine, but if If I paste in characters, it shows those ugly repeating rectangles only. The suggestion of using the "Microsoft Language Register" to set it to Japanese also does not help. Just an aside, there is no "Microsoft Language Register" for any of these other versions of Word. Both TextEdit and iWork > Pages can open a Word document that has Chinese characters, or paste Chinese characters into a new document and it looks fine. What's the big problem with Word 2008 and how can it be converted to use more than one character set? Don't just give me a wild guess or tell me to reinstall. I would like to see some Microsoft documentation that explains why Word 2008 has reverted to being lame with Chinese characters.
 
J

John_McGhie_[MVP]

There's no big problem with Word 2008 — Chinese works fine here, I should
know, my girlfriend is from the People's Republic of China :)

Here: This was pasted out of Word 2008:

Ø 构件的å—力情况分为基本å—力(或基本å˜å½¢ï¼‰å½¢å¼ï¼ˆå¦‚中心å—拉或å—压,扭转,平é¢å¼¯æ›²,
剪切)和组åˆå—力(或组åˆå˜å½¢ï¼‰å½¢å¼ã€‚
Ø 组åˆå˜å½¢ç”±ä¸¤ç§ä»¥ä¸ŠåŸºæœ¬å˜å½¢å½¢å¼ç»„æˆã€‚

You are correct: the Microsoft Language Register has nothing to do with it,
that's for Japanese only.

Whatever the problem is, it's either on your system or in the document you
are looking at. If you did an "upgrade" from an old version of OS X or
Word, you may have the old Mac Roman fonts on your system, and they may not
have been replaced.

I certainly won't tell you to re-install, on a Mac, that's almost guaranteed
to make whatever problem you have worse :) And Word 2008 (in my
experience, anyway...) "just works" in Chinese.

I will, however, ask you to ensure you are fully updated. Go to the Word
menu and choose the "About Word" item. If your do not see "Version 12.2.4"
or higher, then you are missing essential updates, and many thinks will be
broken. Apply the latest updates and your problem may go away.

The text you are pasting may be in an old Mac Roman font that does not have
Unicode encoding. If the font of the pasted text is too old, it may have
the old Mac character set encoding: if it has, it prevents Word from finding
the modern equivalent font. Hence you get "Non-existent characters" when
Word tries to use that old font.

Try setting that text in SimSun font: chances are, the characters will
appear.

In FontBook, check that the three fonts supplied by Beijing ZhongYi
Electronics Co., China: SimSun, SimHei and SimSun ExtB are all installed,
enabled in all applications, and not duplicated.

Then in FontBook choose Select Duplicated Fonts, and then Resolve
Duplicates. Then shut down the computer and wait until the power goes off.
Then restart (doing a power-off restart runs the Unix clean-up tasks...).

It should be fine now, if not, post back and we'll work out what's wrong.`

Hope this helps

Version: 2008 Operating System: Mac OS X 10.4 (Tiger) Processor: Power PC I
have always been able to use Chinese characters in the English version of Word
2004 (Mac) and Word 2003, 2007 (Windows). Now with Word 2008, no-go. I can
type fresh characters fine, but if If I paste in characters, it shows those
ugly repeating rectangles only. The suggestion of using the "Microsoft
Language Register" to set it to Japanese also does not help. Just an aside,
there is no "Microsoft Language Register" for any of these other versions of
Word. Both TextEdit and iWork > Pages can open a Word document that has
Chinese characters, or paste Chinese characters into a new document and it
looks fine. What's the big problem with Word 2008 and how can it be converted
to use more than one character set? Don't just give me a wild guess or tell me
to reinstall. I would like to see some Microsoft documentation that explains
why Word 2008 has reverted to being lame with Chinese characters.

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
M

MT666

The problem has solved itself, but I can't be sure what caused the fix. All I did was open Word 2004, open the same troubling documents, and then quit both versions (yes, you can run both simultaneously). When I relaunched Word 2008, all was well. I suppose for a brief period, it failed to substitute fonts. You know, the Windows Chinese version of Office installs some fonts that are not in the Mac version of Office and are not installed by OS 10.4, so maybe I should try to get those fonts. I notice that Office 2004 had a fonts folder at:
Applications/Microsoft Office 2004/Office/Fonts
but I'm not seeing that for Office 2008.
 
M

MT666

I found one font, "SimSun", in the old Office 2004 area that seems to be a Chinese font... odd that they use a name I don't think is even correct Pinyin, but whatever. I installed it by double-click through FontBook. We'll see if it helps.

I have had this issue off and on, so I can't imagine that my relaunch was a real fix. I am betting on a font fix.
 
M

MT666

I spoke too soon. Perhaps I made come edit to the other document while it was open in Word 2004 and that made it look fine in Word 2008. I tried another iffy document (call it "hello.doc") and found the troubling rectangles. Next I open "hello.doc" in TextEdit, selected and identified the Chinese font file as "»ªÎÄ¿¬Ìå.tiff" which is shown as "STKaIti" in application menus. I then changed the font to "ËÎÌå.tiff" (STSong), saved a copy as "hello-song.doc" and opened it in Word 2008. All is well. It seems to me this is an issue of Word not able to use "»ªÎÄ¿¬Ìå.tiff / STKaIti" font.
 
M

MT666

I can type in STKaiti, but cannot paste it in, even if I have already typed ¶´±±
 in STKaiti; the paste between characters yields: ¶´o‡¡Ü‚¡ª¨Cæø¿‡
±±.
 
M

MT666

I have more insight:
Virtually all the documents I deal with are created by Chinese teachers who type only for info purposes, never to make a newsletter, brochure or similar document that uses a variety of fonts. They NEVER change the font from the default Times New Roman. They simply alternate their input method by tapping the space bar to go into or out of Chinese ABC input (three input methods for Chinese, but 99% of folks use ABC). This works great for viewing and printing in the Chinese version of Office, but not in the English version. The Times New Roman of the English version cannot understand the Chinese characters.

As far as John McGhie's GF, I am guessing he is reading a document that is typed using a Chinese font. I hope he can have her type in Times New Roman and switch input from Chinese to English and back again without changing the font to anything other than TNR. That may prove that this is just a matter of how Word varies in the Chinese version as opposed to the English version.
 
J

John_McGhie_[MVP]

Well, my Girlfriend is a professor, so she has had a little practice driving
Word in a mixed-English environment by now :) I just had a look in one of
her documents: I am not sure whether it was created by her or one of her
students, but there are a range of fonts in there, none of them Times NR :)

But yes, you're right: if you have an old version of Times on the computer
and it does not contain Unicode coding, you will get these problems.

Just to reiterate: a font will either specify its characters as 16-bit
integers between 0 and 65,535; or as eight-bit integers between 0 and 255
with a "Font Name". If it uses the first method, everything just works; in
the case of a missing character the OS can switch in whatever font does
contain the character. The old method leaves the OS with no place to go, so
you see the square boxes which mean "The font specified dos not contain the
character requested."

In your Fonts folder you should find a "Microsoft" folder that contains the
fonts you need: these are the ones 2008 installs, you should make sure these
are enabled.

Cheers

I have more insight:
Virtually all the documents I deal with are created by Chinese teachers who
type only for info purposes, never to make a newsletter, brochure or similar
document that uses a variety of fonts. They NEVER change the font from the
default Times New Roman. They simply alternate their input method by tapping
the space bar to go into or out of Chinese ABC input (three input methods for
Chinese, but 99% of folks use ABC). This works great for viewing and printing
in the Chinese version of Office, but not in the English version. The Times
New Roman of the English version cannot understand the Chinese characters.

As far as John McGhie's GF, I am guessing he is reading a document that is
typed using a Chinese font. I hope he can have her type in Times New Roman and
switch input from Chinese to English and back again without changing the font
to anything other than TNR. That may prove that this is just a matter of how
Word varies in the Chinese version as opposed to the English version.

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
M

MT666

You said "In your Fonts folder you should find a "Microsoft" folder that contains the fonts you need: these are the ones 2008 installs, you should make sure these are enabled."

In my mind, enabling a font involved double-clicking, which opens the font in Font Book. If the font has never been added to the system, I would see the button "Install font". Otherwise, I see all the faces of the font, and the drop arrow shows the option is "Disable" rather than "Enable". This makes me believe this indicates that particular font is enabled.

Your GF obviously doesn't often use the default font (used to be Times New Roman; not sure if her Office is newer and uses Cambria / Calib). That is not how most folks operate. I'm willing to bet half the people who use Word use the default font unless they have something in mind that requires a very different look. That has been the reason why MS is so mindful of the default font. I would guess no other word processor in the world uses Cambria or Calibri as the default font. But I digress...

Background: A few weeks ago, I repartitioned my Mac hard drive and installed OS 10.5 and OS 10.4 fresh, and did all the updates from Software Update (except 10.5.8 which interferes with permissions repair). I installed Office 2004 and 2008 fresh. My PC has a fairly fresh install of Windows XP SP3, did all the updates, and Office 2007.

If I have a serious font issue, how is it that I have this issue on three systems, three different versions of Office, and the issue also existed before any of these fresh installs? Do all the people running Win XP, OS 10.4, and OS 10.5 have this issue or has this issue only blessed me?
 
M

MT666

A couple of other observations:
My PC has both English Windows XP (partition F) and Chinese XP installed (partition C). Office 2007 is installed on C. A search for "fonts" on C finds only the C:\Windows\Fonts. In that, among others, are these fonts:
times.ttf, timesbd.ttf, timesbi.ttf, and times.si.ttf
At F:\Windows\Fonts are none of these (by name). But if I drag any of those from the C Fonts folder to the F Fonts folder, I get "The Times New Roman Bold [or whatever] is already installed...". That makes me think these Chinese fonts, although thought by the Windows system to be identical to the Times New Roman family installed by the English Windows installer, may actually have some different properties, and maybe I want those properties in the English versions of Windows, OS 10.4, and OS 10.5. Any thoughts on that?
 
J

John_McGhie_[MVP]

Well, you're getting a little beyond my area of expertise here :)

On a Unicode system, for this issue, it really does not matter which fonts
you do or do not have, provided that at least one font contains the
character you are looking for.

On a Windows system, most folks install Arial Unicode MS, which acts as a
"lender of last resort" to all the other fonts. It was an attempt by
Microsoft to create one font that would contain ALL the characters possible.

That effort was going well up until Unicode 3.2, when there were less than
32,767 characters. We're now up to Unicode 5-something, and there's more
than 60,000 characters.

Arial Unicode MS is 24 Megabytes, and the number of compromises to character
shape needed to get it down to that size are considerable. So it has not
grown any recently.

I always drop a copy on my Mac, just so I never get caught wanting a
character I don't have. Given that all my Windows OSes are also on this
Mac, I don't have to move it "far"...

I would be surprised if "anyone" uses the default font any more. Since
about Word 2003, there has not actually BEEN such a concept as a default
font in Microsoft Word. In versions of Word later than 2003, the "Normal"
style is completely "empty", and the suggestion is that we leave it that
way, so it can inherit its settings from the document Theme.

In Word 2010 I notice there are now a set of "Document Defaults" you can
set, but I haven't investigated yet to see what is in them, or what they do.

Times and Times New Roman, as I am sure you know, were commissioned by the
London Times as a way to cram the most possible words on a piece of paper,
back when newsprint was very expensive and readers had plenty of time to
peer at a newspaper with a magnifying glass. It was an ugly font when it
was invented, and time has not been kind to it.

Most corporate documents these days are never printed (or at least, are
never "consumed" on paper). Times New Roman looks really disgusting on
screen.

Thus Microsoft commissioned two new fonts, one to replace Arial (Calibri)
and one to replace Times New Roman (Cambria). Both have the benefit of the
learning that has continued since the London Times invented TNR. Those who
know much more about fonts than I do suggest that both are elegant pieces of
craftsmanship (which, given the Adobe-centric leanings of most typographers,
surprised me...).

However, they both contain advanced typography such as proper ligatures and
hanging punctuation. Both contain the WGL4 glyph set, which is much wider
than Times New Roman (more characters). And both have much better support
for non-English characters.

So they look better, they work better, and everyone has them. So I use
them, and I recommend them.

I suspect that your issue may have more to do with the document itself than
the fonts on your system. If the document was constructed with a
non-Unicode encoding system, then you have to have the fonts specified in
the document on the system, or you don't get the character. It is also
quite possible to run into documents coded in Unicode that make extensive
use of the "Private Use Area". No rules in there: the font vendor can do
whatever they like. The "Apple" symbol is a case in point: it's a Private
Use Area character: it does appear in some Windows fonts, but whether you
see it or not in a document depends rather on which Apple font it was
inserted from.

Anyway, I can't help you. I suspect that if you switch the font in those
documents to SimSun, you'll get your characters. If you have the Microsoft
fonts offered by Office 2008 installed, I think you'll get your characters.

More than that, I can't really say :)

Hope this helps


A couple of other observations:
My PC has both English Windows XP (partition F) and Chinese XP installed
(partition C). Office 2007 is installed on C. A search for "fonts" on C finds
only the C:\Windows\Fonts. In that, among others, are these fonts:
times.ttf, timesbd.ttf, timesbi.ttf, and times.si.ttf
At F:\Windows\Fonts are none of these (by name). But if I drag any of those
from the C Fonts folder to the F Fonts folder, I get "The Times New Roman Bold
[or whatever] is already installed...". That makes me think these Chinese
fonts, although thought by the Windows system to be identical to the Times New
Roman family installed by the English Windows installer, may actually have
some different properties, and maybe I want those properties in the English
versions of Windows, OS 10.4, and OS 10.5. Any thoughts on that?

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
M

MT666

I already know I can change the font to get the characters to appear (not to SimSun, as it makes the English look like Courier). I can change to any font as far as I can tell, such as Times, TNR, or Helvetica-- any of these make the characters readable. The issue still remains that on all three of my computers, a PC and two Macs, using three different English operating systems (Windows XP, OS 10.4, and OS 10.5), when the document is first opened, the characters are not readable. The same documents look fine when first opened in the Chinese Windows. The same documents look fine when first opened with any other application, such as TextEdit, Pages, or NeoOffice. What I want is a solution to this issue for the English versions of Word. If it involves trashing a few fonts or installing some new fonts, I am willing to do it, even though it seems implausible that all three computers are at fault.
 
J

John_McGhie_[MVP]

Yeah, I know what you want. All I can say is "use a different font". That
one is not going to work :)

There's nothing in Word to say "interpret the fonts differently from other
applications".

Three things have to match:
* The encoding standard of the document
* The encoding standard of the font
* The font installed on the computer.

It may help to disable "Match font with keyboard" in the Word preferences,
but I doubt it ‹ I believe the damage has been done in the document before
you get it.

Characters are specified in the document as numbers. If the numbers were
correct, Word would display the character from whatever font on the computer
contains the characters: that's how Unicode works. The numbers are wrong.
Possibly, the document has been encoded in Shifted JIS or BIG5 some such,
using a non-Unicode font that you don't have.

Sorry, but that's all I can tell you.

Cheers


I already know I can change the font to get the characters to appear (not to
SimSun, as it makes the English look like Courier). I can change to any font
as far as I can tell, such as Times, TNR, or Helvetica-- any of these make the
characters readable. The issue still remains that on all three of my
computers, a PC and two Macs, using three different English operating systems
(Windows XP, OS 10.4, and OS 10.5), when the document is first opened, the
characters are not readable. The same documents look fine when first opened in
the Chinese Windows. The same documents look fine when first opened with any
other application, such as TextEdit, Pages, or NeoOffice. What I want is a
solution to this issue for the English versions of Word. If it involves
trashing a few fonts or installing some new fonts, I am willing to do it, even
though it seems implausible that all three computers are at fault.

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
M

MT666

You said "Three things have to match: The encoding standard of the document The encoding standard of the font The font installed on the computer."

As to the first of these, it seems the English version of Word needs a different encoding standard. By that I mean no Chinese version of Word ever has any problem like this -- EVER.
As to the second, do you mean the encoding of the font on the author's computer or mine?
As to the third, my version of Word shows the font of the selected "garbage" to be Times New Roman. That entire family of fonts are not in my /System/Library/Fonts or my /Library/Fonts. How does Word choose a font that isn't a part of the system at all? Yes, I have restarted since removing all of the Times New Roman fonts.
 
M

MT666

Is it possible that Word labels any font it can't identify as "Times New Roman"?
 
M

MT666

WOW! A revelation! I noticed another font in the Word list that is not installed, so I used Spotlight to locate it: It is ON A BACKUP DRIVE AND NOWHERE ELSE! Word has found a way to use any font on any drive attached to the computer. How can I point Word's little head at the fonts that are actually installed and keep it from ferreting out fonts from the far reaches of the universe?
 
J

John_McGhie_[MVP]

You said "Three things have to match: The encoding standard of the document
The encoding standard of the font
The font installed on the computer."

As to the first of these, it seems the English version of Word needs a
different encoding standard. By that I mean no Chinese version of Word ever
has any problem like this -- EVER.

My English versions of Word 2000, 2003, 2004, 2007, 2008, 2010 and one other
don't have the problem here, either...
As to the second, do you mean the encoding of the font on the author's
computer or mine?

On the Author's computer. A document comprises a stream of 16-bit numbers
encoded from the character numbers assigned in the font on the Author's
computer.
As to the third, my version of Word shows the font of the selected "garbage"
to be Times New Roman.

Probably is, but it's a different version to the one on your computer. It's
probably a very old non-Unicode version of Times New Roman. The character
numbers come from the author's computer. Word is looking for the Unicode
numbers, and not finding them.
That entire family of fonts are not in my
/System/Library/Fonts or my /Library/Fonts. How does Word choose a font that
isn't a part of the system at all? Yes, I have restarted since removing all of
the Times New Roman fonts.

Word will continue to show the font the document was encoded with, whether
the fonts exist on the local system or not. If you examine the display
closely, you will see that the character outlines appearing are not
"actually" Times New Roman.

When the font in the text is not available on a Unicode system, Word will
switch in the next closest equivalent font. This is often so seamless that
only a skilled typographer would notice the switch.

If you then take the document back to a computer that does have the required
font, Word will again use the encoded font for its outlines.

Cheers

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
J

John_McGhie_[MVP]

Word will use only the fonts handed to it by your operating system. Word
doesn't know or care where the font is, it simply asks the operating system
for the font.

The system will either hand Word the requested font, or its closest
available equivalent. Word doesn't care what it is or where it came from.

Word also asks the operating system printing subsystem for the metrics with
which to render the font. Again, it doesn't care what they are or where
they came from: to Word, they're all just "numbers".

Cheers


WOW! A revelation! I noticed another font in the Word list that is not
installed, so I used Spotlight to locate it: It is ON A BACKUP DRIVE AND
NOWHERE ELSE! Word has found a way to use any font on any drive attached to
the computer. How can I point Word's little head at the fonts that are
actually installed and keep it from ferreting out fonts from the far reaches
of the universe?

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top