How does MS Word arrange the inter word space?

L

LeroyLi

I am now working on finding some statistic feature of inter word
space in MS word document. So I need to know the algrithm or policy MS
word employing to determin the space length between two words.Yet, I
find there is not any article or manual descripting that.
I hope some one to give me some hint or resolution on this. I really
need it. I've done my best, but find little.

Thanks!

Leroy
 
P

PeterMcC

LeroyLi wrote in
I am now working on finding some statistic feature of inter word
space in MS word document. So I need to know the algrithm or policy MS
word employing to determin the space length between two words.Yet, I
find there is not any article or manual descripting that.
I hope some one to give me some hint or resolution on this. I really
need it. I've done my best, but find little.

The width of the space is defined in the font.
Worth a look:
http://www.microsoft.com/typography/developers/fdsspec/spaces.htm


Or did you mean the algo for inserting additional space for justification?
 
L

LeroyLi

The width of the space is defined in the font.
Worth a look:http://www.microsoft.com/typography/developers/fdsspec/spaces.htm

Or did you mean the algo for inserting additional space for justification?

--
PeterMcC
If you feel that any of the above is incorrect,
inappropriate or offensive in any way,
please ignore it and accept my apologies.

Thanks for your replying.
I mean that in different lines, word spaces are different. e.g. some
lines may have wide space because the last word is too long to be
include in, but in some lines the space seems narrow.
even in the same line, the space is different depending on the prior
and followed letters. This is very different from Tex, which renders a
nearly same spaces.
I need to how word determine this difference.
Is this defined in the font?
Thanks again


Leroy
 
P

PeterMcC

LeroyLi wrote in
Thanks for your replying.
I mean that in different lines, word spaces are different. e.g. some
lines may have wide space because the last word is too long to be
include in, but in some lines the space seems narrow.
even in the same line, the space is different depending on the prior
and followed letters. This is very different from Tex, which renders a
nearly same spaces.
I need to how word determine this difference.
Is this defined in the font?
Thanks again

You're welcome.

It sounds as though you are talking about the spacing inserted for
justification. The simple (simplistic?) answer is to turn off the
justification. Tex certainly uses the addition of white space to generate
justified text though you can set it to violate the justification for words
that exceed the required line length but where you don't want the word to
break.

Having said that, typesetters will tell you that the Tex justification algo
is a proper and robust method for the handling of the additional white space
whilst Word's algo is less robust but considered, by Microsoft, to be good
enough for office use.

You can get Word to justify with greater finesse by choosing Tools > Options
Compatibility > Do full justification like WordPerfect 6.x for Windows.
And you could speculate on why an improved justification algo should be an
option rather than the default.

If it's a matter of wanting to make a specific line or two look better
because there are obvious insertions of white space, you can often make a
significant difference to the justification, without it being obvious to the
reader, by changing the character spacing for the text on that line.

Format > Font > Character spacing - I wouldn't go to more than plus or minus
10%

That's OK for the odd line but you wouldn't want to tidy up a long document
that way.

HTH
 
S

Suzanne S. Barnhill

And don't overlook the importance of judicious hyphenation to reduce excess
white space.
 
L

LeroyLi

LeroyLi wrote in
<[email protected]>







You're welcome.

It sounds as though you are talking about the spacing inserted for
justification. The simple (simplistic?) answer is to turn off the
justification. Tex certainly uses the addition of white space to generate
justified text though you can set it to violate the justification for words
that exceed the required line length but where you don't want the word to
break.

Having said that, typesetters will tell you that the Tex justification algo
is a proper and robust method for the handling of the additional white space
whilst Word's algo is less robust but considered, by Microsoft, to be good
enough for office use.

You can get Word to justify with greater finesse by choosing Tools > Options> Compatibility > Do full justification like WordPerfect 6.x for Windows.

And you could speculate on why an improved justification algo should be an
option rather than the default.

If it's a matter of wanting to make a specific line or two look better
because there are obvious insertions of white space, you can often make a
significant difference to the justification, without it being obvious to the
reader, by changing the character spacing for the text on that line.

Format > Font > Character spacing - I wouldn't go to more than plus or minus
10%

That's OK for the odd line but you wouldn't want to tidy up a long document
that way.

HTH

--
PeterMcC
If you feel that any of the above is incorrect,
inappropriate or offensive in any way,
please ignore it and accept my apologies.- Hide quoted text -

- Show quoted text -

I think I haven't make my problem clear enough for you. What I need is
not how to get the MS Word work better but the mechanism which MS Word
uses to justify the spaces in a line. The layout model the MS Word
uses. Just like Tex uses glue mechanism to justity the space between
words, what does Word use?
 
R

Robert M. Franz (RMF)

Hello Leroy

LeroyLi wrote:
[..]
I think I haven't make my problem clear enough for you. What I need is
not how to get the MS Word work better but the mechanism which MS Word
uses to justify the spaces in a line. The layout model the MS Word
uses. Just like Tex uses glue mechanism to justity the space between
words, what does Word use?

Nah, your description was fine.

It's just that nobody in here has come upon a clear _authorized_
description (that only MSFT could give). Doesn't mean there isn't one,
but it's a strong indicator. microsoft.com is a huge place, though.
Maybe you find something (and hopefully share it with us if you do :)).

The default justification algorithm's workings are pretty simply,
though: it adds words after each other, with whatever standard space
width is in a font, until one word is too long to fit on the page. Word
then brings the whole word (or parts of it, if it can through hyphens or
hyphenation) to the new page. And expands the remaining spaces on the
line, equally.

Very simple. And brings pretty bad results if you don't hyphenate a lot!

The compatibility option Peter mentions [BTW: I'm not sure it's needed
anymore in Word 2007] does it a lot better, IMHO. If you activate it,
then upon noting a Word doesn't fit anymore on the line, Word starts to
_reduce_ the space width in the whole line to make it fit. I have no
idea what the threshold is (how thin it will make it, I suspect
something around 2/3 of a normal space width, but that's a really wild
guess). When the word suddenly fits like this, the next word starts the
new line. If the threshold is met, the whole word jumps to the next
line, and Word spacing expands as before.

Not much more intelligent, but the results are far better IMHO.

Now, it's still a very "dumb" algorthm, sort of, since it doesn't look
back and forth in the paragraph. And the automatic hyphenation approach
is not cleverer.

I have no idea how LaTeX does it, but it usually looks better.

That's why, in Word, I prefer to manually hyphenate texts where I can
justify the time for a proper pagination. [This includes texts w/o
horizontal justification, btw.]

2cents
Robert
 
L

LeroyLi

Hello Leroy

LeroyLi wrote:

[..]
I think I haven't make my problem clear enough for you. What I need is
not how to get the MS Word work better but the mechanism which MS Word
uses to justify the spaces in a line. The layout model the MS Word
uses. Just like Tex uses glue mechanism to justity the space between
words, what does Word use?

Nah, your description was fine.

It's just that nobody in here has come upon a clear _authorized_
description (that only MSFT could give). Doesn't mean there isn't one,
but it's a strong indicator. microsoft.com is a huge place, though.
Maybe you find something (and hopefully share it with us if you do :)).

The default justification algorithm's workings are pretty simply,
though: it adds words after each other, with whatever standard space
width is in a font, until one word is too long to fit on the page. Word
then brings the whole word (or parts of it, if it can through hyphens or
hyphenation) to the new page. And expands the remaining spaces on the
line, equally.

Very simple. And brings pretty bad results if you don't hyphenate a lot!

The compatibility option Peter mentions [BTW: I'm not sure it's needed
anymore in Word 2007] does it a lot better, IMHO. If you activate it,
then upon noting a Word doesn't fit anymore on the line, Word starts to
_reduce_ the space width in the whole line to make it fit. I have no
idea what the threshold is (how thin it will make it, I suspect
something around 2/3 of a normal space width, but that's a really wild
guess). When the word suddenly fits like this, the next word starts the
new line. If the threshold is met, the whole word jumps to the next
line, and Word spacing expands as before.

Not much more intelligent, but the results are far better IMHO.

Now, it's still a very "dumb" algorthm, sort of, since it doesn't look
back and forth in the paragraph. And the automatic hyphenation approach
is not cleverer.

I have no idea how LaTeX does it, but it usually looks better.

That's why, in Word, I prefer to manually hyphenate texts where I can
justify the time for a proper pagination. [This includes texts w/o
horizontal justification, btw.]

2cents
Robert
--
/"\ ASCII Ribbon Campaign | MS
\ / | MVP
X Against HTML | for
/ \ in e-mail & news | Word

Thanks for your replying.
I agree that Word will expand the remaining space equally. Yet there
remains a problem: how dose Word determin the original spaces between
two words when there is no need to expand spaces?
I have worked on this these days, and guessed that these original
spaces is defined in the TrueType font which is the MS' default font.
Now I need to read the Times New Roman font file to find the original
spaces. Oh, God! I need to find some introduction or specification to
help me. This is really a heavy work.
 
J

Jonathan West

I agree that Word will expand the remaining space equally. Yet there
remains a problem: how dose Word determin the original spaces between
two words when there is no need to expand spaces?
I have worked on this these days, and guessed that these original
spaces is defined in the TrueType font which is the MS' default font.


Not quite. It is defined by the space character of whatever font is
presently in use. So unexpanded space varies according to the font face and
size.
Now I need to read the Times New Roman font file to find the original
spaces. Oh, God! I need to find some introduction or specification to
help me. This is really a heavy work.

Not just TNR. Whichever font is in use.

You can get the width of a space relatively easily in Word. You can use the
Selection.Information(wdHorizontalPositionRelativeToPage) property to get
the position of the cursor when positioned to the left of the space, and
again with the cursor positioned to the right of the space. The difference
between the two values is the width of the space in points. You will find
that the space width does differ depending on which font the text is
formatted with.


You could build a library of space widths for various fonts and use that
rather then calculate every time. You could also get the width of space for
a large font size (e.g. 72pt) and divide down for smaller sizes of the same
font.
 
M

macropod

Hi all,

FWIW, I posted some code recently in microsoft.public.word.printingfonts to extract the (standard) character widths for the ASCII
character set.

Cheers
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top