How can something so seemingly simple be so difficult to fix?!
I certainly wish it were simpler, but the "how" boils down to
"different software has different target groups, has been designed
and implemented by different people at different times, sometimes
with no good standards to work to"
I've posted a shortened version of this message in reply to your most
recent posts...
However, you may find the following helpful (and Graham - if you
think it is worth incorporating any of this in your web pages, please
do so: I'll probably also put it into my t0003 page about
Excel-related problems. However, the following results from a fairly
quick look, not years of experience)
Generally speaking, switching to the DDE approach that Graham has
mentioned will sort out a lot of problems using Excel data sources.
It doesn't work in all cases (e.g. if you do not actually have Excel
on your system, the data is not in the first sheet of the workbook,
or you need to use "unusual" Unicode characters) but in many cases
there is a simple way of dealing with that (e.g. if the sheet isn't
the first in the book, copy the workbook and delete all the sheets
before the one you want).
1. In Excel, you can enter, store and format, ZIP codes in a number of
different ways, including:
a. If you enter 5-digit ZIPs by entering the 5 digits, Excel will
save the data as a number with "General" formatting and display the
number, right justified, with no leading zeroes. If it is possible to
have ZIP codes with leading zeroes, you can ensure that you always
see all 5 digits by applying the ZIP format, which is a numeric
format: 00000.
b. If you enter 9-digit ZIPs by entering the 9 digits without the
hyphen, Excel will save the data as a number with "General"
formatting and display the number, left justified, with no leading
zeroes. If you want to display them with the hyphen, you can apply
the ZIP+4 format which is a numeric format: 00000\-0000 (i.e.
applying the format does not change the stored number into text)
c. If you enter 9-digit ZIPs by entering 5 digits, a hyphen, then 4
digits, Excel will save the data as text with "General" formatting
and display what you entered, right justified.
d. If you format your cells as text, what happens depends on when and
how you do it:
i) Once you have entered a number in a cell that is not formatted
as text, Excel stores it as a number, even if you apply Text
formatting to the cell.
ii) If you format a cell as text, /then/ enter a number, Excel
stores the number as text (actually, I am by no means sure that this
is always the case). But it also flags a warning using one of its
red(?) corner markers. iii) If you already have numbers in cells in
a column and you want to turn them into text format, you can do it,
but not by applying text formatting - you can
- select the column
- select the Data|Text to Columns... menu option
- click Next through the wizard until you reach Step 3 of 3,
then select Text as the Column Data Format.
As you can imagine, it is not all that easy to tell the difference
between the different things merely by looking at them. Typically,
the 9-digit number in (b) will be right-aligned even when it is
displayed with the "-", and the text in (c) will be left-aligned, as
long as you have not explicitly aligned the result.
Also, applying either of the ZIP or ZIP+4 formats to the entire
column is not much help because
e. if you apply the ZIP format to 9-digit ZIPs, you just see a 9-digit
number with no hyphen
f. if you apply the ZIP+4 format to 5-digit ZIPs, the ZIP12345 will
appear as 00001-2345. Pretty useless, in fact.
2. When Word gets the data from Excel,
a. DDE will retrieve the data /as you see it/ It doesn't matter
whether you see 12345-6789 because you entered 12345-6789 (as in (c)
or applied a ZIP+4 format to 123456789 (as in (b)).
b. the default method Word 2002 and later use to get data from Excel
(an "OLE DB provider") gets the data /as it is stored/, but with a
twist. i.e., if you entered 123456789, Word will retrieve 123456789
whether or not it is formatted as a ZIP+4. If you entered 12345-6789,
Word will retrieve "12345-6789".
The twist is that when it gets the data, the OLE DB provider tries to
assign a data type to the entire column. In other words, it sees the
whole column as numeric, or as text. Roughly speaking,
c. if you entered all your ZIPs as numbers and applied ZIP and ZIP+4
formatting, the provider will decide that the column is numeric
d. if all your ZIPs are 9-digit ZIPs that you entered as per 1(c), the
provider will decide that the column is text
e. if there is a mix of types 1(a), 1(b) and 1(c), the provider will
decide on the contents of the first 8 cells in the ZIP code column.
If they are /all/ numbers (types 1(a) and 1(b), the provider will
decide that the whole column is numeric. If /any/ of them are texts,
the provider will decide that the whole column is text.
Then the provider has to decide what to do about numeric values in a
text columne, and text values in a numeric column
f. If the provider decides that the column is numeric, and there are
no type 1(c) ZIPs, then we are OK in the sense that all the ZIP data
gets through to Word as 5- or 9-digit numbers
g. If the provider decides that the column is text, we are OK because
the provider then appears to use the display text for all the cells.
As long as they all look like 5 or 9-digit ZIPs, that's what Word
will see i. However, if the provider decides that the column is numeric,
any
1(c) type ZIPs will be lost as they are passed to Word as the numeric
value 0.
3. If you got this far, it is perhaps worth asking
a. How could we avoid situation 2(i)?
b. If we manage to avoid situation 2(i), how can we display all the
ZIP codes in Word correctly?
4. I /think/ you can fix 3(b) using the fields I suggested elsewhere,
i.e.
{ IF { QUOTE "{ MERGEFIELD zip }99999" \#0 } < 0
"{ MERGEFIELD zip }"
"{ IF { MERGEFIELD ZIP } > 99999
"{ MERGEFIELD ZIP \#"00000'-'0000" }"
"{ MERGEFIELD ZIP \# "00000" }" }" }
There could well be a simpler formulation but at the moment I am only
trying to think of /a/ way to do it.
5. As for 3(a), it might be better to ask an Excel expert (although I
am not convinced that they have to cope with this either unless they
want to use their data in a merge). A lot depends on how you are
acquiring your data (is it in spreadsheets prepared elsewhere, or by
you?), whether you prefer trying to attain consistency during data
entry or whether you prefer to post-process any column that contains
ZIPs, and whether you prefer enforcement (which probably requires a
lot of code that you could do without) or encouragement to conform to
standards.
Because of the problem that the OLE DB provider can have with mixed
data types, I think that any solution should aim to end up with an
Excel ZIP column having either /all numeric/ ZIPs or /all text/ ZIPs.
Once the data has been entered and there is potentially a mix of
numeric and text data, I think the only way to achieve this is to use
approach 1(d)iii above - i.e. change the entire column to text.
5-digit numeric ZIPs will become 5-digit texts, 9-digit numeric ZIPs
formatted as ZIP+4 will become 9-digit texts, and 5-4 text ZIPs will
remain as 5-4 texts. You would need the complex Word field code in
(4) to deal with that column correctly.
If you prefer to stick to numeric codes, a good approach might be to
try to encourage people to enter 5 digits or 9 digits and avoid
entering 5-4 texts. There are a couple of ways you could consider
doing that, e.g. a. Select the ZIP column, go to Format|Cells|Number,
select Custom,
and enter something like the following (or select it if it is already
there): [<100000]00000;[>99999]00000-0000;[Red]"wrong" @
b. Format the column header as General or Text
Numeric 5 and 9-digit ZIP codes should appear correctly, and any Text
ZIPs should appear in Red with "wrong " in front. Or...
c. use Excel conditional formatting (which lets you apply colour etc.
to the cell).
Or I suppose you could decide to enter all ZIPs in two columns - 5
digits in the first, and 4 digits in the second, blank if it's a
5-digit zip. Then piece the ZIP code back together in Word, e.g.
{ MERGEFIELD zip5 \#00000 }{ MERGEFIELD zip4 \#"'-'0000" }
And so on...
MrsMac said:
Thanks -- it worked, sort of. I get five-digit zips instead of
four-digit ones. But it doesn't merge the nine-digit ones. I get
five zeros instead. Is
this because I have the field formatted as Zip Code instead of Zip
Code + Four? When I change the format to Zip + Four, all my
five-digit zips turn into five zeroes, a hyphen, then FOUR numbers
of the five-digit zip code. My
database is over 5000 entries -- too many to go through and change
it all by
hand -- there's got to be a better way!
How can something so seemingly simple be so difficult to fix?!