Getting Product Titles Out

R

rothrock42

I'm working on a catalog and have several people writing descriptions
using Word vX on the Mac. I need to get a list into excel of which ones
have been done. They are using style sheets to mark the titles of the
entries in a format like this:

Code-number-rating The Name of the Item is Here

I'd like to create a new document or go directly into excel with the
info. Ideally it would end up formated like this:

Code[tab]number[tab]rating[tab]The Name of the Item is Here[paragraph]

Do I need a macro for this? If so is it something I (a regular person
with a little bit of Actionscript in Flash programing and some Pascal
way, way, way, back in college) am capable of learning in a couple
hours? Any guidance would be great. Thanks.
 
D

Daiya Mitchell

I'm not much of a coder, so someone else may come along with something more
sophisticated, but here's one approach:

Use Find to select all text in a certain style. (I assume the title of each
description is formatted in an individual style, e.g. ItemTitle?) To do
this, in the Find dialog, click on the blue button to expand it. Use the
Format menu to format the empty Find box in a particular style. Check the
"highlight all items found" to select all the found text.

Copy and Paste that selection into a new document.

If the Code, Number and Rating have some standard format, you could use
wildcards to insert the tabs--e.g., you tell Find and Replace to Find all
sets of 6 digits followed by a space, and replace that space with a tab.

Once the tabs and paragraphs are in place, I think it will copy straight
into excel in columns and lines.

If this workflow works for you, then the two Find/Replace operations can
easily be saved as macros for one-click access. It should also be possible
to code the copy and paste steps, and all that could be combined into one
code. However, you should test the workflow manually to make sure it meets
your needs before turning it into a one-click macro.

For actual code, you'll need to state exact OS and Office version numbers,
and give more information about the setup of the document. As I said, this
approach depends on the existing format of the title and the
code/number/rating information.


I'm working on a catalog and have several people writing descriptions
using Word vX on the Mac. I need to get a list into excel of which ones
have been done. They are using style sheets to mark the titles of the
entries in a format like this:

Code-number-rating The Name of the Item is Here

I'd like to create a new document or go directly into excel with the
info. Ideally it would end up formated like this:

Code[tab]number[tab]rating[tab]The Name of the Item is Here[paragraph]

Do I need a macro for this? If so is it something I (a regular person
with a little bit of Actionscript in Flash programing and some Pascal
way, way, way, back in college) am capable of learning in a couple
hours? Any guidance would be great. Thanks.
 
R

rothrock42

Yes the style is applied consistently to only those items. I was
thinking something similar for the tabs, but the number of characters
is variable. But they are always seperated by hypens so I should be
able to do something like "select up to the first space and within that
selection replace each hypen with tab."

These documents will be getting longer and longer so I was hoping to
find a solution that allows me to just do the whole document at once.
And I'm not sure how to do that part with the record macro. Won't it
record that I make a new document and do that each time?

I'm on Mac OS 10.3.9 using Word X for Mac Service Release 1
 
D

Daiya Mitchell

Yes the style is applied consistently to only those items. I was
thinking something similar for the tabs, but the number of characters
is variable. But they are always seperated by hypens so I should be
able to do something like "select up to the first space and within that
selection replace each hypen with tab."

It sounds like the hyphens and the Code/Name/Rating are also in the title
style. You could easily tell Word to find hyphens only in that style, and
replace those hyphens with a tab. However, if the item title uses a hyphen,
it could cause problems.
These documents will be getting longer and longer so I was hoping to
find a solution that allows me to just do the whole document at once.

I am not sure what you mean. This approach will do the entire document at
once, in 4 pretty quick steps. At least 3 of those steps can easily be
combined, but there is no point in combining them into a single-click macro
if they haven't been tested by doing it manually. If you only need to run
this on a document once, it is probably not worth coding it, really. It is
only worth turning into a macro if you need to do it over and over again.

However, now I'm a little confused. I was picturing you getting new catalog
material from several people, at repeated intervals, and collating the
titles from these several documents into one excel file. Is that not the
case? If it's a single document, and it is changing, and you want to only
extract the product titles that have been added since you last extracted
titles, that could be tricky. Explain more, please.
And I'm not sure how to do that part with the record macro. Won't it
record that I make a new document and do that each time?

Since the new document is only a transition step between the original
document and excel, I don't see why it matters if it creates a new doc each
time. That doc can be thrown away once the paste to excel is done. By
using a transition document to temporarily hold all the titles, you can
replace all the hyphens without worrying about catching a hyphen in the
catalog description text. You can also then clear all the formatting before
putting the titles into Excel.
 
R

rothrock42

Sorry if I'm making it less clear. Everything you are saying makes
sense. I have recorded the macro. Here is my reply to what you have
said above. But each of my problems is inherent in the macro as well.
First my response, then the problems.

There will be titles that have hypens in the names so I can't replace
all hypens with tabs.

Your original idea was correct. Twice a week I will get new lists. They
will not be appending to the lists. It is just as they get better at it
I will get lists with hundreds of items instead of the 60 or so I have
for my first list.

I also intended the new document to be a temporary step as well.

So here is the Macro as recorded:

Sub Macro1()
'
' Macro1 Macro
' Macro recorded 5/15/06 by Barbara Rothrock
'
Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("LotTitle")
With Selection.Find
.Text = ""
.Replacement.Text = "^t"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute
Selection.Copy
Windows("Document2").Activate
Selection.Paste
Selection.MoveUp Unit:=wdLine, Count:=1
Selection.MoveRight Unit:=wdWord, Count:=5, Extend:=wdExtend
Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("LotTitle")
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "-"
.Replacement.Text = "^t"
.Forward = True
.Wrap = wdFindAsk
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.MoveRight Unit:=wdCharacter, Count:=1
Selection.TypeBackspace
Selection.TypeText Text:=vbTab
Selection.MoveDown Unit:=wdLine, Count:=1
Windows("INCat.doc").Activate
End Sub

***END Macro****BEGIN Problems in order of concern

1. I have to execute this over and over again to pull each line. It
would be great to have it just keep going until it had exhausted every
occurrence of the "LotTitle" style.

2. After I've pasted the text into the new file I select the first part
of the text and do a find/replace hypen to tab. After it does the find
replace it pops up a dialog "Word has finished search the selection. 2
replacements were made. Do you want to search the remainder of the
document?" I had recorded me saying "No", but it didn't record. I don't
want to have to answer this for each entry. How do I make it go away.

3. What if the secondary document isn't called "Document2"

4. What if the original document isn't called "INCat.doc" Okay, okay, 3
and 4 are really the same thing!
 
R

rothrock42

Anybody have any solutions for the four issues? Or can direct me to
where I can learn to fix it myself? Especially part 1. I've got over a
hundred of these to pull out from one document and that takes a long
time.

1. I have to execute this over and over again to pull each line. It
would be great to have it just keep going until it had exhausted every
occurrence of the "LotTitle" style.

2. After I've pasted the text into the new file I select the first part

of the text and do a find/replace hypen to tab. After it does the find
replace it pops up a dialog "Word has finished search the selection. 2
replacements were made. Do you want to search the remainder of the
document?" I had recorded me saying "No", but it didn't record. I don't

want to have to answer this for each entry. How do I make it go away.

3. What if the secondary document isn't called "Document2"

4. What if the original document isn't called "INCat.doc" Okay, okay, 3

and 4 are really the same thing!
 
J

John McGhie [MVP - Word and Word Macintosh]

Yeah, OK... There are people here who have solutions. I'm one...

The hassle we're having is that *learning* to do this stuff in code is quite
a project; we would not really suggest that unless you are already a
competent coder (no, you won't learn it in a couple of WEEKS). On the other
hand, doing it *for* you goes beyond the bounds of what we would do for
someone for "free" (unless you're very good looking ...)

First, let's simplify the problem. I assume that "LotTitle" is the only
style you are interested in, and that there is intervening text in the
document between the title paragraphs. So you want the content of each
paragraph formatted in LotTitle in a spreadsheet?

OK, so create a document and insert a table of contents that has as its
entry only one style: LotTitle. Look up "Field codes: TOC (Table of
Contents) field" in the Help. You need a TOC field like this:

{ TOC \o "LotTitle,1" \n }

The \n switch takes the page numbers off for you, because you don't want
them, you just want a "list".

Now, call in all of your other documents with RD fields. Look up "Field
codes: RD (Referenced Document) field" in the Help:

{ RD " INCat.doc" }{ RD " INCat.2doc" }{ RD " INCat3.doc" }{ RD "
INCat4.doc" }

The other source documents must all be saved and in the same folder as your
master file. If they're not, you need to specify the full path name in each
of those RD fields.

Generate your Table of contents. Copy just the TOC and paste it into a new
document. Unlink the field (see " Prevent changes to information inserted
by a field"). You should have just a list of the paragraphs formatted with
LotTitle style and no page numbers.

Save this as a plain text file. Use the Excel Text Import Wizard to bring
it into a spreadsheet as "delimited" text (see "Use a text file as a data
source" in the Excel help. Specify a hyphen as the delimiter.

Every now and again, you will find a hyphen in the title which will split
the title over two columns. I would simply eye-ball to find those and merge
the cells where it happens.

I think that's done the grunt work for you, without a macro.

Your problem is that you are using the same delimiter to have two different
meanings.

If you hit the users hard enough to persuade them to use real tabs when they
mean a delimiter, or to use an em-dash when they mean a hyphen, then you
won't have to fiddle around. However, if you have a LOT of them, then
you'll have to sort it out in Excel.

(You should be able to do this in Word, but unfortunately Mac Word has a bug
in it that hangs it if you attempt a replace in a vertical column
selection!)

The column to the right of the Title column should be blank. If there's a
hyphen in the title, Excel will place the remainder of the title in the
column to the right of the title.

Make a formula in Excel that concatenates the two if that happens. For
example, let's assume you have the Title in column D. If there's a hyphen
in the title, that title will spill to column E. Insert a column to the
left of D, and paste in a formula such as =IF(F1<>"",E1 & " " & F1,E1)

Fill down, and Excel will concatenate the cell content from the cells that
have tabbed across for you.

Of course, you could extend the formula to allow for the presence of more
than one hyphen in the title. Or hit users who typed that twice as hard...

There we go, job done without a line of code :)

Cheers

Anybody have any solutions for the four issues? Or can direct me to
where I can learn to fix it myself? Especially part 1. I've got over a
hundred of these to pull out from one document and that takes a long
time.

1. I have to execute this over and over again to pull each line. It
would be great to have it just keep going until it had exhausted every
occurrence of the "LotTitle" style.

2. After I've pasted the text into the new file I select the first part

of the text and do a find/replace hypen to tab. After it does the find
replace it pops up a dialog "Word has finished search the selection. 2
replacements were made. Do you want to search the remainder of the
document?" I had recorded me saying "No", but it didn't record. I don't

want to have to answer this for each entry. How do I make it go away.

3. What if the secondary document isn't called "Document2"

4. What if the original document isn't called "INCat.doc" Okay, okay, 3

and 4 are really the same thing!

--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410
 
R

rothrock42

Amazing. Simple. Beautiful.

Thank you. This makes perfect sense and will make what I need to do
quite simple.
 
R

rothrock42

Okay, poked around a bit and figured out how to get the TOC and RD
fields. After reading the help file for TOC I think I should use the \t
switch instead of the \o. Is that right?

I'm getting an error message. "Error! Cannt open file Referenced on
page 1." The document referenced in the RD is saved in the same
location location as the TOC document.

I tried the name in the RD with both the .doc ending and without.

Also will it just mysteriously expand to create the TOC or do I need to
do something? The directions seem quite vague on the generating part.
 
J

John McGhie [MVP - Word and Word Macintosh]

Hi:

Yeah, the Help could be more expansive at times...

Yes, you are correct: if you are using non-built-in style names you need the
\t switch, not \o. My bad..

To update a TOC click in the TOC and hit F9 to regenerate it. If you do not
see the entries, right-click it and Toggle Field Codes.

However, if you are seeing that error message, it "has" regenerated and it
can't find the external document.

If you do not wish to fiddle around with RD fields, there is no reason you
can't build this TOC directly in the document they send you, of course :)
Just "Cut" it out instead of "Copying". Before you Cut, make sure you
"unlink" the field, otherwise when you "Paste", it may try to regenerate in
the blank document and you'll lose everything.

Check your RD field carefully: they're a nasty old mechanism and you need
character-for-character accuracy to get them working. The good news is that
once you get them working, they stay that way :)

* The presence or absence of "spaces" is critical. Check the beginning and
end of the string.

* If you have spaces within your names, make sure you quote the entire
string, including the path name.

* Mac file names are case-specific: check the capitalisation of your file
name.

* If you HAVE an extension on your file name, you MUST specify it in the RD
field. If you do NOT, you must NOT.

The way Windows and Mac work with file name extensions is a bit weird: a
hold-over from earlier days. Neither operating system *requires* extensions
now. But UNIX does! So while you can get away with running without
extensions on Windows, it's not safe on a Mac: there will be times when it
will unpredictably fail to work right :) Actually, I can think of an
instance where it will sometimes fail on Windows, too...

On both operating systems these days, the file name is taken to be
"everything in the string, including the dot and anything that follows it."
So the file name and extension are not two different things any more:
there's only the "file name" and the extension is a significant part of it.

Same with the "case" of letters in file names. Windows is case-agnostic: it
preserves the case of letters, but ignores it when retrieving files. UNIX
is NOT, it is case-specific always.

Hope this helps

Okay, poked around a bit and figured out how to get the TOC and RD
fields. After reading the help file for TOC I think I should use the \t
switch instead of the \o. Is that right?

I'm getting an error message. "Error! Cannt open file Referenced on
page 1." The document referenced in the RD is saved in the same
location location as the TOC document.

I tried the name in the RD with both the .doc ending and without.

Also will it just mysteriously expand to create the TOC or do I need to
do something? The directions seem quite vague on the generating part.

--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410
 
J

John McGhie [MVP - Word and Word Macintosh]

Just looked at the example I sent you...

I see I sent ... { RD " INCat.doc" }

That's wrong: It should have been { RD "INCat.doc" } (no leading space...)

It's a pain trying to copy out of Microsoft Help on the Mac. You always get
a leading space, and I need to remember to check for it...

Sorry about that...


Okay, poked around a bit and figured out how to get the TOC and RD
fields. After reading the help file for TOC I think I should use the \t
switch instead of the \o. Is that right?

I'm getting an error message. "Error! Cannt open file Referenced on
page 1." The document referenced in the RD is saved in the same
location location as the TOC document.

I tried the name in the RD with both the .doc ending and without.

Also will it just mysteriously expand to create the TOC or do I need to
do something? The directions seem quite vague on the generating part.

--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 (0) 4 1209 1410
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top