M
Malcolm Patterson
I am running Word 2003.
I had a single file comprising about 60 copies of a multi-page form. I
converted the file from .pdf to Word, but the result is still cumbersome (and
about 90% double-spaced). Much of the bulk is created by repetition of the
form's questions; I need only each question number and the associated answers
(singlespaced). I have software (Datawatch Monarch) that will parse the
resulting mess into a spreadsheet that is actually useful.
I can do this manually, but it's a wasteful and tedious process.
Knowing that I would have more files of a similar nature, I worked through
the document once using find-and-replace (Ctrl-H). This imposed a limit on
the amount of text I could remove (about 250 characters, I'm guessing).
There was also a risk of removing something important if too small a piece
of text was used in the search argument. For that reason, and because
anything that started a new line would be preceded by a paragraph mark
(remember, this started life as a .pdf), many "^p" codes are found in the
"find" and "replace" arguments.
I recorded about 125 pairs to be applied each time such a document must be
processed (I'm looking at 6 more in my immediate future--hours, not days.)
I saved each pair in a .txt file (alternating _text to find_ with _text to
replace_.
I attempted to record a macro to be edited for this task, but pasting "^p"
in the Find box is recorded as the end of the string, so my arguments were
truncated in the macro code.
I would really like to write a macro that can loop through this list of
changes. I would like to be able to store the list in a Word table or an
Excel spreadsheet, but the need to deal with ^p and ^# codes may complicate
that.
I had a single file comprising about 60 copies of a multi-page form. I
converted the file from .pdf to Word, but the result is still cumbersome (and
about 90% double-spaced). Much of the bulk is created by repetition of the
form's questions; I need only each question number and the associated answers
(singlespaced). I have software (Datawatch Monarch) that will parse the
resulting mess into a spreadsheet that is actually useful.
I can do this manually, but it's a wasteful and tedious process.
Knowing that I would have more files of a similar nature, I worked through
the document once using find-and-replace (Ctrl-H). This imposed a limit on
the amount of text I could remove (about 250 characters, I'm guessing).
There was also a risk of removing something important if too small a piece
of text was used in the search argument. For that reason, and because
anything that started a new line would be preceded by a paragraph mark
(remember, this started life as a .pdf), many "^p" codes are found in the
"find" and "replace" arguments.
I recorded about 125 pairs to be applied each time such a document must be
processed (I'm looking at 6 more in my immediate future--hours, not days.)
I saved each pair in a .txt file (alternating _text to find_ with _text to
replace_.
I attempted to record a macro to be edited for this task, but pasting "^p"
in the Find box is recorded as the end of the string, so my arguments were
truncated in the macro code.
I would really like to write a macro that can loop through this list of
changes. I would like to be able to store the list in a Word table or an
Excel spreadsheet, but the need to deal with ^p and ^# codes may complicate
that.