removing paragraph marks in form fields converted to text

R

Richmond

I have hundreds of forms that contain data in the form fields that I need to
convert into a database. I have figured out how to save the contents of a
form field to a text file, but haven't figured out how to remove the hard
carriage returns present within many of the fields.

Once I can get these hard returns removed so that I can cleanly open it into
excel, what I ultimately need to do is figure out how to get the data from
these hundreds of separate word documents into ONE excel file. I'm assuming
macros are going to be the method, but any guidance with that would be most
appreciated as well!

TIA!
Kristin
 
E

Elliott Roper

Richmond said:
I have hundreds of forms that contain data in the form fields that I need to
convert into a database. I have figured out how to save the contents of a
form field to a text file, but haven't figured out how to remove the hard
carriage returns present within many of the fields.

It depends a bit on finding a way of distinguishing the returns you
want from the ones you don't.
I have used find and replace to fix text that had a return at the end
of every line and two returns at the end of a paragraph.
1. find ^p^p (that's two paragraph marks in succession)
replace with /\ or some other string that does occur naturally in the
document.
2. replace all the remaining ^p with a space
3. replace all the /\ with a ^p

Some variation of that might work for your problem.
Once I can get these hard returns removed so that I can cleanly open it into
excel, what I ultimately need to do is figure out how to get the data from
these hundreds of separate word documents into ONE excel file. I'm assuming
macros are going to be the method, but any guidance with that would be most
appreciated as well!
While the hundreds of docs are still plain text, use cat in the
terminal to glue them all together before you hand it to excel?
OK, it is a bit unixy, and you could do the whole job in vi or emacs
but...
 
R

Richmond

The first part worked beautifully (after I unprotected the form), thanks!
I'm afraid I'm not technical enough to understand the second part. It sounds
like you are suggesting some kind of concatenation, but I wouldn't have the
slightest idea where to access or how to use this kind of functionality... :)
 
E

Elliott Roper

Richmond said:
The first part worked beautifully (after I unprotected the form), thanks!
I'm afraid I'm not technical enough to understand the second part. It sounds
like you are suggesting some kind of concatenation, but I wouldn't have the
slightest idea where to access or how to use this kind of functionality... :)
Indeed I am ;-)
Well, I'm far from being a unix giant, but this should work
1. Collect copies of all your text files in a single directory, let's
say ~/Documents/experiments/
2. In terminal, type
cd ~/Documents
then cat experiments/* > biggy.txt

That should make a new file called biggy.txt of all your text files
joined together in your Documents folder.

I'm not sure whether that unadorned command will do the right thing
with returns at the end of each file, it seemed to be OK in a quick
test here.

There will be howls of derision, and better advice from one of the
proper unix mavens who frequent this list.
 
J

John McGhie

If he doesn't want to use cat he could simply use Insert>File in Word to
insert the content of each file onto the end of a document.


Indeed I am ;-)
Well, I'm far from being a unix giant, but this should work
1. Collect copies of all your text files in a single directory, let's
say ~/Documents/experiments/
2. In terminal, type
cd ~/Documents
then cat experiments/* > biggy.txt

That should make a new file called biggy.txt of all your text files
joined together in your Documents folder.

I'm not sure whether that unadorned command will do the right thing
with returns at the end of each file, it seemed to be OK in a quick
test here.

There will be howls of derision, and better advice from one of the
proper unix mavens who frequent this list.

--
Don't wait for your answer, click here: http://www.word.mvps.org/

Please reply in the group. Please do NOT email me unless I ask you to.

John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltd
http://jgmcghie.fastmail.com.au/
Sydney, Australia. S33°53'34.20 E151°14'54.50
+61 4 1209 1410, mailto:[email protected]
 
E

Elliott Roper

John McGhie said:
If he doesn't want to use cat he could simply use Insert>File in Word to
insert the content of each file onto the end of a document.

Indeed he could, but I thought he had hundreds of the little snivellers
;-)

As a PS, there are several inscrutable ways to rename the files to get
them in the proper order before catting them together. Too scary for
me. I'd try GraphicConverter's rename and index features before typing
"man mumble" at the terminal.
I'm still listening for the howls.
And advice on how to derive a filename from an easy to find string in
the content of each?

Oh ok, I'll go embarrass myself on a unix ng.
 
R

Richmond

I tried the cat command at the command prompt (is that what you mean by
terminal, I hope?) and got this message: 'cat' is not recognized as an
internal or external command, operable program or batch file. This was in
response to the command:

C:\Documents and Settings\krichmond\Desktop\JD text files>cat JD text
files/* > biggy.txt

Any thoughts?
Kristin

Also, since you've been so helpful, I thought I would impose on you with one
more question - I was able to get the macro recorded in Word to do the steps
necessary to create the text file, but I need a way to have it name the new
txt file with the same name it has as a word file - instead it is overwriting
the file I used to create the macro. This is the code:
ActiveDocument.SaveAs FileName:="ADMMRCML001JS.txt"

Do you know what syntax I could use to have it name the new file as
itself.txt?
 
E

Elliott Roper

Richmond said:
I tried the cat command at the command prompt (is that what you mean by
terminal, I hope?) and got this message: 'cat' is not recognized as an
internal or external command, operable program or batch file. This was in
response to the command:

C:\Documents and Settings\krichmond\Desktop\JD text files>cat JD text
files/* > biggy.txt

Any thoughts?

I thought you were using a *real* operating system. All that stuff is
unix. Windows isn't.
You *did* post to a Word for Macintosh group.
I should have asked. Microsoft's support pages were designed by the
million monkeys after they finished with Shakespeare but not before
wear and tear damaged their typewriters beyond repair. Lots of PC users
wander in here. Try finding a Word for PC group starting at
http://www.microsoft.com/office/community/en-us/FlyoutOverview.mspx
Also, since you've been so helpful, I thought I would impose on you with one
more question - I was able to get the macro recorded in Word to do the steps
necessary to create the text file, but I need a way to have it name the new
txt file with the same name it has as a word file - instead it is overwriting
the file I used to create the macro. This is the code:
ActiveDocument.SaveAs FileName:="ADMMRCML001JS.txt"

Do you know what syntax I could use to have it name the new file as
itself.txt?
Not off the top of my head. You need to set a variable to the name of
the current input file then use part of it for the output file name.
I'd rummage in the VBA help for an example.
 
R

Richmond

Oops! My screen is just titled Word. I'll try to find a Windows forum.
Thanks again for getting me this far!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top