Modifying a Word document without using Word Automation

  • Thread starter Michael G. Schneider
  • Start date
M

Michael G. Schneider

I know that using Word Automation in a server side process, as for example
inside an ASP page, is no good idea.

Anything I want to do in the current project is: open document, change some
text, save and close document. Basically changing some variables, consting
of a name embraced by special chars, to some value. As for example: change
"[FirstName]" to "Michael".

Does anybody know whether there is a way for achieving this with basic "file
input / output". Can I regard a Word document as some binary data, perform
the replacement, and save the data, without destroying Word's internal
structure?

Michael G. Schneider
 
J

Jonathan West

Michael G. Schneider said:
I know that using Word Automation in a server side process, as for example
inside an ASP page, is no good idea.

Anything I want to do in the current project is: open document, change some
text, save and close document. Basically changing some variables, consting
of a name embraced by special chars, to some value. As for example: change
"[FirstName]" to "Michael".

Does anybody know whether there is a way for achieving this with basic "file
input / output". Can I regard a Word document as some binary data, perform
the replacement, and save the data, without destroying Word's internal
structure?

Basic File I/O is probably a non-starter for this. The Word binary file
format is exceedingly complex. Microsoft does not publish the format, though
I believe it is released to selected developers who demonstrate a need for
it.

There are two possibilities which spring to mind which might be feasible.

1. Use RTF or HTML format
If you save the document either in RTF or HTML format, you can work out
where the places are which you can modify. This has the disadvantage of not
using the native file format for Word, but has the advantage that HTML and
RTF are published formats, both of which can be read and manipulated as text
files.

2. Modify document properties using dsofile.dll
In the Word document, you can create DOCPROPERTY fields for the places you
want to be able to modify, and create matching custom document properties.
Then you can use dsofile.dll from VB or ASP to read and write the
properties. The overall idea is that if you modify the value of the
appropriate property, then the text in the matching DOCPROPERTY field also
changes next time the document is opened.

dsofile.dll is a free download from Microsoft. There is some sample Word
code which uses dsofile to read standard properties in the following
article. It shouldn't be too hard to work out how to do something similar
from ASP.

Getting access to the Document Properties of a Word file
http://www.mvps.org/word/FAQs/MacrosVBA/DSOFile.htm

The article also contains a lisk to where dsofile can be downloaded.
Although the code in the article only reads built-in properties, it should
be fairly clear how to extend it to cover writing properties and using
custom properties. There is a sample VB application that comes with the
download from Microsoft as well.
 
M

Michael G. Schneider

Thank's a lot for the quick answer.
1. Use RTF or HTML format

In a former project I used RTF files for this purpose. I may be wrong, but I
think saving in RTF might lead to less capabilities in Word.

But something, which had been really a trouble, was the following: I did not
find an ActiveX DLL for reading / writing RTF files. So I did the IO myself.
As I did not want to write a full RTF parser, I simply tried to find my
variables inside the RTF stream. However, there were situations, where a
variable name was cut into pieces by matching curly braces. So if the
variable name was "[XYZ]" inside the Word document, inside the RTF a
"[X}{YZ]" could be found.
2. Modify document properties using dsofile.dll

This is a good idea. If the project started from scratch, this probably
could be used. However, in this project there are about 300 existing Word
documents. All variables are already marked by visible text (as for example
in "[FirstName]"). All documents would have to be modified - something I
would like to avoid.

Michael G. Schneider
 
J

Jonathan West

Michael G. Schneider said:
Thank's a lot for the quick answer.
1. Use RTF or HTML format

In a former project I used RTF files for this purpose. I may be wrong, but I
think saving in RTF might lead to less capabilities in Word.

But something, which had been really a trouble, was the following: I did not
find an ActiveX DLL for reading / writing RTF files. So I did the IO myself.
As I did not want to write a full RTF parser, I simply tried to find my
variables inside the RTF stream. However, there were situations, where a
variable name was cut into pieces by matching curly braces. So if the
variable name was "[XYZ]" inside the Word document, inside the RTF a
"[X}{YZ]" could be found.
2. Modify document properties using dsofile.dll

This is a good idea. If the project started from scratch, this probably
could be used. However, in this project there are about 300 existing Word
documents. All variables are already marked by visible text (as for example
in "[FirstName]"). All documents would have to be modified - something I
would like to avoid.

If the layout of the marked text is consistent (e.g. always enclosed in
square brackets), then it wouldn't be too big a job to write a VBA macro
that runs through the batch, searching for the appropriate text, and for
each item it finds, creates a suitable document property using the same name
as the found text (e.g. FirstName for the [FirstName] text) and replaces the
text with a matching DOCPROPERTY field.

Once you had written the macro, 300 documents could probably be run through
in an hour or so, depending on their size and the speed of your machine.
 
N

nzjrs

The article also contains a lisk to where dsofile can be downloaded.
Although the code in the article only reads built-in properties, it
should
be fairly clear how to extend it to cover writing properties and
using
custom properties. There is a sample VB application that comes with
the
download from Microsoft as well.


--
Regards
Jonathan West - Word MVP
http://www.multilinker.com
Please reply to the newsgroup [/B]

I really need to be able to read other document variables, not just the
built in word ones. Does anyone know how to acomplish this using the
DLL?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top