T
Tim Mavers
I am trying to find the best solution to programatically replacing text in a
word document (using word 2003)? Right now we allow users to enter
specific 'content tags' directly in Word by entering things like @TAGNAME@.
Later in the process we (programatically) run through the document via Word
(using COM) which is very slow and error prone. We search for all these
content tags and replace them with the real values. Some of these tags are
more complex than these (and support heirarchies) so that is why we came up
with using content tags.
I know Word has much better XML support these days, but I have not really
used it. I am trying to figure out a way that we can save these word doc as
XML and then parse through it and transform it using XSLT? Right now our
process is very slow as we have to instantiate Word, use the old Word COM
object model (which is not very friendly with .NET).
I tried a simple export of the word doc in XML and it contained a lot of
extra XML that I wasn't expecting. More importantly, depending on how
things were entered, sometimes our custom tags (i.e. @TAGNAME@) was split up
over several XML nodes. In other words (I am paraphrasing), in XML,
sometimes it woudl look like this:
<a:blah>@</a:blah><a:blah2>TAGNAME</a:blah2><a:blah>@</a:blah>
rather than having @TAGNAME@ be contiuous:
<a:blah>@TAGNAME@</a:blah>
I couldn't figure out where the a:blah stuff came from. Any ideas?
Thanks!
word document (using word 2003)? Right now we allow users to enter
specific 'content tags' directly in Word by entering things like @TAGNAME@.
Later in the process we (programatically) run through the document via Word
(using COM) which is very slow and error prone. We search for all these
content tags and replace them with the real values. Some of these tags are
more complex than these (and support heirarchies) so that is why we came up
with using content tags.
I know Word has much better XML support these days, but I have not really
used it. I am trying to figure out a way that we can save these word doc as
XML and then parse through it and transform it using XSLT? Right now our
process is very slow as we have to instantiate Word, use the old Word COM
object model (which is not very friendly with .NET).
I tried a simple export of the word doc in XML and it contained a lot of
extra XML that I wasn't expecting. More importantly, depending on how
things were entered, sometimes our custom tags (i.e. @TAGNAME@) was split up
over several XML nodes. In other words (I am paraphrasing), in XML,
sometimes it woudl look like this:
<a:blah>@</a:blah><a:blah2>TAGNAME</a:blah2><a:blah>@</a:blah>
rather than having @TAGNAME@ be contiuous:
<a:blah>@TAGNAME@</a:blah>
I couldn't figure out where the a:blah stuff came from. Any ideas?
Thanks!