Word inserts unnecessary XML tags when you edit a Word XML doc

K

Keith Howard

Hello.

I have migrated most of my Word documents from .docx format to .xml (2007)
format in order to obtain the benefits of using search and replace tools over
all files in bulk search and replace operations, without trashing the files
(e.g. which would happen if they were in binary format as opposed to the xml
text file format).

The problem I am having is that Word (intelligently?!) inserts unnecessary
xml tags whenever I do anything, e.g. when inserting new text or deleting
existing text. Note the following series of 4 totally unnecessary tags that
Word inserts:
</w:t></w:r><w:r><w:t>

The result of this is that Word has neutralised what I perceive as the major
benefit of storing Word documents in XML, namely grep'ing (i.e. searching and
replacing) over them.

Does anyone have any clues as to how to get Word to stop inserting these
extra tags?

Thanks.

Keith Howard
 
D

Doug Robbins - Word MVP

What type of things are you replacing and with what.

Your statement that the files would get "trashed" by such an operation is
news to me.

See the following page of fellow MVP Greg Maxey's website:


http://gregmaxey.mvps.org/Process_Batch_Folder.htm



--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP
 
K

Keith Howard

Doug,
I am using a grep'ing tool called Funduc search and replace to do bulk
search and replace operations over hundreds or thousands or closed (not open)
Word documents. My understanding is that if you do this over a docx file, you
will trash the file and make it unusable, because the binary (and
compressed?) structure of the file prevents you from treating it like a text
file, over which you can do search and replaces without destroying the
integrity of the file.
Does that clarify things at all?
Thanks for your help.
Keith
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top