Retrieving html source from word

B

bducci

I am trying to get the html source content of an openend Word document (which
is an htm file).
I can get the content of the document minus the html tags by using the
following:
wordApp.ActiveDocument.Content.Text

Is there a way to get the full html source using the word.application object?

thanks
George
 
C

Cindy Meister

Hi George

No, there's no provision for this in Word. XML, in Word 2003, yes. But not
HTML, not in any version. You'd have to save the file, then look at the file
on disk.

-- Cindy
 
B

bducci

Hi Cindy,
thanks for your quick reply.

I dug around in the WORD DOM and found the following:
wordApplicationObject.ActiveDocument.HTMLProject.HTMLProjectItems(1).Text

The above seems to give me the html source of the opened word HTML document,
this seems to work properly, although the html is unfiltered (which in my
situation is not an issue). I only tried it for WORD 2003 though.

thanks
George
 
C

Cindy Meister

Hah! OK, thanks for pointing this one out to me :) And glad you're up and
running!

-- Cindy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top