"HTML to Word" formatting/macro

G

GoodLevitation

I have an ASP/C# .NET appliation. In that application, I'm saving an
advanced Word control. And whatever user typed in, I'm saving the data along
with formatting info in a HTML string into Oracle CLOB column. We are using
this control in many places ( and saving into many CLOB columns). Now, I want
to pull those HTML data onto Word document bookmarks through a macro that
will convert those HTML tags into associated word formatting.

For example, <strong>Testing 1,2,3</strong> will become bolded Testing
1,2,3. Do you have a macro example similar to this? If not, can you give me
some idea? One idea I have is to save the data from Oracle CLOB column as a
HTML file. Open that file. Copy the data there, and paste into Word. And
I have to do this process one CLOB column at a time. But that will be quite
inefficient. Do you advise a better way?

Thanks in advance.
 
C

Cindy M.

Hi =?Utf-8?B?R29vZExldml0YXRpb24=?=,

This newsgroup is actually targeted at end users. The word.programming or one of
the office.developer groups would be the better place to ask. When you ask,
always specify the version of Word involved. Someone there might have some of
the functions you're looking for and be willing to share them.

FWIW the Word HTML file format is basically undocumented and is not supported in
the object model. The better approach, if Word 2003 or 2007 are involved, would
be to work with Word's XML file format. You wouldn't need to do any
conversion...

As far as I know, the Clipboard method, or inserting a file into the document is
the only way to get Word to do the work for you.
I have an ASP/C# .NET appliation. In that application, I'm saving an
advanced Word control. And whatever user typed in, I'm saving the data along
with formatting info in a HTML string into Oracle CLOB column. We are using
this control in many places ( and saving into many CLOB columns). Now, I want
to pull those HTML data onto Word document bookmarks through a macro that
will convert those HTML tags into associated word formatting.

For example, <strong>Testing 1,2,3</strong> will become bolded Testing
1,2,3. Do you have a macro example similar to this? If not, can you give me
some idea? One idea I have is to save the data from Oracle CLOB column as a
HTML file. Open that file. Copy the data there, and paste into Word. And
I have to do this process one CLOB column at a time. But that will be quite
inefficient. Do you advise a better way?

Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 17 2005)
http://www.word.mvps.org

This reply is posted in the Newsgroup; please post any follow question or reply
in the newsgroup and not by e-mail :)
 
G

GoodLevitation

Thanks for the reply Cindy. We have Office 2003, and will be moving to 2007
very shortly. I'm able to write some test macro that converts some basic
HTML tags to word equivalent in VBA. I thought of "Clip Board" method as you
have advised. But for my needs, it is way too slow and inefficient.

However, since the HTML generated by word has a lot of other things. By any
chance, do you know mapping for all HTML Tags generated by Word to equivalent
VBA script? MS must have that list somewhere. I'm not sure it would share
with us here though. Any advice on this list?

Thanks again Cindy....

Sincerely,

Soe Naing
 
D

Don

=?Utf-8?B?R29vZExldml0YXRpb24=?=
Thanks for the reply Cindy. We have Office 2003, and will be moving
to 2007 very shortly. I'm able to write some test macro that converts
some basic HTML tags to word equivalent in VBA. I thought of "Clip
Board" method as you have advised. But for my needs, it is way too
slow and inefficient.

However, since the HTML generated by word has a lot of other things.
By any chance, do you know mapping for all HTML Tags generated by Word
to equivalent VBA script? MS must have that list somewhere. I'm not
sure it would share with us here though. Any advice on this list?

Thanks again Cindy....

Sincerely,

Soe Naing

Surely sucj information is contained within the Office HTML Filter?
Extracting the information would be an entirely different isssue:

http://www.google.com/search?hl=en&q=microsoft+office+html+filter+2.0
&btnG=Google+Search
 
C

Cindy M.

Hi Soe Niang,
Thanks for the reply Cindy. We have Office 2003, and will be moving to 2007
very shortly. I'm able to write some test macro that converts some basic
HTML tags to word equivalent in VBA. I thought of "Clip Board" method as you
have advised. But for my needs, it is way too slow and inefficient.

However, since the HTML generated by word has a lot of other things. By any
chance, do you know mapping for all HTML Tags generated by Word to equivalent
VBA script? MS must have that list somewhere. I'm not sure it would share
with us here though. Any advice on this list?
No, I know of no such list. Some developers may have worked (partial) lists out,
based on research and trial-and-error.

If I were you, since this is Office 2003 onwards, I'd extract and save the
WordProcessingML, not the HTML. If the users are being asked to save as HTML,
change the request to Word's native XML. Not only is it documented, you can use
the InsertXML method to put content directly back into an opened Word document.
It's everything you're looking to do with HTML, but in contrast to the HTML it's
documented and supported.

Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 17 2005)
http://www.word.mvps.org

This reply is posted in the Newsgroup; please post any follow question or reply
in the newsgroup and not by e-mail :)
 
B

Bob Buckland ?:-\)

Hi Cindy,

FWIW, Word's HTML/XML formatting was documented when the 'big change' to incorporate things was made with Office 2000, but the HTML
part hasn't changed much since this set,
http://msdn.microsoft.com/en-us/library/aa155477(office.10).aspx
but I don't know if what's documented would be helpful when it comes to working with it via VBA. I did find it more useful to
convert the .CHM file to another format though <g>.

As you mentined working with the .docx packages does allow more possibilities and keeps things 'together' than working with the
older format :)

==============
Hi Soe Niang,

No, I know of no such list. Some developers may have worked (partial) lists out, based on research and trial-and-error.

If I were you, since this is Office 2003 onwards, I'd extract and save the
WordProcessingML, not the HTML. If the users are being asked to save as HTML,
change the request to Word's native XML. Not only is it documented, you can use the InsertXML method to put content directly back
into an opened Word document.
It's everything you're looking to do with HTML, but in contrast to the HTML it's documented and supported.

Cindy Meister
INTER-Solutions, Switzerland >>
--

Bob Buckland ?:)
MS Office System Products MVP

*Courtesy is not expensive and can pay big dividends*
 
G

GoodLevitation

Thanks for the suggestion y'all...

The problem is that I'm building Word documents for my client/server
powerbuilder application, using combination of HTML and XHTML data pieces
which are coming from web appliations. So I have no say in how those data
got saved.

I saw some posting by Brian Jones which got reposted here at
http://openxmldeveloper.org/forums/thread/62.aspx. And this solution may not
work for me.

Is there any Microsoft dll to convert "XHTML" to "WordML" - that is
available with WinXP-SP2 or Office2003? Or other free tool??? Something
that I can pass in a XHTML or HTML string, and return a WordML string???

If not, I guess, I have to do a LOT of "fileopen, filesave, fileclose,
copy/paste to word" operations....

Have a great evening!!!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top