VBA to convert Styles to HTML tags

A

Avi Z

All-

I'm trying to convert a sizeable batch of documents using a few
standardized styles (Document Title, Subtitle, Heading, Subtitle,
Bullet, Bullet 2, Emphasis) to tagged HTML.

I know a simple find/replace can insert <P> tags, but I can't figure out
how to convert paragraph, character and spanning styles to their
appropriate <H1> and the like.

Is there something already out there, or can someone walk me through this?

Thanks,

a
 
P

Peter Hewett

Hi Avi

Are you or have you tried saving the file as an HTML file? Something like:

ActiveDocument.SaveAs FileName:="HTML Save As Test.htm", _
FileFormat:= wdFormatHTML, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", _
ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, _
SaveAsAOCELetter:= False

HTH + Cheers - Peter
 
A

Avi Z

Peter said:
Hi Avi

Are you or have you tried saving the file as an HTML file? Something like:

ActiveDocument.SaveAs FileName:="HTML Save As Test.htm", _
FileFormat:= wdFormatHTML, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", _
ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, _
SaveAsAOCELetter:= False


Thanks Peter... This works for automating the Save As process, but
doesn't actively insert the tags into the body of the document.

We're beginning to publish with a CMS that takes tagged HTML input into
a simple text form, and the "Save As HTML" generated HTML doesn't
produce what I need.

Basically, if there's a way to insert an <H1> tag at the front of all
text styled as "Title" and a </H1> at the end of said text, I'd love to
see it. I can probably duplicate it from there for the appropriate
style->HTML Tag mapping.

Thanks,

a

(also, is there a quick and dirty solution to convert em-dash,
typesetter's [curly] quotes, and the like to the appropriate &xxx;
combination?)
 
J

Jay Freedman

Hi, Avi,

Using the Replace dialog, click More and check the Use Wildcards
option. In the Find What box, enter

(*)^13

then click the Format button, select Style, and select Heading 1. In
the Replace With box, enter

<H1>\1</H1>^p

and click the Replace All button.

In a macro, the equivalent code (for H1 and H2, plus the special
characters) is

Dim oRg As Range
Set oRg = ActiveDocument.Range
With oRg.Find
.ClearFormatting
.Replacement.ClearFormatting
.Format = True
.Forward = True
.Wrap = wdFindContinue
.MatchWildcards = True

.Text = "(*)^13"
.Style = "Heading 1"
.Replacement.Text = "<H1>\1</H1>^p"
.Execute Replace:=wdReplaceAll

.Style = "Heading 2"
.Replacement.Text = "<H2>\1</H2>^p"
.Execute Replace:=wdReplaceAll

.ClearFormatting
.Format = False
.MatchWildcards = False
.Text = Chr(147) ' left curly quote
.Replacement.Text = "&147"
.Execute Replace:=wdReplaceAll

.Text = Chr(148) ' right curly quote
.Replacement.Text = "&148"
.Execute Replace:=wdReplaceAll

.Text = Chr(151) ' em dash
.Replacement.Text = "&151"
.Execute Replace:=wdReplaceAll
End With

Avi Z said:
Peter said:
Hi Avi

Are you or have you tried saving the file as an HTML file? Something like:

ActiveDocument.SaveAs FileName:="HTML Save As Test.htm", _
FileFormat:= wdFormatHTML, LockComments:=False, Password:="", _
AddToRecentFiles:=True, WritePassword:="", _
ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, _
SaveAsAOCELetter:= False


Thanks Peter... This works for automating the Save As process, but
doesn't actively insert the tags into the body of the document.

We're beginning to publish with a CMS that takes tagged HTML input into
a simple text form, and the "Save As HTML" generated HTML doesn't
produce what I need.

Basically, if there's a way to insert an <H1> tag at the front of all
text styled as "Title" and a </H1> at the end of said text, I'd love to
see it. I can probably duplicate it from there for the appropriate
style->HTML Tag mapping.

Thanks,

a

(also, is there a quick and dirty solution to convert em-dash,
typesetter's [curly] quotes, and the like to the appropriate &xxx;
combination?)
 
A

Avi Z

This was perfect!

Thanks,

a

Jay said:
Hi, Avi,

Using the Replace dialog, click More and check the Use Wildcards
option. In the Find What box, enter

(*)^13

then click the Format button, select Style, and select Heading 1. In
the Replace With box, enter

<H1>\1</H1>^p

and click the Replace All button.

In a macro, the equivalent code (for H1 and H2, plus the special
characters) is

Dim oRg As Range
Set oRg = ActiveDocument.Range
With oRg.Find
.ClearFormatting
.Replacement.ClearFormatting
.Format = True
.Forward = True
.Wrap = wdFindContinue
.MatchWildcards = True

.Text = "(*)^13"
.Style = "Heading 1"
.Replacement.Text = "<H1>\1</H1>^p"
.Execute Replace:=wdReplaceAll

.Style = "Heading 2"
.Replacement.Text = "<H2>\1</H2>^p"
.Execute Replace:=wdReplaceAll

.ClearFormatting
.Format = False
.MatchWildcards = False
.Text = Chr(147) ' left curly quote
.Replacement.Text = "&147"
.Execute Replace:=wdReplaceAll

.Text = Chr(148) ' right curly quote
.Replacement.Text = "&148"
.Execute Replace:=wdReplaceAll

.Text = Chr(151) ' em dash
.Replacement.Text = "&151"
.Execute Replace:=wdReplaceAll
End With

Avi Z <[email protected]> wrote:
 
J

Jay Freedman

Just a minor oops -- I forgot to include the semicolons at the end of the
special-character replacements. I guess you caught that already.

--
Regards,
Jay Freedman
Microsoft Word MVP

Avi said:
This was perfect!

Thanks,

a

Jay said:
Hi, Avi,

Using the Replace dialog, click More and check the Use Wildcards
option. In the Find What box, enter

(*)^13

then click the Format button, select Style, and select Heading 1. In
the Replace With box, enter

<H1>\1</H1>^p

and click the Replace All button.

In a macro, the equivalent code (for H1 and H2, plus the special
characters) is

Dim oRg As Range
Set oRg = ActiveDocument.Range
With oRg.Find
.ClearFormatting
.Replacement.ClearFormatting
.Format = True
.Forward = True
.Wrap = wdFindContinue
.MatchWildcards = True

.Text = "(*)^13"
.Style = "Heading 1"
.Replacement.Text = "<H1>\1</H1>^p"
.Execute Replace:=wdReplaceAll

.Style = "Heading 2"
.Replacement.Text = "<H2>\1</H2>^p"
.Execute Replace:=wdReplaceAll

.ClearFormatting
.Format = False
.MatchWildcards = False
.Text = Chr(147) ' left curly quote
.Replacement.Text = "&147"
.Execute Replace:=wdReplaceAll

.Text = Chr(148) ' right curly quote
.Replacement.Text = "&148"
.Execute Replace:=wdReplaceAll

.Text = Chr(151) ' em dash
.Replacement.Text = "&151"
.Execute Replace:=wdReplaceAll
End With

Avi Z <[email protected]> wrote:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top