E
Edgar E. Cayce
I am writing a character counting application which needs to go
through hundreds of Word documents, counting the characters in each
one. I need to do things like count double for bold/italic/underline
and count font transitions, etc., so just asking Word for a char count
does not do it for me.
When the bit of code that does the actual counting executes, it is
awfully slow - like 10-20 characters per second.
The app that calls this is VBA from Access 2003.
What I do is get a range for each "story" in the document (body,
headers, footers, etc.) and run the following code on the range (note
that Counter is a structure I use to hold the various counts). I find
that it takes several seconds between each debug printing that it has
counted 100 chars:
Dim BoldState As Boolean
Dim UnderlineState As Boolean
Dim ItalicState As Boolean
Dim LastFontName As String
Dim LastFontSize As Long
Dim LastFontColor As Long
Dim CharCounter As Long
Dim TheChar As Object
BoldState = False
UnderlineState = False
ItalicState = False
LastFontName = CurrentRange.Characters(1).Font.Name
LastFontSize = CurrentRange.Characters(1).Font.Size
LastFontColor = CurrentRange.Characters(1).Font.Color
For CharCounter = 1 To CurrentRange.Characters.Count
If CharCounter Mod 100 = 0 Then Debug.Print CharCounter
With CurrentRange.Characters(CharCounter)
If .Text = " " Then
Counter.Spaces = Counter.Spaces + 1
ElseIf .Text = vbTab Then
Counter.Tabs = Counter.Tabs + 1
ElseIf .Text = vbCr Then
Counter.Returns = Counter.Returns + 1
Else ' else it is a char, the only one where we care about
bold, font, etc.
Counter.Chars = Counter.Chars + 1
If .Case = wdUpperCase Then
Counter.CapitalChars = Counter.CapitalChars + 1
End If
' note for these, we check transition as well as
presence.
If .Font.Bold Then
Counter.BoldChars = Counter.BoldChars + 1
If BoldState = False Then
Counter.BoldTransitions =
Counter.BoldTransitions + 1
End If
BoldState = True
Else
If BoldState = True Then
Counter.BoldTransitions =
Counter.BoldTransitions + 1
End If
BoldState = False
End If
If .Font.Underline Then
Counter.UnderlineChars = Counter.UnderlineChars +
1
If UnderlineState = False Then
Counter.UnderlineTransitions =
Counter.UnderlineTransitions + 1
End If
UnderlineState = True
Else
If UnderlineState = True Then
Counter.UnderlineTransitions =
Counter.UnderlineTransitions + 1
End If
UnderlineState = False
End If
If .Font.Italic Then
Counter.ItalicChars = Counter.ItalicChars + 1
If ItalicState = False Then
Counter.ItalicTransitions =
Counter.ItalicTransitions + 1
End If
ItalicState = True
Else
If ItalicState = True Then
Counter.ItalicTransitions =
Counter.ItalicTransitions + 1
End If
ItalicState = False
End If
If .Font.Name <> LastFontName Then
Counter.FontTransitions = Counter.FontTransitions
+ 1
LastFontName = .Font.Name
End If
If .Font.Size <> LastFontSize Then
Counter.FontTransitions = Counter.FontTransitions
+ 1
LastFontSize = .Font.Size
End If
If .Font.Color <> LastFontColor Then
Counter.FontTransitions = Counter.FontTransitions
+ 1
LastFontColor = .Font.Color
End If
End If
End With
Next CharCounter
Can anyone help me figure out how to speed this up? I saw something
posted about how using Range.Characters(CharCounter) to index the
chars would be slow, so I tried a For Each [CharObj] in
Range.Characters instead, but it did not seem to be any faster - and
am I getting them in order when I do that?
Ed
through hundreds of Word documents, counting the characters in each
one. I need to do things like count double for bold/italic/underline
and count font transitions, etc., so just asking Word for a char count
does not do it for me.
When the bit of code that does the actual counting executes, it is
awfully slow - like 10-20 characters per second.
The app that calls this is VBA from Access 2003.
What I do is get a range for each "story" in the document (body,
headers, footers, etc.) and run the following code on the range (note
that Counter is a structure I use to hold the various counts). I find
that it takes several seconds between each debug printing that it has
counted 100 chars:
Dim BoldState As Boolean
Dim UnderlineState As Boolean
Dim ItalicState As Boolean
Dim LastFontName As String
Dim LastFontSize As Long
Dim LastFontColor As Long
Dim CharCounter As Long
Dim TheChar As Object
BoldState = False
UnderlineState = False
ItalicState = False
LastFontName = CurrentRange.Characters(1).Font.Name
LastFontSize = CurrentRange.Characters(1).Font.Size
LastFontColor = CurrentRange.Characters(1).Font.Color
For CharCounter = 1 To CurrentRange.Characters.Count
If CharCounter Mod 100 = 0 Then Debug.Print CharCounter
With CurrentRange.Characters(CharCounter)
If .Text = " " Then
Counter.Spaces = Counter.Spaces + 1
ElseIf .Text = vbTab Then
Counter.Tabs = Counter.Tabs + 1
ElseIf .Text = vbCr Then
Counter.Returns = Counter.Returns + 1
Else ' else it is a char, the only one where we care about
bold, font, etc.
Counter.Chars = Counter.Chars + 1
If .Case = wdUpperCase Then
Counter.CapitalChars = Counter.CapitalChars + 1
End If
' note for these, we check transition as well as
presence.
If .Font.Bold Then
Counter.BoldChars = Counter.BoldChars + 1
If BoldState = False Then
Counter.BoldTransitions =
Counter.BoldTransitions + 1
End If
BoldState = True
Else
If BoldState = True Then
Counter.BoldTransitions =
Counter.BoldTransitions + 1
End If
BoldState = False
End If
If .Font.Underline Then
Counter.UnderlineChars = Counter.UnderlineChars +
1
If UnderlineState = False Then
Counter.UnderlineTransitions =
Counter.UnderlineTransitions + 1
End If
UnderlineState = True
Else
If UnderlineState = True Then
Counter.UnderlineTransitions =
Counter.UnderlineTransitions + 1
End If
UnderlineState = False
End If
If .Font.Italic Then
Counter.ItalicChars = Counter.ItalicChars + 1
If ItalicState = False Then
Counter.ItalicTransitions =
Counter.ItalicTransitions + 1
End If
ItalicState = True
Else
If ItalicState = True Then
Counter.ItalicTransitions =
Counter.ItalicTransitions + 1
End If
ItalicState = False
End If
If .Font.Name <> LastFontName Then
Counter.FontTransitions = Counter.FontTransitions
+ 1
LastFontName = .Font.Name
End If
If .Font.Size <> LastFontSize Then
Counter.FontTransitions = Counter.FontTransitions
+ 1
LastFontSize = .Font.Size
End If
If .Font.Color <> LastFontColor Then
Counter.FontTransitions = Counter.FontTransitions
+ 1
LastFontColor = .Font.Color
End If
End If
End With
Next CharCounter
Can anyone help me figure out how to speed this up? I saw something
posted about how using Range.Characters(CharCounter) to index the
chars would be slow, so I tried a For Each [CharObj] in
Range.Characters instead, but it did not seem to be any faster - and
am I getting them in order when I do that?
Ed