A
Ancient Brit
I have what seemed at first to be a trivial task for a Word 2003 SP2 macro
(VB 6.3).
Given a body of text containing a range of characters (letters (upper and
lower case), digits, punctuation, spaces), all but the letters A-Z need to be
removed, then the resulting text needs to be sorted, and finally, the count
of each letter should replace each block of sorted letters.
So: “I wandered lonely as a cloud, that floats on high o’er vales and hills,
etc., etc., with a few 12345 thrown in for good measure!†becomes
penultimately:
“AAAAAAAAACCCDDDDDEEEEEEEEEEFFFGGHHHHHHIIIIILLLLLLLMNNNNNNOOOOOOOOORRRRRSSSSSTTTTTTTUUVWWWWYâ€
and then finally: 9 0 3 5 10 3 2 6 5 0 0 7 1 6 9 0 0 5 5 7 2 1 4 0 1 0. (The
zeros appear where a letter in the sequence is missing and so the count for
that letter is zero).
I did some searching around and found very useful information on the use of
Search/Replace with wildcards from Graham Mayor and Klaus Linke at
word.mvps.org (excellent job – thank you!. I’ve been using MS Word for
probably 15 years and I still find something to learn )
My initial code worked OK – my approach was to first select the entire body
of text and render it upper case, then use Search/Replace with the FIND
wildcard sequence [!A-Z] and the REPLACE sequence null to reduce the text to
solely A-Z.
A subsequent Search/Replace on the text added a carriage return after every
character, the result was sorted, followed by another Search/Replace to
remove all the carriage returns. (There may be a quicker/simpler way but I’m
not aware of it.)
When it came to replacing each block of the same letter with its count, I
hit a snag. Try as I might, I cannot find a simple programmatic way to do
what I want.
I thought I had a solution when I tested a manual approach, using FIND with
Highlight checked (so the count is returned, but more importantly the block
of matching text is selected on exiting FIND, so that – I thought – I could
just replace the selection with the contents of Selection.Characters.Count
(and add a space as a separator).
Not so. What works manually doesn’t appear to work in a macro.
If I create a macro (even if I record one) that uses FIND to locate and
select all matching characters, upon completion only the first character in
the group is selected, whether I use Selection or Range.
I haven’t found a bug report that describes the FIND problem – yet – and
there are clearly more complex workarounds that I could devise, but I’d
prefer to keep the solution minimal and simple if I can. I’d be very grateful
for some guidance, even if it’s to say: “Use a workaround; FIND is bugged.â€
Best,
Peter
Example code:
Sub M8()
'
' Macro M8 created 10/15/2007 by Peter GQ Brooks
'
' Sort the text. Simplest way is to begin by making everything upper case
(A-Z), then use
' Search/Replace to remove everything that is NOT in the range A-Z (use
wildcards and the
' expression [!A-Z] for the FIND and null for the REPLACE).
' Then replace every character with itself plus a carriage return, making
each character a
' line on its own, then sort, then delete all carriage returns (replace
every carriage return
' with a null).
' Select the entire document.
' Change case to upper.
Selection.WholeStory
Selection.Range.Case = wdUpperCase
' Ensure Find/Replace boxes have no prior formatting to impede process
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
' Dump everything that isn't in the range A to Z.
With Selection.Find
.Text = "[!A-Z]"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
' Now look for "any single character" and replace it with the same character
and a carriage return
With Selection.Find
.Text = "^?"
.Replacement.Text = "^&^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
' Sort entire document by paragraphs
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric,
SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric,
SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending,
Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False,
LanguageID _
:=wdEnglishUS, SubFieldNumber:="Paragraphs", SubFieldNumber2:= _
"Paragraphs", SubFieldNumber3:="Paragraphs"
' Remove all carriage returns after sorting.
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
' Loop from A to Z and replace with count of character
For asciipointer = 65 To 90
Selection.Find.ClearFormatting
With Selection.Find
.Text = Chr(asciipointer)
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
' Experimental section, trying various solutions:
' Set myRange = ActiveDocument.Content
' myRange.Find.Execute FindText:=Chr(asciipointer), Forward:=True
' Stop
' If myRange.Find.Found = True Then myRange.Text =
Selection.Characters.Count
' Stop
' Selection.Find.Execute
' 'Selection.Delete
' Stop
' MsgBox (Selection.Characters.Count)
' Selection.InsertBefore Selection.Characters.Count
' Stop
' 'Selection.Replace
Next asciipointer
End Sub
(VB 6.3).
Given a body of text containing a range of characters (letters (upper and
lower case), digits, punctuation, spaces), all but the letters A-Z need to be
removed, then the resulting text needs to be sorted, and finally, the count
of each letter should replace each block of sorted letters.
So: “I wandered lonely as a cloud, that floats on high o’er vales and hills,
etc., etc., with a few 12345 thrown in for good measure!†becomes
penultimately:
“AAAAAAAAACCCDDDDDEEEEEEEEEEFFFGGHHHHHHIIIIILLLLLLLMNNNNNNOOOOOOOOORRRRRSSSSSTTTTTTTUUVWWWWYâ€
and then finally: 9 0 3 5 10 3 2 6 5 0 0 7 1 6 9 0 0 5 5 7 2 1 4 0 1 0. (The
zeros appear where a letter in the sequence is missing and so the count for
that letter is zero).
I did some searching around and found very useful information on the use of
Search/Replace with wildcards from Graham Mayor and Klaus Linke at
word.mvps.org (excellent job – thank you!. I’ve been using MS Word for
probably 15 years and I still find something to learn )
My initial code worked OK – my approach was to first select the entire body
of text and render it upper case, then use Search/Replace with the FIND
wildcard sequence [!A-Z] and the REPLACE sequence null to reduce the text to
solely A-Z.
A subsequent Search/Replace on the text added a carriage return after every
character, the result was sorted, followed by another Search/Replace to
remove all the carriage returns. (There may be a quicker/simpler way but I’m
not aware of it.)
When it came to replacing each block of the same letter with its count, I
hit a snag. Try as I might, I cannot find a simple programmatic way to do
what I want.
I thought I had a solution when I tested a manual approach, using FIND with
Highlight checked (so the count is returned, but more importantly the block
of matching text is selected on exiting FIND, so that – I thought – I could
just replace the selection with the contents of Selection.Characters.Count
(and add a space as a separator).
Not so. What works manually doesn’t appear to work in a macro.
If I create a macro (even if I record one) that uses FIND to locate and
select all matching characters, upon completion only the first character in
the group is selected, whether I use Selection or Range.
I haven’t found a bug report that describes the FIND problem – yet – and
there are clearly more complex workarounds that I could devise, but I’d
prefer to keep the solution minimal and simple if I can. I’d be very grateful
for some guidance, even if it’s to say: “Use a workaround; FIND is bugged.â€
Best,
Peter
Example code:
Sub M8()
'
' Macro M8 created 10/15/2007 by Peter GQ Brooks
'
' Sort the text. Simplest way is to begin by making everything upper case
(A-Z), then use
' Search/Replace to remove everything that is NOT in the range A-Z (use
wildcards and the
' expression [!A-Z] for the FIND and null for the REPLACE).
' Then replace every character with itself plus a carriage return, making
each character a
' line on its own, then sort, then delete all carriage returns (replace
every carriage return
' with a null).
' Select the entire document.
' Change case to upper.
Selection.WholeStory
Selection.Range.Case = wdUpperCase
' Ensure Find/Replace boxes have no prior formatting to impede process
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
' Dump everything that isn't in the range A to Z.
With Selection.Find
.Text = "[!A-Z]"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
' Now look for "any single character" and replace it with the same character
and a carriage return
With Selection.Find
.Text = "^?"
.Replacement.Text = "^&^p"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
' Sort entire document by paragraphs
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric,
SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric,
SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending,
Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False,
LanguageID _
:=wdEnglishUS, SubFieldNumber:="Paragraphs", SubFieldNumber2:= _
"Paragraphs", SubFieldNumber3:="Paragraphs"
' Remove all carriage returns after sorting.
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
' Loop from A to Z and replace with count of character
For asciipointer = 65 To 90
Selection.Find.ClearFormatting
With Selection.Find
.Text = Chr(asciipointer)
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
' Experimental section, trying various solutions:
' Set myRange = ActiveDocument.Content
' myRange.Find.Execute FindText:=Chr(asciipointer), Forward:=True
' Stop
' If myRange.Find.Found = True Then myRange.Text =
Selection.Characters.Count
' Stop
' Selection.Find.Execute
' 'Selection.Delete
' Stop
' MsgBox (Selection.Characters.Count)
' Selection.InsertBefore Selection.Characters.Count
' Stop
' 'Selection.Replace
Next asciipointer
End Sub