Wildcard search help

J

Jason L

Hey, I have a few (but lengthy) wildcard search for vba related questions.

First question, what is missing from this macro that is causing an error
that says something like the following, "...contains a group number which is
out of range"?

Selection.HomeKey Unit:=wdStory

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting

With Selection.Find
.Text = "[0-9]{1,}.[^32^s^t]@"
.Replacement.Text = "\1^+"
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With

Selection.Find.Execute Replace:=ReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Style = ActiveDocument.Styles("List Number 2")

Second question. Right now the above macro gets all unformatted number
paragraphs (1. text, 2. text, etc) and reformats them using a list number
template. There are times when the unformatted lists contain a space before
the number. Right now the macro isn't picking these up instances up. What
kind of modifications do I need to make to pick up these instances as well?

Final question. The text in this document contains sentences like the
following: Total Electrical Cost: $1,000,000.00. All of the text in this
document is merged from another source, so all formatting is lost. There are
several variations on this line. For example, some variations contain the
word "work" instead of "cost." Other times there are several other words
between Total and Cost. Also, there are times when the text is all caps, and
other times when the text is not. There are also times when the original
client forgets to insert a colon after "cost" or "work." Basically, I need
the macro to find any of these variations, format the text bold, align it
flush right, capitalize all text, and, if possible, add a colon before the
numberical amount. I've had some help on this before, but the client keeps
coming up with new variations on this paragraph, but they still have the same
expectations - automate the process.

Here is the code I have thus far (I know I shouldn't use asterisks, but I
was at a loss):

Selection.HomeKey Unit:=wdStory

With Selection.Find
.Text = "Total*^13"
.Replacement.Text = "^&"
.Replacement.Font.Bold = True
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With

Selection.Find.Execute Replace:=wdReplaceAll
'
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting

With Selection.Find
.Text = "Total*Cost*^13"
With .ParagraphFormat
.Alignment = wdAlignParagraphJustify
End With

.Forward = True
.Wrap = wdFindContinue
.Format = True
.Replacement.Text = "^&"
With .Replacement
With .ParagraphFormat
.Alignment = wdAlignParagraphRight
End With
End With

.Execute Replace:=wdReplaceAll


TIA,
Jason
 
H

Helmut Weber

Hi Jason,
I am not the grandmaster of wildcard search,
Maybe Graham Mayor or Klaus Linke will drop by,
but IMHO:
First question
A group is something between parenthesis in the search string.
As there is no group in your search string,
you get the mentioned error.
Second question
Wildcard search does not provide for a search for zero (!) or more
occurences of a single character, only one (!) or more, unfortunately.
I'd suggest to get rid of all blanks following paragraph marks
at first, as they are of no good anyway. And of all sequences of
blanks, too.
And of all blanks next to paragraph marks and next to tabs,
in general.
Final question
Here, IMHO, you are at a loss. One of the things i have learned
at university, is:
"Where everything is possible, nothing is possible."
by Prof. Dr. Hans Günter Tillmann.
Your client will come up with new variations ever and ever.
---
Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word XP, Win 98
http://word.mvps.org/
 
J

Jay Freedman

Hi Jason,

Answer to question 1:
In a wildcard replacement, when you use the code \1, it refers to a group
of characters from the Find.Text expression that is surrounded by
parentheses. Your Find.Text expression doesn't contain any parentheses, so
there is no group for \1 to refer to, hence the error message. I assume you
want to replace the "one or more of space, nonbreaking space or tab" with
the em dash, while keeping the number and period. For that, the correct
Find.Text expression would be

"([0-9]{1,}.)[^32^s^t]@"

If you don't want to keep the period, move it outside the right parenthesis.

Answer to question 2:
You can't do it all in one search, because Word's wildcard expressions
(unlike industry-standard regular expressions) doesn't have any construct
for "zero or more occurrences". You have to use two searches, the first one
to remove any leading spaces and the second to do the replacement you
already have. This is further complicated because you need to search for
spaces that occur between a paragraph mark and a number, since you don't
want to zap an occurrence in the middle of a paragraph (as could happen if a
sentence ends with a date, a period, and a space). The code I show below
won't catch a space at the very beginning of the document because it doesn't
have a preceding paragraph mark -- you'll have to handle that manually, or
include yet another bit of code to look at just the start of the first
paragraph.

Answer to question 3:
Instead of using * (which can grab too much text), try using the
following expression that means "any run of characters that don't include a
paragraph mark":

[!^13]@

So you can search for "Total[!^13]@[^13]" and, assuming the word "Total"
identifies all the right lines and only those lines, you should get the
result you want. Also notice that you're setting .MatchCase = False, so
you're doing a case-insensitive search; you may want to change that. Also,
you should set .Format = False, because that controls only whether you're
using formatting to find things, not whether formatting is being applied to
the replacement.

Here's some code for the first two questions:

Selection.HomeKey Unit:=wdStory

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting

With Selection.Find
.Text = "([^13])[^32^s^t]@([0-9]{1,}.)"
.Replacement.Text = "\1\2"
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With

With Selection.Find
.Text = "([0-9]{1,}.)[^32^s^t]@"
.Replacement.Text = "\1^+"
.Replacement.Style = _
ActiveDocument.Styles("List Number 2")
.Execute Replace:=wdReplaceAll
End With

--
Regards,
Jay Freedman
Microsoft Word MVP

Jason said:
Hey, I have a few (but lengthy) wildcard search for vba related
questions.

First question, what is missing from this macro that is causing an
error that says something like the following, "...contains a group
number which is out of range"?

Selection.HomeKey Unit:=wdStory

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting

With Selection.Find
.Text = "[0-9]{1,}.[^32^s^t]@"
.Replacement.Text = "\1^+"
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With

Selection.Find.Execute Replace:=ReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Style = ActiveDocument.Styles("List
Number 2")

Second question. Right now the above macro gets all unformatted
number paragraphs (1. text, 2. text, etc) and reformats them using a
list number template. There are times when the unformatted lists
contain a space before the number. Right now the macro isn't picking
these up instances up. What kind of modifications do I need to make
to pick up these instances as well?

Final question. The text in this document contains sentences like the
following: Total Electrical Cost: $1,000,000.00. All of the text in
this document is merged from another source, so all formatting is
lost. There are several variations on this line. For example, some
variations contain the word "work" instead of "cost." Other times
there are several other words between Total and Cost. Also, there
are times when the text is all caps, and other times when the text is
not. There are also times when the original client forgets to insert
a colon after "cost" or "work." Basically, I need the macro to find
any of these variations, format the text bold, align it flush right,
capitalize all text, and, if possible, add a colon before the
numberical amount. I've had some help on this before, but the
client keeps coming up with new variations on this paragraph, but
they still have the same expectations - automate the process.

Here is the code I have thus far (I know I shouldn't use asterisks,
but I was at a loss):

Selection.HomeKey Unit:=wdStory

With Selection.Find
.Text = "Total*^13"
.Replacement.Text = "^&"
.Replacement.Font.Bold = True
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With

Selection.Find.Execute Replace:=wdReplaceAll
'
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting

With Selection.Find
.Text = "Total*Cost*^13"
With .ParagraphFormat
.Alignment = wdAlignParagraphJustify
End With

.Forward = True
.Wrap = wdFindContinue
.Format = True
.Replacement.Text = "^&"
With .Replacement
With .ParagraphFormat
.Alignment = wdAlignParagraphRight
End With
End With

.Execute Replace:=wdReplaceAll


TIA,
Jason
 
H

Helmut Weber

Hi Jay,
You can't do it all in one search, because Word's wildcard expressions
(unlike industry-standard regular expressions) doesn't have any construct
for "zero or more occurrences".
of course, you are right, and I am glad, we agree in principle.
On second thought, however, isn't there a bit more to it,
which should be mentioned? Should that not be a search for "zero or
more occurrences" _followed_ by something else? As searching for "zero
or more occurrences" of something _alone_ would be meaningless?
---
Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word XP, Win 98
http://word.mvps.org/
 
J

Jay Freedman

Hi Jay,
of course, you are right, and I am glad, we agree in principle.
On second thought, however, isn't there a bit more to it,
which should be mentioned? Should that not be a search for "zero or
more occurrences" _followed_ by something else? As searching for "zero
or more occurrences" of something _alone_ would be meaningless?
---
Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word XP, Win 98
http://word.mvps.org/

Hi Helmut,

I'm not sure searching for "zero or more occurrences" of something
alone is quite meaningless, but it isn't terribly useful. :) Not
having such an expression in Word's wildcards is extremely irritating,
though.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top