Reg Ex Challenge

D

dejones923

I am stumped at how I can extract a subset of data enclosed with parenthesis,
but also that also has data within in it enclosed with parenthesis. For
example, the paragraph below:

Now is the time for all to aid their country. (Note: This passage has been
quoted for years in the United States (US), but it has never been more true
than it is today, the (21st century). We should do all we can to make it a
reality.)

Objective: I would like to extract all data inside of the outer parenthesis
to including any data within the subset parens, ie: (US). I use the
expression below, but it only gets from (Note:...(US) It does not pick up
the rest

my expression used is: mySearch = \(*\)

Any help would be most appreciated.
 
R

Russ

DJ,
You can't really do it automatically with Word's limited wildcards,
logically, because you aren't being specific on which closing parenthesis to
end the search. If you manually went through the document, first, and
somehow marked or changed the closing parentheses that you wanted the
searches to end on, then you would have something more unique to end each
search on.
Other than that, the best you might be able to do now is to use what you
have shown to stop after each find selection and throw up a msgbox loop to
ask if you want to extend to another closing parenthesis.

Also see this on how to use the F8 key to manually extend the selection.
http://www.logicaltips.com/LPMArticle.asp?ID=189
What the article doesn't point out, is that the F8 key is doing the same
thing as toggling the EXT box (you can also click on it) at the bottom of
every document window. Turning that on, manually puts you into extended
mode, and you could for instance, repeatedly press the close parenthesis key
to extend the selection to another closing parenthesis.
 
K

Klaus Linke

Russ said:
You can't really do it automatically with Word's limited wildcards,
logically, because you aren't being specific on which closing parenthesis
to end the search.

Not with one replacement, anyway.

You could put in tags for matching pairs of braces first.

Say,

Find what: \(([!\(]@)\)
Replace with: <b1>\1</b1>

Find what: \(([!\(]@)\)
Replace with: <b2>\1</b2>

.... rinse and repeat until no more braces are found.

Then you could find matching pairs:
Find what: \<(b[1-9]\>)*\</\1

When done, you could put in real braces for the tags again.

Regards,
Klaus
 
R

Russ

DJ,
Because of your question, I wrote this code to help find nested parentheses,
and enhanced it to be a little more versatile in that you can easily change
what it is searching for. It counts the number of opening text finds and
closing text finds until the counts match. The macro could easily be
initiated by a menu choice, toolbar button click, or a shortcut keystroke
combination. So it could adapted to find nested and not-nested parentheses,
brackets, braces, tags, etc.
Add code where indicated in order to apply actions to found text.

=======================
Public Sub NestedMatchingStringPairs()
'If no text is pre-selected, work on whole main body of document.
'Finds an opening string, then searches for a closing string.
'If the count of the opening and closing strings (nested strings)
'are not the same, then search for another closing string until count is the
same.
'Insert more code where indicated below to do other actions on found text.

Dim aRange As Word.Range
Dim aRange2 As Word.Range
Dim aRange3 As Word.Range
Dim aRange4 As Word.Range
Dim strOpeningString As String
Dim strClosingString As String

'Set pairs to match here.
strOpeningString = "(" 'or [ or { or i.e. <head> tag
strClosingString = ")" 'or ] or } or i.e. </head> tag

If Selection.Type = wdSelectionIP Then 'No text selected?
Set aRange = ActiveDocument.Content 'then work one whole main body
ElseIf Selection.Characters.Count > 1 Then
Set aRange = Selection.Range 'Otherwise work on selected text.
Else
MsgBox "If anything is selected, it must be more than one character."
Exit Sub
End If
Set aRange2 = aRange.Duplicate
Set aRange3 = ActiveDocument.Range(0, 0)
Set aRange4 = aRange.Duplicate
With aRange.Find
.Text = strOpeningString
While .Execute
aRange2.SetRange Start:=aRange.Start, End:=aRange4.End
'Ubound and Split functions are used for counting found strings.
Do While Not UBound(Split(aRange3.Text, strOpeningString)) _
= UBound(Split(aRange3.Text, strClosingString)) Or _
aRange3.Start = aRange3.End
With aRange2.Find
.Text = strClosingString
If .Execute And aRange2.End <= aRange4.End Then
aRange3.SetRange Start:=aRange.Start, End:=aRange2.End
aRange2.Collapse direction:=wdCollapseEnd
Else
aRange.Select
MsgBox "This is an unmatched: " & strOpeningString & _
" " & vbCr & "See selection."
Exit Sub
End If
End With
Loop

'****************
MsgBox aRange3.Text 'Do stuff with successfully found aRange3 text
here.
'****************

aRange3.Collapse direction:=wdCollapseEnd
aRange.SetRange Start:=aRange3.End, End:=aRange4.End
Wend
If aRange3.End = 0 Then
MsgBox "Did not find any opening strings like : " &
strOpeningString
End If
End With
End Sub
=======================


Earlier versions of Word based on VB5 or MacWord 2004 need to use this found
code below as a substitute for the built-in VB6 Split() function. Other code
for built-in functions can be found on the net. (inStrRev(), Join(),
Replace(), StringRev(), etc.)

Public Function Split(ByVal sString As String, sDelimiter As String,
Optional iCompare As Integer = vbBinaryCompare) As Variant
'use vbTextCompare to match caseless
Dim sArray() As String, iArrayUpper As Integer, iPosition As Integer
iArrayUpper = 0
iPosition = InStr(1, sString, sDelimiter, iCompare)
Do While iPosition > 0
ReDim Preserve sArray(iArrayUpper)
sArray(iArrayUpper) = Left$(sString, iPosition - 1)
sString = Right$(sString, Len(sString) - iPosition)
iPosition = InStr(1, sString, sDelimiter, iCompare)
iArrayUpper = iArrayUpper + 1
Loop
ReDim Preserve sArray(iArrayUpper)
sArray(iArrayUpper) = sString
Split = sArray
End Function
 
R

Russ

DJ,
If you want to convert the subroutine to use arguments you could copy and
paste this starting code below over the first several lines, up to and
including the beginning of the first "if" statement :...

<snip>
Public Sub NestedMatchingStringPairs(strOpeningString As String, _
strClosingString As String)
'If no text is pre-selected, work on whole main body of document.
'Finds an opening string, then searches for a closing string.
'If the count of the opening and closing strings (nested strings)
'are not the same, then search for another closing string until count is the
same.
'Insert more code where indicated below to do other actions on found text.

Dim aRange As Word.Range
Dim aRange2 As Word.Range
Dim aRange3 As Word.Range
Dim aRange4 As Word.Range

If Selection.Type = wdSelectionIP Then 'No text selected?
<end of snip>




....then use a call statement to call the macro with string arguments
Sub TestNestedMatchingStringPairs()
Call NestedMatchingStringPairs("(", ")")
End Sub
DJ,
Because of your question, I wrote this code to help find nested parentheses,
and enhanced it to be a little more versatile in that you can easily change
what it is searching for. It counts the number of opening text finds and
closing text finds until the counts match. The macro could easily be
initiated by a menu choice, toolbar button click, or a shortcut keystroke
combination. So it could adapted to find nested and not-nested parentheses,
brackets, braces, tags, etc.
Add code where indicated in order to apply actions to found text.

=======================
Public Sub NestedMatchingStringPairs()
'If no text is pre-selected, work on whole main body of document.
'Finds an opening string, then searches for a closing string.
'If the count of the opening and closing strings (nested strings)
'are not the same, then search for another closing string until count is the
same.
'Insert more code where indicated below to do other actions on found text.

Dim aRange As Word.Range
Dim aRange2 As Word.Range
Dim aRange3 As Word.Range
Dim aRange4 As Word.Range
Dim strOpeningString As String
Dim strClosingString As String

'Set pairs to match here.
strOpeningString = "(" 'or [ or { or i.e. <head> tag
strClosingString = ")" 'or ] or } or i.e. </head> tag

If Selection.Type = wdSelectionIP Then 'No text selected?
Set aRange = ActiveDocument.Content 'then work one whole main body
ElseIf Selection.Characters.Count > 1 Then
Set aRange = Selection.Range 'Otherwise work on selected text.
Else
MsgBox "If anything is selected, it must be more than one character."
Exit Sub
End If
Set aRange2 = aRange.Duplicate
Set aRange3 = ActiveDocument.Range(0, 0)
Set aRange4 = aRange.Duplicate
With aRange.Find
.Text = strOpeningString
While .Execute
aRange2.SetRange Start:=aRange.Start, End:=aRange4.End
'Ubound and Split functions are used for counting found strings.
Do While Not UBound(Split(aRange3.Text, strOpeningString)) _
= UBound(Split(aRange3.Text, strClosingString)) Or _
aRange3.Start = aRange3.End
With aRange2.Find
.Text = strClosingString
If .Execute And aRange2.End <= aRange4.End Then
aRange3.SetRange Start:=aRange.Start, End:=aRange2.End
aRange2.Collapse direction:=wdCollapseEnd
Else
aRange.Select
MsgBox "This is an unmatched: " & strOpeningString & _
" " & vbCr & "See selection."
Exit Sub
End If
End With
Loop

'****************
MsgBox aRange3.Text 'Do stuff with successfully found aRange3 text
here.
'****************

aRange3.Collapse direction:=wdCollapseEnd
aRange.SetRange Start:=aRange3.End, End:=aRange4.End
Wend
If aRange3.End = 0 Then
MsgBox "Did not find any opening strings like : " &
strOpeningString
End If
End With
End Sub
=======================


Earlier versions of Word based on VB5 or MacWord 2004 need to use this found
code below as a substitute for the built-in VB6 Split() function. Other code
for built-in functions can be found on the net. (inStrRev(), Join(),
Replace(), StringRev(), etc.)

Public Function Split(ByVal sString As String, sDelimiter As String,
Optional iCompare As Integer = vbBinaryCompare) As Variant
'use vbTextCompare to match caseless
Dim sArray() As String, iArrayUpper As Integer, iPosition As Integer
iArrayUpper = 0
iPosition = InStr(1, sString, sDelimiter, iCompare)
Do While iPosition > 0
ReDim Preserve sArray(iArrayUpper)
sArray(iArrayUpper) = Left$(sString, iPosition - 1)
sString = Right$(sString, Len(sString) - iPosition)
iPosition = InStr(1, sString, sDelimiter, iCompare)
iArrayUpper = iArrayUpper + 1
Loop
ReDim Preserve sArray(iArrayUpper)
sArray(iArrayUpper) = sString
Split = sArray
End Function

DJ,
You can't really do it automatically with Word's limited wildcards,
logically, because you aren't being specific on which closing parenthesis to
end the search. If you manually went through the document, first, and
somehow marked or changed the closing parentheses that you wanted the
searches to end on, then you would have something more unique to end each
search on.
Other than that, the best you might be able to do now is to use what you
have shown to stop after each find selection and throw up a msgbox loop to
ask if you want to extend to another closing parenthesis.

Also see this on how to use the F8 key to manually extend the selection.
http://www.logicaltips.com/LPMArticle.asp?ID=189
What the article doesn't point out, is that the F8 key is doing the same
thing as toggling the EXT box (you can also click on it) at the bottom of
every document window. Turning that on, manually puts you into extended
mode, and you could for instance, repeatedly press the close parenthesis key
to extend the selection to another closing parenthesis.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top