searching text for date?

C

Co

Hi All,

I want to search word documents for a date in the text. Since the
documents all have different formats I can't search for a dedicated
word like: " Date: ".
So I created a loop that searches for all months something like:

Dim sMS as String
Dim aM() as String
sMS = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
aM = Split(sMS,",")

For i = 0 To 11
search in Word document for aM(i)
if found select whole date
try te resolve string as real date
Next

Normally the date I want is somewhere at the top of the document.
However when this is for example AUG
and somewhere in the text it finds FEB it will take that date instead
of the date at the top I really want.
because it first searches for FEB and later on for AUG.

How can I change this so that the right date will be found.

Regards
Marco
 
D

Doug Robbins - Word MVP

You could compare the Range of the dates that are found and make use of the
one with the Range nearest the beginning of the Range of the document
itself.

--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP
 
C

Co

You could compare the Range of the dates that are found and make use of the
one with the Range nearest the beginning of the Range of the document
itself.

--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP












- Tekst uit oorspronkelijk bericht weergeven -

Sound cool Doug,

how does that look in code?

Marco
 
G

Graham Mayor

You could restrict the range e.g. the following will search each of the
first 3 paragraphs in turn for the date using your pattern

Dim oRng As Range
Dim sMS As String
Dim aM() As String
sMS = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
aM = Split(sMS, ",")
For i = 1 To 3
Set oRng = ActiveDocument.Paragraphs(i).Range
For j = 0 To 11
If InStr(1, oRng, aM(j)) Then
MsgBox aM(j) & " Found"
End If
Next j
Next i


--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP


<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
 
H

Helmut Weber

Hi Marco,

how about that one:

Sub Test8888()
Dim sTmp As String
sTmp = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
FindFirstOf (sTmp)
End Sub

Sub FindFirstOf(sTmp As String)
Dim bFound As Boolean
Dim rTmp As Range
Dim iLng As Long
Set rTmp = ActiveDocument.Range
Dim aM() As String
aM = Split(sTmp, ",")
For iLng = 0 To UBound(aM)
With rTmp.Find
.Text = aM(iLng)
.MatchCase = True
.MatchWholeWord = True
If .Execute Then
bFound = True
rTmp.Collapse direction:=wdCollapseStart
rTmp.start = 0
End If
End With
Next
If bFound Then
rTmp.Collapse direction:=wdCollapseEnd
rTmp.End = rTmp.start + 3
rTmp.Select
End If
End Sub

Sub FindFirstOf(sTmp As String)
Dim rTmp As Range
Dim iLng As Long
Set rTmp = ActiveDocument.Range
Dim aM() As String
aM = Split(sTmp, ",")
For iLng = 0 To UBound(aM)
With rTmp.Find
.Text = aM(iLng)
.MatchCase = True
.MatchWholeWord = True
If .Execute Then
rTmp.Select
rTmp.Collapse direction:=wdCollapseStart
rTmp.start = 0
End If
End With
Next
End Sub
 
H

Helmut Weber

Hi Marco,

try the first example of FindFirstOf(sTmp As String),
the second one was my first attempt and stayed there
beause I pasted without selecting what should be overwritten.

--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP
 
C

Co

Hi Marco,

try the first example of FindFirstOf(sTmp As String),
the second one was my first attempt and stayed  there
beause I pasted without selecting what should be overwritten.

--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP

Helmut,

I find your code very helpful.
The only thing I think should be added is the code to search through
all possible Stories.
We use word files which have been converted from pdf. As you possibly
know this is normally a messy thing.
So parts of the text are located in textboxes somewhere on the page.
To make sure that the Search looks in all
these textboxes all the Stories should be searched.

Marco
 
C

Co

Hi Marco,

you may try to incorporate my code in:

http://word.mvps.org/faqs/customization/ReplaceAnywhere.htm

not easy, though. :-(

--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP

Helmut,

I have trying to change the code a little:

Function FindFirstOf(sTmp As String) As String

Dim bFound As Boolean
Dim rTmp As Word.Range
Dim iLng As Long
Dim iLngStart As Long
Set rTmp = ActiveDocument.Range
Dim aM() As String
aM = Split(sTmp, ",")

For iLng = 0 To UBound(aM)
With rTmp.Find
.Text = aM(iLng)
.MatchCase = True
.MatchWholeWord = True
If .Execute Then
bFound = True
rTmp.Collapse Direction:=wdCollapseStart
iLngStart = rTmp.Start - 3
End If
End With
Next

If bFound Then
rTmp.Collapse Direction:=wdCollapseEnd
rTmp.Start = iLngStart
rTmp.Expand Unit:=wdSentence
rTmp.Select
Else
rTmp.Text = Date
End If
Debug.Print rTmp.Text
FindFirstOf = rTmp.Text

End Function

For example on one line I have: Date: 01 AUG 08.
I find AUG and then want to extend that to 01 AUG 08.
With the code as it is he will select "Date: 01 AUG 08"
How can I make it so that just the date gets selected?

Second problem is the MatchWholeWord = True.
Sometimes a date is written like 01AUG08
He will not find that.

Marco
 
H

Helmut Weber

Hi Marco,

this one will search for the first occurence
of an abbreviation of a month in capital letters.
Remember the spot, where it is found (rTmp).

Then define a new range (rTmp1) from that spot's start
to the document's start.
Then search in that range backwards for a string
consisting of one or two digits,
whereby it is assumed, that such a string exists.
Set rtmp.start to the start of that spot.

Then define a new range (rTmp2) from the end of that spot
to the document's end.
Then search in that range forward for a string
consisting of one or two digits,
whereby it is assumed, that such a string exists.
Set rtmp.end to the end of that spot.

Sub Test8888()
Dim sTmp As String
sTmp = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
MsgBox FindFirstOf(sTmp)
End Sub

Function FindFirstOf(sTmp As String) As String
Dim bFound As Boolean
Dim rTmp As Range
Dim rTmp1 As Range
Dim rTmp2 As Range
Dim iLng As Long
Set rTmp = ActiveDocument.Range
Set rTmp1 = Selection.Range
Set rTmp2 = Selection.Range
Dim aM() As String
aM = Split(sTmp, ",")
For iLng = 0 To UBound(aM)
With rTmp.Find
.Text = aM(iLng)
.MatchCase = True
If .Execute Then
bFound = True
rTmp.Collapse Direction:=wdCollapseStart
rTmp.start = 0
End If
End With
Next
If bFound Then
rTmp.Collapse Direction:=wdCollapseEnd
rTmp.End = rTmp.start + 3
rTmp.Select
rTmp1.start = 0
rTmp1.End = rTmp.start
rTmp2.start = rTmp.End
rTmp2.End = ActiveDocument.Range.End
With rTmp1.Find
.Text = "[0-9]{1;2}"
.MatchWildcards = True
.Forward = False
.Execute
rTmp.start = rTmp1.start
End With
With rTmp2.Find
.Text = "[0-9]{1;2}"
.MatchWildcards = True
.Execute
rTmp.End = rTmp2.End
End With
End If
FindFirstOf = rTmp.Text
End Function

But it all depends on the consistency
of the text in your document.

In the end, whether a string represents a date
is undecidable programmatically.
Not whether it could(!) represent a date.

--

Gruß

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP
 
C

Co

Hi Marco,

this one will search for the first occurence
of an abbreviation of a month in capital letters.
Remember the spot, where it is found (rTmp).

Then define a new range (rTmp1) from that spot's start
to the document's start.
Then search in that range backwards for a string
consisting of one or two digits,
whereby it is assumed, that such a string exists.
Set rtmp.start to the start of that spot.

Then define a new range (rTmp2) from the end of that spot
to the document's end.
Then search in that range forward for a string
consisting of one or two digits,
whereby it is assumed, that such a string exists.
Set rtmp.end to the end of that spot.

Sub Test8888()
Dim sTmp As String
sTmp = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
MsgBox FindFirstOf(sTmp)
End Sub

Function FindFirstOf(sTmp As String) As String
Dim bFound As Boolean
Dim rTmp As Range
Dim rTmp1 As Range
Dim rTmp2 As Range
Dim iLng As Long
Set rTmp = ActiveDocument.Range
Set rTmp1 = Selection.Range
Set rTmp2 = Selection.Range
Dim aM() As String
aM = Split(sTmp, ",")
For iLng = 0 To UBound(aM)
   With rTmp.Find
      .Text = aM(iLng)
      .MatchCase = True
      If .Execute Then
         bFound = True
         rTmp.Collapse Direction:=wdCollapseStart
         rTmp.start = 0
      End If
   End With
Next
If bFound Then
   rTmp.Collapse Direction:=wdCollapseEnd
   rTmp.End = rTmp.start + 3
   rTmp.Select
   rTmp1.start = 0
   rTmp1.End = rTmp.start
   rTmp2.start = rTmp.End
   rTmp2.End = ActiveDocument.Range.End
   With rTmp1.Find
      .Text = "[0-9]{1;2}"
      .MatchWildcards = True
      .Forward = False
      .Execute
      rTmp.start = rTmp1.start
   End With
   With rTmp2.Find
      .Text = "[0-9]{1;2}"
      .MatchWildcards = True
      .Execute
      rTmp.End = rTmp2.End
   End With
End If
FindFirstOf = rTmp.Text
End Function

But it all depends on the consistency
of the text in your document.

In the end, whether a string represents a date
is undecidable programmatically.
Not whether it could(!) represent a date.

--

Gruß

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP

Helmut,

thanks very much. This is very nice.
I changed the code a little bit.

rTmp.End = rTmp.Start + 3 is now:
rTmp.End = rTmp.Start + Len(rTmp.Text)

Now I can also find like AUGUST or SEPTEMBER
Also I changed the following:
With rTmp2.Find
.Text = "[0-9]{1;4}"
.MatchWildcards = True
.Execute
rTmp.End = rTmp2.End
End With

So it will find both 01 AUG 08 and 01 AUG 2008.

Regards
Marco
 
H

Helmut Weber

Hi Marco,

note that
..Text = "[0-9]{1;4}"
is a version for Germany and Scandinavia,

for most other countries it would be
..Text = "[0-9]{1,4}"

also
sTmp = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC,JANUARY,
FEBRUARY..."
would be an option.

Glad I could assist.
--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP
 
C

Co

Hi Marco,

note that
.Text = "[0-9]{1;4}"
is a version for Germany and Scandinavia,

for most other countries it would be
.Text = "[0-9]{1,4}"
What do you mean here?
Does that have to do with the Word version or the date version?
also
sTmp = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC,JANUARY,
FEBRUARY..."
would be an option.
That's what I put in also.
 
H

Helmut Weber

..Text = "[0-9]{1;4}" ' Germany and Scandinavia,
..Text = "[0-9]{1,4}" ' most other countries

MsgBox Application.International(wdListSeparator)
Does that have to do with the Word version or the date version?
The Word-version.
 
C

Co

.Text = "[0-9]{1;4}"  ' Germany and Scandinavia,
.Text = "[0-9]{1,4}"  ' most other countries

MsgBox Application.International(wdListSeparator)
Does that have to do with the Word version or the date version?

The Word-version.

Is there a problem when I use this code from inside an Access
database?

MArco
 
H

Helmut Weber

Hi Marco,
Is there a problem when I use this code from inside an Access
database?

Shouldn't be a problem,
but I don't know much about access.

--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP
 
C

Co

Hi Marco,


Shouldn't be a problem,
but I don't know much about access.

--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP

Helmut,

I have been testing for some days and there is one problem that pops
up.
When I use the .MatchCase = True option it will not find date strings
like:
151300ZOCT08 but when I make the option False then it will sometimes
finds MAR in the word SUMMARY.
Is there a workaround so I can get both date strings?

Regards
Marco
 
C

Co

Helmut,

I have been testing for some days and there is one problem that pops
up.
When I use the .MatchCase = True option it will not find date strings
like:
151300ZOCT08 but when I make the option False then it will sometimes
finds MAR in the word SUMMARY.
Is there a workaround so I can get both date strings?

Regards
Marco

Sorry Helmut,

..MatchCase = True should be .MatchWholeWord = True.

Marco
 
C

Co

Hi Marco,


Shouldn't be a problem,
but I don't know much about access.

--

Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Vista Small Business, Office XP

Helmut,

I also have been trying to combine your code with the Search in all
ranges code you pointed out to me on the net.
It does however not find everything. Am I doing something wrong?
I used a document containing first this string: " 25 JUN 08" and later
on this string "241300ZJUN08" and it found only the last.

Sub FindReplaceAnywhere()

Dim rngStory As Range
Dim pFindTxt As String
Dim lJunk As Long
Dim oShp As Shape
Dim sTmp As String
sTmp = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
',JANUARY,FEBRUARY,MARCH,APRIL,JUNE,JULY,AUGUST,SEPTEMBER,OCTOBER,NOVEMBER,DECEMBER
Dim aM() As String
aM = Split(sTmp, ",")
Dim iLng As Long

For iLng = 0 To UBound(aM)
pFindTxt = aM(iLng)

lJunk = ActiveDocument.Sections(1).Headers(1).Range.StoryType
For Each rngStory In ActiveDocument.StoryRanges

Do
SearchAndReplaceInStory rngStory, pFindTxt, pReplaceTxt
On Error Resume Next
Select Case rngStory.StoryType
Case 6, 7, 8, 9, 10, 11
If rngStory.ShapeRange.Count > 0 Then
For Each oShp In rngStory.ShapeRange
If oShp.TextFrame.HasText Then
SearchAndReplaceInStory
oShp.TextFrame.TextRange, _
pFindTxt, pReplaceTxt
End If
Next
End If
Case Else
'do nothing
End Select
On Error GoTo 0
'get next linked story
Set rngStory = rngStory.NextStoryRange
Loop Until rngStory Is Nothing
Next
Next

End Sub

Sub SearchAndReplaceInStory(ByVal rngStory As Word.Range, _
ByVal strSearch As String)

Dim bFound As Boolean
Dim rTmp1 As Range
Dim rTmp2 As Range
Set rTmp1 = Selection.Range
Set rTmp2 = Selection.Range

Selection.HomeKey Unit:=wdStory
With rngStory.Find
.MatchCase = False
.Text = strSearch
If .Execute Then
If rngStory.Information(wdActiveEndPageNumber) = 1 Then
bFound = True
rngStory.Collapse Direction:=wdCollapseStart
rngStory.Start = 0
End If
End If
End With

If bFound Then
rngStory.Collapse Direction:=wdCollapseEnd
rngStory.End = rngStory.Start + 3
rngStory.Select
rTmp1.Start = 0
rTmp1.End = rngStory.Start
rTmp2.Start = rngStory.End
rTmp2.End = ActiveDocument.Range.End
With rTmp1.Find
.Text = "[0-9]{1,7}"
.MatchWildcards = True
.Forward = False
.Execute
rngStory.Start = rTmp1.Start
End With
With rTmp2.Find
.Text = "[0-9]{1,4}"
.MatchWildcards = True
.Execute
rngStory.End = rTmp2.End
End With
Debug.Print rngStory.Text
End If


End Sub

regards
Marco
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top