retrieving email address surrounded by tabs, carriage returns, etc

R

Ray C

I'm using Word VBA to retrieve email addresses inside a Word document. The
problem I encounter is that sometimes there are tabs, carriage returns, ie
special characters before and after the email address that also get pulled
and appear as small squares in the text that I retrieve. My logic is like
this:

1) Find occurence of @ sign.
2) Pull the sentence that contains the @ sign and use the Split function to
create an array of words in the sentence (one word will eventually contain
the full email address).

Problem: When I look at the array item that contains the email address,
there are also tabs, carriage returns, etc that get interpreted as small
squares in my output.

Here is my code:

For Each rngStory In objDocument.StoryRanges
With rngStory.Find
.ClearFormatting
.Text = "@"
.Wrap = wdFindStop
.Forward = True
End With
Do Until rngStory.Find.Execute = False
With rngStory.Duplicate
.Expand Unit:=wdSentence
myArray = Split(.Text, " ", -1, vbTextCompare)
For i = 0 To UBound(myArray)
If InStr(1, myArray(i), "@", vbTextCompare) <> 0 Then
If numEmailsFound < 3 Then
'/// Copy email address to excel.
End If
End If
Next i
End With
Loop
Next rngStory
 
G

Graham Mayor

Try using Word to determine what is an e-mail address by using autoformat eg

Sub CopyEMailAddressesToOtherDoc()
Dim Source As Document
Dim Target As Document
Dim myRange As Range
Dim sView As String

Set Source = ActiveDocument
sView = ActiveWindow.View.ShowFieldCodes
Set Target = Documents.Add
Application.ScreenUpdating = False
Source.Activate
Options.AutoFormatReplaceHyperlinks = True
With Selection
.Range.AutoFormat
ActiveWindow.View.ShowFieldCodes = True
.HomeKey unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
Do While .Execute(findText:="^d HYPERLINK ""Mailto", _
MatchWildcards:=False, Wrap:=wdFindStop, _
Forward:=True) = True
Set myRange = Selection.Range
Target.Range.InsertAfter myRange & vbCr
Loop
End With
.HomeKey unit:=wdStory
End With
ActiveWindow.View.ShowFieldCodes = sView
Source.Close Savechanges:=wdDoNotSaveChanges
Target.Activate
With Selection.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^019 HYPERLINK ""mailto:(*)"" ^21"
.Replacement.Text = "\1"
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
Selection.Sort FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, _
SortOrder:=wdSortOrderAscending
End Sub

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP


<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
 
T

Tony Jollans

Much easier just to walk the hyperlinks ...

Dim HLink As Hyperlink
Dim Source As Document
Dim Target As Document
Dim E_Mail() As String

Set Source = ActiveDocument
Set Target = Documents.Add

For Each HLink In Source.Hyperlinks
E_Mail = Split(HLink.Address, ":", 2)
If E_Mail(0) = "mailto" Then
Target.Content.InsertAfter E_Mail(1)
Target.Range.InsertParagraphAfter
End If
Next

You can tidy it up to get exactly what you want
 
R

Ray C

Tony, unfortunately your code doesn't work for all my documents because Word
does not recognize all of the email addresses as hyperlinks. I need to go
into each document, put my cursor at the end of the email address and press
Enter. At that point Word changes it into a hyperlink (blue color).

But I have over 5000 documents to process so your code won't work. I need a
more solid approach, i.e. search for @ sign and work from there.

Thanks
 
R

Ray C

Hi Tony,

I got it to work...
I added two lines from Graham's reply to the beginning of your code and it
worked.

objWord.Options.AutoFormatReplaceHyperlinks = True
objDocument.Range.AutoFormat

Thanks!
 
T

Tony Jollans

Well done! Making all the addresses hyperlinks first does help in finding
them within the hyperlinks collection!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top