need to extract email addresses from excel files

D

dunndealpr

Hey there. I have many Excel files which each have four columns of dat
(A, B, C, and D) and tens of thousands of rows. I need any emai
addresses present in column D extracted to column E. What's the bes
solution here? The text surrounding the emails is always different (i
the emails are not surrounded by <> or any other kind of constan
characters that can simplify this).

I've tried DigDB but it seems to have a run-time error 6 and/or 100
issue no matter what I do. DigDB also misses more than a few emails tha
I can see in plan view. I've tried a couple other programs but they sav
the emails in a separate document, and I need the emails saved in th
original file alongside the data it was extracted from, so that's n
help to me.

Any ideas? Thanks in advance
 
C

Claus Busch

Hi,

Am Thu, 6 Jun 2013 21:33:31 +0100 schrieb dunndealpr:
Hey there. I have many Excel files which each have four columns of data
(A, B, C, and D) and tens of thousands of rows. I need any email
addresses present in column D extracted to column E. What's the best
solution here? The text surrounding the emails is always different (ie
the emails are not surrounded by <> or any other kind of constant
characters that can simplify this).

please post some example of your strings


Regards
Claus Busch
 
C

Claus Busch

Hi,

Am Fri, 7 Jun 2013 09:09:44 +0200 schrieb Claus Busch:
please post some examples of your strings

if there is a space in front and behind the email-address, try:

Sub EMail()
Dim LRow As Long
Dim intAt As Integer
Dim intStart As Integer
Dim intEnd As Integer
Dim rngC As Range
Dim mystr As String

Application.ScreenUpdating = False
With ActiveSheet
LRow = .Cells(.Rows.Count, 4).End(xlUp).Row
For Each rngC In .Range("D1:D" & LRow)
intAt = InStr(rngC, "@")
If intAt > 0 Then
intStart = InStrRev(rngC, " ", intAt)
intEnd = InStr(intAt, rngC, " ")
mystr = Mid(rngC, intStart + 1, intEnd - intStart)
.Hyperlinks.Add _
anchor:=rngC.Offset(0, 1), _
Address:="mailto:" & mystr, _
TextToDisplay:=mystr
End If
Next
End With


Regards
Claus Busch
 
R

Ron Rosenfeld

Hey there. I have many Excel files which each have four columns of data
(A, B, C, and D) and tens of thousands of rows. I need any email
addresses present in column D extracted to column E. What's the best
solution here? The text surrounding the emails is always different (ie
the emails are not surrounded by <> or any other kind of constant
characters that can simplify this).

I've tried DigDB but it seems to have a run-time error 6 and/or 1004
issue no matter what I do. DigDB also misses more than a few emails that
I can see in plan view. I've tried a couple other programs but they save
the emails in a separate document, and I need the emails saved in the
original file alongside the data it was extracted from, so that's no
help to me.

Any ideas? Thanks in advance.

I would use a macro which would examine Column D; extract the email address from the contents, and write the result into Column E.

Depending on how many files you have, you may also want to have a macro that will open each file individually and save the results back, but without more information, I cannot help you much with that portion.

For the macro, the following will match legitimate email addresses including all country code top level domains, as well as specific common TLD's (e.g. com, net, org, gove, museum, biz, info, etc). You can add other TLD's to the pipe-separated list you will see in the macro. (You could install the macro in each of the workbooks with the lists to be extracted, but that seems too tedious).

I would install this macro either as an add-in, or install it into personal.xlsb (or personal.xlsm or personal.xls depending on how your version of Excel is set up) and then run it with the active worksheet being the worksheet you wish to process.

To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.
Ensure your project for installation (e.g: personal.xlsb) is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select the macro by name, and <RUN>.

=================================
Option Explicit
Sub ExtrEmailAddr()
Dim vSrc As Variant
Dim s As String
Dim i As Long
Dim rDest As Range
Dim re As Object, mc As Object

vSrc = Range("D1", Cells(Rows.Count, "D").End(xlUp))
Set rDest = Range("E1")
Set re = CreateObject("vbscript.regexp")
With re
.Global = False
.ignorecase = True
.Pattern = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*" _
& "@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|" _
& "asia|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel)\b"
End With

rDest.EntireColumn.ClearContents

If VarType(vSrc) >= 8192 Then
For i = LBound(vSrc) To UBound(vSrc)
s = vSrc(i, 1)
If re.test(s) Then
Set mc = re.Execute(s)
vSrc(i, 1) = mc(0)
End If
Next i
rDest.Resize(rowsize:=UBound(vSrc)) = vSrc

Else
s = vSrc
If re.test(s) Then
Set mc = re.Execute(s)
rDest = mc(0)
End If
End If

End Sub
===================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top