Hey there. I have many Excel files which each have four columns of data
(A, B, C, and D) and tens of thousands of rows. I need any email
addresses present in column D extracted to column E. What's the best
solution here? The text surrounding the emails is always different (ie
the emails are not surrounded by <> or any other kind of constant
characters that can simplify this).
I've tried DigDB but it seems to have a run-time error 6 and/or 1004
issue no matter what I do. DigDB also misses more than a few emails that
I can see in plan view. I've tried a couple other programs but they save
the emails in a separate document, and I need the emails saved in the
original file alongside the data it was extracted from, so that's no
help to me.
Any ideas? Thanks in advance.
I would use a macro which would examine Column D; extract the email address from the contents, and write the result into Column E.
Depending on how many files you have, you may also want to have a macro that will open each file individually and save the results back, but without more information, I cannot help you much with that portion.
For the macro, the following will match legitimate email addresses including all country code top level domains, as well as specific common TLD's (e.g. com, net, org, gove, museum, biz, info, etc). You can add other TLD's to the pipe-separated list you will see in the macro. (You could install the macro in each of the workbooks with the lists to be extracted, but that seems too tedious).
I would install this macro either as an add-in, or install it into personal.xlsb (or personal.xlsm or personal.xls depending on how your version of Excel is set up) and then run it with the active worksheet being the worksheet you wish to process.
To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.
Ensure your project for installation (e.g: personal.xlsb) is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select the macro by name, and <RUN>.
=================================
Option Explicit
Sub ExtrEmailAddr()
Dim vSrc As Variant
Dim s As String
Dim i As Long
Dim rDest As Range
Dim re As Object, mc As Object
vSrc = Range("D1", Cells(Rows.Count, "D").End(xlUp))
Set rDest = Range("E1")
Set re = CreateObject("vbscript.regexp")
With re
.Global = False
.ignorecase = True
.Pattern = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*" _
& "@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|" _
& "asia|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel)\b"
End With
rDest.EntireColumn.ClearContents
If VarType(vSrc) >= 8192 Then
For i = LBound(vSrc) To UBound(vSrc)
s = vSrc(i, 1)
If re.test(s) Then
Set mc = re.Execute(s)
vSrc(i, 1) = mc(0)
End If
Next i
rDest.Resize(rowsize:=UBound(vSrc)) = vSrc
Else
s = vSrc
If re.test(s) Then
Set mc = re.Execute(s)
rDest = mc(0)
End If
End If
End Sub
===================================