B
bxc2739
I have two HUGE customer lists. The first one (DATA Sheet) is about 6000
rows
And contains varies information (zip, address, name, etc) the second
sheet (WORK)
Is about 35000+ long and is similar to DATA sheet. I am to compare EACH
of the 35000+ customer of sheet WORK individually to see if there is a
match with ANY customer(s) in sheet DATA. Sheet DATA is simply a
comparison sheet and WORK is the only that I highlight (the row of the
respective customer) if I get a match.
I could do this manually but it would take a very long time. The good
news is that I know (empirically) that over 95% of all the customers in
WORK will NOT match to ANY of the customers in DATA. So if I can filter
out the junk, then I can quickly narrow the number of matches I need to
do by hand and thus save a lot of time.
My first attempt was to do a search/filter by zip code. I have a macro
that goes to each row in WORK and searches for any matching zips in
DATA. It makes sense that if a certain row/customer in WORK does not
match with ANY of the customers in DATA then there is no match (and
thus no need to check name, address, other fields etc)
A macro basically eliminated all the customers that do not match by
zip.
Those that do match by zip in sheet WORK have entire row highlighted a
certain color.
I successfully finished this first step of filtering/narrowing the work
to be done by hand searching …(see below for macro code)…
However I am still left with PLENTY of customers! And of these I also
know (empirically) that MOST (90%) of the matched by zip will NOT match
by name.
So if I can get a macro that can search each highlighted row (already
matched by zip)
And within this set see which ones match by name (1st char of name, see
below..), then I can further narrow down my search significantly to the
point where I can search the rest by hand very quickly.
For example, John Doe in 76016 is NOT Brian Dawson in 76016. So even
though they match by zip there do not match by name and are not the
same customer. So a further filter (if someone can help me with writing
a search by name macro) will eliminate this because it will see that
John and Brian are not the same name.
However many customers names are not consistent. Sometimes they are in
Acronyms in one sheet and fully spelled out in another sheet. So I
would actually want a search by first character of the customer name
field. To do this I would have to access the substring manipulation of
Excel Macro. Again I am not familiar with this and a quick search of
the board has not lead to any case similar to mine. I want a
search/match by first character of the customer names field in order to
be on the safe side and avoid false negatives.
For example, James Doe in 75063 is NOT John Doe in 75063, but this
second search by name (1st char of name) will not pick up on that, BUT
that’s OK, because even a filter out by A-Z beginning character will
have reduced the number to search by hand by a factor of 26.
There are several ways to accomplish this. One is to search within the
set of customers already matched by zip code. Another is to rewrite the
zip code match macro (see below..) and alter it so that when it goes
around to match each individual row/customer in WORK to customers in
DATA by zip, it ALSO at the same time simultaneously matches them by
first characters of the customer names field.
Ex: a pseudo code would be …. If(currentWorkRow.zip = current
DataRow.zip && currentWorkRow.Name(substring(firstchar)) =
currentDataRow.Name(substring(firstchar)) THEN
MATCH
ELSE
NO MATCH AND GO TO NEXT ROW IN WORK UNTIL REACH END
I’m not sure how to implement the substring char in Excel Macro. I
asked around and got some hints but it doesn’t really help me because
I’m not fluent in writing macros and its above my head (the hints are
listed below)
I think in pseudo code it would work as something like this:
for each row in sheet2
{
if currentrow is white/nonhighlighted, leave alone and go to next row
if currentrow is highlighted
{
for each row in sheet1
{
if (sheet1.currentrow.zipcode = sheet2.currentrow.zipcode &&
sheet1.currentrow.firstcharname = sheet2.currentrow.firstcharname )
then
currentrow is highlighted light yellow
}
}
}
Not sure if the logical is intact...
I realize this is a lot, but I don’t even know where to start. If
someone could let me in the right direction (ie give links, threads,
websites that I could read up on and learn how to implement this search
that I want to code) that would be Great!
Thanks,
Bo
rows
And contains varies information (zip, address, name, etc) the second
sheet (WORK)
Is about 35000+ long and is similar to DATA sheet. I am to compare EACH
of the 35000+ customer of sheet WORK individually to see if there is a
match with ANY customer(s) in sheet DATA. Sheet DATA is simply a
comparison sheet and WORK is the only that I highlight (the row of the
respective customer) if I get a match.
I could do this manually but it would take a very long time. The good
news is that I know (empirically) that over 95% of all the customers in
WORK will NOT match to ANY of the customers in DATA. So if I can filter
out the junk, then I can quickly narrow the number of matches I need to
do by hand and thus save a lot of time.
My first attempt was to do a search/filter by zip code. I have a macro
that goes to each row in WORK and searches for any matching zips in
DATA. It makes sense that if a certain row/customer in WORK does not
match with ANY of the customers in DATA then there is no match (and
thus no need to check name, address, other fields etc)
A macro basically eliminated all the customers that do not match by
zip.
Those that do match by zip in sheet WORK have entire row highlighted a
certain color.
I successfully finished this first step of filtering/narrowing the work
to be done by hand searching …(see below for macro code)…
However I am still left with PLENTY of customers! And of these I also
know (empirically) that MOST (90%) of the matched by zip will NOT match
by name.
So if I can get a macro that can search each highlighted row (already
matched by zip)
And within this set see which ones match by name (1st char of name, see
below..), then I can further narrow down my search significantly to the
point where I can search the rest by hand very quickly.
For example, John Doe in 76016 is NOT Brian Dawson in 76016. So even
though they match by zip there do not match by name and are not the
same customer. So a further filter (if someone can help me with writing
a search by name macro) will eliminate this because it will see that
John and Brian are not the same name.
However many customers names are not consistent. Sometimes they are in
Acronyms in one sheet and fully spelled out in another sheet. So I
would actually want a search by first character of the customer name
field. To do this I would have to access the substring manipulation of
Excel Macro. Again I am not familiar with this and a quick search of
the board has not lead to any case similar to mine. I want a
search/match by first character of the customer names field in order to
be on the safe side and avoid false negatives.
For example, James Doe in 75063 is NOT John Doe in 75063, but this
second search by name (1st char of name) will not pick up on that, BUT
that’s OK, because even a filter out by A-Z beginning character will
have reduced the number to search by hand by a factor of 26.
There are several ways to accomplish this. One is to search within the
set of customers already matched by zip code. Another is to rewrite the
zip code match macro (see below..) and alter it so that when it goes
around to match each individual row/customer in WORK to customers in
DATA by zip, it ALSO at the same time simultaneously matches them by
first characters of the customer names field.
Ex: a pseudo code would be …. If(currentWorkRow.zip = current
DataRow.zip && currentWorkRow.Name(substring(firstchar)) =
currentDataRow.Name(substring(firstchar)) THEN
MATCH
ELSE
NO MATCH AND GO TO NEXT ROW IN WORK UNTIL REACH END
I’m not sure how to implement the substring char in Excel Macro. I
asked around and got some hints but it doesn’t really help me because
I’m not fluent in writing macros and its above my head (the hints are
listed below)
I think in pseudo code it would work as something like this:
for each row in sheet2
{
if currentrow is white/nonhighlighted, leave alone and go to next row
if currentrow is highlighted
{
for each row in sheet1
{
if (sheet1.currentrow.zipcode = sheet2.currentrow.zipcode &&
sheet1.currentrow.firstcharname = sheet2.currentrow.firstcharname )
then
currentrow is highlighted light yellow
}
}
}
Not sure if the logical is intact...
I realize this is a lot, but I don’t even know where to start. If
someone could let me in the right direction (ie give links, threads,
websites that I could read up on and learn how to implement this search
that I want to code) that would be Great!
Thanks,
Bo