A
AlanR
Dear all
I have been tasked with a very manual process comparing thousands of
rows of data from 2 data dumps. I have to locate a 'best match'
Product Code from one data dump and copy its value to another data
dump. One data dump has the following columns: PO No, PO line No,
Product Code, Value. The second data dump has only PO No and Product
Code. The idea is on the second data dump to look up the relevant PO
number and most similar Product code and return the value from the
first data dump. The trouble is the product codes in both lists are
slightly different. Sometimes the beginning of the code is different,
sometimes the middle and sometimes the end! However, the characters in
both types of product codes are around 90-95% similar and easily
spottable by manual comparison. To automate this, I have tried a
number of methods so far to do a closest match, for example a vlookup
using TRUE, and I have also tried the 'FuzzyMatch' user-defined-
function that I saw on Mr Excel. Unfortunately neither is providing me
with a workable solution. Therefore I thought I'd ask the experts to
see if what I want is possible via a VBA macro. If so, I would be very
grateful of your help as doing this exercise manually is very
disheartening! Thank you, AlanR.
Example of the first data dump:
PO No Line No Product Code Value
45001 1 AB-ZZZ-HHH-45T 100
45001 2 TY-55555-99-ZA 110
45001 3 CCC-MODEL-XX-YYY 120
45001 4 YYYYY_35-KLMN 130
45002 1 TY-55555-99-ZA 140
45002 2 CCC-MODEL-XX-YYY 150
45002 3 AB-ZZZ-HHH-45T 160
45003 1 YYYYY_35-KLMN 170
45003 2 AB-ZZZ-HHH-45T 180
45004 1 CCC-MODEL-XX-YYY 190
45004 2 KIT-MODEL678 200
45004 3 HYT-JJJ-TOP10 210
45004 4 AB-ZZZ-HHH-45T 220
45004 5 GTO-GTOP-25L 230
45004 6 YYYYY_35-KLMN 240
45004 7 TY-55555-99-ZA 250
Example of the second data dump (including the 4 columns I would like
to auto-populate using VBA or formulae)
PO No. Product Code Closest Match Line No. Closest
Match Product Code Similarity% Value
45001 CCC-MODEL-XX-YYY-35
45001 45-AB-ZZZ-HHH-45T
45001 YYYYY_35-MODEL-KLMN
45001 TY-55555-99C-ZA
45002 CCC-MODEL-XX-YYY-35
45002 45-AB-ZZZ-HHH-45T
45002 TY-55555-99C-ZA
45003 45-AB-ZZZ-HHH-45T
45003 YYYYY_35-MODEL-KLMN
45004 KIT-MODEL678_A
45004 CCC-MODEL-XX-YYY-35
45004 45-AB-ZZZ-HHH-45T
45004 HYT-JJJ-TOP10
45004 YYYYY_35-MODEL-KLMN
45004 GTO-GTOP-25L
45004 TY-55555-99C-ZA
I have been tasked with a very manual process comparing thousands of
rows of data from 2 data dumps. I have to locate a 'best match'
Product Code from one data dump and copy its value to another data
dump. One data dump has the following columns: PO No, PO line No,
Product Code, Value. The second data dump has only PO No and Product
Code. The idea is on the second data dump to look up the relevant PO
number and most similar Product code and return the value from the
first data dump. The trouble is the product codes in both lists are
slightly different. Sometimes the beginning of the code is different,
sometimes the middle and sometimes the end! However, the characters in
both types of product codes are around 90-95% similar and easily
spottable by manual comparison. To automate this, I have tried a
number of methods so far to do a closest match, for example a vlookup
using TRUE, and I have also tried the 'FuzzyMatch' user-defined-
function that I saw on Mr Excel. Unfortunately neither is providing me
with a workable solution. Therefore I thought I'd ask the experts to
see if what I want is possible via a VBA macro. If so, I would be very
grateful of your help as doing this exercise manually is very
disheartening! Thank you, AlanR.
Example of the first data dump:
PO No Line No Product Code Value
45001 1 AB-ZZZ-HHH-45T 100
45001 2 TY-55555-99-ZA 110
45001 3 CCC-MODEL-XX-YYY 120
45001 4 YYYYY_35-KLMN 130
45002 1 TY-55555-99-ZA 140
45002 2 CCC-MODEL-XX-YYY 150
45002 3 AB-ZZZ-HHH-45T 160
45003 1 YYYYY_35-KLMN 170
45003 2 AB-ZZZ-HHH-45T 180
45004 1 CCC-MODEL-XX-YYY 190
45004 2 KIT-MODEL678 200
45004 3 HYT-JJJ-TOP10 210
45004 4 AB-ZZZ-HHH-45T 220
45004 5 GTO-GTOP-25L 230
45004 6 YYYYY_35-KLMN 240
45004 7 TY-55555-99-ZA 250
Example of the second data dump (including the 4 columns I would like
to auto-populate using VBA or formulae)
PO No. Product Code Closest Match Line No. Closest
Match Product Code Similarity% Value
45001 CCC-MODEL-XX-YYY-35
45001 45-AB-ZZZ-HHH-45T
45001 YYYYY_35-MODEL-KLMN
45001 TY-55555-99C-ZA
45002 CCC-MODEL-XX-YYY-35
45002 45-AB-ZZZ-HHH-45T
45002 TY-55555-99C-ZA
45003 45-AB-ZZZ-HHH-45T
45003 YYYYY_35-MODEL-KLMN
45004 KIT-MODEL678_A
45004 CCC-MODEL-XX-YYY-35
45004 45-AB-ZZZ-HHH-45T
45004 HYT-JJJ-TOP10
45004 YYYYY_35-MODEL-KLMN
45004 GTO-GTOP-25L
45004 TY-55555-99C-ZA