finding duplicates and deleting based on another column

T

tpeter

I have a spreadsheet of data that I have compiled from 6 different workbooks.
I have used a true false statement to identify the duplicated and now I need
to delete them based on which spreadsheet they came from. The consolidated
spreadsheet I have currently has 28,000 records, I am currently deletleting
them manually but this will take me until I am 100 to go thru. Any help on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to find the
duplicates in column A, then evaluted column J to see where the source of
the data came from. The choices are:


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Raw 02-06 is
an option then delete the rest of the duplicates leaving only this one. If
there is a duplicate and raw isn't available then pick option 2 and so on.

Thank you for your help, it is greatly appreciated.

Tim Peter
 
P

p45cal

tpeter;536940 said:
I have a spreadsheet of data that I have compiled from 6 differen
workbooks.
I have used a true false statement to identify the duplicated and now
need
to delete them based on which spreadsheet they came from. Th
consolidated
spreadsheet I have currently has 28,000 records, I am currentl
deletleting
them manually but this will take me until I am 100 to go thru. Any hel
on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to fin
the
duplicates in column A, then evaluted column J to see where the sourc
of
the data came from. The choices are:


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Ra
02-06 is
an option then delete the rest of the duplicates leaving only this one
If
there is a duplicate and raw isn't available then pick option 2 and s
on.

Thank you for your help, it is greatly appreciated.

Tim Peter

If this is a one-off exercise, then beacuse the data source names don'
sort naturally into your order of preference I would do find and replac
3 times on column J to put a numeral in front to get:
1Raw 02-06
2PG &E Composite Data
3PG&E Data 08
(you'll reverse that later)
then sort your consolidated sheet primarily sorting on column A, bu
secondarily on column B ascending. Now for each block of duplicates, th
1Raw 02-06 ones(s) should be at the top.

Now it's just a case of running this macro, which deletes all the lowe
duplicates, after selecting the entire block rows-wise, but only colum
A:

Sub blah()
toprow = Selection.Row
bottomrow = Selection.Rows.Count + toprow - 1
For i = bottomrow To toprow + 1 Step -1
If Cells(i, "A").Value = Cells(i - 1, "A").Value Then Rows(i).Delete
Next i
End Sub

Now do a find and replace (well 3 actually) on column J to restore th
original data source names
 
T

tpeter

I have renamed the 3 criteria and sorted in assending order by column A and
secondary assending by column B. When I run the macro it says variable not
defined.
a b c d
e f
25123670 AC250 LIVERMORE 0.85 4/27/2004
25151368 AC250 BENICIA 0.62 6/7/2004
25168891 AC250 SAN FRANCISCO 0.05 12/23/2002
g h i j
k
8/17/2004 American Test 5-20-07 FALSE
3/29/2007 9/29/2004 American Test 5-20-07 FALSE
8/6/2003 American Test 5-20-07 FALSE
 
T

tpeter

I have added numbers to the other 3 options and sorted it by column A and J
but still get varible is not defined on the macro stoping as soon as it
starts.
 
P

p45cal

EITHER[/B] REMOVE
*OPTION EXPLICI
from the top of the module where you have the macro
or* add the following line at the top of the macro:

Dim Toprow, bottomrow,
 
T

tpeter

Thanks for all your help. I sorted by column A and J. I was not seleting
everything I wanted sorted so it was breaking. Went from 23,650 records to
12,303 when I ran by check for duplicates =countif(a:a,a1)>1 everything came
back false. Thanks again for all of your help it saved me a lot of sanity.
Another interesting note is I brought this file home to work on and I have
2007 and there is a remove duplicates button that gave the exact same results.

Tim Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top