Delete rows with duplicate values

S

Smohrman

Hi Team!

I have a spreadsheet with three colums of data. The first column contains
records which have occasional phone number duplication- see blelow:

(555) 000-0000 DataA1 ValueA1
(555) 000-0000 DataA2 ValueA2
(555) 555-9770 DataA3 ValueA3
(555) 555-4464 DataA4 ValueA4
(555) 555-4464 DataA5 ValueA5
(555) 555-4720 DataA6 ValueA6
(555) 555-8823 DataA7 ValueA7
(555) 555-3834 DataA8 ValueA8
(555) 555-4125 DataA9 ValueA9

What I need to do is (somehwhat) automate the process of filtering or
deleting out all rows which have duplicate data in the first column, but not
second or third columns. I'm sure it's been done...I tried the Excel
out-of-the-box help suggestions and I've had no real luck. Any ideas?
 
M

Max

One play, using non-array formulas ..

Assume source data in sheet: X, cols A to C, from row1 down
The key col is col A

In another sheet: Y (say),

Put in A1:
=IF(ISERROR(SMALL($D:$D,ROW(A1))),"",INDEX(X!A:A,MATCH(SMALL($D:$D,ROW(A1)),$D:$D,0)))
Copy A1 to C1

Put in D1: =IF(X!A1="","",IF(COUNTIF(X!$A$1:A1,X!A1)>1,"",ROW()))

Select A1:D1, fill down to say D50 ?
to cover the max expected extent of data in X

Cols A to C in Y will auto-return only the unique* lines from X,
all lines neatly bunched at the top
*unique items in the key col A in X

Note: Refresh the data in X by clearing it with the Delete key
(do not delete the cols), then paste/paste special the new data

For the posted sample data, the results we'd get would be:

(555) 000-0000 DataA1 ValueA1
(555) 555-9770 DataA3 ValueA3
(555) 555-4464 DataA4 ValueA4
(555) 555-4720 DataA6 ValueA6
(555) 555-8823 DataA7 ValueA7
(555) 555-3834 DataA8 ValueA8
(555) 555-4125 DataA9 ValueA9
 
S

Smohrman

Max,

I tried to follow but am still experiencing problems.

I am assuming that you want me to add another sheet to the work book:
"(In another sheet: Y (say)"

I added a sheet to the workbook, then put in A1 the formula you gave:


=IF(ISERROR(SMALL($D:$D,ROW(A1))),"",INDEX(X!A:A,MATCH(SMALL($D:$D,ROW(A1)),$D:$D,0)))

Immediately I get a file browse dialog box that opens up titled "Update
Values: X"

I get the same thing with each step ("Copy A1 to C1") etc. I must be
missing some part of your instructions. I don't see how the formula in the
new sheet in A1 would be tied to the data in another sheet. Can you clarify?
 
V

vezerid

Smohrman,

in the formula you are using replace the X with the sheet name you are
reading from. If the sheet name contains spaces, enclose it in single
quotes. Otherwise you can include them but they are not necessary.

=IF(ISERROR(SMALL($D:$D,ROW(A1))),"",INDEX('My
Data'!A:A,MATCH(SMALL($D:$D,ROW(A1)),$D:$D,0)))

Does this help?

Kostis Vezerides
 
M

Max

Kostis: Thanks for the help !

Smohrman:
As explained in my 1st response / by Kostis,
the suggested set-up presumes your source data is in a sheet: X

Anyway, here's a working sample construct
to illustrate the implementation:
http://cjoint.com/?fceQb3ocN1
Extracting Unique Lines based on key col A.xls
 
M

macropod

Hi Smohrman,

Here's a macro solution:

Sub DeleteDuplicateRows()
Dim lLastRow As Long
Dim lLastCol As Long
Dim I As Long
Dim J As Long
Dim K As Long
lLastRow = ActiveSheet.UsedRange.Rows.Count - 1
lLastCol = ActiveSheet.UsedRange.Columns.Count - 1
For I = 0 To lLastRow - 1
For J = lLastRow To I + 1 Step -1
For K = 0 To lLastCol
If ActiveSheet.Range("A1").Offset(I, K).Value <>
ActiveSheet.Range("A1").Offset(J, K).Value Then
Exit For
End If
Next K
If K > lLastCol Then
ActiveSheet.Range("A1").Offset(J, 0).EntireRow.Delete
End If
Next J
Next I
End Sub

Cheers
 
S

Smohrman

Thank you Max! I had trouble with the syntax at first, but simply renamed my
sheets to X and Y and copied the formula you had in A1 through D1 in the
online example, copied it down and it worked.

I then had to autofilter col D for non-blanks, copy and paste the results to
another sheet to see the actual number of non-duplicated entries from column
A. Is that step necessary or am I missing something?

Thanks- I'm light-years closer than I was...
 
S

Smohrman

Thanks macropod...man I wish I knew how to use that macro!

I've tried and tried but haven't had any success implementing macros. No
one seems to have the time to explain it fully.

Nor (I should say) have I found the time to dedicate myself to learning how
to use macros...I know, I know- I'm going to have to do it someday...after I
write the novel inside me, take that trip to Hawaii, etc.

If you're not screaming busy and you could educate me with a series of clear
steps on how to use the macro you've generated I'd certainly be willing to
try again!

Thanks,
Smohrman
 
M

macropod

Hi Smohrman,

For the most part, using the macro is as simple as:
.. opening up you Excel workbook at the worksheet you want to work on
.. pressing Alt-F11 to access the vba editor
.. clicking Insert|Module
.. copying & pasting the code I gave you into that module
.. either:
. pressing F5 to run the macro
. pressing Alt-F11 again to return to the worksheet and:
. pressing Alt-F8 to open the macro listing dialogue box
. selecting the macro and pressing 'Run'

Learning how to code them is somewhat more involved. {:-o}

Cheers
 
S

Smohrman

Hi Macropod,

When running the macro, i get an error:

"Compile Error: Syntax Error"

The error line highlighted in the VB editor is:

"If ActiveSheet.Range("A1").Offset(I, K).Value <>"

Is there anything I should customize, or anything you can see wrong here?
I'm excited about the chance of getting this working!

Thanks in advance
 
M

Max

Smohrman said:
Thank you Max! I had trouble with the syntax at first, but simply renamed my
sheets to X and Y and copied the formula you had in A1 through D1 in the
online example, copied it down and it worked.

You're welcome ! Yes, it's easier to get it up and working first.
Then if needed, we can change** the sheetnames to taste,
and Excel would handle the sheetname change in the formulas automatically*
*except for some formulas, eg: INDIRECT
**albeit it's better to use short sheetnames whenever possible
I then had to autofilter col D for non-blanks, copy and paste the results to
another sheet to see the actual number of non-duplicated entries from column
A. Is that step necessary or am I missing something?

No, the above shouldn't be necessary <g>.
The results (only unique lines) should already appear in cols A to C,
bunched at the top. Col D is a helper col (can be hidden away).

And if we want to copy and paste the results elsewhere,
we could use paste special > check "values" > ok
 
M

Max

Smohrman said:
.. "Compile Error: Syntax Error"
The error line highlighted in the VB editor is:
"If ActiveSheet.Range("A1").Offset(I, K).Value <>"

Think you were hit with a line break when you copied and pasted the code.
Place the cursor on the leftmost part of the next line below the line above,
then do a backspace to remove the line break (bring the next line up, so as
to speak)

The entire line should read as:
If ActiveSheet.Range("A1").Offset(I, K).Value <>
ActiveSheet.Range("A1").Offset(J, K).Value Then

... but in my tests here, admit I wasn't able to get macropod's code to work
on the sample data. Do hang around for further responses from macropod /
others versed in vba.
 
D

Dave Peterson

This should be all one line in your code window:

If ActiveSheet.Range("A1").Offset(I, K).Value <>
ActiveSheet.Range("A1").Offset(J, K).Value Then

Or you can add a continuation character (space underscore) to the top line:

If ActiveSheet.Range("A1").Offset(I, K).Value <> _
ActiveSheet.Range("A1").Offset(J, K).Value Then
 
S

Smohrman

OK, I resolved the syntax error- it no longer errors out. I run the macro,
the hourglass turns and turns and turns...for about 2 minutes...then nothing.
The column still has duplicate values in it. Is it time to punt or what?
 
M

macropod

Hi Smohrman,

As coded, the macro tests all cells on each row to see whether thev're got
the same values as the corresponding cells on any other row.

From the data you'be posted, it's not clear where the column breaks might
be. If the data are all in one column, but parts of the data differ, you'll
need to use the Text-To Columns function to separate the different elements
into their own columns.

If you only want to test the cells in a limited number of columns, change
the line:
lLastCol = ActiveSheet.UsedRange.Columns.Count - 1
to 1 less than the column-number of the last column you want to test. For
example, if you want to test columns A-C, change this line to
lLastCol = 2
Similalrly, if you only want to start testing a column B, change the line:
For K = 0 To lLastCol
to
For K = 1 To lLastCol

Cheers
 
Top