Search for duplicates using MS Professional 2003 Excel

B

bill_jba

I have a row of data containing 2 inseparable and related columns that I want
to compare with all other data rows.

The first column contains an order #, which could contain text and numbers
and in the same row, is the second related column, which contains an amount
in $.

I want to create a formula to search the data file, which contains more than
5,000 rows, to determine if the same order # and the same amount together,
appear in the data more than once.

If the result finds a duplicate, I want the entire row to be highlighted and
an separated from the data if possible into a new sheet, with out cut and
pasting.

Thanks in advance for the assistance
 
B

Bob Greenblatt

I have a row of data containing 2 inseparable and related columns that I want
to compare with all other data rows.

The first column contains an order #, which could contain text and numbers
and in the same row, is the second related column, which contains an amount
in $.

I want to create a formula to search the data file, which contains more than
5,000 rows, to determine if the same order # and the same amount together,
appear in the data more than once.

If the result finds a duplicate, I want the entire row to be highlighted and
an separated from the data if possible into a new sheet, with out cut and
pasting.

Thanks in advance for the assistance
You need a macro to do exactly what you want. However, you can use several
manual steps to locate the data. I would create 2 additional columns to the
right. In the first column place the formula "=a1&b1 assuming the order
number is in column A and the price in column B. then in the next Colum, put
in sequential numbers representing the row. Then sort the whole mess on the
column with the combined order numbers and row number. You could then enter
a formula in the next column to the right beginning in row 2 for example
that looks like =x2=x1. Then search for TRUE and you'll have your
duplicates.
 
B

bill_jba

Bob:

Thanks so much for your help. Sounds straight forward, but unfortunately I
don't get it.

I added the 2 new columns and placed the formula as you suggested in the 1st
new column. I assume I should copy/paste to fill the entire 1st new column
with that formula.

In the next column, can you help me understand what "sequential numbers"
representing the row should be?

Then it sounds like you suggest using a 3rd column for the =x2=x1 formula???

How would I search for TRUE...by typing in the fx box right?

Here's an example of the data columns which I have previously sorted by
amounts.

Order # $Amount
515 $157.09
516 $157.09
0191009957001*32301 $2,090.53
0191009957001*40601 $2,090.53
60672599*22300 $227.43
60672599*40600 $227.43
60672599-30600 $227.43
11 $428.99
6 $428.99
61921 $485.43
61921 $485.43
32117 $249.55
32147 $249.55
31590 $267.20
31590 $267.20
34099 $278.64
35015 $278.64
35345 $278.64
10075629 $1,156.68
8420029 $1,156.68
9305429 $1,156.68
1999-2000 $300.00
2000/2001 $300.00
 
B

Bob Greenblatt

Bob:

Thanks so much for your help. Sounds straight forward, but unfortunately I
don't get it.

I added the 2 new columns and placed the formula as you suggested in the 1st
new column. I assume I should copy/paste to fill the entire 1st new column
with that formula. Yes
In the next column, can you help me understand what "sequential numbers"
representing the row should be?
The sequential numbers were to represent the original rows, so you could
sort it back if needed. Just enter a row number in one cell, then the next
row number in the cell under it, and then drag to the bottom of the range.
Then it sounds like you suggest using a 3rd column for the =x2=x1 formula???
Yes, after the data is sorted, this will find the duplicates.
How would I search for TRUE...by typing in the fx box right?
Yes, or highlight the column and then use edit find.
 
G

Geoff Lilley

Bob:

Thanks so much for your help. Sounds straight forward, but unfortunately I
don't get it.

I added the 2 new columns and placed the formula as you suggested in the 1st
new column. I assume I should copy/paste to fill the entire 1st new column
with that formula.

In the next column, can you help me understand what "sequential numbers"
representing the row should be?

Then it sounds like you suggest using a 3rd column for the =x2=x1 formula???

How would I search for TRUE...by typing in the fx box right?

Here's an example of the data columns which I have previously sorted by
amounts.

Order # $Amount
515 $157.09
516 $157.09
0191009957001*32301 $2,090.53
0191009957001*40601 $2,090.53
60672599*22300 $227.43
60672599*40600 $227.43
60672599-30600 $227.43
11 $428.99
6 $428.99
61921 $485.43
61921 $485.43
32117 $249.55
32147 $249.55
31590 $267.20
31590 $267.20
34099 $278.64
35015 $278.64
35345 $278.64
10075629 $1,156.68
8420029 $1,156.68
9305429 $1,156.68
1999-2000 $300.00
2000/2001 $300.00

Bob Greenblatt said:
On 5/6/08 2:21 PM, in article
(e-mail address removed), "bill_jba"
You need a macro to do exactly what you want. However, you can use several
manual steps to locate the data. I would create 2 additional columns to the
right. In the first column place the formula "=a1&b1 assuming the order
number is in column A and the price in column B. then in the next Colum, put
in sequential numbers representing the row. Then sort the whole mess on the
column with the combined order numbers and row number. You could then enter
a formula in the next column to the right beginning in row 2 for example
that looks like =x2=x1. Then search for TRUE and you'll have your
duplicates.

Pardon me for jumping in, hope this helps?

The =a1&B1 formula would be in column C. That would just put the two
together.

In D1, put a formula like this:=COUNTIF(C$1:C$21,C1)>1

If it's a duplicate, the cell will say TRUE. If it's not, it'll say
FALSE.

To highlight which rows are duplicates, I would do the following:
* Select the cells in column D
*Go to Format, Conditional Formatting.
* For "Cell Value Is" , choose "Equal To," and type the word TRUE.
(no quotes)
* Click on "Format," then choose "Patterns." I'd recommend using a
yellow background color (or something else that really grabs your
attention).

Then, all the cells that are duplicates will be yellow.

But, Bob's original point is well-taken. A macro is needed to do all
this automatically.

HTH
Cheers
Geoff
 
B

bill_jba

Geoff Lilley said:
Bob:

Thanks so much for your help. Sounds straight forward, but unfortunately I
don't get it.

I added the 2 new columns and placed the formula as you suggested in the 1st
new column. I assume I should copy/paste to fill the entire 1st new column
with that formula.

In the next column, can you help me understand what "sequential numbers"
representing the row should be?

Then it sounds like you suggest using a 3rd column for the =x2=x1 formula???

How would I search for TRUE...by typing in the fx box right?

Here's an example of the data columns which I have previously sorted by
amounts.

Order # $Amount
515 $157.09
516 $157.09
0191009957001*32301 $2,090.53
0191009957001*40601 $2,090.53
60672599*22300 $227.43
60672599*40600 $227.43
60672599-30600 $227.43
11 $428.99
6 $428.99
61921 $485.43
61921 $485.43
32117 $249.55
32147 $249.55
31590 $267.20
31590 $267.20
34099 $278.64
35015 $278.64
35345 $278.64
10075629 $1,156.68
8420029 $1,156.68
9305429 $1,156.68
1999-2000 $300.00
2000/2001 $300.00

Bob Greenblatt said:
On 5/6/08 2:21 PM, in article
(e-mail address removed), "bill_jba"
<[email protected]> wrote:
I have a row of data containing 2 inseparable and related columns that I want
to compare with all other data rows.
The first column contains an order #, which could contain text and numbers
and in the same row, is the second related column, which contains an amount
in $.
I want to create a formula to search the data file, which contains more than
5,000 rows, to determine if the same order # and the same amount together,
appear in the data more than once.
If the result finds a duplicate, I want the entire row to be highlighted and
an separated from the data if possible into a new sheet, with out cut and
pasting.
Thanks in advance for the assistance
You need a macro to do exactly what you want. However, you can use several
manual steps to locate the data. I would create 2 additional columns to the
right. In the first column place the formula "=a1&b1 assuming the order
number is in column A and the price in column B. then in the next Colum, put
in sequential numbers representing the row. Then sort the whole mess on the
column with the combined order numbers and row number. You could then enter
a formula in the next column to the right beginning in row 2 for example
that looks like =x2=x1. Then search for TRUE and you'll have your
duplicates.

Pardon me for jumping in, hope this helps?

The =a1&B1 formula would be in column C. That would just put the two
together.

In D1, put a formula like this:=COUNTIF(C$1:C$21,C1)>1

If it's a duplicate, the cell will say TRUE. If it's not, it'll say
FALSE.

To highlight which rows are duplicates, I would do the following:
* Select the cells in column D
*Go to Format, Conditional Formatting.
* For "Cell Value Is" , choose "Equal To," and type the word TRUE.
(no quotes)
* Click on "Format," then choose "Patterns." I'd recommend using a
yellow background color (or something else that really grabs your
attention).

Then, all the cells that are duplicates will be yellow.

But, Bob's original point is well-taken. A macro is needed to do all
this automatically.

HTH
Cheers
Geoff

To Bob & Geoff:

I have followed both Bob's and Geoff's advice, but am not finding my
solution just yet.

Heres' an example of the results:

515 $157.09 "a1&b1 TRUE
516 $157.09 "a1&b1 TRUE
0191009957001*32301 $2,090.53 "a1&b1 TRUE
0191009957001*40601 $2,090.53 "a1&b1 TRUE
60672599*22300 $227.43 "a1&b1 TRUE
60672599*40600 $227.43 "a1&b1 TRUE
60672599-30600 $227.43 "a1&b1 TRUE
11 $428.99 "a1&b1 TRUE
6 $428.99 "a1&b1 TRUE
61921 $485.43 "a1&b1 TRUE
61921 $485.43 "a1&b1 TRUE
32117 $249.55 "a1&b1 TRUE
32147 $249.55 "a1&b1 TRUE
31590 $267.20 "a1&b1 TRUE
31590 $267.20 "a1&b1 TRUE
34099 $278.64 "a1&b1 TRUE
35015 $278.64 "a1&b1 TRUE
35345 $278.64 "a1&b1 TRUE
10075629 $1,156.68 "a1&b1 TRUE
8420029 $1,156.68 "a1&b1 TRUE
9305429 $1,156.68 "a1&b1 TRUE
1999-2000 $300.00 "a1&b1 TRUE
2000/2001 $300.00 "a1&b1 TRUE
DONATION. $300.00 "a1&b1 TRUE
01020224-1 $2,400.00 "a1&b1 TRUE
01020224-3 $2,400.00 "a1&b1 TRUE
10152-1 $6,310.50 "a1&b1 TRUE
10152-1. $6,310.50 "a1&b1 TRUE

In the first row, cell containing Order #515 for $157.90 is not a duplicate
with row cell #516 even though the amount is the same.

On the other hand, entry row 10152-1 for $6,310.59 is a duplicate with the
last entry row 10152-1.1 for the same amount, the difference being the period.

What is throwing the formula off?

Many thanks, Bill
 
J

JE McGimpsey

bill_jba said:
I have a row of data containing 2 inseparable and related columns that I want
to compare with all other data rows.

The first column contains an order #, which could contain text and numbers
and in the same row, is the second related column, which contains an amount
in $.

I want to create a formula to search the data file, which contains more than
5,000 rows, to determine if the same order # and the same amount together,
appear in the data more than once.

If the result finds a duplicate, I want the entire row to be highlighted and
an separated from the data if possible into a new sheet, with out cut and
pasting.

One way:

Select the relevant cells, with the active cell in Row 1

Choose Format/Conditional Formatting...

Change the dropdowns and input box to read

Formula is =SUMPRODUCT(-($A$1:$A$10000=$A1),-($B$1:$B$10000=$B1))>1

Click Format, Patterns, and choose your highlight color. Click OK, OK.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top