Joe,
JLatham, one of the top professionals that helps people on this board has
done the best that Excel formulas allow to provide exactly what you asked
for, i.e. a formulaic way to get rid of duplicate data in a column. He
deserves KUDOS and a check mark for the effort, but the function-based
solution that you asked for still potentially leaves you with a considerable
amount of work to do. This is the nature of the beast, Excel formulas and
functions are not designed to do what you were asking for. Latham's solution
is as close as I have seen and to see how dificult it is to get Excel to even
pretend to do what you asked, consider that most questions to this board are
answered within an hour or so, some with 4 or more replies. Latham's response
came over 7 1/2 hours later.
I have had to solve the duplicate data problem a number of times; I gave up
on formulaic solutions long ago. So while Latham gave you the closest
formulaic answer to exactly what you specified in your question, I will share
the method I have worked out in the past.
You need a macro to do what you really need to do. NOTE: I needed to
re-create the macro and while I was researching the exact syntax for a term I
found a better example of the code I was writing. The macro below is a
fairly-well modified version (adjusted to fit this problem) of what I found
in the Microsoft Excel VBA
Help files; to see the original code, search for "Delete Method" in Help
from inside the Visual Basic Editor.
Warning: to start, make a copy of your workbook and work on the copy until
you understand and are well practiced in the use of any macro that delets
data. It may take a couple of practice runs to get the whole picture of what
you are doing. When you do understand, you will wonder how you lived without
VBA.
In your copy workbook, with a compiled data worksheet showing (such as
"OVERALL 1965-1967", you should go to Tools|Macros|Visual Basic Editor
and insert a module in your workbook;
then paste the following code into the module:
********************
Sub RidDups()
Dim r1 As Range 'next row
Dim r2 As Range 'current row
Dim r3 As Range 'three cells to delete because of duplicate driver
With ActiveSheet
Set r2 = Selection.Range("A1")
Do While Not IsEmpty(r2)
Set r1 = r2.Offset(1, 0)
If r1.Value = r2.Value Then
Set r3 = r2.Offset(0, -1)
r3.Resize(1, 3).Select
Set r3 = Selection
r3.Cells.Delete Shift:=xlShiftUp
End If
Set r2 = r1
Loop
End With
End Sub
*******************
Don't bother to paste the stars.
NEXT:
Then go to the data compiling worksheet (in the example you posted this is:
"OVERALL 1965-1967".
Then go to a column of drivers where you have pasted new data. Work with one
race type at a time.
The RidDups macro needs the drivers sorted, in the example workbook they
were not perfectly sorted.
Sort the 3 adjacent columns (rank, driver, wins) according to only the
driver column, decending. To do so, just select (click on) the word 'Rank' at
the top of the chosen list, then go to Data|Sort|"Driver"|Assending.
Excel will select the three columns for you. Note: This may temporarily
discombobulate your formulas, they will be quickly fixed at the end.
After sorting, NEXT:
Select just the drivers in the middle column, all the way down. Click on the
driver's name in the top cell of the list, then press the control/shift/down
keys.
With that selection still darkened, go to Tools|Macros|This Workbook
Only|RidDups|Run.
That will compare each name with the next and delete the duplicates, VERY
quickly.
NOTE: If you get tired of the screen flashing while the macro is running,
modify the code by putting two new
lines in it:
Right after the three 'Dim' statements, type this new line:
Application.ScreenUpdating = False
Then right after the 'End With' statement, type this new line:
Application.ScreenUpdating = True
These lines will make the screen stop flickering and flashing while the code
is running and will speed up excecution. I should wait until you are
comfortable with the operation of the macro first.
After deleting the duplicates using the Macro; NEXT:
Check your list for the few remaining data problems, for instance, there
might or might not be both of each
of these:
BILL DENNIGER
BILL DENNINGER
BILL HEENEY
BILL HENNEY
DON QUACKENBUSH
DON QUAKENBUSH
HURAM LOCKWOOD
HURAN LOCKWOOD
JIM GREASLEY
JIM GRESLEY
JOHN LOGAN
JOHNNY LOGAN
LEON WEISKE
LEON WIESKE
These may all be legitimate, or some may be misspellings/nicknames.
As you can see, the amount of work to fix the list is much less after
running the macro; these are the only potential duplicates left in the list
that I tested it on, as far as i can tell.
Also NOTE: Jim Gresely's name has a space in front it. If it doesn't appear
exactly like that in each winners list on each data-feeding worksheet, then
your 'wins' formula will not be able to find him and he will always show zero
wins.
NEXT:
Fix the formulas in the wins column by finding a good copy of the formula
left over (there will be a bunch of good ones left in the list, just a few
may be messed up or missing) and then drag-copying the good formula up and
down the remaining list.
NEXT:
Data|Sort the entire list again, this time according to Wins|Decending
/Drivers|Acending. This puts the winningest drivers back on top and the tied
drivers in alphabetical order.
NEXT:
Your rank numbers will now be somewhat messed up, no worries, it was already
somewhat messed up to begin with. You may want to give them a true ranking,
taking into account ties. To do this, go to the first driver.
In the example sheet I worked on the Pre-LIM Wins set of data, so I went to
BOB SENNEKER with 38 wins. Select the cell for BOB's rank, cell I6 (EYE SIX,
not Sixteen).
In i6, type the formula:
=IF(K6<>K5,COUNTA($J$6:J6),I5)
Read this formula as, {IF This driver's wins are not equal to the previous
driver's wins (both in column K),
THEN Count all drivers to this point (In column J),
ELSE This driver's rank is tied (equal to) the previous driver's rank (In
column eye).}
Note the Dollar-signs in the first term inside the COUNTA function, this
locks all of the formulas in the column to the top cell while the second term
is adjusted by Excel to match where you copy it.
Copy that forumla all the way down the Rank column.
This formula will make the drivers with the same amount of wins tied in
rank, and the next driver with less wins will pick up the correct next number
(in the example worksheet in the PRE-LIMS WINS data table, there are three
drivers with 16 wins who are tied at fourth, the next two drivers have 10
wins and are tied for seventh,
towards the bottom of the list a whole bunch with 2 wins are all tied at
44th, and the 1-win "wild" bunch are tied at 74th.)
Please let us know if any of this helped. Certainly Latham deserves a check
in the was this helpful/did this answer your question check box because he
really did answer your question.
SongBear