Ok, your description of columns used didn't match the example data, so I
wrote the code so that you can adapt it to your actual workbook/worksheet
layout. Just change the Const values (right side of = sign) to match your
reality.
I also suggest testing this on a copy of your workbook first, rather than
taking the chance that I've made a boo-boo that will destroy what you have.
To test the code:
Make a copy of your workbook. Close the original.
Press [Alt]+[F11] to open the VB Editor (VBE).
In the VBE choose Insert --> Module from its menu toolbar.
Copy the code below and paste it into the empty module presented to you.
Make changes to the Const values as required to properly define them.
Close the VBE.
Use Tools --> Macro --> Macros and select the RemoveExtraEntries macro in
the list and click the [Run] button. How long it takes depends on how many
entries you have. It actually does the work in 2 stages. First it finds the
hour values to be removed and sets them to zero. After testing all entries,
it goes back and deletes all rows with a 0 entry for the hours on a row.
Works correctly with the test data you provided.
Sub RemoveExtraEntries()
'by using these Const values
'you can change the code to
'work with a different layout
'on your sheet if you ever need to
Const mySheetName = "Sheet1" ' payroll data sheet name
Const ENumColumn = "B"
Const JCodeColumn = "C"
Const PCodeColumn = "E"
Const HrsColumn = "H"
'end of user redefinable Const values
Dim pSheet As Worksheet
Dim lastRow As Long
Dim offset2JCode As Integer
Dim offset2PCode As Integer
Dim offset2Hrs As Integer
Dim rOffset As Long
Dim maxHours As Single
Dim OLC As Long ' outer loop counter
Dim ILC As Long ' inner loop counter
Dim baseCell As Range
'initialize some values
Set pSheet = ThisWorkbook.Worksheets(mySheetName)
offset2JCode = Range(JCodeColumn & 1).Column - _
Range(ENumColumn & 1).Column
offset2PCode = Range(PCodeColumn & 1).Column - _
Range(ENumColumn & 1).Column
offset2Hrs = Range(HrsColumn & 1).Column - _
Range(ENumColumn & 1).Column
lastRow = pSheet.Range(ENumColumn & Rows.Count). _
End(xlUp).Row
Application.ScreenUpdating = False ' improve performance
For OLC = lastRow To 2 Step -1
Set baseCell = pSheet.Range(ENumColumn & OLC)
rOffset = -1
maxHours = baseCell.Offset(0, offset2Hrs)
For ILC = OLC - 1 To 2 Step -1
If baseCell.Offset(rOffset, 0) = baseCell And _
baseCell.Offset(rOffset, offset2JCode) = _
baseCell.Offset(0, offset2JCode) And _
baseCell.Offset(rOffset, offset2PCode) = _
baseCell.Offset(0, offset2PCode) Then
If baseCell.Offset(rOffset, offset2Hrs) > _
maxHours Then
maxHours = baseCell.Offset(rOffset, offset2Hrs)
baseCell.Offset(0, offset2Hrs) = 0 ' mark for delete
Else
baseCell.Offset(rOffset, offset2Hrs) = 0 ' mark for delete
End If
End If
rOffset = rOffset - 1
Next
Next
'now delete rows with 0 in the Hours column
For OLC = lastRow To 2 Step -1
If pSheet.Range(HrsColumn & OLC) = 0 Then
pSheet.Range(HrsColumn & OLC).EntireRow.Delete
End If
Next
Set baseCell = Nothing
Set pSheet = Nothing
End Sub
ShagNasty said:
Sorry if in wrong Discussion group...
Spreadsheet with columns A thru K -- columns A, E, G, & J have employee data
(name, pay code, pay period, and hours respectively). Frequent pay
adjustments are made to many employee’s time during a given pay period. I
need to delete all records (rows), but the highest hour total, for the
employee, pay code, and pay period. MS Office SP3, Win XP, Approx 20k rows.
EmpName ENum JCode JDesc PayCode PDesc PayPeriod Hours
Emp A 0000A ABC ABC 055 OT 01/25/09 5
Emp A 0000A ABC ABC 055 OT 01/25/09 7.5
Emp A 0000A ABC ABC 065 ST 01/25/09 8
Emp A 0000A ABC ABC 065 ST 01/25/09 4
Emp B 0000B NBC NBC 055 OT 02/25/09 3
Emp B 0000B NBCN NBC 055 OT 02/25/09 5
Emp B 0000B NBC NBC 055 OT 02/25/09 16
I need to retain rows 3 (7.5), 4 (8), and 8 (16)
Thanks.. ShagNasty