Hi Joel
Thanks very much for your help. My answers are as follows:
1) Is it acceptable to sort the worksheet in the macro?
Yes.
2) For macro ONE what is the last column?
This will be different as I want to be able to run the macro on various
spreadsheets. Although I suppose you could set the last column as IV (using
Excel 2003) and therefore it wouldn't matter as some cells may be blank.
3) Will the last column always be the same when you run the macro?
No. See answer to 2 above.
4) Does every Row have data filed to the last column.
Not necessarily. Some cells could be blank.
At the risk of providing too much info, here is some context for you:
BACKGROUND
Basically I am trying to remove duplicate rows from worksheets which import
data from various online forms submitted by users. Unfortunately, sometimes
users submit the exact same data multiple times (for whatever reason) and
this causes problems when we need to do an analysis of the collated data. So
I want to be able to remove duplicate rows where every cell in the row is
exactly the same as every corresponding cell within another row, and some of
these cells could be blank.
As for Macro TWO, I need this for another reason altogether. Last column is
not constant and yes there may be blank cells within the row.
JUST A THOUGHT
I suppose you could have the one macro do both jobs by designing it so that
the user can specify a range to interrogate - the user could select a single
column or a group of columns - but I don't know whether that would complicate
things as I don't want the user to specify WHAT the macro is actually looking
for, only WHERE to look for duplicates.
EXAMPLE
I may have a worksheet with a data range of A1:AG25000. Some cells within
this range may be blank for whatever reason. Now, I may want to remove any
duplicate rows where data in columns A to AG are an exact match for the
corresponding cells.
That is, A123 may be exactly the same as A237 and A767, B123 may be exactly
the same as B237 and B767, and so on for every column in those rows,
regardless of whether the match happens to contain data or be blank - hence
why you could go to column IV to do your interrogation. In this scenario I
want to only be left with unique rows after the macro is finished, so only
row 123 would remain while rows 237 and 767 are deleted.
Now, I may have another spreadsheet where I want to achieve the same thing,
but only want to check a single column for duplicates. For instance, I may
have 25000 rows in a spreadsheet containing employee data, but there may only
be 18000 employees in total, so I should only have 18000 rows not 25000. So,
if Column E contained a unique identifier such as employee numbers, then I
would want the macro to only interrogate that column and remove rows based
only on the duplicated data in row E regardless of whether the other cells in
the row were duplicated. In this scenario I only want to keep the first row
conatining that unique identifier.
Hope this makes sense and that I haven't confused you!
Once again, I appreciate your help and time with this.
Joe.