Domenic wrote...
Yes. I believe that any solution for a Top N list must show any and all
values tied for Nth place. Otherwise, which of the tied values should
be returned?
Fair point.
Personally, for a Top N list, I prefer the solution I offered. It takes
into consideration ties for Nth place, it doesn't use array formulas,
and is very efficient.
By the way, the solution I offered is courtesy of Aladin Akyurek.
Source is irrelevant. Very little in these ngs is original.
OK, I hadn't considered ties, but your solution still uses one column
too many and gains no efficiency by doing so. Given the original data
in A2:A501,
B2:
=COUNTIF(A$2:A$501,A$2:A$501)+(ROWS(A2:A$501)-1)/ROWS(A$2:A$501)
B3:
=IF(ISNA(MATCH(A3,A$2:A2,0)),COUNTIF(A3:A$501,A3)
+(ROWS(A3:A$501)-1)/ROWS(A$2:A$501))
Fill B3 down into B4:B501. Compare my B3 formula to yours.
Yours (B3):
=IF((A3<>"")*(ISNA(MATCH(A3,$A$1:A2,0))),COUNTIF($A$2:$A$501,A3),"")
You're only including non-empty cells in col A. You're also matching
against a cell, A1, that isn't specified just so you can use the same
formula in B2 as in B3. There are arguments either way, but I tend to
avoid needing to leave some cells around my data ranges blank. Next, if
the ISNA call returns TRUE, then the current col A cell's value doesn't
appear in previous rows in col A, so there's NO POINT to including
those rows in the COUNTIF call unless your goal is an INEFFICIENT
solution. Again, the motivation seems to be formula uniformity, but if
partly and fully relative references are OK inside the MATCH call and
as the 2nd arg to COUNTIF, why not as the 1st arg to COUNTIF?
As for my (ROWS(x)-1)/ROWS(y) terms, they should be more or less
directly calculated from the references without having to access data.
Next, I'd enter the desired number of results in C1, then use a formula
in B1 to calculate the effective number of results.
B1:
=COUNTIF($B$2:$B$501,">="&INT(LARGE($B$2:$B$501,C1)))
This compares to your E1 formula
Yours (E1) [array formula]:
=MAX(IF(B2:B501=INDEX(B2:B501,MATCH(D1,C2:C501,0)),C2:C501))-D1
The LARGE call may make this somewhat less efficient, but it's a
mystery why you subtract D1 in your E1 formula when all your col F
formulas add $D$1 and $E$1. Seems inefficient to me. Also note your use
of an array formula. One or two unavoidable array formulas OK, but many
not OK even if they can be filled into the needed range from a single
template formula?
My revised results would be in C2:C501.
C2:
=IF(B$1,INDEX(A$2:A$501,MATCH(MAX(B$2:B$501),B$2:B$501,0)),"")
C3 [array formula]:
=IF(ROWS(B$2:B2)<B$1,INDEX(A$2:A$501,
MATCH(MAX(IF(B$2:B$501<VLOOKUP(C2,A$2:B$501,2,0),B$2:B$501)),B$2:B$501,0)),
"")
Compare my C3 formula to your C3 and F3 formulas.
Yours (C3):
=IF(N(B2),RANK(B2,$B$2:$B$501)+COUNTIF($B$2:B2,B2)-1,"")
Yours (F3):
=IF(ROW()-ROW($F$2)+1<=$D$1+$E$1,INDEX($A$2:$A$501,
MATCH(ROW()-ROW($F$2)+1,$C$2:$C$501,0)),"")
Your C3 COUNTIF call corresponds to my B3 (ROWS(x)-1)/ROWS(y) term.
It's a tie breaker. The difference is that your COUNTIF call has to
work with values in col B while my ratio or ROWS calls can be
calculated straight from the references without having to access range
data.
Your C3 RANK call correspond to my C3 MAX(IF(.<VLOOKUP(..),.)) term.
Mine is probably a bit slower in calculation speed, but it uses less
memory. You could tighten up your formulas a bit.
B2:
=IF((A2<>"")*(ISNA(MATCH(A2,$A$1:A1,0))),COUNTIF($A2:$A$501,A2),"")
E1 [array formula]:
=MAX(IF(B2:B501=INDEX(B2:B501,MATCH(D1,C2:C501,0)),C2:C501))
F2:
=IF(ROWS(F$2:F2)<=$E$1,INDEX($A$2:$A$501,
MATCH(ROWS(F$2:F2),$C$2:$C$501,0)),"")