Algorithm for computing Excel PercentRank

K

Kam Mistry

I am current trying to determine how Excel calculates PercentRank.

The project I am working on requires the use of this function - and we
cannot use COM automation to call Excel.

The only documentation I could find was the following:

Excel Function: PERCENTRANK(x, array)

- If x matches one of the values in the array, this function is
equivalent to the Excel formula = (RANK(x)-1)/(N-1) where N is the
number of data points.

- If x does not match one of the values, then the PERCENTRANK function
interpolates.

Rank(x) is easy to compute. The second statement is where I'm having
trouble with, how does Excel "interpolates" the percentage rank? What
method does it use?

Thanks in advance,
Kam Mistry
Microsoft .NET Developer (C#)
 
H

Harlan Grove

Kam Mistry wrote...
I am current trying to determine how Excel calculates PercentRank.

The project I am working on requires the use of this function - and we
cannot use COM automation to call Excel.
....

But you need to duplicate Excel's PERCENTRANK sematics exactly? If so,
PERCENTRANK(Rng,x) needs to return #N/A or some other error indicator
when x > MAX(Rng).

Also, PERCENTRANK rounds differently than the formula you mention. For
example, with 1..8 in G1:G8 and 5 in G10,

PERCENTRANK(G1:G8,G10,6) returns 0.571428
ROUND((RANK(G10,G1:G8,1)-1)/7,6) returns 0.571429

Rank(x) is easy to compute. The second statement is where I'm having
trouble with, how does Excel "interpolates" the percentage rank? What
method does it use?

What ever happened to exploratory data analysis? It's simple linear
interpolation between the points in the range bracketting your second
argument value. That is,

=TREND(PERCENTRANK(Rng,
{1;0}*MAX(IF(Rng<=x,Rng))+{0;1}*MIN(IF(Rng>=x,Rng)),6),
{1;0}*MAX(IF(Rng<=x,Rng))+{0;1}*MIN(IF(Rng>=x,Rng)),x)

or, since you're going to be programming this,


if (YourErrorCheckFcnHere(pr = YourPercentRankFcnHere(rng, x)) {
for (i = 0, xlo = xhi = x; i < sizeof(rng) / sizeof(*rng); ++i) {
if (xlo < rng && rng < x) xlo = rng;
if (xhi > rng && rng > x) xhi = rng;
}

ylo = YourPercentRankFcnHere(rng, xlo);
yhi = YourPercentRankFcnHere(rng, xhi);

pr = ((xhi - x) * ylo + (x - xlo) * yhi) / (xhi - xlo);
}


If rng is sorted, use binary search rather than simple iteration to
locate the points in rng bracketting x. I assumed rng was a simple
array. Adapt for more sophisticated data structures.
 
K

Kam Mistry

Thanks for you help! Our test cases against Excel PercentRank work.

Cheers,
Kam


Harlan Grove said:
Kam Mistry wrote...
I am current trying to determine how Excel calculates PercentRank.

The project I am working on requires the use of this function - and we
cannot use COM automation to call Excel.
...

But you need to duplicate Excel's PERCENTRANK sematics exactly? If so,
PERCENTRANK(Rng,x) needs to return #N/A or some other error indicator
when x > MAX(Rng).

Also, PERCENTRANK rounds differently than the formula you mention. For
example, with 1..8 in G1:G8 and 5 in G10,

PERCENTRANK(G1:G8,G10,6) returns 0.571428
ROUND((RANK(G10,G1:G8,1)-1)/7,6) returns 0.571429

Rank(x) is easy to compute. The second statement is where I'm having
trouble with, how does Excel "interpolates" the percentage rank? What
method does it use?

What ever happened to exploratory data analysis? It's simple linear
interpolation between the points in the range bracketting your second
argument value. That is,

=TREND(PERCENTRANK(Rng,
{1;0}*MAX(IF(Rng<=x,Rng))+{0;1}*MIN(IF(Rng>=x,Rng)),6),
{1;0}*MAX(IF(Rng<=x,Rng))+{0;1}*MIN(IF(Rng>=x,Rng)),x)

or, since you're going to be programming this,


if (YourErrorCheckFcnHere(pr = YourPercentRankFcnHere(rng, x)) {
for (i = 0, xlo = xhi = x; i < sizeof(rng) / sizeof(*rng); ++i) {
if (xlo < rng && rng < x) xlo = rng;
if (xhi > rng && rng > x) xhi = rng;
}

ylo = YourPercentRankFcnHere(rng, xlo);
yhi = YourPercentRankFcnHere(rng, xhi);

pr = ((xhi - x) * ylo + (x - xlo) * yhi) / (xhi - xlo);
}


If rng is sorted, use binary search rather than simple iteration to
locate the points in rng bracketting x. I assumed rng was a simple
array. Adapt for more sophisticated data structures.
 
K

Kam Mistry

Here's the C# function I wrote where "listOfRankItems" is an array of
struct:


private float PercentRank(rankItem[] listOfRankItems, float valueX)
{

/*--------------------------------------------------------------------------
---------

Calculate the PERCENTRANK(array, x)

If X matches one of the values in the array, this function is
equivalent to
the Excel formula =(RANK(x)-1)/(N-1) where N is the number of data
points.

If X does not match one of the values, then the PERCENTRANK function
interpolates.

----------------------------------------------------------------------
-------------*/

int numberOfItems = listOfRankItems.Length;
float resultPR = valueX;
bool foundX = false;

for (int index = 0; index < numberOfItems; index++)
{
if (listOfRankItems[index].WealthAmount == valueX)
{
resultPR = ((float)index)/((float)(numberOfItems - 1));
foundX = true;
break;
}
}

// calculate value using linear interpolation

if (!foundX)
{
float x1, x2, y1, y2;

x1 = x2 = valueX;

foundX = false;

for (int index = 0; index < numberOfItems - 1; index++)
{
if ((listOfRankItems[index].WealthAmount < valueX && valueX <
listOfRankItems[index + 1].WealthAmount) ||
(listOfRankItems[index].WealthAmount > valueX && valueX >
listOfRankItems[index + 1].WealthAmount))
{
x1 = listOfRankItems[index].WealthAmount;
x2 = listOfRankItems[index + 1].WealthAmount;
foundX = true;
break;
}
}

if (foundX)
{
y1 = PercentRank(listOfRankItems, x1);
y2 = PercentRank(listOfRankItems, x2);

resultPR = (((x2 - valueX)*y1 + (valueX - x1)*y2)) / (x2 - x1);
}
else
{
// use the smallest or largest value in the set which ever is
closer to valueX

if (Math.Abs(listOfRankItems[0].WealthAmount - valueX) <
Math.Abs(valueX - listOfRankItems[numberOfItems-1].WealthAmount))
x1 = listOfRankItems[0].WealthAmount;
else
x1 = listOfRankItems[numberOfItems-1].WealthAmount;

resultPR = PercentRank(listOfRankItems, x1);
}
}

return resultPR;
}



Harlan Grove said:
Kam Mistry wrote...
I am current trying to determine how Excel calculates PercentRank.

The project I am working on requires the use of this function - and we
cannot use COM automation to call Excel.
...

But you need to duplicate Excel's PERCENTRANK sematics exactly? If so,
PERCENTRANK(Rng,x) needs to return #N/A or some other error indicator
when x > MAX(Rng).

Also, PERCENTRANK rounds differently than the formula you mention. For
example, with 1..8 in G1:G8 and 5 in G10,

PERCENTRANK(G1:G8,G10,6) returns 0.571428
ROUND((RANK(G10,G1:G8,1)-1)/7,6) returns 0.571429

Rank(x) is easy to compute. The second statement is where I'm having
trouble with, how does Excel "interpolates" the percentage rank? What
method does it use?

What ever happened to exploratory data analysis? It's simple linear
interpolation between the points in the range bracketting your second
argument value. That is,

=TREND(PERCENTRANK(Rng,
{1;0}*MAX(IF(Rng<=x,Rng))+{0;1}*MIN(IF(Rng>=x,Rng)),6),
{1;0}*MAX(IF(Rng<=x,Rng))+{0;1}*MIN(IF(Rng>=x,Rng)),x)

or, since you're going to be programming this,


if (YourErrorCheckFcnHere(pr = YourPercentRankFcnHere(rng, x)) {
for (i = 0, xlo = xhi = x; i < sizeof(rng) / sizeof(*rng); ++i) {
if (xlo < rng && rng < x) xlo = rng;
if (xhi > rng && rng > x) xhi = rng;
}

ylo = YourPercentRankFcnHere(rng, xlo);
yhi = YourPercentRankFcnHere(rng, xhi);

pr = ((xhi - x) * ylo + (x - xlo) * yhi) / (xhi - xlo);
}


If rng is sorted, use binary search rather than simple iteration to
locate the points in rng bracketting x. I assumed rng was a simple
array. Adapt for more sophisticated data structures.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top