Function to count unique values?

R

Richard Buttrey

I have a column of data in the form

a
a
a
b
b
c
c

etc.,

I know I can run an Advanced unique filter to obtain a list of unique
values and then use a COUNTA function to count them, but is there a
single function that would return the result, i.e. 3 (a, b & c) given
the above data?

(apologies if this appears twice - I thought I had posted it directly
to the GoogleNews (Deja) site but it doesn't seem to have turned up.

Rgds
__
Richard Buttrey
Grappenhall, Cheshire, UK
__________________________
 
K

KL

Hi Richard,

I can see you previous post (as of yesterday) in my OE and two replies to
it.

Regards,
KL
 
R

Richard Buttrey

=SUMPRODUCT((A1:A20<>"")/COUNTIF(A1:A20,A1:A20&""))


Great. Thanks Bob.

Just as a matter of interest, can you explain in words what the bits
of the function do, and what the bit at the end - A20&"" does?

Kind regards,

Richard
__
Richard Buttrey
Grappenhall, Cheshire, UK
__________________________
 
B

Bob Phillips

Okay, I will give it a shot.

Let's start by defining the range A1:A20 to talk specifics.

Bob,John,Bob,Bob,John,John,Bob,Bill,Bill,Max

or data in just A1:A10

The basic formula to count unique items is
=SUMPRODUCT(1/COUNTIF($A$1:$A$10,$A$1:$A$10))
The COUNTIF($A$1:$A$10,$A$1:$A$10) part of the formula builds an array of
the number of occurrences of each item, in this case{4;3;4;4;3;3;4;2;2;1}.
As can be seen, each occurrence of the repeated value is counted, so there
are four occurrences of Bob in the array. There will always be the same
number of occurrences of value as the count of that value, unless two or
more items are repeated the same number of times, in which case it will be
some multiple of that count.
Thus the item that is repeated 4 times has 4 instances of that count,
dividing 1 by the count of 4, gives 0.25 4 times. The full array of values
is
{0.25;0.333333333333333;0.25;0.25;0.333333333333333;0.333333333333333;0.25;0
..5;0.5;1}.
The item that repeats 4 times sums to 1. The item that repeats 3 times also
sums to 1. It should be clear from this that every value works in the same
way and sums to 1. In other words, 1 is returned for every unique item. The
sum of these values becomes the count of unique items.
As our test range is A1:A20, and some of the items in A1:A20 are blank,
extending this formula to A1:A20 would return a #DIV/0! Error.
The reason for the error is blank cells in the full range A1:A20. Each blank
cell returns a 0 value from the COUNTIF formula, which gives the #DIV/0!
Error when divided into 1.
The solution to this is to force it to count the empty cells as well, and
not return a zero. Adding &"" to the end of the COUNTIF formula forces a
count of the blanks.
This addition on its own removes the #DIV/0! error, but will cause the
blanks to be counted as a unique item. A further addition to the formula
resolves this by testing for those blanks. Instead of dividing the array of
counts into 1 each time, adding the test creates an array of TRUE/FALSE
values to be divided by the equivalent element in the counts array. Each
blank will resolve to FALSE in the dividend array, and the count of the
blanks in the divisor array. The result of this will be 0, so the blanks do
not get counted.
 
R

Richard Buttrey

Okay, I will give it a shot.

Let's start by defining the range A1:A20 to talk specifics.

Bob,John,Bob,Bob,John,John,Bob,Bill,Bill,Max

or data in just A1:A10

The basic formula to count unique items is
=SUMPRODUCT(1/COUNTIF($A$1:$A$10,$A$1:$A$10))
The COUNTIF($A$1:$A$10,$A$1:$A$10) part of the formula builds an array of
the number of occurrences of each item, in this case{4;3;4;4;3;3;4;2;2;1}.
As can be seen, each occurrence of the repeated value is counted, so there
are four occurrences of Bob in the array. There will always be the same
number of occurrences of value as the count of that value, unless two or
more items are repeated the same number of times, in which case it will be
some multiple of that count.
Thus the item that is repeated 4 times has 4 instances of that count,
dividing 1 by the count of 4, gives 0.25 4 times. The full array of values
is
{0.25;0.333333333333333;0.25;0.25;0.333333333333333;0.333333333333333;0.25;0
.5;0.5;1}.
The item that repeats 4 times sums to 1. The item that repeats 3 times also
sums to 1. It should be clear from this that every value works in the same
way and sums to 1. In other words, 1 is returned for every unique item. The
sum of these values becomes the count of unique items.
As our test range is A1:A20, and some of the items in A1:A20 are blank,
extending this formula to A1:A20 would return a #DIV/0! Error.
The reason for the error is blank cells in the full range A1:A20. Each blank
cell returns a 0 value from the COUNTIF formula, which gives the #DIV/0!
Error when divided into 1.
The solution to this is to force it to count the empty cells as well, and
not return a zero. Adding &"" to the end of the COUNTIF formula forces a
count of the blanks.
This addition on its own removes the #DIV/0! error, but will cause the
blanks to be counted as a unique item. A further addition to the formula
resolves this by testing for those blanks. Instead of dividing the array of
counts into 1 each time, adding the test creates an array of TRUE/FALSE
values to be divided by the equivalent element in the counts array. Each
blank will resolve to FALSE in the dividend array, and the count of the
blanks in the divisor array. The result of this will be 0, so the blanks do
not get counted.


Terrific.

Thanks Bob, that's one to keep - and so well explained.

Rgds

__
Richard Buttrey
Grappenhall, Cheshire, UK
__________________________
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top