PivotTable - Where have the units/records gone?

E

Epinn

I am quite lost and welcome some logical explanation. This is what I have done.

Click http://www.contextures.com/xlPivot07.html and download the tiny sample file.

Select the SalesPivot sheet (the one one screen when the file is opened). Drag all the fields including page field away from the PT until the only thing left is the outline. Now, drag units from the field list to the row area. (Note: I won't do this for real life; I just do this for this experiment. Bear with me and hear me out.) Then drag the same units field to the data area. Copy both columns to column G and H respectively. Sort column G in ascending order and do AutoSum. I have got 805 as total. I do the same for column H - sort in ascending order and AutoSum. I have got 1208 which is identical to the grand total in the PivotTable. I am missing 403 units from column G. Where have they gone? You will note that the largest number in column H is 196 but it is nowhere to be found in column G. Where has it gone?

Does anyone have an explanation? Roger, do you still want to bet with me? <bg>

Epinn
 
D

Debra Dalgleish

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 
R

Roger Govier

Epinn
Roger, do you still want to bet with me? <bg>

My money is on Deb, especially when it comes to Pivot Tables <vbg

--
Regards

Roger Govier


I am quite lost and welcome some logical explanation. This is what I
have done.

Click http://www.contextures.com/xlPivot07.html and download the tiny
sample file.

Select the SalesPivot sheet (the one one screen when the file is
opened). Drag all the fields including page field away from the PT
until the only thing left is the outline. Now, drag units from the
field list to the row area. (Note: I won't do this for real life; I
just do this for this experiment. Bear with me and hear me out.) Then
drag the same units field to the data area. Copy both columns to column
G and H respectively. Sort column G in ascending order and do AutoSum.
I have got 805 as total. I do the same for column H - sort in ascending
order and AutoSum. I have got 1208 which is identical to the grand
total in the PivotTable. I am missing 403 units from column G. Where
have they gone? You will note that the largest number in column H is
196 but it is nowhere to be found in column G. Where has it gone?

Does anyone have an explanation? Roger, do you still want to bet with
me? <bg>

Epinn
 
E

Epinn

I am not sure if my question/concern is *TRULY* understood.

I understand that 3 showing up twice is equivalent to 6 showing up once i.e. 3+3 = 6. If the units are summarized properly, the total will be the same. This is only true when the units field is dragged to the data area. However, when the units field is dragged to the row area 403 units are missing from the total in the row area.

The total units from the source data (column E) is 1208. The total units in the data area is also 1208. However, the total in the row area is 805. In other words, 403 units are not reported. My original question is why aren't the 403 units reported?

If you look at the source data, you will see 49 four times for *different* items, orders etc. 49*4=196 I do see 196 in the data area, but I only see 49 once in the row area. As Debra explained, in the row area, only unique value is picked up. In other words, even though 49 shows up four times in the source, only one 49 will be picked up. As a result, the total in the row area is distorted and units are missed. I am not sure why PT does this in the row area but not in the data area.

Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.

My experiment has proven my point. However, I am not sure if I am understood.

Epinn

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 
R

Roger Govier

Epinn
I am not sure if my question/concern is *TRULY* understood.
It is.
Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.
My experiment has proven my point.
It hasn't.

The Qty field in this PT is actually called Units.
As per our original discussion, when dragging Units to the Row area,
each occurrence of Units will show quite correctly just once. They are
LABELS in this scenario, not DATA.
You are trying to add Labels (which happen to be numeric and therefore
will sum) and expecting the result to be the same as adding the Data,
which includes the Frequency with which those Labels appear.
Adding Labels, is like adding Apples and Pears - the result is
meaningless.

Try dragging Units to the Data area a second time, but use Count of
Units rather than Sum of Units
Drag the Data field heading to the Total, and you will have the Sum and
Count side by side

Column A will now show 29 different numbers starting at 3 and rising to
50. Not every number in the series is present.
Column B will show the Sum of each of those numbers i.e the value of the
number * the number of occurrences
Column C will show the Count of those numbers i.e the Frequency with
which they occur in the raw data.

In cell E5, enter the following formula
=(C5-1)*A5
and copy down to E33
Enter in cell E34
=SUM(E5:E33)
and what do you see?
Your "missing" 403 which is the sum of the number of times each number
appeared more than once * that number.

As per our original discussion, there is nothing preventing you from
taking a field which contains the values (quantities) from a data table
to both the Row and Data area of a PT, but unless you were wanting to
create a Frequency Distribution, I cannot see why you would do so.

No matter, the PT does it all correctly and does not "lose" any data!!!
Roger, do you still want to bet with me? <bg>
Yep <vbg>
--
Regards

Roger Govier


I am not sure if my question/concern is *TRULY* understood.

I understand that 3 showing up twice is equivalent to 6 showing up once
i.e. 3+3 = 6. If the units are summarized properly, the total will be
the same. This is only true when the units field is dragged to the data
area. However, when the units field is dragged to the row area 403
units are missing from the total in the row area.

The total units from the source data (column E) is 1208. The total
units in the data area is also 1208. However, the total in the row area
is 805. In other words, 403 units are not reported. My original
question is why aren't the 403 units reported?

If you look at the source data, you will see 49 four times for
*different* items, orders etc. 49*4=196 I do see 196 in the data
area, but I only see 49 once in the row area. As Debra explained, in
the row area, only unique value is picked up. In other words, even
though 49 shows up four times in the source, only one 49 will be picked
up. As a result, the total in the row area is distorted and units are
missed. I am not sure why PT does this in the row area but not in the
data area.

Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.

My experiment has proven my point. However, I am not sure if I am
understood.

Epinn

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 
E

Epinn

Roger,

Thank you for being patient with me. I will digest your post some more later on. Now, I feel my question is answered. The following phrase is what I have been looking for.

"They are LABELS ......"

I think we both win the bet. :) Why? Our definition of "lost records/data" is different. When the values are not picked up and I don't see them on the table, I define this as "lost ....." (meaning lost from the report) even though I know they are still there in the database. For you, if it is not reported, it is not lost because it is still in the database.

I promise you I won't drag data fields to the row area. Now, that's a true win-win situation. Once again, thank you for putting up with me. You are one of the most patient people who crosses my path.

I should thank Debra for her help as well. Yes, when it comes to PT, Debra is the guru. I appreciate her writeups on PT and I constantly check them out.

Epinn

Roger Govier said:
I am not sure if my question/concern is *TRULY* understood.
It is.
Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.
My experiment has proven my point.
It hasn't.

The Qty field in this PT is actually called Units.
As per our original discussion, when dragging Units to the Row area,
each occurrence of Units will show quite correctly just once. They are
LABELS in this scenario, not DATA.
You are trying to add Labels (which happen to be numeric and therefore
will sum) and expecting the result to be the same as adding the Data,
which includes the Frequency with which those Labels appear.
Adding Labels, is like adding Apples and Pears - the result is
meaningless.

Try dragging Units to the Data area a second time, but use Count of
Units rather than Sum of Units
Drag the Data field heading to the Total, and you will have the Sum and
Count side by side

Column A will now show 29 different numbers starting at 3 and rising to
50. Not every number in the series is present.
Column B will show the Sum of each of those numbers i.e the value of the
number * the number of occurrences
Column C will show the Count of those numbers i.e the Frequency with
which they occur in the raw data.

In cell E5, enter the following formula
=(C5-1)*A5
and copy down to E33
Enter in cell E34
=SUM(E5:E33)
and what do you see?
Your "missing" 403 which is the sum of the number of times each number
appeared more than once * that number.

As per our original discussion, there is nothing preventing you from
taking a field which contains the values (quantities) from a data table
to both the Row and Data area of a PT, but unless you were wanting to
create a Frequency Distribution, I cannot see why you would do so.

No matter, the PT does it all correctly and does not "lose" any data!!!
Roger, do you still want to bet with me? <bg>
Yep <vbg>
--
Regards

Roger Govier


I am not sure if my question/concern is *TRULY* understood.

I understand that 3 showing up twice is equivalent to 6 showing up once
i.e. 3+3 = 6. If the units are summarized properly, the total will be
the same. This is only true when the units field is dragged to the data
area. However, when the units field is dragged to the row area 403
units are missing from the total in the row area.

The total units from the source data (column E) is 1208. The total
units in the data area is also 1208. However, the total in the row area
is 805. In other words, 403 units are not reported. My original
question is why aren't the 403 units reported?

If you look at the source data, you will see 49 four times for
*different* items, orders etc. 49*4=196 I do see 196 in the data
area, but I only see 49 once in the row area. As Debra explained, in
the row area, only unique value is picked up. In other words, even
though 49 shows up four times in the source, only one 49 will be picked
up. As a result, the total in the row area is distorted and units are
missed. I am not sure why PT does this in the row area but not in the
data area.

Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.

My experiment has proven my point. However, I am not sure if I am
understood.

Epinn

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 
R

Roger Govier

Epinn
I think we both win the bet. :)
Whilst I would say I am normally a gracious loser, I cannot accept a
draw.

By any definition, nothing is "lost".
If the PT included every instance of the value, you would be looking at
exactly the same information as the raw data table. Where is the point
in that?
A Report is a summary of the data, otherwise, don't bother with reports,
just look at the complete data table.
So please, stop referring to anything as lost.

The beers are very definitely on you <VBG>

--
Regards

Roger Govier


Roger,

Thank you for being patient with me. I will digest your post some more
later on. Now, I feel my question is answered. The following phrase is
what I have been looking for.

"They are LABELS ......"

I think we both win the bet. :) Why? Our definition of "lost
records/data" is different. When the values are not picked up and I
don't see them on the table, I define this as "lost ....." (meaning lost
from the report) even though I know they are still there in the
database. For you, if it is not reported, it is not lost because it is
still in the database.

I promise you I won't drag data fields to the row area. Now, that's a
true win-win situation. Once again, thank you for putting up with me.
You are one of the most patient people who crosses my path.

I should thank Debra for her help as well. Yes, when it comes to PT,
Debra is the guru. I appreciate her writeups on PT and I constantly
check them out.

Epinn

Epinn
I am not sure if my question/concern is *TRULY* understood.
It is.
Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.
My experiment has proven my point.
It hasn't.

The Qty field in this PT is actually called Units.
As per our original discussion, when dragging Units to the Row area,
each occurrence of Units will show quite correctly just once. They are
LABELS in this scenario, not DATA.
You are trying to add Labels (which happen to be numeric and therefore
will sum) and expecting the result to be the same as adding the Data,
which includes the Frequency with which those Labels appear.
Adding Labels, is like adding Apples and Pears - the result is
meaningless.

Try dragging Units to the Data area a second time, but use Count of
Units rather than Sum of Units
Drag the Data field heading to the Total, and you will have the Sum and
Count side by side

Column A will now show 29 different numbers starting at 3 and rising to
50. Not every number in the series is present.
Column B will show the Sum of each of those numbers i.e the value of the
number * the number of occurrences
Column C will show the Count of those numbers i.e the Frequency with
which they occur in the raw data.

In cell E5, enter the following formula
=(C5-1)*A5
and copy down to E33
Enter in cell E34
=SUM(E5:E33)
and what do you see?
Your "missing" 403 which is the sum of the number of times each number
appeared more than once * that number.

As per our original discussion, there is nothing preventing you from
taking a field which contains the values (quantities) from a data table
to both the Row and Data area of a PT, but unless you were wanting to
create a Frequency Distribution, I cannot see why you would do so.

No matter, the PT does it all correctly and does not "lose" any data!!!
Roger, do you still want to bet with me? <bg>
Yep <vbg>
--
Regards

Roger Govier


I am not sure if my question/concern is *TRULY* understood.

I understand that 3 showing up twice is equivalent to 6 showing up once
i.e. 3+3 = 6. If the units are summarized properly, the total will be
the same. This is only true when the units field is dragged to the data
area. However, when the units field is dragged to the row area 403
units are missing from the total in the row area.

The total units from the source data (column E) is 1208. The total
units in the data area is also 1208. However, the total in the row area
is 805. In other words, 403 units are not reported. My original
question is why aren't the 403 units reported?

If you look at the source data, you will see 49 four times for
*different* items, orders etc. 49*4=196 I do see 196 in the data
area, but I only see 49 once in the row area. As Debra explained, in
the row area, only unique value is picked up. In other words, even
though 49 shows up four times in the source, only one 49 will be picked
up. As a result, the total in the row area is distorted and units are
missed. I am not sure why PT does this in the row area but not in the
data area.

Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.

My experiment has proven my point. However, I am not sure if I am
understood.

Epinn

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 
E

Epinn

The beers are very definitely on you <VBG>

Okay! If I put on my IT hat, I am on your side and won't argue. Are you close to London? I may be able to get you some coupons .......

Epinn

I think we both win the bet. :)
Whilst I would say I am normally a gracious loser, I cannot accept a
draw.

By any definition, nothing is "lost".
If the PT included every instance of the value, you would be looking at
exactly the same information as the raw data table. Where is the point
in that?
A Report is a summary of the data, otherwise, don't bother with reports,
just look at the complete data table.
So please, stop referring to anything as lost.

The beers are very definitely on you <VBG>

--
Regards

Roger Govier


Roger,

Thank you for being patient with me. I will digest your post some more
later on. Now, I feel my question is answered. The following phrase is
what I have been looking for.

"They are LABELS ......"

I think we both win the bet. :) Why? Our definition of "lost
records/data" is different. When the values are not picked up and I
don't see them on the table, I define this as "lost ....." (meaning lost
from the report) even though I know they are still there in the
database. For you, if it is not reported, it is not lost because it is
still in the database.

I promise you I won't drag data fields to the row area. Now, that's a
true win-win situation. Once again, thank you for putting up with me.
You are one of the most patient people who crosses my path.

I should thank Debra for her help as well. Yes, when it comes to PT,
Debra is the guru. I appreciate her writeups on PT and I constantly
check them out.

Epinn

Epinn
I am not sure if my question/concern is *TRULY* understood.
It is.
Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.
My experiment has proven my point.
It hasn't.

The Qty field in this PT is actually called Units.
As per our original discussion, when dragging Units to the Row area,
each occurrence of Units will show quite correctly just once. They are
LABELS in this scenario, not DATA.
You are trying to add Labels (which happen to be numeric and therefore
will sum) and expecting the result to be the same as adding the Data,
which includes the Frequency with which those Labels appear.
Adding Labels, is like adding Apples and Pears - the result is
meaningless.

Try dragging Units to the Data area a second time, but use Count of
Units rather than Sum of Units
Drag the Data field heading to the Total, and you will have the Sum and
Count side by side

Column A will now show 29 different numbers starting at 3 and rising to
50. Not every number in the series is present.
Column B will show the Sum of each of those numbers i.e the value of the
number * the number of occurrences
Column C will show the Count of those numbers i.e the Frequency with
which they occur in the raw data.

In cell E5, enter the following formula
=(C5-1)*A5
and copy down to E33
Enter in cell E34
=SUM(E5:E33)
and what do you see?
Your "missing" 403 which is the sum of the number of times each number
appeared more than once * that number.

As per our original discussion, there is nothing preventing you from
taking a field which contains the values (quantities) from a data table
to both the Row and Data area of a PT, but unless you were wanting to
create a Frequency Distribution, I cannot see why you would do so.

No matter, the PT does it all correctly and does not "lose" any data!!!
Roger, do you still want to bet with me? <bg>
Yep <vbg>
--
Regards

Roger Govier


I am not sure if my question/concern is *TRULY* understood.

I understand that 3 showing up twice is equivalent to 6 showing up once
i.e. 3+3 = 6. If the units are summarized properly, the total will be
the same. This is only true when the units field is dragged to the data
area. However, when the units field is dragged to the row area 403
units are missing from the total in the row area.

The total units from the source data (column E) is 1208. The total
units in the data area is also 1208. However, the total in the row area
is 805. In other words, 403 units are not reported. My original
question is why aren't the 403 units reported?

If you look at the source data, you will see 49 four times for
*different* items, orders etc. 49*4=196 I do see 196 in the data
area, but I only see 49 once in the row area. As Debra explained, in
the row area, only unique value is picked up. In other words, even
though 49 shows up four times in the source, only one 49 will be picked
up. As a result, the total in the row area is distorted and units are
missed. I am not sure why PT does this in the row area but not in the
data area.

Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.

My experiment has proven my point. However, I am not sure if I am
understood.

Epinn

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 
E

Epinn

Roger,

I forgot to say that you are demonstrative on top of being patient.

You wrote: ......unless you were wanting to create a Frequency Distribution......
I wrote: ......I promise you I won't drag data fields to the row area.......

Sorry, I have changed my mind. I am inspired by this experiment; and I have this idea of using PivotTable to check for duplicate numbers.

Using just column A (row area) and B (data area), I can tell if there are duplicates. Throwing in column C (data area), the counts can highlight the duplicates.

I read a user's request http://tinyurl.com/ycflgv. If I understand it correctly, there is a need to check for duplicates in non-contiguous ranges as one big picture i.e. a1:a50, c1:c50, e1:e50, ......k1:k50.

If I can find a way to make the six columns into one long column, then I can drag the number field (for the long column) into row area and data area respectively. If the total for the column in the row area is the same as that of the data area, then there are no duplicates. Do a count in the data area will pinpoint the duplicated number(s). I know this is not efficient but I am trying to apply my new idea to a real life example. This way I don't need any formula for the non-contiguous ranges.

Now, what is an easy way to append the six columns into one long column? ;) Anything better than copy and paste?

What do you think about the idea of using PT to check for duplicates?

Thanks for reading.

Epinn

Roger Govier said:
I am not sure if my question/concern is *TRULY* understood.
It is.
Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.
My experiment has proven my point.
It hasn't.

The Qty field in this PT is actually called Units.
As per our original discussion, when dragging Units to the Row area,
each occurrence of Units will show quite correctly just once. They are
LABELS in this scenario, not DATA.
You are trying to add Labels (which happen to be numeric and therefore
will sum) and expecting the result to be the same as adding the Data,
which includes the Frequency with which those Labels appear.
Adding Labels, is like adding Apples and Pears - the result is
meaningless.

Try dragging Units to the Data area a second time, but use Count of
Units rather than Sum of Units
Drag the Data field heading to the Total, and you will have the Sum and
Count side by side

Column A will now show 29 different numbers starting at 3 and rising to
50. Not every number in the series is present.
Column B will show the Sum of each of those numbers i.e the value of the
number * the number of occurrences
Column C will show the Count of those numbers i.e the Frequency with
which they occur in the raw data.

In cell E5, enter the following formula
=(C5-1)*A5
and copy down to E33
Enter in cell E34
=SUM(E5:E33)
and what do you see?
Your "missing" 403 which is the sum of the number of times each number
appeared more than once * that number.

As per our original discussion, there is nothing preventing you from
taking a field which contains the values (quantities) from a data table
to both the Row and Data area of a PT, but unless you were wanting to
create a Frequency Distribution, I cannot see why you would do so.

No matter, the PT does it all correctly and does not "lose" any data!!!
Roger, do you still want to bet with me? <bg>
Yep <vbg>
--
Regards

Roger Govier


I am not sure if my question/concern is *TRULY* understood.

I understand that 3 showing up twice is equivalent to 6 showing up once
i.e. 3+3 = 6. If the units are summarized properly, the total will be
the same. This is only true when the units field is dragged to the data
area. However, when the units field is dragged to the row area 403
units are missing from the total in the row area.

The total units from the source data (column E) is 1208. The total
units in the data area is also 1208. However, the total in the row area
is 805. In other words, 403 units are not reported. My original
question is why aren't the 403 units reported?

If you look at the source data, you will see 49 four times for
*different* items, orders etc. 49*4=196 I do see 196 in the data
area, but I only see 49 once in the row area. As Debra explained, in
the row area, only unique value is picked up. In other words, even
though 49 shows up four times in the source, only one 49 will be picked
up. As a result, the total in the row area is distorted and units are
missed. I am not sure why PT does this in the row area but not in the
data area.

Roger, my very first statement under another thread is the following:
If I just drag qty to the row area and nothing else, I lose a couple of
records.

My experiment has proven my point. However, I am not sure if I am
understood.

Epinn

The units in the Row area represent the unique unit entries in the
source data. If you sort the source table by Units, you'll see two
entries where 3 units were ordered. Those two orders are summarized in
the 3 row of the pivot table, for a total of 6 units.

Other unit rows, e.g. 9 and 28, also summarize multiple orders in the
data area.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top