Compare "like neighbourhoods"? Think outside the box?

H

HotRod

I have a really intersting request here and I'm wondering if there is an
excel answer before I go for a VBA one. I have a list of 100+ locations and
12 data variables that define demographic information. Is there a way to
sort the towns or query them so that I can find a town that most closely
resembles town #1 or has the closest maching variables?

e.g.
35320085 0.80 0.00 0.00 12.00 0.02 0.02 9.00 0.02 0.02 9.00 -0.15 0.15
8.00
35320087 0.82 -0.02 0.02 10.00 0.00 0.00 12.00 0.00 0.00 11.00 -0.17
0.17 6.00
35320088 0.82 -0.02 0.02 11.00 0.00 0.00 11.00 0.00 0.00 12.00 -0.17
0.17 7.00
35320090 0.65 0.15 0.15 5.00 0.17 0.17 5.00 0.17 0.17 5.00 0.00 0.00
12.00
 
B

Bernard Liengme

I would take each town and for every other town commute
SUM(ABS(diff_in_parm)) for all parms. Then for find which other town has the
lowest value for this statistic.
Unfortunately you would have 100! pairs to look at. This is about 10^158
which is possibly more than all the grains of sand on earth. If you compared
1 million in a second it would take about 8 universe-life-times.
Happy hunting and best wishes
 
R

Roger Govier

Hi

If you have a row above your data, you could, mark the row and apply
Data>Filter>Autofilter

Using the dropdowns, go to each variable in order of importance for your
selection, and select the same value as for town 1.
You will gradually filter down through the list until you get to none
selected at all. If so, go back 1 selection.
Rather than selecting the value for town 1 for any variable, you could
select Custom and give values Greater than or Equal and Less then or
equal values which are slightly below and slightly greater than the
target town's values.
 
H

HotRod

I was afraid these would be the answers. I'm trying to work with the user to
decide if all of the variables are weight equal or if some of them are more
important? I'm hoping that may help me assign a score to the towns.

When I remember my other question I'll need to ask that too...
 
H

HotRod

This is turning out to be a lot harder than I expected. Time for me and the
user to have a talk. I think I may need to determine a single score for each
town and then assign it a colour on the map so that user can visualize the
relationship. Thanks for the ideas though.
 
H

Harlan Grove

Bernard Liengme wrote...
I would take each town and for every other town commute
SUM(ABS(diff_in_parm)) for all parms. Then for find which other town has the
lowest value for this statistic.
Unfortunately you would have 100! pairs to look at. This is about 10^158
which is possibly more than all the grains of sand on earth. If you compared
1 million in a second it would take about 8 universe-life-times.
....

So don't compare them pairwise. If the 12 variables were denoted by
letters A to L, and the order of importance were G, E, C, A, B, D, F,
H, J, L, K, I, create a sort key by translating each variable to a
numeric 0-1 scale (e.g., x' = (x-MIN(xRng))/(MAX(xRng)-MIN(xRng))),
then concatenate their text representations as

=TEXT(g',"0.000")&TEXT(e',"0.000")&...&TEXT(i',"0.000")

This assumes that differences in variables are hierarchical, so that 2
records would be close if their G varaibles were close even if all
their other varaibles were far apart.

As for the OP's data, the OP is looking for the town closest to Town1,
and if there were N towns in the list, that requires only N-1
comparisons, and the entire list could be sorted by closeness to Town1.
And if the varaibles for Town1 and Town99 were in C2:N2 and C100:N100,
respectively, it may be more appropriate to use

=SUMXMY2(C$2:N$2,C$99:N$99)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top