ALTERNATIVE TO SUMPRODUCT NEEDED

R

Rog

I need to use two columns and search for two things and count how often the
two occur together. In col X will occur the word "warranty" and in column AD
will occur, for example the word "switch". The problem is that the word
switch will occur in a paragraph, so I need a wild card and SUMPRODUCT does
not support this. I looked at the solutions provided for this, but none will
find the word "switch" in a paragraph. Is there another function or set of
functions that will do this? DCOUNT will not work because it's criteria
cannot be set to words with a wildcard character. Thanks in advance!! Needed
asap by the way. I have 40,000 records to search!
 
C

CLR

Take a look at the Autofilter......you can select "warranty" in the one
column, and select "custom, contains switch" in the other..........if you
then need to count the filtered rows, you could use the SUBTOTAL formula.

hth
Vaya con Dios,
Chuck, CABGx3
 
J

JE McGimpsey

One way:

=SUMPRODUCT(--(X1:X1000="warranty"),
--(ISNUMBER(SEARCH("switch",AD1:AD1000))))
 
H

Harlan Grove

Rog said:
I need to use two columns and search for two things and count
how often the two occur together. In col X will occur the word
"warranty" and in column AD will occur, for example the word
"switch". The problem is that the word switch will occur in a
paragraph, so I need a wild card and SUMPRODUCT does not support
this. . . .

FWIW, only SUMIF, COUNTIF, SEARCH, MATCH and {V|H}LOOKUP support
wildcards, and the last 3 only for exact matching. However, if you're
looking for a particular word that would be separated from other text
by spaces, you don't need wildcards.

=SUMPRODUCT(COUNTIF(Range,{"test *","* test *","* test"}))

and

=SUMPRODUCT(--ISNUMBER(SEARCH(" test ",Range)))

return the same result. The array argument to COUNTIF in the first
formula is necessary to capture "test" appearing at the start or end
of each cell value in Range as well as appearing in the middle of the
string. Eliminating the spaces would mean you could match "test" as a
substring of other words, e.g., "detested". So for more rigorous
matching, SUMPRODUCT/ISNUMBER/SEARCH is actually simpler to use. And
as an added bonus, SEARCH allows you to use wildcards if you have to.
 
P

Peo Sjoblom

If you have 40000 records to search then I doubt SUMPRODUCT is the right
tool but you can use it to find strings that are part of other strings, de
facto wildcard

=SUMPRODUCT(--(ISNUMBER(SEARCH("warranty",A2:A40000))),--(AD2:AD40000="Switch"))
 
R

Rog

Thanks, but this value needs to go into a report and this approach is
cumbersome to do that. I also will need to do this for many other
combinations and that would take some time.

Is there anything else out there? This can't be THAT hard for MS!!

Thanks very much, though!
 
P

Peo Sjoblom

The easiest way would be to use a filter and VBA, it is much more cumbersome
to use formulas on 40000 rows of data.
 
R

Rog

I think you're missing the point here. I need to count the rows that have
both "warranty" and "switch".
col X col AD
warranty the switch broke
customer light
warranty the valve is bad
customer the switch failedh
warranty it was the switch that broke

The result desired here is 2 because two rows have "warranty" AND the word
"switch". Hope that clarifies it and thanks again!
 
H

Harlan Grove

Rog said:
I think you're missing the point here.
....

No, Peo only got the string order wrong. Change his formula to

=SUMPRODUCT(--(X2:X40000="warranty"),
--ISNUMBER(SEARCH("switch",AD2:AD40000)))

and it will produce the result you claim to be seeking. The
ISNUMBER(SEARCH(..)) idiom is the STANDARD approach to indicating
whether a substring exists in a longer string, though, FTHOI, this
could also be done with (SUBSTITUTE(string,substring,"")=string) less
efficiently (sometimes only one level of function calls is necessary).
 
R

Rog

WOW! Thanks to you all!!! I have not completely tested it, but so far it
seems to work beautifully!
You were right... there was no misunderstanding!
Thank you so much!!
Roger
 
A

Alan Beban

Rog said:
I think you're missing the point here. I need to count the rows that have
both "warranty" and "switch".
col X col AD
warranty the switch broke
customer light
warranty the valve is bad
customer the switch failedh
warranty it was the switch that broke

The result desired here is 2 because two rows have "warranty" AND the word
"switch". Hope that clarifies it and thanks again!
If the functions in the freely downloadable file at
http://home.pacbell.net/beban are available to your workbook, you might
consider something like the following:

Assuming your lookup values are in X1:X5, Array enter into AE1:AE5

=SEARCH("switch",VLOOKUPs("warranty",X1:AD5,7),1) and enter into AF1

=COUNTIF(AE1:AE5,"<>#VALUE!")

The result should be in AF1

Alan Beban
 
H

Harlan Grove

Alan Beban said:
If the functions in the freely downloadable file at
http://home.pacbell.net/beban are available to your workbook,
you might consider something like the following:

Assuming your lookup values are in X1:X5, Array enter into AE1:AE5
=SEARCH("switch",VLOOKUPs("warranty",X1:AD5,7),1)
and enter into AF1
=COUNTIF(AE1:AE5,"<>#VALUE!")

The result should be in AF1

The OP did mention that his data spans nearly 40K rows. The good news
is that there'd be only one udf VLOOKUPS call, and since its result
would presumably have far fewer than 40K entries, there'd effectively
be fewer SEARCH calls.

But why bother with entering an array formula in AE1:AE#? The SEARCH
will return an array of numbers or error values. All that'd be needed
is the SINGLE array formula

=COUNT(SEARCH("switch",VLOOKUPS("warranty",X1:AD#,7)))

More efficient array formulas could be used that don't require udfs.

=COUNT(IF(X1:X#="warranty",SEARCH("switch",AD1:AD#)))

This will do less work than the VLOOKUPS formula much more quickly
since it avoids the Excel/VBA interface.

Note: replace # with the actual ranges' bottom row number.
 
R

Rog

WAIT! THERE'S A PROBLEM! Why will it not "see" additions to the data base? If
I add the word "switch" to one of the records it will not update to show the
count plus one. Please advise. What is happening here? I have put the word at
the beginning of the record and inthe middle of it. I am using "*" before and
after the word. Is it because I have 40k records to update?

Thanks
 
P

Peo Sjoblom

Do you have your calculation set to automatic under
tools>options>calculation?
What is the formula you are using?
Be aware that any array formula will be slow calculating 40000 rows
 
R

Rog

Here is the formula. I do have the auto calc set. I tried going manual and
using F9, but that didn't change it either.

=SUMPRODUCT(--(Portfolio_Review!$X$2:$X$45001="warranty*"),--ISNUMBER(SEARCH("BELLOW*",Portfolio_Review!$AD$2:$AD$45001)))

Thanks.
 
R

Rog

Correction: "warranty does not have the "*". BTW, "BELLOW*" is supposed to
pick up all forms of "BELLLOWS" (Sometimes they spell it wrong) but count it
only once in any record.

=SUMPRODUCT(--(Portfolio_Review!$X$2:$X$45001="warranty"),--ISNUMBER(SEARCH("BELLOW*",Portfolio_Review!$AD$2:$AD$45001)))
 
H

Harlan Grove

Rog said:
Here is the formula. I do have the auto calc set. I tried going manual
and using F9, but that didn't change it either.

=SUMPRODUCT(--(Portfolio_Review!$X$2:$X$45001="warranty*"),
--ISNUMBER(SEARCH("BELLOW*",Portfolio_Review!$AD$2:$AD$45001)))
....

No, you can't use wildcards in simple equality tests - "warranty*" would
only match substrings containing "warranty" immediately followed by an
asterisk. If you want to match "warranty" at the beginning of the col X
cells, only check the first 8 chars of each of those cells. And the SEARCH
string "BELLOW*" could be replaced with "BELLOW" because there the * is
superfluous.

=SUMPRODUCT(--(LEFT(Portfolio_Review!$X$2:$X$45001,8)="warranty"),
--ISNUMBER(SEARCH("BELLOW",Portfolio_Review!$AD$2:$AD$45001)))
 
H

Harlan Grove

Rog said:
Correction: "warranty does not have the "*". BTW, "BELLOW*" is supposed
to pick up all forms of "BELLLOWS" (Sometimes they spell it wrong) but
count it only once in any record.
....

Is it BELLLOW, BELLOW or BELOW? Doesn't really matter. The * is superfluous.
SEARCH("xyz*",Range) and SEARCH("xyz",Range) always return the same result.
Wildcards in SEARCH are only useful between literal text, e.g.,

SEARCH("a*z",Range)

which could match the alphabet, "Anzania", "a long time ago in Zimbabwe",
but not "Zounds! Another 'a'!".

Anyway, there's no hope for matching misspelled words unless you use
approximate patterns that could match a lot of other text or unless you test
all the allowed misspellings, e.g., test BELOW, BEELOW, BELLOW, BELLLOW,
etc. Approximate text matching requires VBA/udfs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top