SumProduct, Match in a UDF

B

BillW

SumProduct & Match in a UDF

I have a w-sheet of queried data that will max at about 60 columns and
27,000 rows. Am experimenting w/ SumProduct formulas as a means to sum data
based on multiple criteria, but am having trouble translating them into VBA.
(Want to go VBA route to simplify for users). My questions:
1) Have read in the postings that SumProduct may be a memory hog, end up
being slow for users. Is that true?? If yes, blow the below questions off,
and kindly suggest a different VBA solution than below, pls. If the answer
is no, this is what I’m up against:
2) Here’s an example of a worksheet formula I’m currently (successfully)
using, but am having trouble getting into VBA
=SUMPRODUCT(--(ISNUMBER(MATCH(Org,zh6A,0))),--(ISNUMBER(MATCH(Acct,SelectedAccts,0))),Actual12)
Org = named range of approx 27,000 rows, 1 column on the main data sheet
talked about above. Data can be either numeric or string, includes
duplicates.
zh6A = named range of approx 200 rows, 1 column on a separate sheet.
Numeric or string data, no duplicates.
Acct = another named range of approx 27,000 rows, 1 column on the main data
sheet. Data can be either numeric or string, includes duplicates.
SelectedAccts = named range of approx 50 rows, 1 column on a separate sheet.
Numeric or string data, no duplicates.
Actual12 = another named range of approx 27,000 rows, 1 column on the main
data sheet. Data is numeric, no dups.
3) Here’s the UDF I’ve come up with so far (returns the #Value error):
Function ARES(ROrgs, RAccts, DataType_Mo)
Dim Org
Org = Worksheets("Data").Range("Org")
Dim Acct
Acct = Worksheets("Data").Range("Acct")
ARES = Evaluate( "SumProduct(--(IsNumber(Match(" & Org & "," & ROrgs & _
",0))),--(IsNumber(Match(" & Acct & "," & RAccts & ",0)))," & DataType_Mo &
")â€)

My plan is for the users to be able to provide named ranges of data for the
UDF arguments (e.g., the zh6A named range for the ROrgs argument), and then
the Org and Acct variables are named ranges that will already be defined in
the w-book and always used in the UDF. Thanks much for anyone’s help.
 
B

Bob Phillips

SUMPRODUCT, like array formulae, can be quite resource hungry when you have
large ranges of data, and multiple instances of the formula.

However, embedding them as evaluated statements in a UDF is adding two extra
layers of work, which can only slow them down even more.

--
---
HTH

Bob

(there's no email, no snail mail, but somewhere should be gmail in my addy)
 
C

Charles Williams

And your UDF wont work properly if you refer to ranges which are not in the
UDF parameter list.

A UDF is usually faster than SUMPRODUCT/Array formulae if you get all the
data from ranges into variant arrays using
vArr=Rng
and then process the data in the arrays.

Charles
______________________
Decision Models
FastExcel 2.3 now available
Name Manager 4.0 now available
www.DecisionModels.com
 
B

BillW

Thanks for your input, appreciate it. Was glad to find SumProduct as
potential answer, but need to keep things skinny cuz of potential size of
data.
I definitely need help w/ arrays & VBA, just learning. How do I accomplish
the below?
Main Datasheet consists of
A B C D
1 10000 3000 10 20
2 10000 9EXP 20 30
3 12510A 10000 30 40
4 12510 3000 40 50
A1:A4 is named range Org. B1:B4 = named range Acct. C1:C4 = named range
Actl11 and D1:D4 = named range Bgt11, both columns with data to be summed.
All these named ranges will always be available for formula, VBA purposes.
Users via dialog boxes will (a) provide a named range called ROrgs with
comparison criteria to match in Org, (b) provide a named range called RAccts
with comparison criteria to match in Acct, and (c) designate which column,
Actl11 or Bgt11, to sum records for. So say ROrgs has data of 10000 and
12510A in it, RAccts has 3000 and 10000 in it, and they want Bgt11 summed.
Answer is 60, but not sure how to get there from here using arrays in VBA.
(This is much-simplified example, actually have 5 different criteria columns
so far, with the different combinations increasing the decision points
exponentially, but will nest things if you can get me started here.)
Thanks again for all help.
 
C

Charles Williams

Sounds like you should be using a proper database rather than Excel (or
maybe SUMIFS in Excel 2007)

anyway here is some code to get you started

public function Bill(theData as range, OrgList as range, AcctList as range,
....
dim vData as variant
dim j as long
dim k as long
dim oOrg as range
dim oAcct as range

vData=theData
'
' vdata now contains a 2-dimensional 1-based array of the data from theRange
'
for j=1 to ubound(vData,1)
for each oOrg in OrgList
if oorg.value=vdata(j,1) then
for each oAcct.value in AcctList
if oAcct.value=vdata(j,2) then Bill=Bill+vdata(j,4)
next oAcct
endif
next oOrg
next j
end function

Charles
______________________
Decision Models
FastExcel 2.3 now available
Name Manager 4.0 now available
www.DecisionModels.com

BillW said:
Thanks for your input, appreciate it. Was glad to find SumProduct as
potential answer, but need to keep things skinny cuz of potential size of
data.
I definitely need help w/ arrays & VBA, just learning. How do I
accomplish
the below?
Main Datasheet consists of
A B C D
1 10000 3000 10 20
2 10000 9EXP 20 30
3 12510A 10000 30 40
4 12510 3000 40 50
A1:A4 is named range Org. B1:B4 = named range Acct. C1:C4 = named range
Actl11 and D1:D4 = named range Bgt11, both columns with data to be summed.
All these named ranges will always be available for formula, VBA purposes.
Users via dialog boxes will (a) provide a named range called ROrgs with
comparison criteria to match in Org, (b) provide a named range called
RAccts
with comparison criteria to match in Acct, and (c) designate which column,
Actl11 or Bgt11, to sum records for. So say ROrgs has data of 10000 and
12510A in it, RAccts has 3000 and 10000 in it, and they want Bgt11 summed.
Answer is 60, but not sure how to get there from here using arrays in VBA.
(This is much-simplified example, actually have 5 different criteria
columns
so far, with the different combinations increasing the decision points
exponentially, but will nest things if you can get me started here.)
Thanks again for all help.
 
B

BillW

Works like a charm, thank you!!!
And agreed re: the database tool. Have been working in Crystal for a long
time, but users always complain it's too hard, so am trying to squeeze a more
user-friendly reporting tool out of Excel. Many thanks again.
 
B

BillW

Wellllll..... I built the UDF function for my monster database, and it works
fine, but it turns out that SumProduct formulas are much faster. It's not a
fair comparison, though, so would still like to test SumProduct in a UDF to
compare speed. Anyone care to take a stab at converting this spreadsheet
formula:
=SUMPRODUCT(--(ISNUMBER(MATCH(Org,zh6A,0))),--(ISNUMBER(MATCH(Acct,SelectedAccts,0))),Actual12)
into VBA language?? I still can't get it.
Thanks in advance.
 
C

Charles Williams

You could speed up the UDF if you put orglist and acctlist into variant
arrays as well (just have to handle the case where there is only 1 value in
the list so the result is not an array).

But I am surprised that the UDF is slower: can you post the code you are
using?

Charles
______________________
Decision Models
FastExcel 2.3 now available
Name Manager 4.0 now available
www.DecisionModels.com
 
B

Bob Phillips

If you MUST put the formula in code it would simply be

myVar =
Activesheet.Evaluate("SUMPRODUCT(--(ISNUMBER(MATCH(Org,zh6A,0))),--(ISNUMBER(MATCH(Acct,SelectedAccts,0))),Actual12)")

but like Charles, I cannot belive this could ever be the most efficient way
to do it.


--
---
HTH

Bob

(there's no email, no snail mail, but somewhere should be gmail in my addy)
 
B

BillW

Bob & Charles, sorry for being very unclear on my last post. I'd already
done the SumProduct formula in the UDF as Bob wrote below, and changed my
test UDF so all the named ranges were arguments to the UDF. That's where my
comparison of the speed between the loop code and the sumproduct is showing
sumproduct is so much faster - sumproduct is fraction of a second to calc 1
formula, the loop code takes 3 seconds.
What I really meant to ask re: the sumproduct formula below and vba is how
do I put it into vba so I can reference the named ranges that i don't want to
use as arguments in the udf? E.g., if I didn't use the named ranges 'Org' or
'Acct' from the sumproduct formula below as arguments in the udf? Or if
that's not possible in a udf, then in just a sub in vba? I have followed the
posts from other discussions of this nature, and think that what's stopping
me is how to get the IsNumber and Match functions into vba.
Charles - below is the code I used for test purposes. I held it to just the
2 criteria (Org and Acct) for now as I wanted to compare apples to apples re:
speed.
Thx a million for your time and help gentlemen.
Public Function ARes4(theData As Range, OrgList As Range, AcctList As Range)
Dim vData As Variant
Dim j As Long
Dim k As Long
Dim Org As Range
Dim Acct As Range
vData = theData ' vData now contains a 2-dimensional 1-based array of the
data from theData
For j = 1 To UBound(vData, 1)
For Each Org In OrgList
If Org.Value = vData(j, 1) Then
For Each Acct In AcctList
If Acct.Value = vData(j, 3) Then ARes4 = ARes4 + vData(j, 51)
Next Acct
End If
Next Org
Next j
End Function
 
B

BillW

Bob & Charles, I think I solved it. I noticed Charles' note re: declaring
the OrgList & AcctList variables as arrays, and in experimenting with that
discovered that that was my problem with the SumProduct-into-vba issue, too.
So I have an equal test now, and...the results, please.... the loop code and
sumproduct-in-vba are equally slow & resource-hungry! Back to the drawing
board, gotta figure out a way to limit the beginning data to begin with, but
in the meantime I've learned a ton - thanks again for your help!!
 
C

Charles Williams

OK, this one is a bit more optimised, takes about 480 milliseconds on my
machine for 200 orgs and 200 accts on 20K rows.
a SUMPRODUCT formula in a cell takes about 290 millisecs to get the same
results (the UDF will close the gap/win with more conditions).
evaluating the sumproduct as text inside a UDF takes about 590 millisecs, I
have not programmed it using Ranges rather than text but it would be slower.

Option Explicit
Option Base 1
Public Function ARes4(theData As Range, OrgList As Range, AcctList As Range)
Dim vOrgData As Variant
Dim vAcctData As Variant
Dim vSumData As Variant
Dim vOrgs() As Variant
Dim vAccts() As Variant
Dim Org As Variant
Dim OrgData As Variant
Dim Acct As Variant
Dim j As Long

vOrgData = theData.Columns(1).Value2
If OrgList.Count = 1 Then
ReDim vOrgs(1)
vOrgs(1) = OrgList.Value2
Else
vOrgs = OrgList.Value2
End If

vAcctData = theData.Columns(3).Value2
If AcctList.Count = 1 Then
ReDim vAccts(1)
vAccts(1) = AcctList.Value2
Else
vAccts = AcctList.Value2
End If

vSumData = theData.Columns(51).Value2

For Each OrgData In vOrgData
j = j + 1
For Each Org In vOrgs
If Org = OrgData Then
For Each Acct In vAccts
If Acct = vAcctData(j, 1) Then ARes4 = ARes4 +
vSumData(j, 1)
Next Acct
End If
Next Org
Next OrgData
End Function


--
Charles
______________________
Decision Models
FastExcel 2.3 now available
Name Manager 4.0 now available
www.DecisionModels.com
 
C

Charles Williams

If you need to make it really fast then sort the data on the first test
(Org) and use binary search rather than linear search.

Charles
______________________
Decision Models
FastExcel 2.3 now available
Name Manager 4.0 now available
www.DecisionModels.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top