Kiffar said:
Is this because it would be inaccurate, or because there's
a more efficient method?
The latter.
Kiffar said:
I'd prefer to limit the number of simulations, so that each
time I change one of the non-random variables, it doesn't
take very long to update the calculations, as my actual
datasets have billions of possibilities.
The "best" implementation approach will depend on the exact nature of the
simulation.
I would encourage you to share all the details with us. You can upload an
example Excel file (devoid of any private data) that demonstrates the
problem to a file-sharing website.
Then post the "shared", "public" or "view-only" link (aka URL;
http://...)
in a response here. The following is a list of some free file-sharing
websites; or use your own.
Box.Net:
http://www.box.net/files
Windows Live Skydrive:
http://skydrive.live.com
MediaFire:
http://www.mediafire.com
FileFactory:
http://www.filefactory.com
FileSavr:
http://www.filesavr.com
RapidShare:
http://www.rapidshare.com
Alternatively, send email to me at joeu2004 "at" hotmail.com, ideally with a
full description and an Excel file attachment.
(Some forums censor real email addresses.)
Kiffar said:
I've never used macros or anything fancy like that before,
though, so I might need handholding implementing such
things. To give you an idea of my familiarity with Excel
functions, I just learned how if statements work this week.
Generally, VBA (macros) is the best way to implement simulations, first
because the logic is sometimes difficult to write in Excel, and second
because we can avoid recalculations at unexpected times. The use of
RANDBETWEEN and RAND usually causes recalculations every time any cell in
any worksheet in the workbook is edited.
Be that as it may, here is a quick-and-dirty way to do the simulation that
you offered as an example. You will note that the cell usage is very
different from what you asked for. Is that really a deal-breaker?
In Sheet1:
A1: =RANDBETWEEN(1,10)
B1: =RANDBETWEEN(1,10)
C1: =RANDBETWEEN(1,11)
D1: =RANDBETWEEN(1,11)
E1: =A1+B1+5
F1: =C1+D1+6
G1: =(C1-C2)*2/(C1+C2)
Copy A1:G1 down through A10000:F10000. One way to do that easily:
1. Copy A1:G1.
2. Type A2:G10000 into the Name Box, then press Enter to select that range.
3. Paste by pressing ctrl+V.
Note that columns E, F and G correspond to your C1, C2 and C3. There are
just 10,000 of them -- 10,000 simulations.
For now, copy G1:G10000 and paste-special-value into H1:H10000. That
"freezes" the simulation. Note that G1:G10000 et al will continue to
change. We don't care.
I put the random data into a separate worksheet so that we can use a macro
later, if you wish, to "freeze" the simulation instead of using
copy-and-paste-special-value.
In Sheet2:
D1: =COUNTIF(Sheet1!H1:H10000,">=0.4")
D2: =COUNTIF(Sheet1!H1:H10000,">=0.2") - D1
D3: =COUNTIF(Sheet1!H1:H10000,">-0.2") - D1 - D2
D4: =COUNTIF(Sheet1!H1:H10000,">-0.4") - SUM(D1
3)
D5: =COUNTIF(Sheet1!H1:H10000,"<=-0.4")
D7: =SUM(D1
5) ' should be 10000, of course
For D2
4, we could use COUNTIFS in Excel 2007 and later instead. For
example:
=COUNTIFS(Sheet1!H1:H10000,">=0.2",Sheet1!H1:H10000,"<0.4")
E1: =D1/$D$7
Copy E1 and paste into E2:E5. Format E1:E5 as Percentage. Those are your
probabilities.
With your example data, RANDBETWEEN(1,10) and RANDBETWEEN(1,11), the
probability distribution should be close to the following:
8.98% C3>=0.4
10.96% 0.2<=C3<0.4
38.69% -0.2<C3<0.2
18.19% -0.4<C3<=-0.2
23.18% C3<=-0.4
For your example, the expected probability distribution can be computed
using a macro. Let me know if you are interested.
Is that an approach you can work with?