Sorting in VBA

G

GB

I have a *large* amount of data that I need to sort and have available on
disk for easy access. It's about 20 million items, and about a gigabyte. I
thought that probably a tree is the way to go. I imagine this has already
been programmed many, many times in VBA, so can somebody kindly point me to
some code please?
 
T

Tony Jollans

When you have your data sorted, how do you plan to access it?

I'm sure that Sorts have been programmed many times, but whether Word is a
sensible tool for the job, I somehow doubt. Why do you want to do it in VBA?
 
G

GB

Tony said:
When you have your data sorted, how do you plan to access it?

I'm sure that Sorts have been programmed many times, but whether Word
is a sensible tool for the job, I somehow doubt. Why do you want to
do it in VBA?

Well, I'll need to access the data from within word, so I thought that
setting up the database within word made good sense, so I do the job once
not twice.

Why do you think that VBA/Word is not up to the job? I'd rather learn that
now than later.
 
T

Tony Jollans

Word is not a database tool. Small amounts of data can be dealt with withn
it, but 1GB will exceed the maximum document size, and, even if it didn't,
performance would likely be dreadful, so, even if you want to access it from
Word, you will have to hold it somewhere else.

Strictly, how and where you sort your data, and where you ultimately hold it
are not necessarily related but if, for example, you were going to set up an
Access database, it would be better to do the work in Access.

As for Word VBA not being up to the job (of sorting), I think memory would
likely be the biggest concern - dedicated sort software would handle that
kind of thing automatically, in VBA you'd (probably) have to code up lots of
logic to have overflow storage somewhere on disk. Mechanically, WordBasic
Sort _may_ be able to do it but, again, the volume of data and the memory
required would be a concern.

At a guess you would need a minimum of 2GB (and very probably more, although
I'm not sure VBA can address more than that) of available RAM, which
probably means you would need to be running a 64-bit system - maybe, if you
have the hardware, you could give it a shot. Just try reading your data into
an array - if you can get that far, the sort might be worth a shot, but you
still have the issue of where you are going to save the sorted data, and
what way you are going to access it when it has been sorted. Using the
database as an example, again, you may not need to do the sorting yourself -
a database load function might be able to do it all behind the scenes.
 
G

GB

Tony said:
Word is not a database tool. Small amounts of data can be dealt with
withn it, but 1GB will exceed the maximum document size, and, even if
it didn't, performance would likely be dreadful, so, even if you want
to access it from Word, you will have to hold it somewhere else.

The way a tree sort works is that you set up a large random access file on
disk. You add the records one at a time, with pointers (ie an additional
field with the record number) to the next record. As you add more records,
you just change the pointers in the 'adjacent' records accordingly. So, Word
only needs to have enough space for 3 records in memory at a time.

I just thought that, rather than programming this myself, I would see if
somebody else has already done this and tested it?

When you said it wouldn't work, I was worried that you meant there are
restrictions on random access file sizes that vba can handle - something
like that. There may be of course, but you didn't mention this.
 
T

Tony Jollans

I think we are misunderstanding each other here. I was thinking of sorting
as a logical process, whilst you were concerned about possible constraints
that VBA might impose on a physical process.

I don't know what size random access file might realistically be needed for
your data, but until Office 2010 is released (next month, I believe) there
is only a 32-bit version of VBA, and I'm pretty sure you will trip over a
2-gigabyte limit somewhere or other

I have never come across this kind of code in VBA myself, so can't help on
that score, I'm afraid.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top