Database Size Problem

V

Vilem Sova

I've been asked to look at speeding up a monthly data import procedure which
imports data from a linked text file into an Access database (it's done by
an append query). The database just has the one table in it (which is
cleared and compacted before each import procedure). There are about 6
million records imported, and the database is 1.1 Gb in size after the
import is done and is re-compacted.

There are about 35 fields in the table, mostly text fields set to 255
characters in length. I went into the table design and changed the field
sizes to the maximum likely for each field - about 30 to 50 characters.

Therefore I expected to have a much reduced import time, and a much reduced
(eg by 75%) final file size.

However, there is no perceptable reduction in time (about 2 hours), and very
little (less than 1%) reduction in file size (still 1.1 Gb).

Does anyone know why it's not reducing the file size (and also the import
time) ?

Also, according to the Microsoft help documentation, setting the "Use
Transaction" property to "No" on an Action Query is supposed to increase
performance on large transactions. But this has resulted in no change
either. Any ideas why?

Thanks
Vilem Sova
 
J

Jerry Whittle

It's rather simple actually. Access does not waste space in text field. If
you have a text field set to 255 characters and only put the letter "A" in
it, Access only uses 1 character. So unless your text field is importing
trailing spaces, reducing the field size will not make things smaller unless
you are truncating data. For example if you have a text field set to 5 and
try to put ABCDEFGHI, Access will only let in ABCDE.

If it is importing trailing, or even leading spaces, and you do not need
these spaces, wrapping the field name with the Trim() function in the append
query could save a lot of space. I doubt if it would be faster and maybe even
slower.

About the only thing that might help on the import would be to remove any
indexes in the receiving table. After the import in completed, create any
needed indexes.

It might also go faster if both the database and text file are on a local
drive of the PC used for the import. The network could be slowing things down.
 
T

Tim Ferguson

There are about 35 fields in the table, mostly text fields set to 255
characters in length. I went into the table design and changed the
field sizes to the maximum likely for each field - about 30 to 50
characters.

First of all, with 35 fields in a table I'd be suspicious of whether it's
normalised...

Second, it's worth warning you of a nasty bug-in-the-making with this
design. Access has several rules like "text fields have a max of 255
characters" and "cannot have more than 255 fields in a record" and "an
individual record cannot have more than 1024* characters". Note that the
last one applies despite the first two... you can _define_ 35 fields with
50 chars each, but the first time you try to put more than 30 in all of
them you'll get a nasty DB Engine error about failing to save the record.
This may well be some years down the road, after you have left the
project along with anybody who knows anything about how it was put
together.

For that reason if for no other, I'd go back and visit point number one!

All the best


Tim F
 
V

Vilem Sova

Thanks Jerry for the useful info.

However, I don't think this explains everything as to the file size.

The text file from which the data is being imported is about half a Gb in
size, and the final database size is 1.1 Gb. The text file is a fixed-width
format, so there is a lot of wasted space in it, which isn't reduced when it
comes across to the Access 2002 DB - most of the fields for most of the
records are blank (just spaces). A rough guess would be about 10 - 20% of
the file contains non-blank data. Therefore I would expect that if Access
only stores 1 char for every char of data, then the final DB size should be
about 50 - 100 Mb, plus a bit more for various overheads to do with keeping
track of record and field information.

So the DB file still seems to me to be about 5 to 10 times bigger than it
should be.

Can you or anyone else offer any other comments or suggestions?

NB The table doesn't have any indexes.

Thanks
Vilem Sova
 
V

Vilem Sova

Thanks Tim, I didn't know about the 1024 character limit for a record. It
does surprise me, as I've seen some hideously un-normalised tables with what
I would have guessed contained much more data than that per record, but I'll
take your word on it for now.

Re-structuring the thing isn't really a viable option now, so I'll just have
to keep my fingers crossed.....

Regards
Vilem Sova
 
T

Tim Ferguson

Thanks Tim, I didn't know about the 1024 character limit for a record.

I forgot to mention that it's now 2048, but since that's in Unicode it
doesn't make a whole lot of difference.

Best of luck

Tim F
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top