Maxed file size

T

Tom Ellison

Dear friends:

A client reports a problem with an Access MDB file. It contains one table
and nothing else.

"its size is 2.095GB."

The problem is that when he tries to Compact and Repair it, he gets an error
message:

'Invalid argument'.

If he tries to delete one row from the table, he gets the same error
message.

What I propose to do is to query the table and import it in pieces into a
new SQL Express table. Using more efficient datatypes (smallmoney instead
of doubles) and given the Express 4 GB limit, this should work well. To
keep within the 5 minute Access timeouts for SQL queries, I have used
filters to break the process into multiple parts. Probably due to
fragmentation of the files on his drive, this hasn't been effective. (It
works well in tests here, with no single query taking more than 2 1/2
minutes. But my hard drive is severly defragmented!)

What he wants to do is to manually separate the file into smaller pieces,
but this isn't working out. I want to do that inside the queries that
import the data, but have advised him to defragment his hard drive using
PerfectDisk first. He's going to need a clean import anyway, and this is
easier to provide before importing (in my opinion). What I'm asking for is
a "second opinion" so everybody chime in, please.

Performance of the database in queries afterward is the actual issue of the
project. The optimal column on which to perform a physical sorting of the
rows (as in a primary key for Jet or a clustered index for Express) is not
unique, which rules out a PK in Jet. The clustered index approach works
superbly, with an increase in performance beyond your wildest dreams. I
propose that such a solution makes proper defragmenting all the more
essential, hence my recommendation for PerfectDisk.

Please offer any critical opinions of my approach and any alternatives you
think are viable. I will direct him to read this thread, so pick my ideas
apart.

Thanks,
Tom Ellison
 
A

Albert D.Kallal

The problem is that when he tries to Compact and Repair it, he gets an
error message:

Try creating a blank database, and then import. It is possible that the
actual data file is only a few 100 meg in size. So, a LOT could be learned
and gained by having compact work. Since you can't compact, then simply
importing to a blank new mdb file would be the next step.

If the import fails, it possible the file is damaged...
. I propose that such a solution makes proper defragmenting all the more
essential, hence my recommendation for PerfectDisk.

That is not really going to help a lot here. While you might wind up
defragging the actual hard disk files, that huge mdb file is still NOT
defragged. (records can be strewn all over the place *inside* this new
defragged file. You still might have records in order that are ALL over this
file, so defragging at the system/windows level will NOT help. Defragging
the file ensures that the sectors for the actual file are in order on the
hard disk, and that does help performance here. However, I don't think it is
a make or break issue. I would not worry about this step...

Only a compact and repair will actually move records together by index
order, so, that is what we should try.

On the other hand, perhaps you don't have to take the piece by piece
approach anyway....

I would first try importing into a new blank mdb. The result might be a
significantly smaller file...(of which you can then compact and repair which
DOES put data in index order).

It is not clear if the sql express import tools are currently timing out
(or, even you are even using what comes with sql express). I am not sure,
but do the enterprise tools work with sql express? (those import tools are
quite well battle hardened, and I don't think they time out after 5 minutes.

Also, consider exporting to a text file (tab, or comma delimited), and then
import that to sql express....as again you eliminate any 5 minute
timeout....
So, explore the possibility of ONE big import that you don't have to write
code...test, and baby-sit...

At the end of the day you may VERY well have to import piece by piece, but I
would try some one shot imports that does the whole thing first, as then you
don't care about index order....you just import....

I just don't know if the EM tools can work with sql express, but that is a
possible try also....
 
T

Tom Ellison

Dear Albert:

Here are some interesting and important facts in response to your
suggestions.

The database has, as I said before, only 1 table. Nothing else. This table
was laboriously imported from Excel spreadsheets. Prior to these recent
efforts, its rows of data have not been edited or deleted. It is a clean
set of over 3 million rows, with very numerous columns. Its sole purpose as
a database is for the performance of searches. I have compacted and
repaired some of these tables here myself, and they don't shrink!

My puropose in recommending defragmenting the hard drive was primarily so
the new table being created and sorted by SQL Express would be unfragmented.
This table will physically sort the data by the clustered index as it goes.
I could have assigned this to be done later, after the table was imported.
But, in testing this here on a defragmented hard drive, it took longer to do
that than just importing with the clustered index already assigned.

The compact and repair in Access cannot sort the rows as desired. There is
no unique key to the data, and the desired sort order is specifically not
unique. Because the Express clustered index does not require uniqueness,
and because it is ALWAYS sorted in that order, that is a great advantage.

Do you agree that having a very well defragmented hard drive will increase
the speed of insertion to the new database, given that it is sorting the
rows by the clustered index? Can you see why this is the likely problem,
where it is taking more than twice as long to import on his system?

Do you agree that the compact and repair will NOT put the rows in order,
since there is no PK for the table? I want to show your response to the
client. Do not do me any favors personally. Just give the true answers, as
I know you are doing anyway.

Your suggestion of exporting to a delimited text file is a good one. There
are imports to Express from that that avoid any time limit. I did not do
this initially because my testing indicated it should not be necessary.

From the start, before I even delivered the product, I made defragmenting a
strong suggestion. It is essential so the database performs at a peak.
Even the SELECT queries it performs on a finished 10 million + rows may time
out without that. Since that is essential anyway, my plan was always to do
this on an optimally defragmented hard drive. I did not initially know the
Windows defragmenter did not provide that. But it is available and
inexpensive. I've already availed myself of that for testing.

Thanks so much for your insight, Albert. Could you please let me know your
thoughts again with this additional information?

Tom Ellison
 
A

Albert D.Kallal

It is a clean set of over 3 million rows, with very numerous columns. Its
sole purpose as a database is for the performance of searches. I have
compacted and repaired some of these tables here myself, and they don't
shrink!

I suspected this, given your experience and knowledge with ms-access.

However, I was JUST making
sure that client had not miss communicated to you! 3 million is large, but
*usually* that does not max out the file size. It seems in your case, you
are very well maxed out. I still think that a compact and repair would
give you some elbow room here!! (remove a few indexes..that would
free up some serous space for the mdb file anyway).
My puropose in recommending defragmenting the hard drive was primarily so
the new table being created and sorted by SQL Express would be
unfragmented.

I can't agree more with the above. That is a large file, and "your goal" to
trying
to have this data in order on the new database is going to help a lot.
Without
question, some huge performance gains will be made by you shooting for
this goal.

I kind of taken the view is get the data in there!!..and THEN do the above
step
by using the sql export tools to re-org the data into a new (sorted)
database...
However, I not played with express (it is installed on a machine beside
me!!..
but that is it!!!).
The compact and repair in Access cannot sort the rows as desired.

ok, but do be aware that the data is sorted by the first index
in a compact and repair if the database does NOT have
a primarily key.
no unique key to the data, and the desired sort order is specifically not
unique. Because the Express clustered index does not require uniqueness,
and because it is ALWAYS sorted in that order, that is a great advantage.

Do you agree that having a very well defragmented hard drive will increase
the speed of insertion to the new database, given that it is sorting the
rows by the clustered index? Can you see why this is the likely problem,
where it is taking more than twice as long to import on his system?

I totally agree here. In fact, the clustered index is MORE valuable then
having the actual hard drive defragged, but ultimate, you want both, and I
have no issues at all with this goal of yours. Both will result in
substantial performance gains.

Do you agree that the compact and repair will NOT put the rows in order,
since there is no PK for the table?

actually, as mentioned, you do not need a primary key (in fact, you might
remove it!!!).
So, a compact and repair will put the file in order by the FIRST index
encountered in the table design. I suppose, you could still have more the
one index, but the resulting record order still is the FIRST index for the
given table (when you "view" the index list for a given table in design
mode).

On the other hand...your file is so large that you can't compact it
anyway.....so, I am not sure if I am helping here!! But, to answer the
question...yes...compact sets the data to the first index...even if it is
NOT a primary key..and duplicates ARE allowed.

The above means if you can compact, then your data file would be in ORDER.
And, that likely means a straight import. And, even if you did procesing in
chunks..you could process in table order....

You just need to get the compact/repair working....
 
T

Tom Ellison

Dear Albert:

Thanks again, greatly.

A final detail brought out by your post, and not mentioned before, is that
the table in this form has NO INDEX WHATSOEVER. Since there's no room to
add one now, this cannot help either. What a mess.

Tom Ellison
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top