Efficient Matrix Database Design

ricm · Sep 19, 2004

The requirement is for a database that holds boolean data for a large 2-d
matrix. It seems to me there are two approaches:
First: create a table with enough fields for all the columns, add rows
corresponding to each row in the matrix. This is easy but wasteful because
the "true" data entries will be relatively sparse.
Second: Create two tables, one for all the colum names [Table_columns], the
other for all the rows [Table_rows]. Store the data in a third table
[Table_data] in the form [Data_Id,Row_Index,Column_Index] where the index
values point to the row and colum tables respectively. This has the advantage
of being able to store only the 'true' elements.
A further requirement is that the number of rows and columns, together with
their names is not fixed. This requirement alone seems to push the design in
favour of option 2, above.
The problem (for me) is the form design which allows both reads/writes: how
to display such a matrix without having pre-defined fields hidden/unhidden on
the form (a somewhat inelegant if not impossible solution given the above
requirements). A datasheet would seem ideal.
Two questions: First: do you think (as I do) the 2nd option is the best way
forward for the table design? Second: Can anyone give me pointers as to how I
should start to think about ways of developing a suitable form? My thoughts
so far are to create a temp table from Table_rows, Table_columns and
Table_data and use this as the recordsource for the form when it opens, then
iterate the temp table, writing values back to Table_data when the form
closes (or some other user-driven event).
Hope this makes sense. Thanks, Ric

Tim Ferguson · Sep 19, 2004

Two questions: First: do you think (as I do) the 2nd option is the
best way forward for the table design?

Yes: no brainer. Check out At Your Survey for a working example.

Second: Can anyone give me
pointers as to how I should start to think about ways of developing a
suitable form? My

To be honest, I'd think hard about using Excel for the UI for this. It's
dead easy to read the tables to put in the headers for columns and rows;
and then fill in the cells in the middle from the Answers table. Intercept
the Worksheet_Change event to write changes back to the database.

You could use Access for the analysis/ decision support part -- using the
crosstab query and the excellent report designer.

Just a thought

Tim F

John Nurick · Sep 20, 2004

Hi Ric,

A further requirement is that the number of rows and columns, together with
their names is not fixed. This requirement alone seems to push the design in
favour of option 2, above.

Without a doubt.

The problem (for me) is the form design which allows both reads/writes: how
to display such a matrix without having pre-defined fields hidden/unhidden on
the form (a somewhat inelegant if not impossible solution given the above
requirements). A datasheet would seem ideal.

It depends how large and sparse your matrix is. A datasheet - using, as
Tim suggests, Excel for the actual UI - is fine if you just have a few
dozen rows and columns but gets unwieldy if there are hundreds or
thousands, let alone millions (especially as it will be hard to devise
an elegant way to disguise Excel's limit of 255 columns per worksheet).
In that case it might be best to do data entry by having the user select
or enter the coordinates of each "true" point, and concentrating on
making this process as easy as possible.

If you use Excel, don't forget to set up data validation so the user can
only enter boolean values.

ricm · Sep 20, 2004

....

Yes: no brainer. Check out At Your Survey for a working example.

Thank you : agreed. I have created a small model for this part of the design
and it seems to work well. What is already clear is that I shall have to wrap
the changes in a transaction to make sure all the changes are written
consistently to the database.

To be honest, I'd think hard about using Excel for the UI for this. It's
dead easy to read the tables to put in the headers for columns and rows;
and then fill in the cells in the middle from the Answers table. Intercept
the Worksheet_Change event to write changes back to the database.

When the question of using Access was first put to me some months ago it was
evident that there were huge knowledge gaps both in the extent of the data
set and in the way data would be manipulated. I had the team create a paper
design so that the entire process could be made transparent. The team (of
educationalists) decided by themselves to use Excel to create and print the
grids. There are some 20 worsheets, some of which are already up to the
maximum column count. Practically, the worksheets are difficult to manipulate
- hence the request to look at the feasibility of using Access. Fortunately
the columns break down into logical groups - so an Access form need only
present a subset of data at any one time. The groups are numbered 1.0.0,
1.0.1, 3.2.3, etc, so at least column headings will not be too wide. The
"Columns" definition table has the column descriptors so a text box can
display the friendly name when a particular column has focus. The data grid
would have to be a subform.

I can still see no way round creating temporary tables though for the form's
underlying recordsource in order to give read/write functionality - which in
itself implies quite a lot of coding both to populate the grid for user entry
and subsequently write the changes back to the compact, indexed data table.
My guess is I shall have to think hard and long about the point at which data
is saved: either save at the completion of each subset or allow the user to
move "next" or "back" between subsets to review changes before finally
committing an update.

If I think in terms of resilience, then the temporary table(s) would
effectively hold all the deltas to the data, allowing a "back" and "next"
functionality.. A final commit/rollback completes the process to transfer the
data back to the normalised data structure. Some load testing will indicate
how efficient this is.

You could use Access for the analysis/ decision support part -- using the
crosstab query and the excellent report designer.

Yes, here is where the real benefits will accrue: the 'false' values will be
just as important as the 'true' ones in highlighting gaps in provision and
indicating tasks/areas still to be completed. See my reply to John Nurick
below why Access would seem the optimum application.

Many thanks,

Ric

ricm · Sep 20, 2004

John Nurick said:
Hi Ric,

Without a doubt.

It depends how large and sparse your matrix is. A datasheet - using, as
Tim suggests, Excel for the actual UI - is fine if you just have a few
dozen rows and columns but gets unwieldy if there are hundreds or
thousands, let alone millions (especially as it will be hard to devise
an elegant way to disguise Excel's limit of 255 columns per worksheet).
In that case it might be best to do data entry by having the user select
or enter the coordinates of each "true" point, and concentrating on
making this process as easy as possible.

If you use Excel, don't forget to set up data validation so the user can
only enter boolean values.

John, thanks for your reply.

Yes, the grids are quite large! The final application must also hold data
about providers, trainees, etc... so Access would seem ideal. The data grid
is only 1 aspect of the User Requirement and is about the mapping between
skills and qualifications - it is this aspect that causes me the greatest
concern because of the overhead in presenting data to the user (dynamically
create and populate tables?? dynamically create forms??). My reason for
posting is that whilst we are still at the stage of scoping out the problem,
ideas, pointers and 'best practice' inputs from those who've already
travelled this or similar routes might save days of trial-and-error work!

Thanks once again,

Ric

Tim Ferguson · Sep 20, 2004

The team (of
educationalists) decided by themselves to use Excel to create and
print the grids. There are some 20 worsheets, some of which are
already up to the maximum column count. Practically, the worksheets
are difficult to manipulate

I started to think about this aspect later on -- if you start to get close
to a maximum of 250 columns in Excel, they you simply have a major UI
design problem. I agree you cannot just offer users a 250 * several thou
grid!

Therefore, analysing their processes a bit more might be in order, eg
separate forms or grids for different groups of columns/ rows?
Unfortunately Access is not good at making "dynamic" forms in the way that
proper VB does with control arrays. Have you thought of looking at that as
a solution? Or, as a more lateral-thinking approach, what about capturing
the data earlier on as a text file and parsing that into the database?

I don't think you have a trivial problem!!
All the best

Tim F

John Nurick · Sep 20, 2004

Yes, the grids are quite large! The final application must also hold data
about providers, trainees, etc... so Access would seem ideal. The data grid
is only 1 aspect of the User Requirement and is about the mapping between
skills and qualifications - it is this aspect that causes me the greatest
concern because of the overhead in presenting data to the user (dynamically
create and populate tables?? dynamically create forms??). My reason for
posting is that whilst we are still at the stage of scoping out the problem,
ideas, pointers and 'best practice' inputs from those who've already
travelled this or similar routes might save days of trial-and-error work!

I feel that I don't understand enough about the data domain or the
users' requirements to say anything very sensible. But you've got the
basic table design sorted, and now it's a UI issue. Thinking of
implementation details like temporary tables and transactions is a
distraction at this stage, IMO; it's now time to think about a UI to
meet the users' needs, and only after getting a handle on that decide
whether it's practical to implement it with the available resources, or
whether it's necessary to fall back to something easier to implement but
less easy to use.

So: how do the users need to see (and enter or edit) the data?

Adrian Jansen · Sep 20, 2004

I did a design for a contacts management database for an educational
institution a couple of years ago, which had some similarities to what
you seem to want. Could track courses offered, venues, course
presenters, students enrolled, and their educational history, amongst
other things. If you want some professional advice, maybe I could help,
for a fee.
Contact me at the address below if you are interested.

--
Regards,

Adrian Jansen adrianjansen at internode dot on dot net
Design Engineer J & K Micro Systems
Microcomputer solutions for industrial control
Note reply address is invalid, convert address above to machine form.

ricm · Sep 21, 2004

I feel that I don't understand enough about the data domain or the
users' requirements to say anything very sensible. But you've got the
basic table design sorted, and now it's a UI issue. Thinking of
implementation details like temporary tables and transactions is a
distraction at this stage, IMO; it's now time to think about a UI to
meet the users' needs, and only after getting a handle on that decide
whether it's practical to implement it with the available resources, or
whether it's necessary to fall back to something easier to implement but
less easy to use.

I have spent some time since my first post hacking out some sample forms. A
user-friendly form may, of itself, require a different data structure to the
one envisaged. If the UI isn't easy for users to use then then it's a no-go.
Period.

So: how do the users need to see (and enter or edit) the data?

This is my question to the group also. I think they believe we are somehow
going to deliver a database and users will magically enter data and, presto,
the necessary reports/analyses will pop out... Back on Planet Earth: I have
scheduled a full coast-to-coast walk through of the paper-based & spreadsheet
processes tomorrow and the day after. I shall have better handle on the UI
requirements then.

Just like to say thanks to you & Tim for your helpful replies. Much
appreciated!

Ric

table design feedback request for plating shop bath testing	2	Apr 4, 2012
Real Estate Contract Database Design	1	May 11, 2011
How to handle indirectly related data	1	Apr 24, 2008
need band-aid for poorly constructed db	4	Dec 15, 2009
Comments/help on database design	5	Jul 23, 2009
Database design problem	0	Sep 11, 2007
Heirarchal Database Design	3	Jun 16, 2008
Project Online Timesheets Actuals Mismatches - data not publishing to projects	0	Sep 12, 2017

Efficient Matrix Database Design

ricm

Tim Ferguson

John Nurick

ricm

ricm

Tim Ferguson

John Nurick

Adrian Jansen

ricm

Ask a Question

Similar Threads