Design Question

K

Kat

I am attempting to write/develop a new database from scratch and I want to
get it right from the begining. Thanks for any help in advance.

The database will record entries into a warehouse enviorment of stock and
the process off our main inventory system. Each delivery will have a record
in tblDelivery with a unique DeliveryID and related information to that
delivery. When it is first entered it will have a status of "new" set
against it and the date/time needs to be recorded. Seperate stock items are
not part of this database only the delivery as a whole.

This is where I have run into my problem. I had originally intended to have
a seperate tblStatus where I would keep this information. Each time the
delivery moved to a different status, it would then have a new record added
into tblStatus.

I am not sure this is the best set up however because part of the reporting
I will need to do is how long something was "in process" and the current
stage of any or all deliveries. In general I am only interested in broad
information such as on Monday we had 160 deliveries and each delivery took an
average of 2hrs to complete. However I will need the ability to drill down to
a by delivery analysis of something is flagged up as inconsistant.

The current set up of the database is

tblDeliveries
DeliveryID (PK)
Delivery feilds (general information)

tblStatus
DeliveryID(FK)
StatusID(FK - codes used from a lookup table)
DateChanged (date field)
TimeChanged(time field)

tblLocation - this will need to be set up in the same was as the status
table to do much the same thing only handle the locations the shipment has
been moved to. Not actually set this one up until I resolve the current issue.

There is a one to many relationship between tblDelivery and tblStatus.

Will this table stucture allow for the kind of reporting I need? and If so,
could you give an example of the "high" level reporting such as how long on
average the deliveries for a day have taken to go from status new to status
complete considering there could be up to 5 different statuses inbetween?

Hopefully this is enough information, and thank you for any help in advance.

Kat
 
J

Jason Lepack

The best way to handle the Delivery_Status table is like this:

Devlivery_Status:
delivery_id (FK - references Deliveries)
status_id (FK - references Statuses)
status_changed_dt (datetime variable)

Based on your description I get the hint that each delivery will only
be in a given status once, therefore you may have a primary key on
(delivery_id, status_id).
 
K

Kat

Hi Jason,

Thanks for your reply. Indeed each delivery will only have a status once
(or so I am building this to belive) and it will move through the statuses in
a linear manner.

If I use a single date/time field as you've suggested is it easy to
determine how long a particular delivery has taken from status new to status
completed? I'm worried that it will be difficult to determine actual time
spans, which I am sure the client will want in the future, even if they don't
know it now.

Any chance I could bother you further for an example of the query to do
this? I am a bit stumped on how to get it to look for the first status (call
it "new") and find the difference(date/time wise) to the last status (call it
"complete") without dlookups in the query which will effect performace after
a few months and a few hundred thousand records.

Thanks again for your help so far.

Kat
 
J

Jason Lepack

Create this query and then base your other queries off of it.

SELECT
A.delivery_id,
A.status_id AS start_status,
A.status_changed_dt AS start_status_dt,
B.status_id AS end_status,
B.status_changed_dt AS end_status_dt,
DateDiff("s",.[status_changed_dt],[A].[status_changed_dt]) AS
time_elapsed
FROM
Delivery_Status AS A
INNER JOIN Delivery_Status AS B
ON A.delivery_id = B.delivery_id

If you find for certain that statuses will travel 1,2,3... you could
add:
WHERE
A.status_id < B.status_id

The query is a list for all deliveries, each possible pair of statuses
and their times at which they entered that status, as well as the
difference between those two times.

Cheers,
Jason Lepack
 
C

Chris2

Kat said:
I am attempting to write/develop a new database from scratch and I want to
get it right from the begining. Thanks for any help in advance.

The database will record entries into a warehouse enviorment of stock and
the process off our main inventory system. Each delivery will have a record
in tblDelivery with a unique DeliveryID and related information to that
delivery. When it is first entered it will have a status of "new" set
against it and the date/time needs to be recorded. Seperate stock items are
not part of this database only the delivery as a whole.

This is where I have run into my problem. I had originally intended to have
a seperate tblStatus where I would keep this information. Each time the
delivery moved to a different status, it would then have a new record added
into tblStatus.

I am not sure this is the best set up however because part of the reporting
I will need to do is how long something was "in process" and the current
stage of any or all deliveries. In general I am only interested in broad
information such as on Monday we had 160 deliveries and each delivery took an
average of 2hrs to complete. However I will need the ability to drill down to
a by delivery analysis of something is flagged up as inconsistant.

The current set up of the database is

tblDeliveries
DeliveryID (PK)
Delivery feilds (general information)

tblStatus
DeliveryID(FK)
StatusID(FK - codes used from a lookup table)
DateChanged (date field)
TimeChanged(time field)

tblLocation - this will need to be set up in the same was as the status
table to do much the same thing only handle the locations the shipment has
been moved to. Not actually set this one up until I resolve the current issue.

There is a one to many relationship between tblDelivery and tblStatus.

Will this table stucture allow for the kind of reporting I need? and If so,
could you give an example of the "high" level reporting such as how long on
average the deliveries for a day have taken to go from status new to status
complete considering there could be up to 5 different statuses inbetween?

Hopefully this is enough information, and thank you for any help in advance.

Kat


Kat,

Comments:

You should probably not use two DATETIME columns, one for date, and one for time. In
order to work with this, you will find yourself hopping through endless function
manipulation of those columns.

The SQL DATETIME data type represents a point in time, not an interval. Recording just
one point in time per row (whether with one DATETIME column, or two as you proposed) means
that you must use points in time from different rows to somehow find a duration between
two points in time. This is quite difficult (but not impossible) in SQL.

By keeping two points in time in a row, you have the whole duration involved right there
were you need it.

When a new status is ready to be entered, your Forms in MS Access will take the current
Now() value, and then UPDATE that value into StatusEnd for the status that is about to
end, and then INSERT a new row with a StatusStart equal to that same value (and the new
StatusID, etc.).


Tables:

Create a blank MS Access database. You can copy and paste these DDL SQL queries each into
an MS Access Query, executing each one in order to create the tables and the
relationships.

Query: Create_Deliveries

CREATE TABLE Deliveries
(DeliveryID AUTOINCREMENT
,TruckID INTEGER
,CONSTRAINT pk_Deliveries
PRIMARY KEY (DeliveryID)
)

(TruckID is a column I added into the table so that you can manually enter the sample data
given below. If there is only an autonumber column in a table, you can't manually enter
data. It is also reasonable to assume that each delivery arrives on a truck, although I
realize your system may not keep track of this. If it did, TruckID would likely be a
foreign key to a Trucks table, and that key is also omitted in this example.)


Query: Create_Statuses

CREATE TABLE Statuses
(StatusID AUTOINCREMENT
,StatusName TEXT(255)
,CONSTRAINT pk_Statuses
PRIMARY KEY (StatusID)
,CONSTRAINT un_Statues_StatusName
UNIQUE (StatusName)
)

Query: Create_DeliveryStatuses

CREATE TABLE DeliveryStatuses
(DeliveryStatusID AUTOINCREMENT
,DeliveryID INTEGER NOT NULL
,StatusID INTEGER NOT NULL
,StatusStart DATETIME NOT NULL
,StatusEnd DATETIME
,CONSTRAINT pk_DeliveryStatuses
PRIMARY KEY (DeliveryStatusID)
,CONSTRAINT fk_DeliveryStatuses_Deliveries
FOREIGN KEY (DeliveryID)
REFERENCES Deliveries (DeliveryID)
,CONSTRAINT fk_DeliveryStatuses_Statuses
FOREIGN KEY (StatusID)
REFERENCES Statuses (StatusID)
,CONSTRAINT un_DeliveryStatuses_DeliveryID
UNIQUE (DeliveryID
,StatusID
,StatusStart)
)


Sample Data:

You will need to manually load the following data into the tables above.

Deliveries
DeliveryID
1, 1
2, 1
3, 1


Statuses
StatusID, StatusName
1, New
2, Early Middle
3, Middle
4, Late Middle
5, Completed


DeliveryStatuses
DeliveryStatusID, DeliveryID, StatusID, StatusStart, StatusEnd
1, 1, 1, 07/13/2007 8:00 AM, 07/13/2007 09:00 AM
2, 1, 2, 07/13/2007 9:00 AM, 07/13/2007 10:00 AM
3, 1, 3, 07/13/2007 10:00 AM, 07/13/2007 11:00 AM
4, 1, 4, 07/13/2007 11:00 AM, 07/13/2007 12:00 PM
5, 1, 5, 07/13/2007 12:00 PM, 07/13/2007 01:00 PM
6, 2, 1, 07/13/2007 3:00 PM, 07/13/2007 04:00 PM
7, 2, 2, 07/13/2007 4:00 PM, 07/13/2007 05:00 PM
8, 2, 3, 07/14/2007 8:00 AM, 07/14/2007 09:00 AM
9, 2, 4, 07/14/2007 9:00 AM, 07/14/2007 10:00 AM
10, 2, 5, 07/14/2007 10:00 AM, 07/14/2007 11:00 AM
11, 3, 1, 07/14/2007 08:00 AM


Queries:

You may create and execute the following queries to test this.

Query: ElapsedStatusTimes

(Watch out for line-wrap.)

SELECT DS1.DeliveryID
,Format(SUM(DateDiff("s", DS1.StatusStart, DS1.StatusEnd)) \ 3600, "00")
& ":" &
Format(SUM(DateDiff("s", DS1.StatusStart, DS1.StatusEnd)) \ 60 MOD 60, "00")
& ":" &
Format(SUM(DateDiff("s", DS1.StatusStart, DS1.StatusEnd)) MOD 60, "00")
As ElapsedStatusTime
FROM DeliveryStatuses AS DS1
WHERE DS1.StatusEnd IS NOT NULL
GROUP BY DS1.DeliveryID


Results:

DeliveryID, As TotalElapsedStatusTime
1, 05:00:00
2, 05:00:00



Sincerely,

Chriis O.
 
K

Kat

Thanks Jason, thats just what I needed.

Kat

Jason Lepack said:
Create this query and then base your other queries off of it.

SELECT
A.delivery_id,
A.status_id AS start_status,
A.status_changed_dt AS start_status_dt,
B.status_id AS end_status,
B.status_changed_dt AS end_status_dt,
DateDiff("s",.[status_changed_dt],[A].[status_changed_dt]) AS
time_elapsed
FROM
Delivery_Status AS A
INNER JOIN Delivery_Status AS B
ON A.delivery_id = B.delivery_id

If you find for certain that statuses will travel 1,2,3... you could
add:
WHERE
A.status_id < B.status_id

The query is a list for all deliveries, each possible pair of statuses
and their times at which they entered that status, as well as the
difference between those two times.

Cheers,
Jason Lepack

Hi Jason,

Thanks for your reply. Indeed each delivery will only have a status once
(or so I am building this to belive) and it will move through the statuses in
a linear manner.

If I use a single date/time field as you've suggested is it easy to
determine how long a particular delivery has taken from status new to status
completed? I'm worried that it will be difficult to determine actual time
spans, which I am sure the client will want in the future, even if they don't
know it now.

Any chance I could bother you further for an example of the query to do
this? I am a bit stumped on how to get it to look for the first status (call
it "new") and find the difference(date/time wise) to the last status (call it
"complete") without dlookups in the query which will effect performace after
a few months and a few hundred thousand records.

Thanks again for your help so far.

Kat



















- Show quoted text -
 
K

Kat

Thanks Chris, I will give this idea a try. I suppose I was thinking too
linearly. It didn't even occur to me to have both a start and end date/time
and a new record for each status. For some strange reason my brain got stuck
on either having a new record for each status with only one date, or having
some horribly badly designed table with a date/time field for each status.
Its amazing how simple things can be when someone suggests them.

Thanks a bunch again!

Kat
 
J

Jamie Collins

CREATE TABLE DeliveryStatuses
(DeliveryStatusID AUTOINCREMENT
,DeliveryID INTEGER NOT NULL
,StatusID INTEGER NOT NULL
,StatusStart DATETIMENOT NULL
,StatusEnd DATETIME
,CONSTRAINT pk_DeliveryStatuses
PRIMARY KEY (DeliveryStatusID)
,CONSTRAINT fk_DeliveryStatuses_Deliveries
FOREIGN KEY (DeliveryID)
REFERENCES Deliveries (DeliveryID)
,CONSTRAINT fk_DeliveryStatuses_Statuses
FOREIGN KEY (StatusID)
REFERENCES Statuses (StatusID)
,CONSTRAINT un_DeliveryStatuses_DeliveryID
UNIQUE (DeliveryID
,StatusID
,StatusStart)
)

Sample Data:

DeliveryStatuses
DeliveryStatusID, DeliveryID, StatusID, StatusStart, StatusEnd
1, 1, 1, 07/13/2007 8:00 AM, 07/13/2007 09:00 AM
2, 1, 2, 07/13/2007 9:00 AM, 07/13/2007 10:00 AM
3, 1, 3, 07/13/2007 10:00 AM, 07/13/2007 11:00 AM
4, 1, 4, 07/13/2007 11:00 AM, 07/13/2007 12:00 PM
5, 1, 5, 07/13/2007 12:00 PM, 07/13/2007 01:00 PM
6, 2, 1, 07/13/2007 3:00 PM, 07/13/2007 04:00 PM
7, 2, 2, 07/13/2007 4:00 PM, 07/13/2007 05:00 PM
8, 2, 3, 07/14/2007 8:00 AM, 07/14/2007 09:00 AM
9, 2, 4, 07/14/2007 9:00 AM, 07/14/2007 10:00 AM
10, 2, 5, 07/14/2007 10:00 AM, 07/14/2007 11:00 AM
11, 3, 1, 07/14/2007 08:00 AM

Your proposed table wants a sequenced 'primary key' e.g.

12, 2, 4, 07/14/2007 10:30 AM, 07/14/2007 10:45 AM

creates an overlapping period i.e. duplication.

You probably also need to pay regard to transitions: the above row
represents a 'regression' from status 5 to status 4, presumably a
business rule violation, as I assume would this:

11, 3, 1, 07/14/2007 08:00 AM, 07/14/2007 09:00 AM -- UPDATE
13, 3, 5, 07/14/2007 09:00 AM, 07/14/2007 10:00 AM -- INSERT

which jumps straight from 'New' to 'Complete' while omitting the
intermediate steps.

Jamie.

--
 
J

Jamie Collins

The best way to handle the Delivery_Status table is like this:

Devlivery_Status:
delivery_id (FK - references Deliveries)
status_id (FK - references Statuses)
status_changed_dt (datetimevariable)

Based on your description I get the hint that each delivery will only
be in a given status once, therefore you may have a primary key on
(delivery_id, status_id).

Not only that, they have to be 'given' statuses in sequence (1, 2,
3...) and in chronological order. Care to post an amendment to your
proposal to enforce this business rule?

I acknowledge this may be a little unfair may be because you table is
fundamentally flawed e.g. if I knew the transition dates for statuses
1,2,4,5, (i.e. had a missing date for status 3) I would not be able to
model the missing data using your proposal, it would erroneously show
the end date for status 2 as being the start date for status 4.

Jamie.

--
 
J

Jason Lepack

Not only that, they have to be 'given' statuses in sequence (1, 2,
3...) and in chronological order. Care to post an amendment to your
proposal to enforce this business rule?

Not really.

In SQL Server I would use triggers to implement this. However, in MS
Access, I don't know how to do this, other than through VBA, but that
doesn't guard against someone entering data through the back-end.

Also, my solution cares not about when a package leaves a status, but
uses the difference in time between the "start" date/times of each
status.

If you have some advice that would be nice, but based on the
environment, I don't see any problem with my proposal.

Cheers,
Jason Lepack
 
J

Jamie Collins

Not really.

Apathy is a real killer for design-by-email said:
my solution cares not about when a package leaves a status, but
uses the difference in time between the "start" date/times of each
status.
I don't see any problem with my proposal.

That *is* the problem with your proposal i.e. remove a row and another
row gets an entirely new and erroneous end date. IIRC the jargon is
'delete anomaly'.
If you have some advice that would be nice

In brief, add an end_date to the table. Add a CHECK constraint or
validation rule to test that start_date occurs before end_date. Add
candidate keys (delivery_id, status_id, start_date), (delivery_id,
status_id, end_date), (delivery_id, status_id, start_date, end_date),
making an arbitrary decision about which to promote to PK
(alternatively employ knowledge of clustering on disk <g>). Add a
CHECK constraint to ensure no overlapping periods for each
delivery_id. Assuming status codes are sequential, add a CHECK
constraint to ensure contiguous periods for each delivery_id have
sequential status values and a further CHECK constraint to ensure that
for each delivery_id there are no earlier periods with a status code
greater than a later status code.

FWIW if the transitions are non-sequential, create a two-column table
of legal transitions (e.g. 'from_status' and 'to_status') and
corresponding columns in the usage table with referential integrity
(of course) and appropriate CHECK constraints, though hard to say what
they would be without knowledge of the transitions e.g. cyclic?

Jamie.

--
 
J

Jason Lepack

Add a
CHECK constraint to ensure no overlapping periods for each
delivery_id. Assuming status codes are sequential, add a CHECK
constraint to ensure contiguous periods for each delivery_id have
sequential status values and a further CHECK constraint to ensure that
for each delivery_id there are no earlier periods with a status code
greater than a later status code.

How would you go about writing these CHECK constraints to check for
overlaps, and ordered statuses? I didn't think that was possible with
ms access. In SQL Server I implement these using triggers.

Cheers,
Jason Lepack
 
J

Jason Lepack

I don't know if my previous response went through...
CHECK constraint to ensure no overlapping periods for each
delivery_id. Assuming status codes are sequential, add a CHECK
constraint to ensure contiguous periods for each delivery_id have
sequential status values and a further CHECK constraint to ensure that
for each delivery_id there are no earlier periods with a status code
greater than a later status code.

How would you go about writing these constraints in ms access? In SQL
Server I use triggers to implement things such as this. I didn't
think it was possible to implement these in standard sql.

Cheers,
Jason Lepack
 
C

Chris2

Your proposed table wants a sequenced 'primary key' e.g.

12, 2, 4, 07/14/2007 10:30 AM, 07/14/2007 10:45 AM

creates an overlapping period i.e. duplication.


What is your solution to the prevention of overlapping dates in DDL SQL for this case?

What fully working solution would you provide to the OP?

You probably also need to pay regard to transitions: the above row
represents a 'regression' from status 5 to status 4, presumably a
business rule violation, as I assume would this:

Yes, DeliveryID and StatusID need to be indexed separately to stop that.

11, 3, 1, 07/14/2007 08:00 AM, 07/14/2007 09:00 AM -- UPDATE
13, 3, 5, 07/14/2007 09:00 AM, 07/14/2007 10:00 AM -- INSERT

which jumps straight from 'New' to 'Complete' while omitting the
intermediate steps.

No requirement specifying that all statuses be present in consecutive order (or any order)
was hinted at.


Sincerely,

Chris O.
 
C

Chris2

Jason Lepack said:
I don't know if my previous response went through...


How would you go about writing these constraints in ms access? In SQL
Server I use triggers to implement things such as this. I didn't
think it was possible to implement these in standard sql.

Cheers,
Jason Lepack

In theory, you could write some supporting CHECK constraints.

Personally, I would be happy to see a set of CHECK constraints in MS Access that can
prevent the INSERTion of (or UPDATing to) overlapping dates, by group, in the presence of
many prior intervals.

In practice, triggers have to be implemented as VBA code in Form objects in MS Access . .
.. which I find to be, well, unpleasant to say the least.

That, or you go with unbound forms and have Class Modules containing all code that
accesses tables, which in turn are instantiated as an object on any form needing to access
the data via the object (that represents an entity, which in turn is represented by a
table) and have any required "additional logic" controlling the table be inside the Class
Module along with the rest of the data access code.


Sincerely,

Chris O.
 
C

Chris2

Kat said:
Thanks Chris, I will give this idea a try. I suppose I was thinking too
linearly. It didn't even occur to me to have both a start and end date/time
and a new record for each status. For some strange reason my brain got stuck
on either having a new record for each status with only one date, or having
some horribly badly designed table with a date/time field for each status.
Its amazing how simple things can be when someone suggests them.

Thanks a bunch again!

Kat

Kat,

You're welcome.

As noted in another branch of this thread, you'll need an extra DeliveryID/StatusID index
on DeliveryStatuses.

The prevention of overlapping dates is something that will have to be detected and
prevented in VBA code. (Barring the appearance of a DDL SQL solution for MS Access.)

Basically, prior to any INSERT or UPDATE, you'll need to instantiate a recordset (or check
the existing one via a recordsetclone (in bound forms)), and then iterate through any
existing dates for the same DeliveryID, and make sure the new dates you are about to enter
don't overlap.


Sincerely,

Chris O.
 
K

Kat

Hi Guys,

Thanks for all your ideas. I have elimated the problem of overlaping
date/times for statuses by not exposing the date/time fields to the user and
updating them at the time of entry. Since the brief of this database is for
the users to add/amend before and after they are done with a task (read
status) this will give date/times close enough for the use of the management.
All the appending/updating/deleting is done via unbound forms with checks
for the right data in vba prior to the updates happening.

After having the discussion with the managers in question, they have also
decided to allow all status movements, ie move from status 1 to 5 back to 2 ,
without checks. If they decide after its up and running that it is a problem
I have a couple of ideas how I would limit the status interactions from your
comments above.

Thanks again for all your help!

Kat
 
C

Chris2

Kat said:
Hi Guys,

Thanks for all your ideas. I have elimated the problem of overlaping
date/times for statuses by not exposing the date/time fields to the user and
updating them at the time of entry. Since the brief of this database is for
the users to add/amend before and after they are done with a task (read
status) this will give date/times close enough for the use of the management.
All the appending/updating/deleting is done via unbound forms with checks
for the right data in vba prior to the updates happening.

After having the discussion with the managers in question, they have also
decided to allow all status movements, ie move from status 1 to 5 back to 2 ,
without checks. If they decide after its up and running that it is a problem
I have a couple of ideas how I would limit the status interactions from your
comments above.

Thanks again for all your help!

Kat

Kat,

You're welcome and I'm glad to hear that everything worked out.


Sincerely,

Chris O.
 
J

Jason Lepack

So we're on the same page in that we can't at the DDL level prevent
overlapping dates in MS Access.

Cheers,
Jason
 
J

Jamie Collins

So we're on the same page in that we can't at the DDL level prevent
overlapping dates in MS Access.

I think when Chris2 said, "In theory, you could write some supporting
CHECK constraints," I think he actually meant to say, "You can write
some supporting CHECK constraints."

Consider the following SQL DDL (which pays no regard to transitions
between statuses):

CREATE TABLE Deliveries
(
delivery_id INTEGER NOT NULL PRIMARY KEY
)
;
CREATE TABLE Statuses
(
status_id INTEGER NOT NULL PRIMARY KEY
)
;
CREATE TABLE Devlivery_Status
(
delivery_id INTEGER NOT NULL
references Deliveries,
status_id INTEGER NOT NULL
references Statuses,
start_date datetime NOT NULL,
end_date datetime,
CHECK (start_date < end_date),
UNIQUE (delivery_id, start_date),
UNIQUE (delivery_id, end_date),
CONSTRAINT no_overlapping_periods
CHECK
(
NOT EXISTS
(
SELECT *
FROM Devlivery_Status AS T1, Devlivery_Status AS T2
WHERE T1.delivery_id = T2.delivery_id
AND T1.start_date < T2.start_date
AND T2.start_date <
IIF(IIF(T1.end_date IS NULL, NOW(), T1.end_date) >
IIF(T2.end_date IS NULL, NOW(), T2.end_date), IIF(T2.end_date IS NULL,
NOW(), T2.end_date), IIF(T1.end_date IS NULL, NOW(), T1.end_date))

)
)
)
;

The above is Access/Jet ANSI-92 Query Mode SQL syntax. To be able to
execute this from a query window in ms access you must first put set
it to ANSI-92 Query Mode (http://office.microsoft.com/en-us/access/
HP030704861033.aspx. Perhaps the easiest way to execute the SQL DDL is
to use an ADO connection with the appropriate OLE DB (either
Microsoft.Jet.OLEDB.4.0 or Microsoft.ACE.OLEDB.12.0) which always uses
ANSI-92 Query Mode SQL syntax. Once in place, the CHECK constraints
will be effective regardless of query mode used to access the data.

If you missed CHECK constraints when they came out (Access 2000, circa
1999), then you may want to review the other features new to Jet 4.0:

Description of the new features that are included in Microsoft Jet 4.0
http://support.microsoft.com/default.aspx?scid=kb;en-us;275561

A new Microsoft Jet 4.0 data type (DECIMAL)
Compressible data types
Lock promotion
Record-level locking
etc

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top