Table/Query trouble

R

RipperT

InmateId In_Out HU CellNo Bunk MvDate
140060 In 4 127 B 18-Jan-06
140060 Out 8 18 18-Jan-06
497386 In 7 137 T 18-Jan-06
497386 Out 8 27 18-Jan-06
237480 In 8 1 17-Jan-06
237480 Out 7 245 B 17-Jan-06
536235 In C 75 B 17-Jan-06
536235 Out A 37 17-Jan-06
467801 In 5 123 T 16-Jan-06
467801 Out 4 158 B 16-Jan-06
258327 In 5 118 T 16-Jan-06
258327 Out 4 158 T 16-Jan-06
478494 In 8 22 15-Jan-06
478494 Out 6 117 B 15-Jan-06
229895 In 8 6 15-Jan-06
229895 Out 7 156 B 15-Jan-06

The above is a table that is programmatically written to each time an inmate moves from one bunk to another. The inspector wants to know who all lived in HU (Housing Unit) 4 on a specific date. Is it possible to extract this data from this table? IOW, who were the last inmates to move into HU 4 prior to the date in question?
Many thanx,

Rip
 
T

Tom Ellison

Dear Rip:

In one sense, the data is redundant. If Inmate 140060 moved out of HU 8 on 18-Jan-06 that fact is already represented by the fact he moved into HU 4 on that date. Presumably the Cell and Bunk from which he moved would be the same as the one to which he formerly moved.

Assuming an inmate can only be in one HU/Cell/Bunk at a time, the following is sufficient:

Inmate HU Cell Bunk MovedIn
140060 4 127 B 18-Jan-06
497386 7 137 T 18-Jan-06
237480 8 1 17-Jan-06
536235 C 75 B 17-Jan-06
467801 5 123 T 16-Jan-06
258327 5 118 T 16-Jan-06
478494 8 22 15-Jan-06
229895 8 6 15-Jan-06

The case where an inmate is discharged could be represented with HU value D (for discharged) if that has not other meaning, or with a NULL value for HU.

Why is this important? If you have an inmate moving out on 15-Jan-06 but not moving in until 18-Jan-06, then you have a real problem with tracking him. Likely this would be due to a user error. Asking for redundant information not only is a cause of such errors, but is a waste of time to enter the data and of storage space.

An important case is not represented in the data you give. You do not show any inmate having successive HU/Cell/Bunk assignments.

Now, to find a list of all inmates who are in HU 5 on 16-Jan-06, I propose a query:

SELECT InmateID
FROM YourTable T
WHERE HU = 5
AND MovedIn =
(SELECT MAX(MovedIn)
FROM YourTable T1
WHERE T1.InmateID = T.InmateID
AND T1.MovedIn >= #01/16/06#)

An inmate who moved into HU 5 on the 16th would presumably have been at some other HU/Cell/Bunk prior to that. The above does not count him as having been at both the previous HU/Cell/Bunk on the 165h, and at the new one. He could, of course, have been moved late in the day and would have been in the previous location for most of the day. The query could be adjusted for that if necessary.

It is not databases that are messy in this way, it is real live that is often less deterministic than the mathematics of set theory. It is our job to find such ambiguity and nail down the intent of those who would use the software.

Tom Ellison

"RipperT @comcast.net>" <rippertsmith<N_oS_pAm> wrote in message InmateId In_Out HU CellNo Bunk MvDate
140060 In 4 127 B 18-Jan-06
140060 Out 8 18 18-Jan-06
497386 In 7 137 T 18-Jan-06
497386 Out 8 27 18-Jan-06
237480 In 8 1 17-Jan-06
237480 Out 7 245 B 17-Jan-06
536235 In C 75 B 17-Jan-06
536235 Out A 37 17-Jan-06
467801 In 5 123 T 16-Jan-06
467801 Out 4 158 B 16-Jan-06
258327 In 5 118 T 16-Jan-06
258327 Out 4 158 T 16-Jan-06
478494 In 8 22 15-Jan-06
478494 Out 6 117 B 15-Jan-06
229895 In 8 6 15-Jan-06
229895 Out 7 156 B 15-Jan-06

The above is a table that is programmatically written to each time an inmate moves from one bunk to another. The inspector wants to know who all lived in HU (Housing Unit) 4 on a specific date. Is it possible to extract this data from this table? IOW, who were the last inmates to move into HU 4 prior to the date in question?
Many thanx,

Rip
 
T

Tom Ellison

Dear Jamie:

I do not agree. Having both a begin and end date in each row is redundant,
causing both increased data entry and requiring that these entries be made
with perfect consistency. To a significant degree, this violates one of the
fundamental rules of normalization.

If you rely on the fact that successive records for the same Inmate must
have the ending date of the prior row equal to the beginning date of the
subsequent one, you will "detect" a missing row every time a user makes a
data entry error. Likely, this will occur quite frequently just from user
errors. When an inmate moves, the user must locate his prior location row
and enter the ending date there, then enter that same date in the new
location row.

Tom Ellison


on 18-Jan-06 that fact is already represented by the fact he moved into HU 4
on that date.but not moving in until 18-Jan-06, then you have a real problem with
tracking him. Likely this would be due to a user error. Asking for
redundant information not only is a cause of such errors, but is a waste of
time to enter the data and of storage space.
 
T

Tom Ellison

Jamie Collins said:
I already covered this: if it's required for data integrity then by
definition it's not redundant. Think of DRI: you repeat the key column
value(s) in the FK column(s) but you don't think they are redundant, do
you?

This is not required for data integrity. You posit a situation in which it
could be used to detect that data is missing due to corruption. Your
proposal does not provide a way of restoring the data (you would not know
the HU/Cell/Bunk for the missing row) nor would you know if more than one
row were missing. Perhaps if you created a duplicate of every row in the
table, with every column, you would then be able to restore all the data. I
commonly do this. It's called a backup. The backup can be used to detect
and also to fully repair the type of corruption you suggest. Just having
two dates saved will only detect that there is an error, not how many, and
cannot be used to repair the damage.

Third normal form. If the user needs to alter the date an imate moves, you
must update two rows. Redundancy, pure and simple.
Please give more details: which rule, to what degree etc.


Successive records, prior/subsequent row? Why, you are now violating
the 'rules' of RM (a set has no order...?) <g>.

Yes, the table "bag" has no order unless we posit to apply one. This refers
to the ordering of each Inmate by the MvDate.
Good! I don't want bad, inconsistent data in my database, therefore I
write constraints to prevent it. Hopefully the front end application
developer will handle the bad user entry with grace. Perfect
consistency? Absolutely yes.

Having just one date in each row does this the same way as you suggest.
Having the user enter it just once, then storing it twice is not much of a
solution, as I have covered.
You've lost me now. As the database guy, I'll take responsibility for
data integrity. There's no reason why the front end guy couldn't get
just one date from the user to close off the end_date and open the new
start_date, with helper procedures provided by me. But maybe I'm not
seeing the problem here: please post an example scenario.

Fine. You can have the user enter this once, then save it in two places.
Still not a good or proper solution to the problem, as I have tried to show.

In summary, the date an imate leaves can be derived from the date in the
subsequent row (in date order and for the same inmate). Storing data that
can be derived is redundancy.

This invokes all the common problems of storing redundant data:

If the date an imate arrives is changed, you must find and fix up the date
departed from another row.

If a row is deleted, then the process must find the date arrived from the
subsequent row and update the previous row.

If a new row is inserted between two existing rows, another "fix up" is
mandated.

These kinds of housekeeping nightmares are what we avoid by not having
redundant data. That's a primary meaning for the 3rd Normal Rule.

Tom Ellison
 
R

RipperT

Thanx to you both for valuable input. I don't pretend to understand all of
it, as you are both far and away more skilled in the art of DB design than
I.

The table, as I said, is written to programmatically in the course of the
user plugging inmate ID's into each row of a table called
tblLockAllocations. Each row of this table represents a bunk in the
institution. There is one, and only one, row in this table for each bunk in
the institution, so we can take care of the bigger business of counting
inmates at specific times of the day. As the user continually changes these
inmate ID's in this table (ave. 65/day), tblLockHistory is written to with
the data originally posted (I removed other fields for clarity).
tblLockHistory works well for eyeballing the lock history of single inmate,
but I hadn't anticipated the inspector's request.

What Jamie is proposing would require the user to go to tblLockHistory and
find the move-in date of a prisoner and fill in the move-out date of that
record when the prisoner moves out. The DB's main purpose, as I said, is for
counting inmates. My users don't even know that tblLockHistory exists. Also,
this DB is only used in a single institution. If an inmate shows an 'out'
record with no corresponding 'in' record in tblLockHistory, that means he
transferred to another institution and he is off our count, in which case we
won't worry about tracking him until he transfers back in again.

Hope this clears things up a bit. Thanks again for the help.

Rip
 
T

Tom Ellison

Dear Jamie:

Previously, you stated:

"There's no reason why the front end guy couldn't get just one date from the
user to close off the end_date and open the new start_date"

Now you're saying the end_date of the prior occupancy need not be the same
as the start_date of the new occupancy.

Previously, you said the repetition of the same date ending one occupancy
and beginning the next could be used to detect a corruption in the data.

With your current assertion that the occupancy need not be continuous, this
can no longer be the case.

If you continuously make arguments that are contradictory and make
assumptions about the data that may not be appropriate, you will doubtless
come up with a combination that makes my suggestions inappropriate. Whether
this would have any bearing on the original question is doubtful.

Using the old axiom, "everybody has to be someplace" it is not an unlikely
assumption that each inmate is assigned to one HU/Cell/Bunk right up to the
moment he is assigned to another. I do not think this is unlikely. Only
the person who originally asked the question here can evaluate that.

When you shift your objections every time I respond, I can only see that
what you want is to argue with me, not to help with the original question.
That is not my goal, so I will not continue this.

Tom Ellison
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top