Responses inline.
Jamie Collins said:
In a training records database, each training session has one or more
instructors. Most of the training sessions are one-time events, perhaps
to
learn about processing a new type of product or something like that
In
other words, it's not a situation where there is a list of courses.
Sometimes the training is conducted by other employees such as
supervisors
(who also attend training sessions). Sometimes the training is conducted
by
people from outside the company. My approach in an early version of the
database was to just store the name. There is a combo box to select
an
employee's name; if the instructor is not an employee, the combo box
Limit
to List property is set to no, and the user can just type a name into the
box. I don't really object to doing it that way, although I realize
it
violates some normalization principles.
Which normalization 'principles' were you concerned about? A person's
name is single a atomic fact which in a DBMS is usually split into
elements (e.g. first_name, middle_names, last_name) for practical
purposes, therefore normalization considerations are moot.
The normalization principle is storing the name redundantly. Joe Jones
appears in the Employee table (stored as LastName and FirstName). Then Joe
Jones is stored in the Instructor field as the full name each time he
conducts a training session. In truth, that doesn't bother me very much,
but I wonder if there is a better way.
A person's name is a reasonable natural key in some circumstances e.g.
unique for all practical purposes in a company with few employees.
EmployeeID is used in other company records, including payroll. For now the
databases are separate, and will likely remain so for some time to come, but
there may well be a time when the data need to be combined. I think it
is a
reasonable single-field key, expecially considering that it is being used as
the key elsewhere in the company. I see that if I were storing three fields
(Last, First, Middle) as the FK in a table related to the Employee table
then there is no question of reduncancy, since I would be storing the
full
name in every record related to the employee table. As I expect you know
from our previous discussions, I am inclined to use surrogate keys unless a
simple natural key presents itself. I have read the arguments on both
sides, and have concluded that I am on solid ground with a surrogate key. I
will continue to use one-field keys where I can.
It seems to me that there will need to be a separate instructor table,
maybe
with just InstructorID, FirstName, LastName, and Company.
Is InstructorID a relational key? Perhaps its trusted source is a
'closed-shop' association (i.e. compulsory membership) of instructors
for your region who ensure that the same instructor cannot be assigned
two different InstructorIDs? But if InstructorID is what I think it is
then this may violate some relational principles. Be wary of trading a
weak natural key for no key at all.
InstructorID would be an arbitrary number, incremented by 1 starting from
1000000, as I see it. It would not be part of correspondence, or in any
other way used in the real world. When somebody comes in to to HazMat or
First Aid training, we contact a qualified company, and they send an
instructor. The instructor is identified by name.
The employee
table also has FirstName and LastName fields. EmployeeID is a four-digit
number, so I could start the Instructor table with a number such as
1000000
and use DMax to increment by 1.
All you'd be doing is assigning a sequence number to non-employees;
that's not the same thing as uniquely identifying instructors. Would
you issue instructors with this sequence number to and ask them to
quote it in all related correspondence etc? A key must exist in the
reality being modelled therefore if you've invented it then you have
to expose it.
I don't see that there is a reason why the key "must" exist in the
reality
being modelled, or that I have to expose the key. See above.
I suspect the natural key for external trainers in current use in your
organization is company + trainer name and is unique for all practical
purposes. Are you in a position to impose a new identifier on your
employees and their clients? It's great to be able to modify the
reality to fit your model but most of the time this isn't an option.
If you are in this fortunate position, do some research first because
an incrementing integer does not always make the best key (hint: think
of adding a check digit to reduce keying errors).
This has nothing to do with imposing a new identifier on anybody. The
identifier is behind the scenes. I am not at this time going to convert to
using natural keys at any cost. I will index as needed, and will not worry
about physical position on the disk or clustered indexes. The surrogate key
does what I need.
I'm not sure what you mean by the incremented number representing a
potential problem. It is assigned automatically. In a multi-user
environment I would simply assign it during the form's Before Update
event
(with error handling in the very unlikely case that a problem arises from
two users creating a record at the same moment).
[Reminds me, about seven years I worked for a company where *every*
time I was asked to quote my employee number there was general
disbelieve that I really could be employee 2000 in the year 2000. I
wager you can deduce why I don't need to ask fellow current employees
their employee IDs when I tell mine is 'jamiec'.]
Then I could combine the two tables into a
query that would be the row source for the Instructor combo box on the
Training Session form. By selecting a name I would store the ID number.
The thing I'm not quite sorting out in my head here is that the number
will
relate to one of two tables (Employee and Instructor), so how would I go
about handling that relationship? Also, each instructor could be
associated
with many training sessions, and each training session could have several
instructors, so it looks like there is a many-to-many there.
Somehow this is all starting to seem more complicated than is necessary.
Am
I missing a more direct route?
I think I'd keep things separate as far as possible i.e. internal
employee led training as a activity distinct from that of external
instructor led training and use UNION ALL when the results need to be
combined.
It seems to me that 'employees' and 'external instructors' are not of
the same 'domain' (modelling a superclass of 'people' might be a bit
too generic e.g. privacy and trust issues associated with reliable
identifiers) and it sounds like you are having problems combining them
anyhow so why force it? Issues of keys aside, what practical problem
are you trying to address?
I agree that the Employee records and Outside Instructor records are
separate enough that they should be stored separately. A single generic
table has its own difficulties.
Let's say that I am using two tables, and combining them with a union query
as needed. Please accept for now that I have a single numeric field as the
identifier. How can I relate the instructor field in the Training
Session
table to that key, which may be in either of two tables.
If I use a natural key consisting of First, Last, Company for the
instructor, and First, Middle, Last for the Employee, how do I handle the
relationship with the Instructor field (or combination of fields, since
if I
am using a compond PK I also need a compound FK) in the Training Session
table?
In the end I may well decide to just store the name as I have in a single
field. As I explained to Tina, that is the simplest solution for the user.
In the end, this is about creating the best possible tool for the user.
Thanks for your thoughts and comments, Jamie. We may take a different
approach on some matters, but I always appreciate your thoughtful and
thorough postings.