difficulty creating relational database

C

CAPcoordinator

Hi all-

I am trying to create a db with three tables, each cascading down to the
next as a 'one to many' relationship. My data is largely non-numeric, and I
am having a problem with determining (a) what fields should be related to
each other, and (b) how to create a one-to-many relationship if Access won't
automatically recognize a relationship as such (am I even able to do that?)

thanks
-C
 
R

Roger Carlson

The fields that should be related to each other are called (in general
terms) the Primary Key (of the one-side table) and the Foreign Key (of the
many-side table). For instance, if I had an Employee Table and a Position
table (each employee could hold many positions) then my primary key would
probably be Employee_ID. So I would have to create a foreign key in the
Position table called Employee_ID. It doesn't actually have to be *named*,
just hold the same data.

In order for Access to create the one-to-many relationship, the primary key
field (Employee_ID) MUST have a unique index on it. But if you have
designated this field as the Primary Key, it will automatically have a
unique index, so there won't be a problem. The foreign key MUST NOT have a
unique index on it. It may be indexed, but it needs to allow duplicates.
Thus you cannot designate it as a primary key.

Does that help?

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 
K

Ken Sheridan

Essentially a one-to-many relationship involves the primary key column of one
table being referenced by a corresponding foreign key column in another. A
primary key is defined in the table design, a foreign key is not defined as
such in the table design but is implicit in the definition of the
relationship which you create between tables via the Relationships window.

It would help us advise you if you explained what your tables represent in
terms of the real world. In the meantime perhaps the following brief
summaries of (A) the relational model and (B) normalization will help (There
is some duplication between the two)

A: "The database relational model was originally proposed by E F Codd in a
paper in 1970 in the journal 'Communications of the Association for Computing
Machinery'. Since then there has been a vast amount of further theoretical
work, and the relational model has shown itself to be a robust one. Without
going too deeply into the theoretical basis, which can be quite abstract, a
relational database in essence models a part of the real world in terms of
its entity types and the relationship types between them. Note the inclusion
of the word 'type' in both cases here. While its almost always used in the
former case, its often omitted in the latter case. This is a little bit
sloppy but not too important. When one talks about a 'relationship' it
really refers to a relationship value. As an example 'marriage' is a
relationship type, but my being married to my wife Fiona is a relationship
value, represented by our names on the marriage certificate, which is the
physical equivalent of a row in a Marriages table with columns Husband and
Wife, each referencing the primary key of a table People. This is a
many-to-many relationship type (I've been married twice so would be in two
rows, my first wife would also be in two rows as she remarried too). It is
resolved into two one-to-many relationship types, People to Marriages in each
case, in one case via the Husband column in the other via the Wife column.

In a relational database tables model Entity Types. In the above example
People is an entity type, modelled by the People table. Marriage is also an
entity type, modelled by the Marriages table. As we've seen its also a
relationship type. In fact a relationship type is just a special kind of
entity type.

Each column in a table represents an attribute type of each entity type, so
attribute types of People might be FirstName, LastName, DateOfBirth etc.
This table would also have a PersonID numeric column (usually an autonumber)
as its primary key as names are not unique. Each row in a table represents
one instance of the entity type, and the attributes of each instance are
represented by values at column positions in the row. This is the only way
that data can be legitimately stored in a relational database.

Its important that there is no redundancy in the information content of the
database. This is achieved by the process of 'normalization'. Normalization
is based on a set of 'normal form's ranging from First Normal Form (1NF) to
Fifth Normal Form (5NF). There is also one called Boyce/Codd Normal Form
(BCNF) which was inserted when it was found that the original Third Normal
Form was deficient; it didn't cater satisfactorily for tables with two or
more candidate keys where the keys were composite and overlapped, i.e. had a
column in common. I won't go into the details of normalization here; you'll
find it written up in plenty of places.

To see an example of redundancy and therefore a table which is not properly
normalized take a look at the Customers table in the sample Northwind
database which comes with Access. You'll see that it includes City, Region
and Country columns. If you look at its data you'll se that we are
redundantly told that São Paulo is in province SP which is in country Brazil
4 times. This is not just inefficient, it is dangerous as it leaves the
table open to inconsistent data being entered. There is nothing to stop
somebody putting São Paulo in the UK, USA or in each in separate rows in the
table for instance. To normalize the table it should be decomposed into
Customers, Cities, Regions and Countries tables, each of the first three with
a foreign key referencing the primary key of the next table up in the
hierarchy."

B: "Normalization is the process of eliminating redundancy from a database,
and involves decomposing a table into several related tables. In a
relational database each table represents an entity type, e.g. Contacts,
Companies, Cities, States etc. and each column in a table represents an
attribute type of the entity type, e.g. ContactID, FirstName and LastName
might be attribute types of Contacts and hence columns of a Contacts table.
Its important that each attribute type must be specific to the entity type,
so that each 'fact' is stored once only. In the jargon its said that the
attribute type is 'functionally dependent' solely on the whole of the primary
key of a table.

To relate tables a 'referencing' table will have a foreign key column which
makes the link to the 'referenced' table, e.g. a Contacts table might have a
CompanyID column as a foreign key, while a Companies table has a CompanyID
column as its primary key. Consequently no data other than the CompanyID
needs to be stored in a row in the Contacts table for all the company
information for that contact to be known; its available via the relationship
and can be returned in a query joining the two tables on the CompanyID
columns.

Similarly the Companies table might have a CityID column and the Cities
table a StateID column. If its an international database the States (or more
generically Regions) table would have a CountryID referencing the primary key
of a Countries table. So via the relationships, simply by entering (in
reality this would be selected from a list of Companies in a combo box, not
typed in)a CompanyID in the Contacts table the location of the contact's
company is also known. Redundancy, and therefore repetitive data entry is
avoided.

To see how a database is made up of related tables take a look at the sample
Northwind database. Its not completely normalized in fact (deliberately so
for the sake of simplicity) but illustrates the main principles of how tables
representing entity types relate to each other. An example of its lack of
proper normalization can be found in its Customers table. You'll see that
this has City, Region and Country columns so we are told numerous times that
São Paulo is in SP region (as is Resende) and that SP region is in Brazil.
Not only does this require repetitive data entry, but more importantly it
opens up the risk of inconsistent data, e.g. it would be perfectly possible
to put São Paulo in California in one row and California in Ireland! Proper
normalization as I described above would prevent this as the fact that São
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA.

An example of what at first sight might seem to be redundancy, but in fact
is not, can also be found in Northwind. The Products table and the
OrderDetails table both have UnitPrice columns. It might be thought that the
unit price of a product could always be looked up from the Products table, so
its unnecessary in Order Details. However, the unit price of a product will
change over time, but each order needs to retain the price in force at the
time the order was created. Consequently a UnitPrice column is needed in
both tables; that in products holds the current price and is used to get the
value for that in Order Details (code in the ProductID control's AfterUpdate
event procedure in the Order Details Subform does this), which then remains
static when the current price (in products) changes. In each case UnitPrice
is functionally dependent on the key of the table, so there is no redundancy"

Ken Sheridan
Stafford, England
 
C

CAPcoordinator

Roger thanks- a follow-up question: when you say that the primary key field
must have a unique index on it, what exactly do you mean?

In my specific case, I have a table capturing various information (dates,
etc.) about reports. Each report has within it multiple Recommendations. In a
related table, I capture the Recommendations (which I want to be the "many"
to the "one" report in the first table) along with some other information
(type of recommendation, etc..). In the third table I want to have Tasks,
which are also included in the reports- there are (often multiple) Tasks
associated with each Recommendation; I want Tasks to be the "Many" to the
"One" that is the Recommendation in the second table.

Recommendations and Tasks are textual (I use Memo field b/c some are long).
There are #s associated with each Recommendation / Task set (E.G.
Recommendation 1a, Task 1a.1) but they are not strictly numeric so I have to
use Text field for those...

So, I am wondering what having a unique index on my primary key field means,
with regards to my particular situation? Do I need to relate the Report field
to the Recommendation Primary Key field? (or the Recommendation to the Task
Primary Key?)

Thanks for any thoughts you have on this...
-Chris
 
R

Roger Carlson

Sorry, I'm being a little sloppy in my terminology. What I mean is this.

In your one-side table, you have to have a field (or group of fields) that
uniquely identify each record. Therefore, this field (or fields) cannot
have duplicate values. You will create a Primary Key (which is nothing more
than a special type of unique index) on this field (or fields).

Now, in most of my databases, I create what's called a surrogate key, which
is nothing more that an Autonumber field that I designate as my Primary Key.
It doesn't matter what the value is, as long as that value is placed into
the foreign key field of each related record in the Many-side table.

So in your particular case, you've got 3 tables Report, Recommendations, and
Tasks related this way:

Reports ----< Recommendations -----< Tasks

(where One-side ----< Many-side) Correct?

In that case, in the Reports table, I would have an autonumber field called
Report_ID which I would make the primary key. In the Recommendations table,
I would have an autonumber field called Recommendation_ID (I would make this
a primary key too) and a foreign key field (long integer) called Report_ID.
Lastly in the Tasks table, I would make a primary key called Task_ID and a
foreign key (long integer) called Recommendation_ID.

On my website (www.rogersaccesslibrary.com), is a small Access database
sample called "SubForm3Levels.mdb" which takes this one step futher and adds
a fourth table (which you don't need). Take a look at the Relationships
window to see what the relationships look like. You can find the sample
here: http://www.rogersaccesslibrary.com/forum/forum_posts.asp?TID=263.


--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 
C

CAPcoordinator

Roger thanks that explains a lot- I am working on this now, if I have other
questions for you I'll post them (so hopefully you won't hear from me for a
while :) )

thanks again
-Chris
 
C

CAPcoordinator

Hi again- OK one more question. I now have a ----< between Recommendations
and Tasks. However I cannot create a new relationship between Reports and
Recommendations (i.e. dragging one field over to another in Relationships
window doesn't work, and double-clicking to create a new relationship doesn't
either.) There was a previous relationship between Rec's and Tasks which I
had to modify as I couldn't drag-and-drop a new relationship between those
two tables either.

Have I done something else wrong here? Do I need to remove my partial data
from all Tables and re-build the Relationships?

Thanks again
 
C

CAPcoordinator

I think I figured it out- had to use the drop-down option to creae new
Relationship...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top