It might help if we first have a brief look at the underlying principles of
the relational model as a grasp of the theoretical basis makes the practical
application of those principles a whole lot easier.
The database relational model was originally proposed by E F Codd in a paper
in 1970 in the journal 'Communications of the Association for Computing
Machinery'. Since then there has been a vast amount of further theoretical
work, and the relational model has shown itself to be a robust one. Without
going too deeply into the theoretical basis, which can be quite abstract, a
relational database in essence models a part of the real world in terms of
its entity types and the relationship types between them. Note the inclusion
of the word 'type' in both cases here. While its almost always used in the
former case, its often omitted in the latter case. This is a little bit
sloppy but not too important. When one talks about a 'relationship' it
really refers to a relationship value. As an example 'marriage' is a
relationship type, but my being married to my wife Fiona is a relationship
value, represented by our names on the marriage certificate, which is the
physical equivalent of a row in a Marriages table with columns Husband and
Wife, each referencing the primary key of a table People. This is a
many-to-many relationship type (I've been married twice so would be in two
rows, my first wife would also be in two rows as she remarried too). It is
resolved into two one-to-many relationship types, People to Marriages in each
case, in one case via the Husband column in the other via the Wife column.
In a relational database tables model Entity Types. In the above example
People is an entity type, modelled by the People table. Marriage is also an
entity type, modelled by the Marriages table. As we've seen its also a
relationship type. In fact a relationship type is just a special kind of
entity type.
Each column in a table represents an attribute type of each entity type, so
attribute types of People might be FirstName, LastName, DateOfBirth etc.
This table would also have a PersonID numeric column (usually an autonumber)
as its primary key as names are not unique. Each row in a table represents
one instance of the entity type, and the attributes of each instance are
represented by values at column positions in the row. This is the only way
that data can be legitimately stored in a relational database.
Its important that there is no redundancy in the information content of the
database. This is achieved by the process of 'normalization'. Normalization
is based on a set of 'normal form's ranging from First Normal Form (1NF) to
Fifth Normal Form (5NF). There is also one called Boyce/Codd Normal Form
(BCNF) which was inserted when it was found that the original Third Normal
Form was deficient; it didn't cater satisfactorily for tables with two or
more candidate keys where thee keys were composite and overlapped, i.e. had a
column in common. I won't go into the details of normalization here; you'll
find it written up in plenty of places.
To see an example of redundancy and therefore a table which is not properly
normalized take a look at the Customers table in the sample Northwind
database which comes with Access. You'll see that it includes City, Region
and Country columns. If you look at its data you'll se that we are
redundantly told that São Paulo is in province SP which is in country Brazil
4 times. This is not just inefficient, it is dangerous as it leaves the
table open to inconsistent data being entered. There is nothing to stop
somebody putting São Paulo in the UK, USA or in each in separate rows in the
table for instance. To normalize the table it should be decomposed into
Customers, Cities, Regions and Countries tables, each of the first three with
a foreign key referencing the primary key of the next table up in the
hierarchy.
So, lets look at your scenario and identify the entity types and
relationship types involved, and the tables needed to model them. The role
of the first four below, and the columns they'd have will be obvious I think:
1. Players.
2. Matches
3. ColoursPlayed
4. GameStatuses
5. Games. This table has rows representing each game, with columns
representing the attributes of the game as a whole. One of these would be
GameStatus as this is an attribute of the game, not of a particular player's
involvement in it. In the jargon its said to be 'functionally dependent' on
the key of the table. This would be a foreign key column referencing the
primary key of the GameStatuses table. You are in a better position to
identify what other columns might be in this table.
6. GamesPlayers. This models a relationship type between Players and Games
so would have foreign key columns referencing the primary keys of each, e.g.
PlayerID and GameID. Together these form the composite primary key of this
table as the combined values PlayerID and GameID must be unique. It also has
attributes ColourPlayed and GameStatus, which are also foreign keys
referencing the primary keys of the ColoursPlayed and GameStatuses tables, so
in fact the relationship type is really between all four tables. And it has
a Score column of course. I'll say more about Moves below.
7. Moves. How you include these depends on the degree of analysis you
might want to make of moves. If you simply want to record the moves for a
game, without being able to compute any analysis of the moves, then you can
simply enter than into a memo field using conventional notation. This would
be an attribute of the game as a whole so the memo field would be in the
Games table. If on the other hand you want to be able to undertake some
analysis on the moves then you'd have a separate Moves table with one row per
move, with columns Move, TimeMoved (don't call the column Time; that's the
name of a built in function). This table would have a composite foreign key
of PlayerID and GameID referencing the similarly composite primary key of
GamesPlayers.
Now as regards the interface the main input would be via a Games form, based
on the games table and a GamesPlayers subform within it, linked on GameID.
The data would not be inserted into rows in the subform via code but manually
by selecting the name of each player, the colour, status and entering the
score per player. By uniquely indexing on GameID, PlayerID and ColourPlayed
in the table definition both players cannot be given the same colour. To
ensure only one row per Bye is entered and two rows, no more no less, per
played match are entered would require some validation code in the subform's
BeforeUpdate event procedure. The same event procedure would validate the
scores, though this would only kick in when the second player is entered of
course as even Access does not have powers of prescience! Code in the
AfterUpdate event procedure and the parent form's Current event procedure
could also set the subform's AllowAdditions property to True or False as
appropriate. I'd suggest that you concentrate on getting the form/subform
set up and working before tackling the validation code, however. We can
always come back to that later.
Finally one point which has just occurred to me, arising from my own
ignorance of how chess matches are organised (I'm not a chess player, the
sort of matches I've played in generally took place in a muddy field and
involved beating the opposition to a pulp!). Is a match a set of games
between the same two players or is it a set between different combinations of
players? What I'm really asking is whether a match is between two teams or
two individuals. If it’s the latter then the Matches table models a
relationship between two instances of Players, so the Matches table would
include two columns Player1 and Player2 say, both referencing the PlayerID
key of Players. This could then be used to control which players can be
selected in the GamesPlayers subform, i.e. the combo box would only list the
two players for the match in question.
Ken Sheridan
Stafford, England
MikeB said:
Right, please bear with me. I have the feeling this is an important
concept that I'm just not getting. It may explain why I can't use the
Games table, but have to run a union query over the games table to
derive the source data for subsequent reports I'm creating.
Today I have a table [Games] with the following fields: GameID,
MatchID, WhitePlayerID, WhiteScore, BlackPlayerID, BlackScore Result,
Moves, Time
If I understand you correctly, I need to change this to:
Table [Games] with fields: GameID, MatchID, Moves, Time
and two records in Table [GamesPlayers], with the following fields:
GameID, PlayerID, Color, Score.
Then I have to design a form with a subform as follows: Form Games
allows me to Add a new Game and then in the subform (which I presume I
have to open in Datasheet mode to see two rows at the same time) I
enter in the PlayerID (how do I get them in the form of a drop-down
list?) and the result. But how does the form ensure that two players
get entered for each game, unless the Color field says "Bye?" How does
the form ensure that the sum of the two records' score is 1,
appropriately apportioned between white and black depending on who won/
draw?
I've looked in the Access books I have and this doesn't seem to be
covered.
As I type this, a thought crossed my mind. Did you intend (and I'm not
knowledgeable enough) that the data is entered in the Games Form and
then programatically propagated to the two subrecords in GamesPlayers?
I can kind of see how that would work, but it seems to be a SMOP
(small matter of programming) that I'm not familiar with.