I have 2 fields: [PHONE] and [AUTONUMBER]
I would like to define the Primary Key as a multi-field primary key
as the
concatenation of [PHONE]+[AUTONUMBER]
Now, [PHONE] may be null, but [AUTONUMBER] will always contain a
unique
autonumber.
The problem is that Access won't allow null in [PHONE] even though
the
multi-field concatenation of [PHONE]+[AUTONUMBER] would be unique.
Why
not?
Is this a bug?
Such a key could implemented using a UNIQUE constraint i.e.
UNIQUE([PHONE], [AUTONUMBER]). In relational terms, a key is a key and
'primary' has no special meaning. The UNIQUE constraint can still be
used in a FOREIGN KEY constraint (see OT note below).
Perhaps the better approach is to ask: what does PRIMARY KEY give me
that UNIQUE does not, aside from requiring that all values be not null?
(FWIW columns in the PRIMARY KEY designation can be nullable, it's just
you can't actually put the null value in them)...
The key could implemented using a UNIQUE constraint i.e.
UNIQUE([PHONE], [AUTONUMBER]). In relational terms, a key is a key.
The UNIQUE constraint can still be used in a FOREIGN KEY.
AutoNumbers are normally unique on their
own? So why would you want to use an autonumber plus another field?
A valid answer to your question is clustering or physical ordering on
disk.
The better 'phone book' example would be the desire to cluster on the
subscribers' names. The phone number (using a natural key) or
autonumber would be required to break duplicates because subscribers'
names may not be unique.
Care is required in declaring the columns in the correct order i.e.
PRIMARY KEY (name, phone_number) i.e. the columns should be in
left-to-right order of significance to the clustering.
Clustering on name would favour BETWEEN (e.g. grab all names beginning
with 'B') and GROUP BY on the name because the rows would be physically
contiguous (and on the same page) to begin with and would be faster to
fetch. Think how 'useful' a paper phone book physically order on
telephone number would be to the average user.
Clustering on phone number or *random* autonumber on the other hand
would favour concurrency but a sequential autonumber PRIMARY KEY may be
the worst choice in any case.
[OT ]
FWIW nulls in foreign key constraints is one of the few areas where the
Jet implementation is IMO better than SQL Server's (and, Allen, I know
you are a fan of the 'CASCADE to null' feature in Jet 4.0 <g>).
SQL Server will disregard all values in the referencing columns if one
of then contains a null and also will not cascade them e.g. from a null
value to a not null value. When you see how Jet handles it, it makes
the SQL Server approach seem very odd e.g.
CREATE TABLE Table1 (
key_col1 INTEGER NOT NULL,
key_col2 INTEGER, UNIQUE (key_col1, key_col2)
)
;
CREATE TABLE Table2 (
key_col1 INTEGER NOT NULL,
key_col2 INTEGER,
FOREIGN KEY (key_col1, key_col2)
REFERENCES Table1 (key_col1, key_col2)
ON DELETE CASCADE
ON UPDATE CASCADE
)
;
INSERT INTO Table1 (key_col1, key_col2) VALUES (1, NULL)
;
INSERT INTO Table1 (key_col1, key_col2) VALUES (2, NULL)
;
INSERT INTO Table1 (key_col1, key_col2) VALUES (3, 3)
;
INSERT INTO Table2 (key_col1, key_col2) VALUES (1, NULL)
;
INSERT INTO Table2 (key_col1, key_col2) VALUES (2, NULL)
;
INSERT INTO Table2 (key_col1, key_col2) VALUES (3, NULL)
;
That last insert would fail in Jet but in SQL Server it would succeed
on the logic, I guess, that Table1.col2 = Table2.col2 returns unknown
therefore should not fail; Jet's apparent logic 'NULL must equal NULL'
is harder to defend 'relationally'. However, what seems strange IMO is
the SQL Server CASCADE behaviour:
UPDATE Table1
SET key_col2 = 1
WHERE key_col1 = 1
;
The above change does not cascade to the referencing table (Table2) in
SQL Server. While the inability to compare null values between tables
may be sound 'relationally' it would seem to be unintuitive and not the
desired effect. Similarly, when deleting:
DELETE
FROM Table1
WHERE key_col1 = 2
the row is removed from the referencing table in Jet but not in SQL
Server.
I think the Jet behaviour has more utility. Such 'practical' decisions
are hardly unprecedented, even my the famously strict ANSI SQL
standards committees. For example, if one null value cannot be compared
as being equal to another null value then why do they GROUP together?
e.g.
DELETE FROM Table1
;
INSERT INTO Table1 (key_col1, key_col2) VALUES (1, NULL)
;
INSERT INTO Table1 (key_col1, key_col2) VALUES (2, NULL)
;
INSERT INTO Table1 (key_col1, key_col2) VALUES (3, 3)
;
SELECT key_col2, COUNT(*)
FROM Table1
GROUP BY key_col2
;
The answer is, at least in part, because most people would expect it to
work this way and such an approach has the greater practical utility.
Jamie.