Should I use auto-number for my tables?

C

CAD Fiend

I am new to Access. I am wondering the big question: Should I use
auto-number for my Primary Key field?

Why or why not?

To me, I think that the ID fields should be something relevant to the
function or purpose of that particular table, like if you were doing a
table on different job sites for your projects, you would use something
that tied in to that job site, like the project number, for the primary
key field, instead of an auto-number.

What (if any) is the disadvantage of NOT using auto-numbering?

TIA.
 
S

Steve Schapel

CAD,

This is a question on which opinions vary.

For almost all tables, it is important that each record can be uniquely
identified. If your data includes a field which does this already, such
as your example of a Project Number, then clearly this should be used as
the primary key field, and there is no advantage at all to be gained by
adding an AutoNumber field. If the data does not contain such a field,
in which case the unique identification would otherwise need to be done
on the basis of a multiple-field index, I personally always use an
AutoNumber field. To me, the main advantages are simplicity and
convenience.
 
J

John Vinson

I am new to Access. I am wondering the big question: Should I use
auto-number for my Primary Key field?

This question has set off flamewars in here in the past... let's hope
not this time!

Autonumbers are handy; Access pushes them. But they're certainly not
required and - if you have a valid "candidate key" - not even a good
idea, since they can give you a false assurance that you have no
duplicate data. The autonumber will make the record unique even if all
the other fields have duplicates!

My rule of thumb is that a good candidate primary key should be
unique, stable, and short. Unique is obviously a requirement by
definition. Stable is valuable; you can use Cascade Updates to
propagate changes if (say) Canada changes the province abbreviation
for Newfoundland, as they did a while back, but I wouldn't want to
have cascading updates invoked *frequently*. Short is perhaps the
least essential, with disk being down under a penny a megabyte these
days - but there are good performance and efficiency arguments to say
that a 4-byte Long Integer foreign key in a half-dozen tables might be
better than a 60-byte product name, say.

If you have a Project Number that you can count on to be unique and
(reasonably) stable and short, by all means use it as the Primary Key.
On the other hand, if you have a table of People, consider using an
autonumber PersonID: names are neither unique, stable, nor short, and
even if you use first/last/middle name/address you may still find
duplicates (my friend Fred Brown's son Fred Brown has moved off to
college but you get the idea).

John W. Vinson[MVP]
Join the online Access Chats
Tuesday 11am EDT - Thursday 3:30pm EDT
http://community.compuserve.com/msdevapps
 
J

Jamie Collins

John Vinson said:
<snip> My rule of thumb is that a good candidate primary key should be
unique, stable, and short <snip>

Some good points here (I too prefer a natural key if possible).

I'd just like to add another point. Although using autonumber is
useful in forcing a row to be unique, it's rarely a good idea to use
an autonumber column as the primary key (PK) for a table.

Theoretically speaking, all keys are equally powerful (Celko). In
practise, the primary key is often given special significance. With
Jet (MS Access), the PK determines the table's clustered index, being
its physical order on disk.

Having a useful physical order is a significant performance
consideration. Think of a paper copy telephone directory sorted by
telephone number - not terribly useful. Adding an index to this
directory ordered on last name then first name would make it more
useful but it would still not be a joy to manually extract the phone
numbers for all subscribers with a last name beginning with the letter
'S'. If the directory was physically ordered on last name then first
name, the same task would be simplicity itself: you'd just need to
grab the relevant pages.

So if the autonumber is needed solely to ensure uniqueness, use a
compound key of the column(s) you use most often in GROUP BY, BETWEEN
and ORDER BY clauses and just tack the autonumber on the end.

Jamie.

--
 
B

Bruce

As a relatively new Access user I perhaps don't have the diverse experience
of some who post here, but I have to disagree with the contention that it is
"rarely a good idea" to use an autonumber PK. The most commonly used fields
are often subject to change: people and company names change, addresses
change, etc. I question the usefulness of the phone book analogy, since any
comparison to a paper document is more in the nature of a thought experiment
than an actual situation.
Compound field PKs have their uses, but can make relationships difficult to
manage. The PK does not need to be useful to the user. I have made a
training records database with an autonumber PK in the parent table. I
relate that field to a corresponding number field in the child table, but I
never see it or work with it directly. I could probably combine enough
fields to come up with a unique combination, but I can't think of a reason
for doing so. I use queries to customize the order in which records appear.
 
J

Jamie Collins

Bruce said:
As a relatively new Access user I perhaps don't have the diverse experience
of some who post here, but I have to disagree with the contention that it is
"rarely a good idea" to use an autonumber PK. The most commonly used fields
are often subject to change: people and company names change, addresses
change, etc. I question the usefulness of the phone book analogy, since any
comparison to a paper document is more in the nature of a thought experiment
than an actual situation.
Compound field PKs have their uses, but can make relationships difficult to
manage. The PK does not need to be useful to the user. I have made a
training records database with an autonumber PK in the parent table. I
relate that field to a corresponding number field in the child table, but I
never see it or work with it directly. I could probably combine enough
fields to come up with a unique combination, but I can't think of a reason
for doing so. I use queries to customize the order in which records appear.

Bruce, Thank you for your comments. I'm glad you've correctly seen the
phone directory as a thought process/analogy rather than an example as
such.

In case I didn't make my point clear, I'll say a few more words. I'm
talking about performance. If this paper copy directory was sorted by
telephone number and I asked you, a human, to extract all the phone
numbers for people whose last name began with an 'S', you would have
to look at every entry and it would take a long time (bad
performance). Even if said book had an index on last name that told
you the page number, it wouldn't make your task much easier. If the
list was by last name order, it would take you mere seconds to rip out
the relevant pages (vastly better performance).

A Jet database's logical index works more effectively than a paper
equivalent but it hold true that having the data in logical *and*
physical order will give a distinct advantage. You may customize the
order in queries but there will be less work to be done if the data is
already in the correct order i.e. your quires will run faster.

I can't think of a better example, so I'll break my own rule and
continue with the telephone directory <g>. As a key, you are proposing

(phone number)

and I'm proposing

(last name, first name, phone number)

You have your reasons (to make 'relationships' easier to manage) and I
have mine (to make queries run faster). So we'll have both! We'll
create a unique index for each. They are both equally 'powerful'.

But how do we decide which shall be the *primary* key (PK)? In
theory, it makes no difference. In practice, we should look for the
'special' meaning of the PK in the product's implementation.

There is no advantage in making your key the PK. You can still define
your 'relationships' on the unique index and maintain referential
integrity plus you've only got one column for the simplicity you
desire.

However, if we make mine the PK we get the performance advantage of a
sensible physical order (clustered index).

So it's a win-win situation. The mistake you made is assuming the
wrong 'special' meaning of the PK in Jet.

Jamie.

--
 
B

Bruce

I've always understood that a good rule for PKs is to choose values that will
not change. I guess the thing I don't follow with your Last name, First
name, Phone number example is what happens if the name changes. I realize it
won't affect new records, but what about old records? This may not be
relevant for a phone directory, but for an example of a different instance I
manage a training records database for my company. PK for the employees is
EmployeeID. I wouldn't want to include name in the PK, since an employee's
name can change. What would become of records from before the name change?
Maybe I'm missing something here, and if so I would like to know what it is
so that I gain perspective for future projects.
 
J

John Vinson

I've always understood that a good rule for PKs is to choose values that will
not change. I guess the thing I don't follow with your Last name, First
name, Phone number example is what happens if the name changes.

My friends Fred Brown and Fred Brown Jr. no longer live in the same
house - but they did for years. And yes they only had the one phone.

Boxer George Foreman had, as I understand, five sons named George,
George, George, George, and George. Uncommon - but names do NOT make
good keys!

John W. Vinson[MVP]
Join the online Access Chats
Tuesday 11am EDT - Thursday 3:30pm EDT
http://community.compuserve.com/msdevapps
 
J

Jamie Collins

Steve Schapel said:
Bruce,

I agree. I use an AutoNumber Primary Key in most of my tables.

Your comprehensive counter-arguments have me beat <g>.

Jamie.

--
 
J

Jamie Collins

John Vinson said:
My friends Fred Brown and Fred Brown Jr. no longer live in the same
house - but they did for years. And yes they only had the one phone.

Boxer George Foreman had, as I understand, five sons named George,
George, George, George, and George. Uncommon - but names do NOT make
good keys!

John,
With respect, you missed the point. The example employed is admittedly
weak (FWIW in my country, only one name may be listed per residential
phone number, being the person responsible for paying the bill!)

I agree that when

(phone number)

is not a key then

(phone number, last name, first name)

is not a reliable improvement.

My argument (i.e. the latter compound key makes a superior Jet PK)
relies on phone number a providing a unique key for a person.

Jamie.

--
 
J

Jamie Collins

Bruce said:
I've always understood that a good rule for PKs is to choose values that will
not change.
I manage a training records database for my company. PK for the employees is
EmployeeID. I wouldn't want to include name in the PK, since an employee's
name can change.

I don't see the problem to which you allude. Say your employees table
had a compound PK of

(last name, first name, employee ID)

and another unique index on

(employee ID)

and 'relationships' in other tables referenced the index. If last name
and/or first name changed for a particular employee ID, your clustered
index (physical order) would be rebuilt at the next compact and any
'relationships' would be unaffected.

If, on the other hand, employee ID is permitted to be changed for an
individual (though I've never know this to be the case in practice),
then perhaps employee ID as PK would gain an edge because you could
take advantage of ON UPDATE CASCADE.
What would become of records from before the name change?

I can't answer that because you don't say how you currently handle
this. I think the most common implementation is that the names get
overwritten and no new row/record is created. If required, a 'former
name' column(s) in the same table is updated as appropriate. In my
experience of employee databases, name change situations more complex
than current names + former names are recorded in an 'employee notes'
type column. If you need to store name changes temporally to create a
history, they would need to be in a separate table to avoid data
redundancy issues.

Jamie.

--
 
B

Bruce

Jamie,

Thanks for all of the time and thought you have put into your replies. This
discussion has given me something to consider when databases experience
performance problems. The discussion is a bit beyond me because of my
limited understanding (so far) of how indexes work. But a discussion such as
the current one would have been over my head not that long ago, so there is
hope that I will be able to continue assimilating and processing new
information.
 
A

alfaista

Just wanted to say a big thank you to you all for this discussion! Yes, I
can get out there and google and find even more, but I needed a few good
discussion points and found them all on this one thread, thanks!
I have issues with AutoNumbers, especially when someone decides to use them
extensivley. Such as having used for all keys in all tables. I.e., the
product table has an autonumber for the key, that links to an autonumber for
the producer of said product, that links to an autonumber for a point of
contact for said producer. The product table also links to itself via
another level table that shows which sub-products may belong to an overall
product. This level table is a series of 10 integers, all autonumbers from
the products table.
I find this mind boggling, and I was told that the data was all properly
"normalized".
Am I missing the picture here?
An answer would be great, but I may have developed an opinion from this
thread as it is. Any more discussion is welcome.

Thanks again.
 
A

alfaista

Just wanted to say a big thank you to you all for this discussion! Yes, I
can get out there and google and find even more, but I needed a few good
discussion points and found them all on this one thread, thanks!
I have issues with AutoNumbers, especially when someone decides to use them
extensivley. Such as having used for all keys in all tables. I.e., the
product table has an autonumber for the key, that links to an autonumber for
the producer of said product, that links to an autonumber for a point of
contact for said producer. The product table also links to itself via
another level table that shows which sub-products may belong to an overall
product. This level table is a series of 10 integers, all autonumbers from
the products table.
I find this mind boggling, and I was told that the data was all properly
"normalized".
Am I missing the picture here?
An answer would be great, but I may have developed an opinion from this
thread as it is. Any more discussion is welcome.

Thanks again.
 
A

alfaista

Just wanted to say a big thank you to you all for this discussion! Yes, I
can get out there and google and find even more, but I needed a few good
discussion points and found them all on this one thread, thanks!
I have issues with AutoNumbers, especially when someone decides to use them
extensivley. Such as having used for all keys in all tables. I.e., the
product table has an autonumber for the key, that links to an autonumber for
the producer of said product, that links to an autonumber for a point of
contact for said producer. The product table also links to itself via
another level table that shows which sub-products may belong to an overall
product. This level table is a series of 10 integers, all autonumbers from
the products table.
I find this mind boggling, and I was told that the data was all properly
"normalized".
Am I missing the picture here?
An answer would be great, but I may have developed an opinion from this
thread as it is. Any more discussion is welcome.

Thanks again.
 
A

alfaista

Just wanted to say a big thank you to you all for this discussion! Yes, I
can get out there and google and find even more, but I needed a few good
discussion points and found them all on this one thread, thanks!
I have issues with AutoNumbers, especially when someone decides to use them
extensivley. Such as having used for all keys in all tables. I.e., the
product table has an autonumber for the key, that links to an autonumber for
the producer of said product, that links to an autonumber for a point of
contact for said producer. The product table also links to itself via
another level table that shows which sub-products may belong to an overall
product. This level table is a series of 10 integers, all autonumbers from
the products table.
I find this mind boggling, and I was told that the data was all properly
"normalized".
Am I missing the picture here?
An answer would be great, but I may have developed an opinion from this
thread as it is. Any more discussion is welcome.

Thanks again.
 
A

alfaista

Just wanted to say a big thank you to you all for this discussion! Yes, I
can get out there and google and find even more, but I needed a few good
discussion points and found them all on this one thread, thanks!
I have issues with AutoNumbers, especially when someone decides to use them
extensively. Such as having used for all keys in all tables. I.e., the
product table has an autonumber for the key, that links to an autonumber for
the producer of said product, that links to an autonumber for a point of
contact for said producer. The product table also links to itself via
another level table that shows which sub-products may belong to an overall
product. This level table is a series of 10 integers, all autonumbers from
the products table.
I find this mind boggling, and I was told that the data was all properly
"normalized".
Am I missing the picture here?
An answer would be great, but I may have developed an opinion from this
thread as it is. Any more discussion is welcome.

Thanks again.
 
A

alfaista

i am sorry, the following posts are dupliates!! the system kept tell me
there was an error posting my message and i should try again later, so i
did!! as you can see, several times! so sorry!!!
 
J

Jamie Collins

alfaista said:
Just wanted to say a big thank you to you all for this discussion! Yes, I
can get out there and google and find even more

Yes, this has been discussed many times before and here is a suggested
search in the google groups Usenet archive:

http://groups.google.com/groups?q=i...ial+natural+key&meta=group=microsoft.public.*

And here's a biased view from one expert:

http://www.intelligententerprise.com/030320/605celko1_1.jhtml?_requestid=139469

Note that IDENTITY the SQL Server equivalent of, and a Jet 4.0 synonym
for, automnumber.

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top