D
Darrell
I have a recurring question which I can't seem to locate any discussion
of. It relates to keeping historical data in a relational database. This
may seem basic, but I've never come across an answer which satisfies my
curiosity.
Obviously, when you are joining from one table to related table through
a key field, data accessed via this key is what is currently in the
fields of the related table. Whenever data in the related table changes,
these changes will be reflected in any joins. This, is, in fact, part of
the whole point of a RDBMS. It backfires, however, whenever you want to
collect and preserve HISTORICAL information.
The classic example is when a vendor address changes. Purchases are made
from a vendor (Vendor A), and all that is recorded for the purchases is
the key for Vendor A. Then Vendor A's address changes. All purchases
ever made from Vendor A will now show Vendor A's current current -
clearly not true.
I have two possible solutions to this problem, the first of which I have
used. The first one is to duplicate all the fields of Vendor table in
the Purchases table and write the values from Vendors into Purchases
whenever Vendor A is selected. This seems a horrible redundancy of data.
The second one is to create multiple records for the Vendors table with
date ranges as part of the key which must be referenced in joins to that
table. This is not quite so redundant but still seems clunky, involving
the complications of including the date range in ALL joins.
Are there any other options????? This presents itself in virtually every
database I've ever designed, and I would certainly like some more
enlightened approaches.
Thanks, as always, for all suggestions,
Darrell
of. It relates to keeping historical data in a relational database. This
may seem basic, but I've never come across an answer which satisfies my
curiosity.
Obviously, when you are joining from one table to related table through
a key field, data accessed via this key is what is currently in the
fields of the related table. Whenever data in the related table changes,
these changes will be reflected in any joins. This, is, in fact, part of
the whole point of a RDBMS. It backfires, however, whenever you want to
collect and preserve HISTORICAL information.
The classic example is when a vendor address changes. Purchases are made
from a vendor (Vendor A), and all that is recorded for the purchases is
the key for Vendor A. Then Vendor A's address changes. All purchases
ever made from Vendor A will now show Vendor A's current current -
clearly not true.
I have two possible solutions to this problem, the first of which I have
used. The first one is to duplicate all the fields of Vendor table in
the Purchases table and write the values from Vendors into Purchases
whenever Vendor A is selected. This seems a horrible redundancy of data.
The second one is to create multiple records for the Vendors table with
date ranges as part of the key which must be referenced in joins to that
table. This is not quite so redundant but still seems clunky, involving
the complications of including the date range in ALL joins.
Are there any other options????? This presents itself in virtually every
database I've ever designed, and I would certainly like some more
enlightened approaches.
Thanks, as always, for all suggestions,
Darrell