M
Mac
I have several tables of thousands of records each. The data is
imported daily from excel sheets or flat files that I receive from the
data originator. Because the data set is so large, I can only keep the
full set of the current data. But it's become increasingly important
to find a way to log any changes that are made to the data from day to
day.
The best advice I've received so far is to try the DISTINCTROW
predicate in SQL. That would be great if we weren't talking about so
many records. Not only would I exceed the limits of Access in short
order, I would also be storing a lot of redundant data.
What I really would like to end up with is a log table for each main
table that is almost an exact copy. The difference would be that the
fields in the log table would only be populated if they are different
from the previous day's data.
For example, given an initial record for a particular employee in the
main table like the following:
tbl_emp
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Georgia Green Green
After several weeks of imports that replace the original, the record
in the main table will look like the following:
tbl_emp
date_imported emp_id emp_loca emp_health emp_rating
21OCT2010 FB6K2 Florida Yellow Red
I want to end up with a change log that looks like the following:
tbl_emp_chg_log
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Florida
11OCT2010 FB6K2 Yellow
15OCT2010 FB6K2 Red
21OCT2010 FB6K2 Yellow
If I were editing the data myself, I would just create a change
transaction log as there are a variety of strategies on the subject.
But I need a way to compare field for field changes during the import,
and then log only the fields that differ from the existing data,
thousand of records each day.
I have a couple of ideas to approach it, but I'm pretty sure there's a
more efficient way than FOR - EACHing through each field of each
record. It would not only take a long time, but I've already run into
problems writing code that tries to step through every field of every
record in such a large data set.
Even if someone that recognizes this as a known problem set and could
recommend search terms that would help lead me to the right path, I
would be so thankful. My searches and brainstorming are coming up
empty.
imported daily from excel sheets or flat files that I receive from the
data originator. Because the data set is so large, I can only keep the
full set of the current data. But it's become increasingly important
to find a way to log any changes that are made to the data from day to
day.
The best advice I've received so far is to try the DISTINCTROW
predicate in SQL. That would be great if we weren't talking about so
many records. Not only would I exceed the limits of Access in short
order, I would also be storing a lot of redundant data.
What I really would like to end up with is a log table for each main
table that is almost an exact copy. The difference would be that the
fields in the log table would only be populated if they are different
from the previous day's data.
For example, given an initial record for a particular employee in the
main table like the following:
tbl_emp
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Georgia Green Green
After several weeks of imports that replace the original, the record
in the main table will look like the following:
tbl_emp
date_imported emp_id emp_loca emp_health emp_rating
21OCT2010 FB6K2 Florida Yellow Red
I want to end up with a change log that looks like the following:
tbl_emp_chg_log
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Florida
11OCT2010 FB6K2 Yellow
15OCT2010 FB6K2 Red
21OCT2010 FB6K2 Yellow
If I were editing the data myself, I would just create a change
transaction log as there are a variety of strategies on the subject.
But I need a way to compare field for field changes during the import,
and then log only the fields that differ from the existing data,
thousand of records each day.
I have a couple of ideas to approach it, but I'm pretty sure there's a
more efficient way than FOR - EACHing through each field of each
record. It would not only take a long time, but I've already run into
problems writing code that tries to step through every field of every
record in such a large data set.
Even if someone that recognizes this as a known problem set and could
recommend search terms that would help lead me to the right path, I
would be so thankful. My searches and brainstorming are coming up
empty.