Best way to move multiple authors into Junction table after import

P

PaulAlex

Hello all. I am relatively new to Access and SQL (started playing with
it a few days ago), and so I have a question on the best way to move
multiple authors into a junction table.

Here's the deal:

1)
I had a flat-file database of a bunch of books, in CSV format. The
fields consisted of:

title
auth1_firstname
auth1_lastname
auth2_firstname
auth2_lastname
auth3_firstname
auth3_lastname
auth4_firstname
auth4_lastname
pubmonth
pubyear
isbn
desc(ription)

2)
In Access, I created three tables: Books, Authors, BooksAuthors, and
Publishing.

The Books table has the Title and Desc fields, plus BookID which is
AutoNum (AN) and a Primary Key (PK).

The Authors table has AuthorID (designed as AN and PK), and has all of
the Auth* fields listed above. I know this isn't set up right, and
once I get this set up right, I believe everything else will fall into
place..

The Publishing table has PubMonth, PubYear and ISBN.

The BooksAuthors table has BookAuthID as a AN and a PK. In addition, I
used the LookUp Wizard to create two more fields: link_BookID and
link_AuthID, which link to the respective fields in the Books and
Authors tables.

I had read about junction tables, and fits my scenario, since I do
have books that have multiple authors, and authors who have written
more than one book.

3)
I imported my CSV table into a Holding table.

Currently the Holding table has all the data and all the fields from
the CSV file.

My plan is to use a bunch of SQL queries to move all that data from
Holding into the various other tables (once I learn about Append,
Update and other SQL statements, that is).

4)
But my question would be: How do I move the auth1_* and auth2* and
auth3* and auth4* data to in such a way that they are linked?

-- Paul
 
J

John Nurick

Hi Paul,

You're pretty much on the right lines.

The Authors table needs to have something like
AuthorID AutoNumber PK
LastName
FirstName
One or more additional fields to help
distinguish between authors who share the
same name. The only one in the actual data is
Title, it seems, but later on you might be
able to add the author's dates or a description.

You may also need to think about how to handle (a) variant
forms of authors' names (b) pseudonyms (c) Anon and (d)
institutional authors.

Populate this by using a series of 4 append queries to get each of the
paired author fields from the holding table:

INSERT INTO Authors (LastName, Firstname, Title)
SELECT auth1_lastname, auth1_firstname, Title
FROM HoldingTable;

Having got all the authors from all fields in the holding table into the
Authors table, run a Find Duplicates query to show you names that appear
more than once. Using your knowledge of the field, delete any genuinely
duplicate entries and add dates or descriptions to distinguish between
different authors with the same names. Then scrutinise the table as a
whole to correct spelling mistakes and regularise variant forms, and
repeat the de-duping exercise. Eventually, you will have a table with
one record for each genuine author.

The next thing I'd do is add four AuthorID fields to the holding table
and run a series of update queries joining Authors and HoldingTable to
populate each of these with the ID of the relevant author. Some handwork
may be needed to deal with authors who share names, spelling mistakes
and variant forms.

You can then append the books to the books table, thus creating the
BookID values; add a BookID field to the holding table and populate it
(again, care is needed when dealing with different books with the same
titles); and finally use 4 append queries to generate the records in the
BookAuthors table.
 
P

PaulAlex

Thanks, John. I'll print out your response and go through it. I
suspect that the first time through I'll have a bumpy road to drive.
But that's okay because I'll figure I'll learn something as I try to
figure out what I did wrong.

If by institutional authors, you mean "house" names, that's a good
point that you raised. A lot of my books are credited to house names.
Originally, I used the actual ghost writer as the author name, but now
I think I should actually include the house name.

One more question tho:
INSERT INTO Authors (LastName, Firstname, Title)
SELECT auth1_lastname, auth1_firstname, Title
FROM HoldingTable;

I'm assuming of course that for each of the 4 iterations, I'll want to
change auth1* fields to auth2* and auth3* and so on. So Access will be
good enough to "remember" that the auth2* and subsequent auth* fields
will be tied to the book? In other words, those subsequent auth*
fields won't become orphaned?

If I may attempt to answer my own question (I actually haven't gotten
that far in the SQL tutorial book yet), I'm guessing the answer is No,
because each of those Auth* fields are still marked by the AuthorID
AutoNumber Primary Key field?

Thanks again, and in advance, for your help.

Paul
 
J

John Nurick

Hi Paul,

Comments in line.

Thanks, John. I'll print out your response and go through it. I
suspect that the first time through I'll have a bumpy road to drive.
But that's okay because I'll figure I'll learn something as I try to
figure out what I did wrong.

If by institutional authors, you mean "house" names, that's a good
point that you raised. A lot of my books are credited to house names.
Originally, I used the actual ghost writer as the author name, but now
I think I should actually include the house name.

One more question tho:


I'm assuming of course that for each of the 4 iterations, I'll want to
change auth1* fields to auth2* and auth3* and so on. So Access will be
good enough to "remember" that the auth2* and subsequent auth* fields
will be tied to the book? In other words, those subsequent auth*
fields won't become orphaned?

If I may attempt to answer my own question (I actually haven't gotten
that far in the SQL tutorial book yet), I'm guessing the answer is No,
because each of those Auth* fields are still marked by the AuthorID
AutoNumber Primary Key field?

No (or yes). Access doesn't remember that things are tied together
unless you set up relationships. What we're trying to do is to get from
something like this:

Title, Auth1FN, Auth1LN, Auth2FN, Auth2LN, Auth3FN, Auth3LN ...
"Well I never!","Will","Wynken","Bob","Blynken","Ned","Nod" ...
"Yes you did!","Ned","Nod","Harry","Potter" ...

to this:

Authors
AuthorID, LastName, FirstName...
1, Wynken, Will
2, Blynken, Bob,
3, Nod, Ned
4, Potter, Harry

Books
BookID, Title ...
1, Well I never!, ...
2, Yes you did!, ...

BookAuthors
BookID, AuthorID
1,1
1,2
1,3
2,3
2,4

It's necessary first to construct the Authors table by extracting all
the different authors from all four pairs of "author name" fields in the
textfile, and to correct errors and eliminate duplicates.

Having done this, you need to link up the new AuthorIDs from the authors
table with the right entries in the holding table; one way of doing this
- IMO the simplest unless one's really familiar with writing queries -
is to add four new fields, one for each of the four pairs of author name
fields.

I hope this makes sense. Read up on database normalisation if it
doesn't, and post back here if necessary.
 
P

PaulAlex

Hey, John...

Well, I've been learning quite a lot. Got the auth dupe issue taking
care of, but I've run into another problem, when I set up the tables
and fields a different way. (The reason I didn't continue the way as
depicted originally, was because nothing was linking correctly between
tables.)

Now when I try to INSERT INTO from the "holdingtable" table into the
actual new table, I get the infamous "...Access set 0 field(s) to Null
due to a type conversion failure, and it didn't add 1288 record(s) to
the table due to key violations." I've double-checked consistency
between data types and made sure that AllowZeroLength was set to Yes.
(Obviously, I must have missed something.)

I've done a bunch of experimenting and troubleshooting but alas,
nothing works.

The Key Violations error seems to happen when my au_ID and book_ID are
set to Text instead of AutoNumber. But Access will not let me set both
aut_id and book_id to Autonumber in the TITLE_AUTHORS junction table.


If I had set up these tables from scratch, then obviously I set it up
wrong. But since I modeled my database on the sample Books.mdb
(provided by Peachpit Press "Visual Quickstart Guide SQL" by Chris
Fehily) AND I even used their database to transfer my "holdingtable"
data and it still didn't work...

Any ideas on this one? This problem has been stumping me for the last
2 days...

Here is how my database is set up:
------------------------------------------------------

The "holdingtable" Table consist of these headers:

id, series, booknum, title, picture, auth1_firstname, auth1_lastname,
auth2_firstname, auth2_lastname, auth3_firstname, auth3_lastname,
auth4_firstname, auth4_lastname, pubmonth, pubyear, isbn, desc

(the ID field was automatically generated when I imported the CSV file
into Access.)

An example of the data that is stored in "holdingtable" would be:

87, The Hardy Boys, 44, Case of the Missing Pencil,
Covers\HB-CaseMissingPencil.jpg , Joe, Author, Billy, Author2, Third,
Author, Fourth, Author, February , 1999, 373811977


Here is my layout:

AUTHORS TABLE
=============
au_id (PK)
au_fname
au_lname
title_name
notes

TITLE_AUTHORS (junction table)
==============================
title_id (PK)
au_id (PK)

TITLES
======
title_id (PK)
title_name
series
pub_id
booknum
picture
description

PUBLISH
=======
pub_id (PK)
pub_month
pub_year
pub_isbn


Thanks,
Paul
 
P

PaulAlex

I tried getting around this error by deleting the Au_ID and Book_ID
Primary Keys altogether and then importing. That worked. Then I
recreated those fields. Of course, each table had different values for
Au_ID and Book_ID -- in other words, they weren't linked.

I am guessing that I am getting that error because when I try to
import data from "holdingtable" table into the Authors table, it's
complaining that I have an existing primary key value, or that the
primary key is null?
 
P

PaulAlex

Okay, I think I'm on to something here, so hold off on any assistance
yet, please. Even though I've spend 6 hours a day on this, for the
last few days, I still want to see if I can figure this out :)
 
J

John Nurick

Comments inline.

Hey, John...

Well, I've been learning quite a lot. Got the auth dupe issue taking
care of, but I've run into another problem, when I set up the tables
and fields a different way. (The reason I didn't continue the way as
depicted originally, was because nothing was linking correctly between
tables.)

Now when I try to INSERT INTO from the "holdingtable" table into the
actual new table, I get the infamous "...Access set 0 field(s) to Null
due to a type conversion failure, and it didn't add 1288 record(s) to
the table due to key violations." I've double-checked consistency
between data types and made sure that AllowZeroLength was set to Yes.
(Obviously, I must have missed something.)

I've done a bunch of experimenting and troubleshooting but alas,
nothing works.

The Key Violations error seems to happen when my au_ID and book_ID are
set to Text instead of AutoNumber. But Access will not let me set both
aut_id and book_id to Autonumber in the TITLE_AUTHORS junction table.

Neither of these should be Autonumber in the junction table. Each of
them is a foreign key to the primary key field in the Authors and Titles
tables respectively; whether those PKs are autonumbers or Longs, the
related foreign keys must be Longs.
If I had set up these tables from scratch, then obviously I set it up
wrong. But since I modeled my database on the sample Books.mdb
(provided by Peachpit Press "Visual Quickstart Guide SQL" by Chris
Fehily) AND I even used their database to transfer my "holdingtable"
data and it still didn't work...

Any ideas on this one? This problem has been stumping me for the last
2 days...

Here is how my database is set up:
------------------------------------------------------

The "holdingtable" Table consist of these headers:

id, series, booknum, title, picture, auth1_firstname, auth1_lastname,
auth2_firstname, auth2_lastname, auth3_firstname, auth3_lastname,
auth4_firstname, auth4_lastname, pubmonth, pubyear, isbn, desc

(the ID field was automatically generated when I imported the CSV file
into Access.)

An example of the data that is stored in "holdingtable" would be:

87, The Hardy Boys, 44, Case of the Missing Pencil,
Covers\HB-CaseMissingPencil.jpg , Joe, Author, Billy, Author2, Third,
Author, Fourth, Author, February , 1999, 373811977


Here is my layout: This looks OK.

AUTHORS TABLE
=============
au_id (PK)
au_fname
au_lname
title_name

Presumably title_name has nothing to do with the title of any book; if
it does, it belongs in another table.
notes

TITLE_AUTHORS (junction table)
==============================
title_id (PK)
au_id (PK)

As I said above, these fields need to be Long, not Autonumber.
 
P

PaulAlex

Thank you; setting the junction table ID's to Long Integers instead of
Text worked around the problem. When you mentioned that junction table
IDs were foreign keys to the primary field in the other tables, that
helped me see the light. I had thought that IDs in junction tables
were directly the same as those in the other tables, and thus the
properties had to be the same.
Presumably title_name has nothing to do with the title of any book; if
it does, it belongs in another table.

Yup; I had temporarily put it into here as a point of reference so I
could determine which dupe entries to delete after I had imported the
4 sets of authors. I'll be deleting the title_name column from the
Author's table shortly.


As you can see, I still have a lot to learn. I'm halfway through this
SQL book (but having a bit of a time with JOIN statements). I'd buy a
book on Access, but as soon as I'm done with this test project, I'm
gonna be porting it to SQL Server anyhow, and that's what I really
need to start studying.

Some hobby I picked up, eh?

Thanks again, John. Have a happy holidays.

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top