Normalizing data

C

Confused

I have an excel file which has over 5000 lines and will continue to add new
data. I would like to convert this into an Access database. The problem is
I'm not sure which field is my primary key as these fields have redundant
data lines. I'm guessing that normalization is required but not quite sure
how to go about it.

Here are some sample data lines:

Dealers Start Date Org Amt CUSIP Security
Class
BS 2/13/2008 22 07388YAE2 BSCMS 07-PW16 A4
Citi 2/13/2008 10 07388YAE2 BSCMS 07-PW16 A4
DB 2/13/2008 10 07388YAE2 BSCMS 07-PW16 A4
BS 2/14/2008 22 07388YAE2 BSCMS 07-PW16 A4
DB 2/14/2008 10 07388YAE2 BSCMS 07-PW16 A4
DB 2/15/2008 10 07388YAE2 BSCMS 07-PW16 A4
Citi 2/21/2008 30 07388YAE2 BSCMS 07-PW16 A4
BS 2/25/2008 10 07388YAE2 BSCMS 07-PW16 A4
BS 2/28/2008 10 07388YAE2 BSCMS 07-PW16 A4

Moodys SP Fitch Org Credit Curr Credit Curr Sub
Aaa AAA 30.03%
AAA AAA 24.69% 4.42%
Aaa AAA 30.04%
Aaa AAA 30.03%
Aaa AAA 30.04%
Aaa AAA 30.04%
Aaa 30.05% 3.25%
Aaa AAA 30.03%
Aaa AAA 30.03%

Prepay A/L Index Spread Dollar Type
9.11 Swaps 225 95 IG
CPR 0 1.30 Swaps 280 101 IG
9.12 Swaps 235 95-00 IG
9.11 Swaps 225 95 IG
9.12 Swaps 225 95-01 IG
9.12 Swaps 220 94-22 IG
CPR 0 9.09 Swaps 215 94 IG
9.11 Swaps 180 96 IG
9.11 Swaps 185 96 IG

Is the primary key the "CUSIP" field? Can someone please help me with
normalizing the table?

Any help is much appreciated !!!
 
G

Guillermo_Lopez

The primery key is a field or group of fields that never repeat. In
your case no field cualify, so you would have to create one. An auto
number works well here.

Normalization works best when you have several tables and then you use
key fields to link all the tables.

Without knowing what each field means or what its purpose is, there
isn't much that I can do outside of guessing.

- GL
 
J

John W. Vinson

I have an excel file which has over 5000 lines and will continue to add new
data. I would like to convert this into an Access database. The problem is
I'm not sure which field is my primary key as these fields have redundant
data lines. I'm guessing that normalization is required but not quite sure
how to go about it.

The first step when it comes to normalization is...

Click the Start button on your computer, and choose Turn Off Computer. Select
Shut Down or Stand By, as you prefer.

Get a pad of paper, and a couple of pencils with good erasers.

Identify the "Entities" - real-life things, events, or people of importance to
your application. One of these will be Stocks, clearly; for a Stock, a CUSIP
is perhaps a good primary key. Another, different KIND of entity, is a rating
agency; yet another KIND of entity is a stock transaction. These are
*different* from stocks, and require... their own different table!

You need to figure out, for your own business needs, what entities are
important and what information you need to know about them. THEN you can
design your tables.

It's a given that a spreadsheet will very rarely map directly to a table.
Spreadsheets have their own logic and their own structure; that structure is
*very different* than the structure of a relational database.

Draw out the list of entities of importance to your business need, identify
the important attributes of each entity, and the relationships between the
entities. THAT will become the basis of your table structure.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top