Unique Records

A

Anthony

Hi

Hoping someone might be able to assist.

I have a csv file with incident records, which has been extracted from
a large database and provided to me. Each incident has a number of
details including date, time, location etc etc. They also have some
details on person/s involved, such as age, sex etc. Each incident has a
unique number (Report Number). There may be a number of rows for each
incident, depending if there was more than one person involved. In
these cases there may be 2 or more rows with the same report number,
but which will differ in just one or more field (ie second record might
be identical, but have a different age, sex etc indicating more than
one person was involved).

I am trying to work out the easiest way of working with this data. I am
not used to working with such a large dataset (about 400 000 records)
so usually I can use a formula to count unique records in Excel and use
a pivot table to manipulate the data. I figure that I should import the
data into an Access database and then use queries to manipulate, but am
wondering whether it is possible to use a similar function to that in
excel. (ie in excel I would create an additional column, and input a
formula so that for the first occurrence of a report number 1 is the
result, but thereafter it produces a zero. I then use this column as
the count field). That way I can for example count the number of unique
incidents that occurred in each day, in each time period etc etc.

I hope this explanation is reasonably clear, I can provide more
information, such as field names etc if this is necessary.

Any ideas? Thanks in advance.

Cheers, Anthony
 
D

dcichelli

Hi Anthony,

This is pretty simple in Access. Just take your table created from the
CSV, then create a query with only one field (the unique ID) and group
by the unique ID. This will give you a count of the unique incidents.

Diane
 
A

Anthony

Hi Diane

Sorry I don't think I was very clear with my explanation. I have
imported the csv file into Access.

The unique Id is not actually unique per se, it can reappear in
multiple records. Therefore if I add IR_ID (the 'unique field') into a
query twice, once as a count and the other as group by, it returns a
list of all the IR_ID's in this list and the number of times they
appear (hence giving me the number of actual unique incidents, but not
in the format I require). Removing the group by field leaves me with a
total number of all rows, not just unique IR_ID's.

I have pasted the field names below to help me explain. Each row has
the following fields:

"IR_ID","REP_DTE","FROM_DTE","FROM_TME","TO_DTE","TO_TME","ADD_DTE","REP_MTHD","REP_PLAC","VICT_SEX","VICT_AGE","OFFNC_ID","ATT_FLG","ACT_NAME","OFF_NAME","SECTION","IMS_OLD","DOM_FLG","ALC_FLG","LOC_NAME","LOC_SUB","SUB_TXT","STATION","DISTRICT","REGION","POSTCODE","RES_DTE","RESULT","OFR_SEX","OFR_AGE"

IR_ID is the 'unique ID' for each row. In many cases there will be more
than one row with this IR_ID. In these cases, almost the entire row
will be the same, except for one or more other fields, for example
"OFF_NAME" - in this example there are multiple offences in the
incident.

What I would like to be able to do is to add an additional field to the
table which uses a formula similar to the one I use in Excel >>
=IF(SUMPRODUCT(($A$2:$A2=A2)*($C$2:$C2=C2))>1,0,1) (where A would be
IR_ID and C FROM_DATE for example). As I tried (very poorly) to
explain, this produces a 1 if it is the first instance of IR_ID, and a
0 if it is a repeated record. I can then use this as a count field.

Because it refers to two fields I can use this for any field in my
datasheet (in excel). For example if I want to do count the number of
offences (rather than unique incidents, in which as I said there may be
multiple offences), I would add a column in excel that uses the above
formula and refers to IR_ID and OFF_NAME. I should add that whilst
there may be multiple offences, there may be multiple offenders for the
one offence, hence the need to identify the number of unique offences
in each incident.

Can I do something similar in Access, given that I am working with such
a large data set?

I hope I have improved my explanation. I know what I am trying to do,
but it is difficult to explain.

Thanks again.

Anthony
 
D

dcichelli

Hi Anthony,

Ok --- I THINK I understand what is going on here. You need a true
unique ID in addition to your IR_ID field. Let's do this:

1. Make an exact copy of your table called Copy_Main (so if we mess up,
the original table is not affected).

2. In you Copy_Main table add an AutoNumber field called Unique_ID.

3. Create a make-table query based on your Copy_Main table including
two fields: the MIN of Copy_Main.Unique_ID and the Copy_Main.IR_ID
field where COUNT([IR_ID])> 1. Name the new table created by this query
Unique_MIN_IR_ID.

4. Create a select query by joining both tables Copy_Main and
Unique_MIN_IR_ID on the Unique_ID fields. This is a straight one-to-one
join where the query results will show all records where
Copy_Main.Unique_ID = Unique_MIN_IR_ID.Unique_ID.

For the query fields include all fields from the table Copy_Main.

Whew! Does it work??

Diane
 
D

dcichelli

Hi Anthony,

Ok --- I THINK I understand what is going on here. You need a true
unique ID in addition to your IR_ID field. Let's do this:

1. Make an exact copy of your table called Copy_Main (so if we mess up,
the original table is not affected).

2. In you Copy_Main table add an AutoNumber field called Unique_ID.

3. Create a make-table query based on your Copy_Main table including
two fields: the MIN of Copy_Main.Unique_ID and the Copy_Main.IR_ID
field where COUNT([IR_ID])> 1. Name the new table created by this query
Unique_MIN_IR_ID.

4. Create a select query by joining both tables Copy_Main and
Unique_MIN_IR_ID on the Unique_ID fields. This is a straight one-to-one
join where the query results will show all records where
Copy_Main.Unique_ID = Unique_MIN_IR_ID.Unique_ID.

For the query fields include all fields from the table Copy_Main.

Whew! Does it work??

Diane
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top