Please help with Query "dups"

T

TotallyConfused

I need some help on how to write a query that will look at a list of names
and addresses and mark the dups. I want the query to look at the list if the
address1, address 2, city, state and zip are the same than mark it "dup". If
not then leave blank or mark "No". I am thinking something like Excel.
Only reason I can't use Excel I have over 200,000 rows.

I ran the duplicate query in Access and I can't use that because I need to
keep the mbrID which is unique.

Can someone please help me how to identify the duplicate addresses? Thank
you in advance for any help you can provide.
 
B

Brendan Reynolds

TotallyConfused said:
I need some help on how to write a query that will look at a list of names
and addresses and mark the dups. I want the query to look at the list if
the
address1, address 2, city, state and zip are the same than mark it "dup".
If
not then leave blank or mark "No". I am thinking something like Excel.
Only reason I can't use Excel I have over 200,000 rows.

I ran the duplicate query in Access and I can't use that because I need to
keep the mbrID which is unique.

Can someone please help me how to identify the duplicate addresses? Thank
you in advance for any help you can provide.


I had a problem similar to this yesterday. Here's how I solved it, maybe you
can adapt this to your needs.

I needed to find students with duplicate PPSNs. (PPSN = Personal Public
Service Number). So I created a query that looked like this, called
qryDupPPSN ...

SELECT PPSN, Count(*) AS Duplicates
FROM Students
WHERE PPSN IS NOT NULL And PPSN <> ""
GROUP BY PPSN
HAVING Count(*)>1;

This gave me a list of duplicated PPSNs and the number of students with each
duplicate PPSN. But in order to fix the problem, the user needed to be able
to identify the students with the duplicate PPSNs. And if I added enough
information to the query to identify the students, the records would no
longer be duplicates. So rather than adding the identifying data to this
query, I had to create two more queries. First, a simple query to return the
students names and PPSNs, called qryNamePPSN ...

SELECT PPSN, FamilyName, GivenName, MiddleName FROM Students

.... and finally, a third query to join these two queries ...

SELECT PPSN, FamilyName, GivenName, MiddleName, Duplicates FROM qryDupPPSN
INNER JOIN qryNamePPSN ON qryDupPPSN.PPSN = qryNamePPSN.PPSN
 
E

Evi

Add your Find Duplicates query into another query. Add the table which
contains mbrID and join the query and table via Address1, Address2, City,
State and Zip.
Then add mbrID to the query grid along with the desired address fields
Evi
 
J

John Spencer

Assuming you used the query wizard to construct the duplicates query, then you
can use that duplicates query to flag all the records from your table that
have duplicate values. The SQL of the query would look like the following.

SELECT Ta.MbrID
, Ta.address1
, Ta.Address2
, Ta.City
, Ta,State
, Ta.Zip
, IIF(q.MbrID is Null,Null,"Dupe") as DuplicateExists
FROM YourTable as Ta LEFT JOIN YourDuplicateQuery as Q
ON Ta.MbrID = Q.MbrID


John Spencer
Access MVP 2002-2005, 2007-2008
Center for Health Program Development and Management
University of Maryland Baltimore County
 
T

TotallyConfused

Thank you however, what does Ta mean? I assume Table? Do I type the whole
name out or do I type as in your example? same for "q"? Thank you. Sorry
but I am not very good at SQL.
 
J

John Spencer

Ta and Q are alias names for your table and query names. This just makes it
easier to enter the SQL.

You can remove "As Ta" and "As Q" and then replace all the "Ta" with your
table name and replace all the "q" with your query name.

Or just replace YourTable with your table name and YourDuplicateQuery with the
name of your duplicate query.

John Spencer
Access MVP 2002-2005, 2007-2008
Center for Health Program Development and Management
University of Maryland Baltimore County
 
T

TotallyConfused

Okay Thank you for your help. I have identified the dups. How do I delete
only the first dup? Thank you.
 
J

John Spencer

Easiest way is manually.

Otherwise, you need some way to identify which one to keep and which one(s) to
delete.

Another way (if you don't care WHICH record of the duplicates is kept is to
save a copy of the table structure with NO records in it.

Set up a compound unique index on the fields that you used to identify the
duplicates.

To create a multiple field unique index (Compound index)
--Open up the WorkingTable in design mode
--Select View: Index from the menu
--Enter Name for Index in first blank cell under Index Name
--Select one field in the index under Field Name
--Set Unique to Yes
--Move down one line and select the next FieldName
--Continue moving down and selecting fieldnames until all needed are included.
--Close the index window and close and save the table

Now build an append query that appends all the records from your current table
to the new table. When you run the query you will get a message that some
records can not be added due to an index error. What this means is that only
one of the duplicate records can be added and any other of that set of
duplicates will be ignored.

Once you have transferred all the records to the new table, you rename the old
table and then name the new table with the old table's previous name.

Check things out and then delete the old table when you are satisfied that
everything transferred correctly. You may have to deal with relationships -
removing them and then re-establishing them.

John Spencer
Access MVP 2002-2005, 2007-2008
Center for Health Program Development and Management
University of Maryland Baltimore County
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top