mysql – identifying number of duplicates in a table

So my little query adventures with mySQL continues. Here’s another unique problem: There are duplicates in the table as there is no field that was setup as primary index. So how do we know how many do we have?

Well assume that the duplicate values happen on the “Phone Number” of the client. Here’s the query you would run:
select `BUSINESS PHONE`, count(*) as ‘Num of Dups’
from mytable
group by `BUSINESS PHONE`
having count(*) > 1

Mysql – Appending data

Problem:
Appending data from one table to another table where duplicate might exist.
Solution:
A simple “Insert into table2 select * from table1″ query will give errors saying “duplicate records exist”, to go around it, the following query should do the job:


INSERT INTO MASTERTABLE
SELECT *
FROM NEWTABLE
WHERE NEWTABLE.`UNIQUEID` NOT
IN (
SELECT `UNIQUEID`

FROM MASTERTABLE

)

MySQL non-matching Records

Problem: Finding records in Table b that do not match records in Table a

This sort of problem may arise when you need to scrub a file against another or run a duplicate scan on a file against another to weed out any duplicates that may exist prior to doing whatever it is you need to do with that file. Lets assume that both the tables have a field called “PHONE” that identifies your clients as unique.
The query that you will need to run uses LEFT JOIN function

SELECT * FROM TableB b
LEFT OUTER JOIN TableA a ON b.`PHONE` = a.`PHONE`
WHERE a.`PHONE` IS NULL

This should do it.

mysql – matching records

Problem:
Find records that match within two tables, TableA and TableB.

Solution:

Use a simple SELECT statement with a defined WHERE clause

Lets assume that TableA and TableB have a field called “Phone” which contains the phone number of the clients and these numbers are obviously stored in a unique index. So what we are going to do is run a simple query to find out which numbers in TableB match TableA.

Assuming you are using linux and you can run this query from the command prompt, this is the way to do it:

user@server:~/ mysql -u USERNAME -p
(the server will ask for your password, enter the password and press Enter)

Now you should be in mysql and the prompt should look different already

mysql> use DATABASENAME
(this command changes the database to the one where you have both the tables)
mysql> select * FROM TableB, TableA WHERE TableB.phone=TableA.phone;

This query will obviously select all fields from TableB and TableA. If you only wanted to select certain fields (for example “name” from TableB and “name” from TableA), then the query would change to:

mysql> select TableB.name, TableA.name FROM TableB, TableA WHERE TableB.phone=TableA.phone;

This will give you just the name column from TableB and name column from TableB.

Now what if you wanted to dump the query result into a text file for future use? Here’s the query:

mysql> SELECT TableB.name, TableA.name FROM TableB, TableA WHERE TableB.phone=TableA.phone INTO OUTFILE ‘/tmp/matchingnames.txt’;

That should do it.

mysql – scrubbing files against others

Problem:

TableA contains a customer database
TableB contains new names that you want to match against your customer database and remove all the names that are already customers.

Solution:

Use “LEFT JOIN” property of mysql to carry out the task.

Lets assume that TableA and TableB have a field called “Phone” which contains the phone number of the clients and these numbers are obviously stored in a unique index. So what we are going to do is run a simple query to find out which numbers in TableB do not exist in TableA.

Assuming you are using linux and you can run this query from the command prompt, this is the way to do it:

user@server:~/ mysql -u USERNAME -p
(the server will ask for your password, enter the password and press Enter)

Now you should be in mysql and the prompt should look different already

mysql> use DATABASENAME
(this command changes the database to the one where you have both the tables)
mysql> select * from TableB LEFT JOIN TableA ON TableB.phone=TableA.phone WHERE TableA.phone IS NULL;

This should show you all the names in TableA that do not have a match in TableB.

Suppose you wanted to dump this file into a .txt file so you could use it elsewhere, the above command would change to:
mysql> SELECT * FROM TableB LEFT JOIN TableA ON TableB.phone=TableA.phone WHERE TableA.phone IS NULL INTO OUTFILE ‘/tmp/scrubbedfilename.txt’;

This will dump the file with the unmatching records from TableA in the /tmp directory of your linux filesystem.