Remove duplicates

You have a set of data. Duplicate values should be removed, so that only unique values are left. This can be done in multiple ways.

ID                       Product

1                         A

2                         A

3                         B

4                         O

5                         O

6                         U

7                         U

8                         U

9                         W

10                       K

11                       A

12                       S

We have a simple data running number (ID) and list of products. We should remove the duplicates and just see each different product once.

Data – data tools – remove duplicates

Maybe the most common way is to use remove duplicate functionality. Activate the data and press duplicate functionality button.

Press ok.

The duplicates were removed, and unique values are left in the list. As you noted, the functionality is overwriting the data. If you use this way, take a copy from the data and use remove duplicates for your copied data. Then you have originals untouched.

Conditional formatting

Select home – styles – conditional formatting – highlight cell rules – duplicate values

Select unique and press ok.

Here unique means that the product is listed just once. For example, A appears three times, there A is not unique. My thinking is that, if we take unique values, we take all the product listed just once. For example, product A is also listed and in unique values A is listed just once. Out of 12 values, we have seven unique values. According to conditional formatting, only four are unique.

This is a good way if you don’t want to change the data but just visualize the duplicates.

UNIQUE function

Write the UNIQUE and select the data, press enter.

The advantage with UNIQUE is that your original data and unique values are neatly separated.

COUNTIF function

Write in C2 =COUNTIF($B$2:B2;B2) and copy the formula downwards till C13. When the formula hits the first A in B2 the formula checks how many A letters are in B2:B2, and there is just one. In B3 the formula hits A again. Now the formula checks As in B2:B3, the formula returns 2.

Filter only number ones, and you have removed the duplicates.

You can also use Access to remove the duplicates.

The table consists of two columns running number ID and the product values. Data is similar than in Excel.

Create query at create – queries – query design.

Press property sheet.

You need to select unique values as “yes”.

Then drag the table to the middle of the screen.

Double click product field.

Run with exclamation mark. You get the results, seven unique values.

If you want to load the data from Access to Excel, select external data – export – Excel.

You can see the SQL, too. Select view- SQL view in query design.

As you noticed, we have many ways to find unique values and remove duplicates. Most often, I have used the first option, remove duplicates functionality. Also UNIQUE function is handy.