Getting unique values across two files
I have two very large files we'll call Old and New. New contains many entries that Old contains. What I need to do is remove any entry from New that Old contains. There are 9,459 entries in Old with 55 columns. New contains 11,983 entries with 76 columns. I need to make the comparison based on 5 columns; 'name_last', 'name_first', 'name_middle', 'street', and 'type'
I'm using Excel 2010, I'm very new to it, and haven't got a clue where to start.
One simple approach could be as follows:
1. In a spare column of the old file, concatenate the five columns i.e. A2&C2&F2&G2&H2. I have assumed that 'name_last', 'name_first', 'name_middle', 'street', and 'type' are in column A, C, F, G and H
2. Copy this down till last row
3. Repeat steps 1 and 2 for new file as well. Supoose the spare column in new file in BZ
4. In another spare column of the new file, enter this formula
=VLOOKUP(BZ2,spare column of the old file created in step 1 above,1,0)
You will get an error against entries which are in new file and not in old file. Filter on non error values and delete those rows.
Ta reduce file size, you may now delete the spare columns created.