De-duping a mailing list

Hi everyone,

I have a spreadsheet consisting of about 150 rows. It contains the full names of the people in Column F and each of the rows have either a “billing address”, “home address”, “work address” or “PO address”. The spreadsheet is sorted by column F (Full name) in alphabetical order so that you can clearly see the duplicate names. In fact, the duplicates in column F were identified in another workbook and copied into the current separate one so that duplicate customer records could be removed in preparation for a mailout.

In most instances there are only two duplicates for any particular customer but in some cases there could be 3, 4 or even 5. The goal of the exercise is to keep the customer row that has a home address as that is the preferable address to send to. If there’s no home address, then keep the billing address. If no billing address, then keep the work address. If no work address keep the PO box address.

My first attempt at this de-duping was to sort by home address, then by billing address, then by work address, then by PO box address. Even though this helped identify which entries had a home address, it still hasn’t solved the problem as it hasn’t de-duped as I would like and as I described in the previous paragraph.

Can anyone tell me if this can be done in excel and if yes, how I would go about it? Or would I would need to get some vba code written?

Would really appreciate any advice.

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

it would help to find a solution if we can see the geography of you spreadsheet

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Thanks for the reply, 

The following are the column names from column A to column AQ. Does that help with troubleshooting?

Date 1
Date 2
Membership Type
Secure zone Name
Title
Fullname
Firstname
Middle Name
Lastname
Email 1 (Primary)
Email Address (Alternate)
Username
Cell Phone
Profession
Double Opt-In Status
Opt-In Status
Billing Address
Billing Address City
Billing Address State
Billing Address Zipcode
Billing Address Country
Home Address
Home Address2
Home Address City
Home Address State
Home Address Zipcode
Home Address Country
Home Phone
Work Address
Work Address2
Work Address City
Work Address State
Work Address Zipcode
Work Address Country
Work Phone
Alternate Work Phone Number
PO Box Address
PO Box Suburb
PO Box State
PO Box Postcode
PO Box Country
Preferred Contact Method
Comments

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

on the menu ribbon select the data tab, then select filter.

then on the home address column only show non blank cells.

then copy the records to another sheet.

then set the filter to show all, then move to filtering the next criteria and copy the filtered data into another sheet and so on until you have copied all records that contain complete records for each home address, billing address, work address, and PO Box address.

now you have four lookup tables for each criteria.

next normalize all the names where you only have one instance for each name and put those name in its own sheet, this sheet is where you will be putting your formula to lookup the name in the home address lookup table first if it exists, then so on and so forth until all records are normalized.

yes it is labor intensive, but once you have normalized this data, it can then be stored in a relational database (if this is a long term endeavor) where if you would ever need to do this exercise again all you would have to do is to create a query :)

if you had thousands of rows of data writing vba code would be better, but for 150 rows of data manually doing it is less of a headache even for someone who is an expert writing vba code.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Ok thanks - I'll give it a try.

Much appreciated.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Hi,

I was just going through your steps and have managed to filter each of the addresses and have copied each address set into a separate workbook, and I named each of these workbooks as follows:

Home

Billing

Work

PO Box

I then used the "Remove Duplicates" function in the data tab to "normalise" the fullnames (from column F) and copied them into another separate workbook called "names only". They're all located in column A of this workbook. You mentioned that it was in this workbook that I would put the VLookup function.

I know that the syntax for a VLookup is:

VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])

If the first step is to use the VLookup function on the Home workbook, can you let me know what to include in the function? The "Home" workbook contains all the columns from columns A to column AQ and the "Fullname" is still in column F in the "Home" workbook. As mentioned, all the normalised names are in column A of the workbook called "names only" so can you tell me where to place the vlookup function in that workbook and what the function should contain?

Sorry but I've never used a VLookup before.


Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Hi,

I was just going through your steps and have managed to filter each of the addresses and have copied each address set into a separate workbook, and I named each of these workbooks as follows:

Home

Billing

Work

PO Box

I then used the "Remove Duplicates" function in the data tab to "normalise" the fullnames (from column F) and copied them into another separate workbook called "names only". They're all located in column A of this workbook. You mentioned that it was in this workbook that I would put the VLookup function.

I know that the syntax for a VLookup is:

VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])

If the first step is to use the VLookup function on the Home workbook, can you let me know what to include in the function? The "Home" workbook contains all the columns from columns A to column AQ and the "Fullname" is still in column F in the "Home" workbook. As mentioned, all the normalised names are in column A of the workbook called "names only" so can you tell me where to place the vlookup function in that workbook and what the function should contain?

Sorry but I've never used a VLookup before.


Ok, I am going to make certain assumptions.

1. workbooks, Home, Billing, Work, and PO Box sheets have Names and complete addresses in them

2 the names in all the sheets are in the first column

if you can show screenshots of these it would be easier to see their geography as opposed to using my imagination.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Hi,

In my second post I showed all the names of the columns from A to AQ. As you can see from that list, the fullname is in Column F.

The worksheets called Home, Billing, Work, and PO Box contain all those columns from A to AQ and the fullname is in Column F. This is because after I had filtered the columns I just copied and pasted them into the new worksheets.

The other worksheet called "names only" has the list of names in column A - there's nothing else in that worksheet except the names which have been normalised.

So to summarise, the fullnames are in column F in worksheets Home, Billing, Work, and PO Box and in column A in the worksheet "names only".

Can you help further to get the vlookup working?

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Also, I forgot to add that if it's easier to do the vLookup, I can move the fullname from column F to column A in the worksheets called Home, Billing, Work, and PO Box.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

for each workbook, I will need the:

workbook name

sheet name

range, i.e. A1:AR7

so I will know how to reference it into the formula.

example:

VLOOKUP($A1,[Home.xlsx]Sheet1!A1:AR100,1,FALSE)

THE REAL FORMULA I AM CREATING LOOKS SOMETHING LIKE THIS:

IF(ISNA(VLOOKUP($A1,[Home.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE))
IF(ISNA(VLOOKUP($A1,[Billing.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),IF(ISNA(VLOOKUP($A1,[Work.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),IF(ISNA(VLOOKUP($A1,[POBox.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),VLOOKUP($A1,[POBox.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE),"NO MATCH ANYWHERE"),VLOOKUP($A1,[Work.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),VLOOKUP($A1,[Billing.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),VLOOKUP($A1,[Home.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE))

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

for each workbook, I will need the:

workbook name

sheet name

range, i.e. A1:AR7

so I will know how to reference it into the formula.

example:

VLOOKUP($A1,[Home.xlsx]Sheet1!A1:AR100,1,FALSE)

THE REAL FORMULA I AM CREATING LOOKS SOMETHING LIKE THIS:

IF(ISNA(VLOOKUP($A1,[Home.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE))
IF(ISNA(VLOOKUP($A1,[Billing.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),IF(ISNA(VLOOKUP($A1,[Work.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),IF(ISNA(VLOOKUP($A1,[POBox.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),VLOOKUP($A1,[POBox.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE),"NO MATCH ANYWHERE"),VLOOKUP($A1,[Work.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),VLOOKUP($A1,[Billing.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE)),VLOOKUP($A1,[Home.xlsx]Sheet1!THE ENTIRE RANGE,1,FALSE))

and that blank worksheet with the names in it we will be formatting into something that looks like this:

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

 
 

Question Info


Last updated October 5, 2021 Views 280 Applies to: