I have numeric data in 4 columns, I need to compare column B to A and remove duplicates from B, then compare column C to A and B and remove duplicates from C and so on. So in the end I am left with four columns of data, all with unique info. The remove duplicates features seems to simply check each column against itself and does not compare the different columns. Each column's data has significance so I need it to stay in those columns. Thanks
Scrawny2011-08-02T13:57:11Z
Favorite Answer
Insert a helper column between columns B and C and enter this formula in the new column C and copy down the length of the data.
=IF(COUNTIF(A:A,B1)>0,"dup","")
This will identify all values in column B that appear in column A. Delete the entries that have "dup" beside them. Now delete the helper column.
Insert a helper column between columns C and D and enter the same formula changing the A:A reference to B:B and the B1 to C1 and copy down the length of the data. Again delete all entries marked with "dup" beside them.
I'm sure there's a more elegant solution, but it sounds like you can take put a cell next to each of the 4 columns and label it A, B, C. D. Like: 1, A, 10, B, 567, C, 23, D.
Then take each 2-column pair and stack them in one column. 1, A 10, B 567, C 23, D
Then remove duplicate features on the single column. Then autofilter the column and sort/filter by the letter. Then just copy each letter set back out into it's own column.
Essentially what you're doing is labeling each datapoint with the applicable letter so that after you remove duplicates in a single column, you can still put everything back where it was. You can even add another cell with a number to remember the order. Just put in 1, 2, 3, then group-select 1,2,3, and click the little black bump in the bottom right and drag it down to number the whole column sequentially.