Finding duplicate entries by several criteria

nudny ekscentryk · 2 years ago

Finding duplicate entries by several criteria

erAck@discuss.tchncs.de · 2 years ago

Add a helper column concatenating cell contents (in the following example of column A and B, multiple cells or ranges are possible, see TEXTJOIN() online help) with a delimiter that does not occur in data, here the | vertical line or pipe symbol:

C1: =TEXTJOIN("|";0;A1:B1)

Copy-paste or pull down formula C1 to the desired range of rows.

For each row count the number of occurrences, assuming data in rows 1:99 here:

D1: =COUNTIF(C$1:C$99;C1)

Copy-paste or pull down formula D1 to the desired range of rows.

Filter on column D >1, or sort data in columns A:D on column D descending and at the bottom delete rows with value 1 in column D.

JackSkellington@lemmy.world · 2 years ago

Maybe you could concatenate every relevant field/criteria with “ - “ between each element into a new column . then add a filter on the new column for duplicates?

I do this quite often

nudny ekscentryk · edit-2 2 years ago

what if I sorted the orders by names and then for each one check if the one above and the one below it have the same name, date and amount due using 3 columns of IFs, and then filter out those which meet all three of these criteria by multiplying the outputs of IFs in another column? that should work I think? the only problem is last step filtering may fuck up the existing IF functions

JackSkellington@lemmy.world · 2 years ago

That would work for a 1 dimensional table. If you have many columns, you either mess up the following columns or you get back to the beginning. With lists that works wonderfully

nudny ekscentryk · 2 years ago

It seems to have worked