Given a List that contains multiple objects with the same property, split it into smaller lists grouped by the distinct property

Oh, that title doesn’t make this very clear at all, does it? Fuck.

So we had a problem at work today. We have a number of transaction statuses to insert into a system, and do some other work on them. The system broke because they contained “duplicates”, which weren’t true duplicates but rather statuses against the same ID, which happens to be the primary key of a temporary table that’s used in a bulk insert. The code doesn’t cope very well with multiple statuses against the same ID for the same date, and there were thousands of them like this. The data isn’t such that we can use a different key, so my colleague fixed it in the production system with a SQL query and a temp table, manipulating the data such that the process could be run multiple times in stages.

I then came up with a code fix, which I first implemented in a little console application to test the principle. It’s that console application I can share.

In the example below, I have a simplified class to demonstrate the problem, called a Payment. A payment has three properties, an ID, a Status, and a Date. What the code needs to do is, given a list that contains multiple entries with the same ID, break the list into smaller lists of distinct IDs. To achieve this, I use an extension method and a yield return statement.

using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApplication2
{
    class Payment
    {
        public int ID { get; set; }
        public string Status { get; set; }
        public DateTime Date { get; set; }
    }

    static class Extensions
    {
        public static IEnumerable<List<Payment>> BatchPaymentById(this IEnumerable<Payment> payments)
        {
            var list = payments.ToList();
            var result = list.GroupBy(p => p.ID, (key, g) => g.OrderBy(e => e.ID).First()).ToList();

            if (result.Count != list.Count)
            {
                while (list.Count > 0)
                {
                    list.RemoveAll(p => result.Contains(p));
                    yield return result;
                    result = list.GroupBy(p => p.ID, (key, g) => g.OrderBy(e => e.ID).First()).ToList();
                }
            }
            else yield return list;
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var today = DateTime.Today.Date;
            var payments = new List<Payment>();
            payments.Add(new Payment { ID = 1, Status = "Success", Date = today });
            payments.Add(new Payment { ID = 1, Status = "Dispute", Date = today });
            payments.Add(new Payment { ID = 2, Status = "Success", Date = today });
            payments.Add(new Payment { ID = 2, Status = "Dispute", Date = today });
            payments.Add(new Payment { ID = 3, Status = "Success", Date = today });

            foreach (List<Payment> paymentList in payments.BatchPaymentById())
            {
                // Do something with the batch
            }
        }
    }
}

About Jerome

I am a senior C# developer in Johannesburg, South Africa. I am also a recovering addict, who spent nearly eight years using methamphetamine. I write on my recovery blog about my lessons learned and sometimes give advice to others who have made similar mistakes, often from my viewpoint as an atheist, and I also write some C# programming articles on my programming blog.
This entry was posted in Programming and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s