Beware of calling Contains in the Where predicate on IQueryable<T> in LINQ to SQL if your collection has more than 2100 items

Recently I ran into a problem I’d never seen before – the 2100 parameter limit in the SQL that’s generated when you call Contains in the IQueryable<T>.Where predicate in LINQ to SQL. To make matters worse, this happened when filtering data in the Selecting event of a DevExpress LinqServerModeDataSource… which suppresses the error, resulting in an empty result set. (I only saw the error after stepping over the call, and then inspecting the result set in the debugger.)

You don’t hit the problem when your collection contains exactly 2100 items… it happens with less items than that. The 2100 number refers to a limit that’s hit in the SQL code that’s generated. So even as many as 2000 items in your collection will probably trigger the issue. I haven’t experimented or dug deep enough to figure out exactly what the limit is. But it doesn’t matter. What matters is that calling Contains when you don’t know how many items are in your collection, is asking for trouble.

The solution to this is simple – it was finding the problem that took so long. If I have a collection that contains thousands of items, then I split it up into smaller collections, filtering the data each time and adding the result sets to a List<T>, which I then convert back to an IQueryable<T> at the end.

The code below is an example selecting event for the data source. I’ve put that rather than only the relevant code, so that it’s a better example of how one can filter using such an event. It’s using a strongly typed Linq to SQL class for a view. I then have an ASPxGridView control that’s bound to the LinqServerModeDataSource.

You can ignore the first bit… which gets the lines from a memo in the drop down edit window of a drop down edit control, and puts any valid ones into a List<long>. The idea here is that users can filter the data by pasting in a column of account numbers they copied from Excel. Just in case they have bad data, I then ignore anything that isn’t numeric. (Of course this introduced the problem. Allowing users to paste in an unlimited number of rows was probably a bad idea, although in theory it solved a problem they had. But until this point, we never needed to filter a list of many thousands of items using Contains.)

Also, my way of splitting the list into smaller lists is to use IEnumerable<T>GroupBy with index/count – you could achieve the same thing with less code, using Skip and Take in a loop. I like writing it this way because I find the resulting code easier to read.

Note also that the unfortunate part of this is it will put the whole data set in memory. For that reason, I don’t use this approach if there are less than 1000 items in the list, and in my real code (this one is simplified and has the view name changed), this is the last step of filtering the data.

Aside: The closest I came to finding a solution online was this question on StackOverflow, which doesn’t have an accepted answer, but ironically the top one that I tried moved the call to Contains such that it actually threw the Exception that had been suppressed all the time in my code. Hopefully what comes across in this post is not so much exactly how I implemented a solution for this problem, but that the solution is to split the call to Contains between smaller collections, add all the results to a List<T>, and convert back to the IQueryable<T> at the end. If you understand the principle, you don’t even need to use my code…

Edit: The important code to take note of is really from the block comment downwards. You should be able to see all of it when dragging the scrollbar around at the bottom of this code block. That block comment got its indenting a little messed up when I unindented the code and copied it here… (And since this code was copy-pasted to a temporary class that I deleted, it’s too much trouble to fix on here. Sorry.)

protected void dsFutureDatedTransactions_Selecting(object sender, DevExpress.Data.Linq.LinqServerModeDataSourceSelectEventArgs e)
{
    FutureDatedTransactionsDataContext Context = new FutureDatedTransactionsDataContext();
    IQueryable<FutureDatedTransactions> Data = Context.FutureDatedTransactionss;

    // Account numbers
    ASPxMemo memo = (ASPxMemo)ddeAccountNumbers.FindControl("memoAccountNumbers");

    if (!string.IsNullOrEmpty(memo.Text))
    {
        List<long> accountNumbers = new List<long>();

        string[] numbers = memo.Text.Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);

        foreach (string accountText in numbers)
        {
            long accountNumber = 0;

            if (long.TryParse(accountText, out accountNumber))
            {
                accountNumbers.Add(accountNumber);
            }
        }

        if (accountNumbers.Count > 0)
        {
            /* If there are too many elements in the list, the call to Contains will suppress an Exception from the
                * underlying SQL which cannot contain more than 2100 parameters. (Known issue in LinqToSQL.) Then we get 
                * an EMPTY grid with no error. To work around this, using 1000 to be safe, if there more than 1000 items, 
                * break it into batches, and call Contains on each batch of 1000, add the results to a List<T>, then 
                * convert back to IQueryable<T> at the end. */
            if (accountNumbers.Count <= 1000)
            {
                Data = Data.Where(a => accountNumbers.Contains(a.AccountNumber));
            }
            else
            {
                var accountQuery = accountNumbers.GroupBy(num => accountNumbers.IndexOf(num) / 1000);

                List<FutureDatedTransactions> dataList = new List<FutureDatedTransactions>();

                for (int i = 0; i < accountQuery.Count(); i++)
                {
                    var batch = accountQuery.ElementAt(i).ToList();

                    var subSet = Data.Where(a => batch.Contains(a.AccountNumber)).ToList();
                    dataList.AddRange(subSet);
                }

                Data = dataList.AsQueryable<FutureDatedTransactions>();
            }
        }
    }

    e.KeyExpression = "ID";
    e.QueryableSource = Data;
}
Advertisements

About Jerome

I am a senior C# developer in Johannesburg, South Africa. I am also a recovering addict, who spent nearly eight years using methamphetamine. I write on my recovery blog about my lessons learned and sometimes give advice to others who have made similar mistakes, often from my viewpoint as an atheist, and I also write some C# programming articles on my programming blog.
This entry was posted in Programming and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s