Linq-to-SQL, group-by, subqueries and performance

If you’re using Linq-to-SQL, doing group-by and selecting other columns than those in the grouping-key, performance might suffer. This is because there is no good translation of such queries to SQL and Linq-to-SQL has to resort to doing multiple subqueries. Matt Warren explains here. I experienced this firsthand when grouping a lot of geocoded events by latitude and longitude and selecting a few more columns (EventId and CategoryId in the example below):

from e in db.Events
group e by new { e.Lat, e.Lng } into g
select new
{
    g.Key.Lat,
    g.Key.Lng,
    es = g.Select(_ => new { _.EventId, _.CategoryId })
};

One possible solution is to fetch all events, to a ToList() and do the grouping in-memory.

var foo =
    from e in db.Events
    select new { e.Lat, e.Lng, e.EventId, e.CategoryId };

var bar = from e in foo.ToList()
            group e by new { e.Lat, e.Lng } into g
            select new
            {
                g.Key.Lat,
                g.Key.Lng,
                es = g.Select(_ => new { _.EventId, _.CategoryId })
            };

2 thoughts on “Linq-to-SQL, group-by, subqueries and performance”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>