If you've been writing C# for a while, chances are you've already bumped into
LINQ. It sneaks its way into nearly every project-sometimes through simple
where
and select
clauses, sometimes through more advanced scenarios like building queries
against databases. But here's the thing: most developers I've worked with stop
once they know the basics. They filter, project, maybe join a collection or two,
and then move on. What I want to do here is take you much deeper-closer to the
metal of how LINQ actually works and what it enables once you understand its
execution model.
In my experience, the real power of LINQ isn't just in the operators you already
know. It's in understanding how the whole system is designed to be extensible,
composable, and surprisingly flexible. We'll talk about how LINQ really runs
behind the scenes, how you can build your own reusable operators that feel like
part of the framework, what expression trees mean for dynamic queries, and why
IQueryable
is such a powerful abstraction.
Along the way, I'll highlight some common mistakes I've seen developers make-some of which I've made myself over the years-and offer guidance on writing production-ready LINQ that won't surprise you with performance issues down the road.
Understanding Deferred Execution
Let's start with a subtle point that many developers miss: LINQ queries don't run immediately. They're lazy. When you write something like this:
var numbers = new List<int> { 1, 2, 3, 4, 5 };
var evens = numbers
.Where(n => n % 2 == 0)
.Select(n => n * 10);
// Nothing has executed yet
foreach (var num in evens)
Console.WriteLine(num);
Up until the foreach
, nothing is actually
happening. LINQ builds a pipeline of operations, but execution is deferred
until you iterate. This design allows queries to be composed efficiently and
gives you control over when computation happens. It also explains why
sometimes queries behave differently when the underlying collection changes.
For example, if you modify numbers
before
iterating over evens
, those changes show up
in the query results. This can be surprising if you expect snapshots.
Building Custom Operators
One of the real joys of LINQ is that it's extensible. You're not stuck with
just Select
and
Where
. You can define your own operators
that plug into the fluent chain naturally. Let's try one: imagine we want a
method called WhereNotNull
that filters out
nulls.
public static class LinqExtensions
{
public static IEnumerable<T> WhereNotNull<T>(this IEnumerable<T?> source)
where T : class
{
foreach (var item in source)
{
if (item != null)
yield return item;
}
}
}
Notice how this feels just like a built-in operator. Because we wrote it as
an extension method with yield return
, it
integrates seamlessly into deferred execution. Using it looks natural:
var names = new List<string?> { "Alice", null, "Bob" };
var safeNames = names
.WhereNotNull()
.Select(n => n.ToUpper());
This is a very simple operator, but the idea scales. You can build higher-level abstractions specific to your domain, making queries both safer and more expressive.
yield return
creates an iterator that
defers execution. This keeps memory usage low but means you can't modify
the source collection during iteration.
Advanced Custom Operators
Let's go deeper on custom operators. Here's one I use frequently when working with time-series data:
public static class TimeSeriesExtensions
{
public static IEnumerable<T[]> Batch<T>(
this IEnumerable<T> source, int batchSize)
{
var batch = new List<T>(batchSize);
foreach (var item in source)
{
batch.Add(item);
if (batch.Count == batchSize)
{
yield return batch.ToArray();
batch.Clear();
}
}
if (batch.Count > 0) yield return batch.ToArray();
}
public static IEnumerable<TResult> Windowed<T, TResult>(
this IEnumerable<T> source, int windowSize,
Func<IEnumerable<T>, TResult> selector)
{
var window = new Queue<T>(windowSize);
foreach (var item in source)
{
window.Enqueue(item);
if (window.Count > windowSize) window.Dequeue();
if (window.Count == windowSize)
yield return selector(window);
}
}
}
These operators let you write elegant code for common data processing patterns.
The Batch
operator chunks data for bulk operations,
while Windowed
creates sliding windows-perfect
for calculating moving averages or detecting patterns in sequences.
var prices = GetStockPrices();
// Calculate 5-day moving averages
var movingAverages = prices
.Windowed(5, window => window.Average())
.ToList();
// Process data in batches of 100
var batches = prices
.Batch(100)
.Select(batch => ProcessBatch(batch));
Expression Trees and IQueryable
If you've worked with Entity Framework or another ORM, you've probably seen
IQueryable<T>
. Unlike
IEnumerable<T>
, which executes against
in-memory objects, IQueryable
builds an
expression tree. That tree represents the structure of the query itself.
When you write:
var query = db.Customers
.Where(c => c.City == "London")
.Select(c => c.Name);
you're not filtering customers in memory. Instead, EF translates the
expression tree into SQL and runs it against the database. This is why some
LINQ operators behave differently depending on whether you're working with
IEnumerable
or
IQueryable
.
Expression trees also let you build dynamic queries at runtime. Suppose you want to construct a filter based on user input. Instead of string-concatenated SQL, you can generate an expression tree programmatically. It's type-safe, composable, and far less error-prone.
public static class RuleBuilder
{
public static Expression<Func<Customer, bool>> CreateRule(
string propertyName, string operation, object value)
{
var parameter = Expression.Parameter(typeof(Customer), "customer");
var property = Expression.Property(parameter, propertyName);
var constant = Expression.Constant(value, property.Type);
Expression comparison = operation switch
{
"equals" => Expression.Equal(property, constant),
"greater" => Expression.GreaterThan(property, constant),
"contains" => Expression.Call(property, "Contains", null, constant),
_ => throw new ArgumentException($"Unknown operation: {operation}")
};
return Expression.Lambda<Func<Customer, bool>>(comparison, parameter);
}
}
Now you can build rules from configuration files, user input, or database records:
var rule = RuleBuilder.CreateRule("Age", "greater", 21);
var eligibleCustomers = customers.Where(rule.Compile());
Performance Optimization Strategies
Performance is where things get interesting. LINQ makes queries readable, but it's easy to forget about efficiency. Let's consider a simple mistake:
var results = numbers
.Where(n => n > 10)
.ToList()
.Where(n => n % 2 == 0);
Here we materialize the query too early by calling
ToList()
. That means the second
Where
runs in memory on a full list,
wasting effort. Chaining operators before materialization avoids this.
Another performance trick is to favor Any()
over Count()
when checking existence.
Any()
can stop at the first match, whereas
Count()
has to walk the entire sequence.
Memory Efficiency and Streaming
One of LINQ's underappreciated strengths is how well it handles large datasets without loading everything into memory. This is where understanding deferred execution really pays off. You can process files with millions of records using constant memory, as long as you're careful about when materialization happens.
public static IEnumerable<LogEntry> ProcessLargeLogs(string filePath)
{
return File.ReadLines(filePath)
.Select(ParseLogEntry)
.Where(entry => entry.Level == LogLevel.Error)
.Where(entry => entry.Timestamp > DateTime.Today.AddDays(-7))
.OrderBy(entry => entry.Timestamp);
}
var recentErrors = ProcessLargeLogs("huge-log.txt")
.Take(100).ToList();
The beautiful thing here is that LINQ only processes as much data as needed to produce the final result. If you only take the first 100 error entries, it stops processing once it finds them. This is streaming at its finest.
Building Query DSLs
One of the most impressive things you can do with LINQ is build domain-specific languages that feel natural to your business users. The key is creating an API that exposes the right abstractions while hiding the complexity underneath.
public class CustomerQuery
{
private IQueryable<Customer> _query;
public CustomerQuery(IQueryable<Customer> query) => _query = query;
public CustomerQuery FromCity(string city) =>
new(_query.Where(c => c.City == city));
public CustomerQuery WithOrdersAfter(DateTime date) =>
new(_query.Where(c => c.Orders.Any(o => o.Date > date)));
public CustomerQuery TopSpenders(int count) =>
new(_query.OrderByDescending(c => c.Orders.Sum(o => o.Total)).Take(count));
public IQueryable<Customer> Build() => _query;
}
This creates a fluent API that business analysts can understand and use:
var customers = new CustomerQuery(db.Customers)
.FromCity("Seattle")
.WithOrdersAfter(DateTime.Today.AddMonths(-6))
.TopSpenders(50)
.Build();
The beauty is that this still compiles down to efficient SQL when used with Entity Framework, but the API surface is much more approachable than raw LINQ.
Parallel and Asynchronous Execution
LINQ to Objects has another trick: PLINQ (Parallel LINQ). By calling
AsParallel()
, you can distribute query
execution across multiple cores:
var bigResults = numbers
.AsParallel()
.Where(n => SomeExpensiveCheck(n))
.ToList();
For CPU-bound workloads, this can dramatically cut runtime. But it's not
always a free win-you have to be careful about ordering and side effects.
Parallel execution may change result order unless you reapply
AsOrdered()
.
On the async side, EF Core supports asynchronous LINQ execution with methods
like ToListAsync()
. This is essential for
scaling I/O-bound systems, where blocking calls can stall throughput.
public static async IAsyncEnumerable<WeatherData> GetWeatherStream()
{
using var client = new HttpClient();
for (int i = 0; i < 100; i++)
{
var response = await client.GetStringAsync($"/api/weather/{i}");
yield return JsonSerializer.Deserialize<WeatherData>(response);
}
}
await foreach (var weather in GetWeatherStream())
Console.WriteLine($"Temperature: {weather.Temperature}");
Common Pitfalls
Let's pause and look at a few traps I've seen over and over. One is abusing LINQ for everything. Just because you can write a single monstrous query doesn't mean you should. Breaking queries into smaller, named steps is often more readable and debuggable.
Another pitfall is assuming LINQ queries are free. They're not. Deferred execution can hide performance issues until runtime, sometimes in production. It's wise to profile queries on realistic datasets.
One pattern I see frequently is unnecessary multiple enumeration:
// This enumerates the source multiple times!
var query = GetExpensiveData().Where(x => x.IsValid);
if (query.Any())
{
Console.WriteLine($"Found {query.Count()} items");
foreach (var item in query)
ProcessItem(item);
}
// Better - materialize once
var items = GetExpensiveData().Where(x => x.IsValid).ToList();
if (items.Any())
{
Console.WriteLine($"Found {items.Count} items");
foreach (var item in items)
ProcessItem(item);
}
for
loop really is clearer and faster.
Summary
LINQ is more than a set of handy operators-it's a full model for querying and transforming data. Its lazy execution allows efficient, composable pipelines, and its extensibility lets you create custom operators tailored to your domain.
We've looked at deferred execution, performance trade-offs, expression trees, and even dynamic query scenarios. These tools give you both flexibility and control when working with large or remote datasets. The real shift is in mindset: thinking in data transformations instead of loops.