Set Operations

Vaibhav • September 11, 2025

As you work with collections in C#, you’ll often encounter scenarios where you need to compare, combine, or contrast two sequences. For example, finding common elements between two lists, removing duplicates, or identifying items that exist in one list but not the other. These kinds of tasks are known as set operations, and LINQ provides a clean and expressive way to perform them. In this article, we’ll explore the core set operations in LINQ - Distinct, Union, Intersect, and Except - and how to use them effectively.

Understanding Set Semantics

Set operations treat collections like mathematical sets - groups of unique elements. LINQ’s set methods operate on sequences and return results that reflect these set-based relationships. These operations are especially useful when working with lists, arrays, or any IEnumerable<T> where you want to eliminate duplicates or compare contents.

All LINQ set operations use the default equality comparer for the type unless you provide a custom one. For reference types, this means comparing by value if Equals and GetHashCode are overridden.

Removing Duplicates with Distinct

The Distinct method removes duplicate elements from a sequence. It’s the simplest set operation and often used to clean up data.

List fruits = new List { "apple", "banana", "apple", "cherry", "banana" };

var uniqueFruits = fruits.Distinct();

This returns a sequence with only unique fruit names: ["apple", "banana", "cherry"]. The method compares elements using their default equality logic.

Combining Sequences with Union

Union combines two sequences and removes duplicates. It’s useful when merging data from multiple sources while avoiding repetition.

List listA = new List { "apple", "banana" };
List listB = new List { "banana", "cherry" };

var combined = listA.Union(listB);

This returns: ["apple", "banana", "cherry"]. The duplicate “banana” is removed. The order reflects the first appearance of each unique item.

Finding Common Elements with Intersect

Intersect returns elements that exist in both sequences. It’s ideal for identifying shared values or overlaps.

var common = listA.Intersect(listB);

This returns: ["banana"]. Only “banana” appears in both lists. The result contains no duplicates.

Finding Differences with Except

Except returns elements from the first sequence that are not in the second. It’s useful for subtracting one set from another.

var difference = listA.Except(listB);

This returns: ["apple"]. “banana” is excluded because it exists in listB. The result is a filtered version of listA.

Using Set Operations with Custom Types

For custom objects, set operations require proper equality logic. Suppose you have a Product class:

class Product
{
    public string Name { get; set; }
    public decimal Price { get; set; }

    public override bool Equals(object obj)
    {
        return obj is Product p && Name == p.Name && Price == p.Price;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(Name, Price);
    }
}

With Equals and GetHashCode overridden, you can now use set operations:

List storeA = new List
{
    new Product { Name = "Laptop", Price = 999.99m },
    new Product { Name = "Tablet", Price = 499.50m }
};

List storeB = new List
{
    new Product { Name = "Tablet", Price = 499.50m },
    new Product { Name = "Phone", Price = 799.00m }
};

var sharedProducts = storeA.Intersect(storeB);

This returns the tablet, which exists in both stores with the same name and price. Without proper equality logic, the comparison would fail.

Using IEqualityComparer for Custom Comparison

If you don’t want to override equality in your class, you can use an IEqualityComparer<T> to define comparison externally.

class ProductComparer : IEqualityComparer
{
    public bool Equals(Product x, Product y)
    {
        return x.Name == y.Name;
    }

    public int GetHashCode(Product obj)
    {
        return obj.Name.GetHashCode();
    }
}

var sharedByName = storeA.Intersect(storeB, new ProductComparer());

This compares products by name only, ignoring price. It’s a flexible way to customize set logic without modifying the class.

Chaining Set Operations

You can combine set operations to build complex queries. For example, finding items unique to each list:

var uniqueToA = listA.Except(listB);
var uniqueToB = listB.Except(listA);
var symmetricDifference = uniqueToA.Union(uniqueToB);

This computes the symmetric difference - items that exist in one list but not both. It’s useful for comparing datasets or tracking changes.

Set Operations and Deferred Execution

Like most LINQ methods, set operations use deferred execution. The result is not computed until you enumerate it. This allows you to build queries incrementally and efficiently.

Note: If you modify the source collections after defining the query but before enumerating it, the changes will affect the result.

Set Operations and Performance

Set operations are optimized for performance. They use hash-based lookups internally, which makes them fast for large collections - especially when the elements have good hash distribution.

If you’re performing multiple set operations on the same data, consider materializing the sequence with ToList() to avoid repeated enumeration.

Handling Case Sensitivity

For strings, set operations are case-sensitive by default. “Apple” and “apple” are treated as different values. You can use StringComparer.OrdinalIgnoreCase to ignore case.

var combined = listA.Union(listB, StringComparer.OrdinalIgnoreCase);

This treats “Apple” and “apple” as equal. It’s useful for user-facing data where case should not affect logic.

Using Set Operations with Query Syntax

Set operations are not directly supported in query syntax. You must use method syntax. However, you can combine query syntax with method calls:

var result = (from fruit in fruits
              where fruit.Length > 5
              select fruit).Distinct();

This filters fruits by length and removes duplicates. The Distinct call is in method syntax, chained after the query.

Common Mistakes to Avoid

A frequent mistake is assuming that set operations preserve order. They don’t - the result order is based on internal hash logic. If order matters, use OrderBy after the set operation. Also, remember that set operations compare entire elements - not just parts - unless you use a custom comparer.

You can use DistinctBy in .NET 6+ to remove duplicates based on a property: fruits.DistinctBy(f => f.Length). It’s a more expressive alternative to custom comparers.

Summary

Set operations in LINQ - Distinct, Union, Intersect, and Except - provide powerful tools for comparing and combining collections. They help you remove duplicates, find common or unique elements, and build expressive queries that reflect real-world relationships. With support for custom types and comparers, these methods adapt to a wide range of scenarios. By mastering set operations, you’ll write cleaner, more efficient, and more insightful LINQ code.

In the next article, we’ll explore Deferred Execution - how LINQ queries are evaluated lazily and what that means for performance and correctness.