Brendan Enrick's Blog

Daily Software Development.


LINQ Your Collections with IEqualityComparer and Lambda Expressions

by Brendan Enrick Monday, April 13 2009 13:50

Anyone using LINQ to manipulate in-memory collections is probably also using plenty of lambda expressions to make things quite easy. These two additions were really meant for each other. One of our interns here recently ran into an interesting problem while using LINQ. As a relatively new user of .NET based languages, reference types caused him a bit of trouble.

The problem

While using the dot notation with lambda expressions, he was using the Except method in the following way.

List<MyObject> x = myCollection.Except(otherCollection).ToList();

Well the problem here is that these two collections contain "MyObject"s, and when it does the comparison it does so based on the reference. This means if those are separate but equivalent objects that the comparison will claim they are different.

He had unit tests making sure that the except statement worked, but was using the same instance of variables to Assert, so the tests claimed to work.

I told him the problem and mentioned that there was probably an overload of Except that allows one to specify how to do the comparison. I was correct, but the overload takes an IEqualityComparer object. I was hoping for a Func<x,x,bool> as the second parameter, so I did what I always do; I Googled to see if anyone knew an easy way to get that to work without doing extra work.

The Internet was kind enough to inform me that there was no built in way of handling this situation.

Building your own was the suggestion. It is a pretty simple class, so it can just be tossed somewhere to be reused easily. It could easily come up and be needed again.

public class LambdaComparer<T> : IEqualityComparer<T>
{
    private readonly Func<T, T, bool> _lambdaComparer;
    private readonly Func<T, int> _lambdaHash;

    public LambdaComparer(Func<T, T, bool> lambdaComparer) :
        this(lambdaComparer, o => 0)
    {
    }
    
    public LambdaComparer(Func<T, T, bool> lambdaComparer, Func<T, int> lambdaHash)
    {
        if (lambdaComparer == null)
            throw new ArgumentNullException("lambdaComparer");
        if (lambdaHash == null)
            throw new ArgumentNullException("lambdaHash");

        _lambdaComparer = lambdaComparer;
        _lambdaHash = lambdaHash;
    }

    public bool Equals(T x, T y)
    {
        return _lambdaComparer(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _lambdaHash(obj);
    }
}

Now that we have a nice, Generic, comparer which can take lambda expressions, we are all set to plug this in to the previous code.

List<MyObject> x = myCollection.Except(otherCollection, 
  new LambdaComparer<MyObject>((x, y) => x.Id == y.Id)).ToList();

// or

IEqualityComparer comparer = new LambdaComparer<MyObject>((x, y) => x.Id == y.Id);
List<MyObject> x = myCollection.Except(otherCollection, comparer).ToList();

I admit I am still kind of annoyed that there wasn't an overload which just took a Func<T, T, bool> or a Func<T, T, int>.

Either way, I hope this helps someone use LINQ a little more easily. I know of some alternate ways of solving this same problem. So if you think I should have solved this differently then blog it and link back here or just post a comment below.

Update 16 April 2009 - From a comment below

One commenter below posted a suggested extension method for use with this. He suggests using this nice extension method so you can hide away the fact that you're using the custom comparer class.

public static class Ext
{
    public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, 
        IEnumerable<TSource> second , Func<TSource, TSource, bool> comparer )
    {
        return first.Except(second, new LambdaComparer<TSource>(comparer));
    }
}

Thank you for the comment. I like the idea. It will very nicely hide away the fact that a silly comparer is needed.

Comments

4/16/2009 5:43:05 AM #

Something like this? :


    static class Ext


    {


        public class LambdaComparer<T> : IEqualityComparer<T>


        {


            private readonly Func<T, T, bool> _lambdaComparer;


            public LambdaComparer( Func<T, T, bool> lambdaComparer )


            {


                if(lambdaComparer == null)


                    throw new ArgumentNullException( "lambdaComparer" );


                _lambdaComparer = lambdaComparer;


            }


            public bool Equals( T x, T y )


            {


                return _lambdaComparer( x, y );


            }


            public int GetHashCode( T obj )


            {


                return obj.GetHashCode();


            }


        }


        /// <summary>


        /// Produces the set difference of two sequences by using the specified System.Collections.Generic.IEqualityComparer<T>


        /// to compare values.


        /// </summary>


        /// <typeparam name="TSource">The type of the elements of the input sequences</typeparam>


        /// <param name="first">  


        /// An System.Collections.Generic.IEnumerable<T> whose elements that are not


        /// also in second will be returned.


        /// </param>


        /// <param name="second">


        /// An System.Collections.Generic.IEnumerable<T> whose elements that also occur


        /// in the first sequence will cause those elements to be removed from the returned


        /// sequence.


        /// </param>


        /// <param name="comparer">


        ///  An Func<TSource, TSource, bool> to compare values.


        /// </param>


        /// <returns>


        /// A sequence that contains the set difference of the elements of two sequences.


        /// </returns>


        /// <exception cref="System.ArgumentNullException">first or second is null</exception>


        public static IEnumerable<TSource>


            Except<TSource>( this IEnumerable<TSource> first, IEnumerable<TSource> second , Func<TSource, TSource, bool> comparer )


        {


            return first.Except(second, new LambdaComparer<TSource>(comparer));


        }


    }


5/19/2009 4:49:27 AM #

That IEqualityComparer implementation looks broken to me - as it uses an arbitrary function for equality, but then the original hashing function for hashing. That could very easily lead to a hash which is inconsistent with equals, which means it will break in anything like HashSet.


5/19/2009 4:55:47 AM #

Specifically, it can return different hashes for equal elements, which violates the GetHashCode contract.


5/19/2009 9:12:56 AM #

Brendan Enrick

Thank you for the advice. Yes, that slipped my mind for this example since the hash was not being used in this instance. I've adjusted the code snippet.


I now allow a Func to be passed in for the hash code as well. In the instance where you will not need it, it will just use the default one.


Brendan Enrick

11/20/2009 5:02:56 PM #

'Except' and other set operators (Union,Distinct) do not use Equals at all, and only GetHashCode meaning this implementation should infact just take the hash lambda.


twistur-tales.blogspot.com/.../...-not-called.html">twistur-tales.blogspot.com/.../iequalitycompar


11/25/2009 1:33:16 PM #

Your code has a bug,


This line:


private readonly Func<T, T, bool> _lambdaHash;


should read:


private readonly Func<T, int> _lambdaHash;


11/25/2009 1:35:17 PM #

Thanks for the code, this really helps things out.


12/2/2009 11:41:28 AM #

I came up with the same solution. You actually don't need the LambdaHash. If you just return 0 for the hash the Equals comparer will kick in. The underlying evaluation checks the hash and then short circuits the evaluation if it is false. Otherwise, it checks the Equals. If you force the hash to be true (by assuming 0 for both objects), you will always fall through to the Equals check which is what your going for anyway. Via reflector, here's the underlying evaluation that the DistinctIterator uses:


   if ((this.slots[i].hashCode == hashCode) && this.comparer.Equals(this.slots[i].value, value))


1/2/2010 8:32:23 PM #

Great solution!  I hate that some LINQ methods still take in IEquityComparers instead of simple Func<>'s.


1/22/2010 5:05:36 AM #

Great! - but there is still a hash bug in the first construtor in the code snippet.


The default lamdaHash expression should be "o => 0" and not "o => o.GetHashCode()"


For the reason that Jon Skeet mentioned.


8/17/2010 8:24:14 AM #

pingback

Pingback from joe-stevens.com

Linq lambda expression IEqualityComparer for IEnumerable.Distinct and Except | Joe Stevens' Blog

joe-stevens.com

5/29/2011 4:24:54 AM #

Jeremy Lundy

Setting the hashcode to zero does work, but you also lose the performance benefits of the hashcodes.

For example, if you use this comparer to do a Distinct() over a collection of items (i.e. myCollection.Distinct(new LambdaComparer<MyObject>(...))) and you don't specify a lambda hash, each item will have to be compared with every other item in the collection until a match is found. That means n-squared calls to the Equals() method.

I would probably remove the first constructor and always require lambdaHash. Then in (hopefully) rare cases when you really want to "disable" hashcodes, you can just pass "o => 0" to the constructor.

On a side note, take a look at Jon Skeet's ProjectionEqualityComparer here:
stackoverflow.com/.../188130#188130
It does almost the same thing, but with a simpler lambda expression and it automatically infers the correct GetHashCode() result.

Jeremy Lundy United States

6/22/2011 9:21:11 AM #

pingback

Pingback from blog.aquabirdconsulting.com

LamdaComparer – A Must Have Snippet

blog.aquabirdconsulting.com

6/13/2013 7:23:14 AM #

pingback

Pingback from scoop.it

LINQ Your Collections with IEqualityComparer an...

scoop.it

7/22/2013 9:51:35 AM #

pingback

Pingback from digitalinfamy.com

Digital~Infamy | Linq LambdaComparisons

digitalinfamy.com

7/22/2013 9:51:35 AM #

pingback

Pingback from digitalinfamy.com

Digital~Infamy | Linq LambdaComparisons

digitalinfamy.com

Comments are closed