Apr 13 2009

LINQ Your Collections with IEqualityComparer and Lambda Expressions

Category: BlogBrendan Enrick @ 08:50

Anyone using LINQ to manipulate in-memory collections is probably also using plenty of lambda expressions to make things quite easy. These two additions were really meant for each other. One of our interns here recently ran into an interesting problem while using LINQ. As a relatively new user of .NET based languages, reference types caused him a bit of trouble.

The problem

While using the dot notation with lambda expressions, he was using the Except method in the following way.

List<MyObject> x = myCollection.Except(otherCollection).ToList();

Well the problem here is that these two collections contain "MyObject"s, and when it does the comparison it does so based on the reference. This means if those are separate but equivalent objects that the comparison will claim they are different.

He had unit tests making sure that the except statement worked, but was using the same instance of variables to Assert, so the tests claimed to work.

I told him the problem and mentioned that there was probably an overload of Except that allows one to specify how to do the comparison. I was correct, but the overload takes an IEqualityComparer object. I was hoping for a Func<x,x,bool> as the second parameter, so I did what I always do; I Googled to see if anyone knew an easy way to get that to work without doing extra work.

The Internet was kind enough to inform me that there was no built in way of handling this situation.

Building your own was the suggestion. It is a pretty simple class, so it can just be tossed somewhere to be reused easily. It could easily come up and be needed again.

public class LambdaComparer<T> : IEqualityComparer<T>
{
    private readonly Func<T, T, bool> _lambdaComparer;
    private readonly Func<T, int> _lambdaHash;

    public LambdaComparer(Func<T, T, bool> lambdaComparer) :
        this(lambdaComparer, o => 0)
    {
    }
    
    public LambdaComparer(Func<T, T, bool> lambdaComparer, Func<T, int> lambdaHash)
    {
        if (lambdaComparer == null)
            throw new ArgumentNullException("lambdaComparer");
        if (lambdaHash == null)
            throw new ArgumentNullException("lambdaHash");

        _lambdaComparer = lambdaComparer;
        _lambdaHash = lambdaHash;
    }

    public bool Equals(T x, T y)
    {
        return _lambdaComparer(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _lambdaHash(obj);
    }
}

Now that we have a nice, Generic, comparer which can take lambda expressions, we are all set to plug this in to the previous code.

List<MyObject> x = myCollection.Except(otherCollection, 
  new LambdaComparer<MyObject>((x, y) => x.Id == y.Id)).ToList();

// or

IEqualityComparer comparer = new LambdaComparer<MyObject>((x, y) => x.Id == y.Id);
List<MyObject> x = myCollection.Except(otherCollection, comparer).ToList();

I admit I am still kind of annoyed that there wasn't an overload which just took a Func<T, T, bool> or a Func<T, T, int>.

Either way, I hope this helps someone use LINQ a little more easily. I know of some alternate ways of solving this same problem. So if you think I should have solved this differently then blog it and link back here or just post a comment below.

Update 16 April 2009 - From a comment below

One commenter below posted a suggested extension method for use with this. He suggests using this nice extension method so you can hide away the fact that you're using the custom comparer class.

public static class Ext
{
    public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, 
        IEnumerable<TSource> second , Func<TSource, TSource, bool> comparer )
    {
        return first.Except(second, new LambdaComparer<TSource>(comparer));
    }
}

Thank you for the comment. I like the idea. It will very nicely hide away the fact that a silly comparer is needed.

Tags: , ,

Comments

1.
says:

Something like this? :


    static class Ext


    {


        public class LambdaComparer<T> : IEqualityComparer<T>


        {


            private readonly Func<T, T, bool> _lambdaComparer;


            public LambdaComparer( Func<T, T, bool> lambdaComparer )


            {


                if(lambdaComparer == null)


                    throw new ArgumentNullException( "lambdaComparer" );


                _lambdaComparer = lambdaComparer;


            }


            public bool Equals( T x, T y )


            {


                return _lambdaComparer( x, y );


            }


            public int GetHashCode( T obj )


            {


                return obj.GetHashCode();


            }


        }


        /// <summary>


        /// Produces the set difference of two sequences by using the specified System.Collections.Generic.IEqualityComparer<T>


        /// to compare values.


        /// </summary>


        /// <typeparam name="TSource">The type of the elements of the input sequences</typeparam>


        /// <param name="first">  


        /// An System.Collections.Generic.IEnumerable<T> whose elements that are not


        /// also in second will be returned.


        /// </param>


        /// <param name="second">


        /// An System.Collections.Generic.IEnumerable<T> whose elements that also occur


        /// in the first sequence will cause those elements to be removed from the returned


        /// sequence.


        /// </param>


        /// <param name="comparer">


        ///  An Func<TSource, TSource, bool> to compare values.


        /// </param>


        /// <returns>


        /// A sequence that contains the set difference of the elements of two sequences.


        /// </returns>


        /// <exception cref="System.ArgumentNullException">first or second is null</exception>


        public static IEnumerable<TSource>


            Except<TSource>( this IEnumerable<TSource> first, IEnumerable<TSource> second , Func<TSource, TSource, bool> comparer )


        {


            return first.Except(second, new LambdaComparer<TSource>(comparer));


        }


    }


2.
says:

That IEqualityComparer implementation looks broken to me - as it uses an arbitrary function for equality, but then the original hashing function for hashing. That could very easily lead to a hash which is inconsistent with equals, which means it will break in anything like HashSet.


3.
says:

Specifically, it can return different hashes for equal elements, which violates the GetHashCode contract.


4.
benrick benrick says:

Thank you for the advice. Yes, that slipped my mind for this example since the hash was not being used in this instance. I've adjusted the code snippet.


I now allow a Func to be passed in for the hash code as well. In the instance where you will not need it, it will just use the default one.


5.
says:

'Except' and other set operators (Union,Distinct) do not use Equals at all, and only GetHashCode meaning this implementation should infact just take the hash lambda.


twistur-tales.blogspot.com/.../...-not-called.html">twistur-tales.blogspot.com/.../iequalitycompar


6.
says:

Your code has a bug,


This line:


private readonly Func<T, T, bool> _lambdaHash;


should read:


private readonly Func<T, int> _lambdaHash;


7.
says:

Thanks for the code, this really helps things out.


8.
says:

I came up with the same solution. You actually don't need the LambdaHash. If you just return 0 for the hash the Equals comparer will kick in. The underlying evaluation checks the hash and then short circuits the evaluation if it is false. Otherwise, it checks the Equals. If you force the hash to be true (by assuming 0 for both objects), you will always fall through to the Equals check which is what your going for anyway. Via reflector, here's the underlying evaluation that the DistinctIterator uses:


   if ((this.slots[i].hashCode == hashCode) && this.comparer.Equals(this.slots[i].value, value))


9.
says:

Great solution!  I hate that some LINQ methods still take in IEquityComparers instead of simple Func<>'s.


10.
says:

Great! - but there is still a hash bug in the first construtor in the code snippet.


The default lamdaHash expression should be "o => 0" and not "o => o.GetHashCode()"


For the reason that Jon Skeet mentioned.


11.
pingback joe-stevens.com says:

Pingback from joe-stevens.com

Linq lambda expression IEqualityComparer for IEnumerable.Distinct and Except | Joe Stevens' Blog

12.
Jack Mail Jack Mail Poland says:

Intresting and helps things, its looking very of simple

13.
Jack Mail Jack Mail Poland says:

Intresting and helps things, its looking very of simple

14.
Jack Mail Jack Mail Poland says:

Intresting and helps things, its looking very of simple

15.
Jack Mail Jack Mail Poland says:

Intresting and helps things, its looking very of simple

16.
Jack Mail Jack Mail Poland says:

Intresting and helps things, its looking very of simple

17.
outlook pst repair outlook pst repair United States says:

Do you always write such good blogs. I am interested in reading more of your work. I have bookmarked your site. Thanks in advance.

18.
john@nurserybeddingsets.com john@nurserybeddingsets.com United States says:

Easy solution to something that has been bugging me...

19.
john@nurserybeddingsets.com john@nurserybeddingsets.com United States says:

Easy solution to something that has been bugging me...

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading