Tuesday, January 10, 2012

Comparing Linq performance.

I heard some rumours in internet,  that  we have to avoid LINQ, because it's slow.  What can I do here? Right, I gonna write my own  test.
I decided to test following  scope of methods:

  • A simple foreach loop
  • A simple for loop
  • Using the ICollection.Contains method
  • The Any extension method using HashSet
  • The Any extension method ("LINQ")

Here is  source code of my app:
1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class Program
    {
        /// <summary>
        /// this method for testing performance only
        /// It runs Action method in the loop and uses Stopwatch to track time.
        /// </summary>
        /// <param name="funcList">list of delegates and  some  additional help strings</param>
        static void ProfileReport(Dictionary<string, Action> funcList)
        {
            foreach (var func in funcList)
            {
                Stopwatch sw = Stopwatch.StartNew();//start timer
                var f = func.Value;//get instance of current delegate
                f();//execute it
                sw.Stop();//stop time tracking
                Console.WriteLine(func.Key + '\t' + sw.Elapsed.ToString());//here is result
                GC.Collect();// removing unnecesary  data before next iteration
            }
            Console.WriteLine();
        }

        static void Main(string[] args)
        {
            var names = Enumerable.Range(1, 1000000).Select(i => i.ToString()).ToList();//generate some test values as sequence of numbers
            var namesHash = new HashSet<string>(names);
            string testName = "99999";
            for (int i = 0; i < 10; i++)
            {
                ProfileReport(new Dictionary<string, Action>() 
            {
                { "Foreach Loop\t", () => Search(names, testName, ContainsForeachLoop) },    
                { "For Loop\t", () => ContainsForLoop(names,testName) },    
                { "Enumerable.Any\t", () => Search(names, testName, ContainsAny) },                
                { "HashSet\t\t", () => Search(namesHash, testName, ContainsCollection) },
                { "ICollection.Contains", () => Search(names, testName, ContainsCollection) }
            });


            }
            Console.ReadLine();
        }
        static bool ContainsAny(ICollection<string> names, string name)
        {
            return names.Any(s => s == name);
        }

        static bool ContainsCollection(ICollection<string> names, string name)
        {
            return names.Contains(name);
        }

        static bool ContainsForeachLoop(ICollection<string> names, string name)
        {
            foreach (var currentName in names)
            {
                if (currentName == name)
                    return true;
            }
            return false;
        }

        static bool ContainsForLoop(List<string> names, string name)
        {
            for (int i=0; i<names.Count(); i++)
            {
                if (names[i] == name)
                    return true;
            }
            return false;
        }


        static bool Search(ICollection<string> names, string name,
        Func<ICollection<string>, string, bool> containsFunc)
        {
            return (containsFunc(names, name) ? true : false);
        }
    }

As result, I've got:

I'm little bit surprized here. I thought, that "For" loop will be faster then  foreach because  in for loop we don't need copy current value on every iteration from entiredata list. Yes, It looks like miracle, I can't  explaine it right now, I have to check it in msdn. But any way, lets back to LINQ, this is  our main point of all these  measurements.

So, as you can see, LINQ is really slower then foreach loop, HashSet and ICollection.Contains.
It should be avoided if it is not fast enoughSlow and not fast enough are not at all the same thing!
Slow is irrelevant to our customers.
I found some discussions  related to this topic, for example some guys things like:

"Performance optimization is expensive. Writing code so that it can be read and maintained by others is expensive. Those goals are frequently in opposition to each other, so in order to spend your customer's money responsibly you've got to ensure that you're only spending valuable time and effort doing performance optimizations on things that are not fast enough. "


But I do not agree, I believe, we have to write good for reading but also fast code.
So if I will have posibility to avoid LINQ, will do. =)

No comments:

Post a Comment