Monday, April 9, 2012

Heapsort Algorithm implementation in C#


Today I would like to introduce another sorting algorithm: heapsort. 
Heapsort’s running time is O(n lg n). Heapsort also introduces another algorithm design technique: using a data structure, in this case one we call a "heap", to manage information. The term "heap" was originally coined in the context of heapsort, but it has since come to refer to "garbage-collected storage", such as the programming languages C#, Java provide. My heap data structure is not garbage-collected storage, and whenever I refer to heaps,  I mean a data structure rather than an aspect of garbage collection.

The heapsort algorithm starts by using "Adjust" to build a max-heap on the input array A[1..n], where n=A.Length. Since the maximum element of the array is stored at the root A[1], we can put it into its correct final position by exchanging it with A[n]. If we now discard node n from the heap—and we can do so by simply decrementing A:heap-size—we observe that the children of the root remain max-heaps, but the new root element might violate the max-heap property. All we need to do to restore the max-heap property, however, is call  Adjust(A,1), which leaves a max-heap in A[1..n-1]. The heapsort algorithm then repeats this process for the max-heap of size n - 1 down to a heap of size 2.

Here is code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace SortComparison
{
    public class HeapSort<T> where T : IComparable<T>
    {
        
        public static void Sort(T[] A)
        {
            //build the initial heap
            for (int i = (A.Length - 1) / 2; i >= 0; i--)
                Adjust2(A, i, A.Length - 1);

            //swap root node and the last heap node
            for (int i = A.Length - 1; i >= 1; i--)
            {
                T Temp = A[0];
                A[0] = A[i];
                A[i] = Temp;
                Adjust(A, 0, i - 1);
            }
        }



        private static void Adjust(T[] list, int i, int len)
        {
            T Temp = list[i];
            int j = i * 2 + 1;

            while (j <= len)
            {
                //more children
                if (j < len)
                    if (list[j].CompareTo(list[j + 1]) < 0)
                        j = j + 1;

                //compare roots and the older children
                if (Temp.CompareTo(list[j]) < 0)
                {
                    list[i] = list[j];
                    i = j;
                    j = 2 * i + 1;
                }
                else
                {
                    j = len + 1;
                }
            }
            list[i] = Temp;
        }


    }
}

Please, feel free post any comments!

Wednesday, April 4, 2012

Binary Search Algorithm

Generally, to find a value in unsorted array, we should  choose:
1 Take a look through elements of an array one by one, until searched value is found. If required value doesn't exists in  our array, then we will pass all array, all elements. In average, complexity of such an algorithm is  O(n) n- count of elements.
2. Sort array  using  any in place sort algorithms, QuickSort for instance, and then apply Binary Search algorithm.


Algorithm

Algorithm is very simple. It can be done either recursively or iteratively:


  1. get the middle element;
  2. if the middle element equals to the searched value, the algorithm stops;
  3. otherwise, two cases are possible:


  • searched value is less, than the middle element. In this case, go to the step 1 for the part of the array, before middle element.
  • searched value is greater, than the middle element. In this case, go to the step 1 for the part of the array, after middle element.
I decided  to implement it recursive way with a few enhancements.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
public class BinarySearch
    {
        public int Search(List A, T val) where T : IComparable
        {
            if (A == null || A.Count == 0) return -1;
            return InnerSearch(A, val, 0, A.Count() - 1);
        }
        private int InnerSearch(List A, T val, int min, int max) where T : IComparable
        {
            int middle = (min + max) / 2;

            if (max < min || min > max) return -1;
            if (A[middle].CompareTo(val) == 0) return middle;
            if (middle == A.Count() - 1) return -1;
            if (A[min].CompareTo(val) == 0) return min;
            if (A[max].CompareTo(val) == 0) return max;


            else if (A[middle].CompareTo(val) > 0) return InnerSearch(A, val, min+1, middle - 1);
            else return InnerSearch(A, val, middle + 1, max-1);
        }
    }

Quicksort Algorithm implementation in C#

Since begining of January 2012, I  tried  spend some spare time of studying  computer science base. Sure,  of course, I know  a lot  from that, but not  all. In addition, it is useful to repeat the basics knowledge.  I decided  create my own library of algorithm and data structures,  not  for  production   using but  for studying and understanding them.  I spent a lot of time  surfing internet  for good articles  regarding  algorithms and data structures. 
Today, I would like to  talk  about QuickSort algorithm. 

Quicksort, like lots of others sort algorithms, applies the divide-and-conquer paradigm. 
Here is the three-step divide-and-conquer process for sorting a
typical subarray Arr[q..w]

  • Divide: Partition (rearrange) the array Arr[q..w] into two subarrays  Arr[q..e-1] and  Arr[e+1..w] such that each element of Arr[e..w-1] is less than or equal to Arr[e] which is, in turn, less than or equal to each element of Arr[e+1,w]. Compute the index e as part of this partitioning procedure. 
  • Conquer: Sort the two subarrays Arr[q..e-1] and Arr[e+1..w] by recursive calls to quicksort.
  • Combine: Because the subarrays are already sorted, no work is needed to combine them: the entire array Arr[q..w] is now sorted.


The quicksort algorithm has a worst-case running time of O(N^2) on an input array of n numbers. Despite this slow worst-case running time, quicksort is often the best practical choice for sorting because it is remarkably efficient on the average: its expected running time is O(n lg n), and the constant factors hidden in the O(n lg n) notation are quite small.

Here is my C# implementation using recursion:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace SortComparison
{
    class QuickSort<T>; where T :IComparable;
    {
        private static T[] A;
        public static void Sort(T[] arr)
        {
            A = arr;
            InnerSort(0, A.Length - 1);
        }

        private static void InnerSort(int low, int high)
        {
            int i = low;
            int j = high;
            T pivot = A[(low + high) / 2];
            do
            {
                while (A[i].CompareTo(pivot) < 0) i++;
                while (A[j].CompareTo(pivot) > 0) j--;

                if (i <= j)
                {
                    T tmp = A[i];
                    A[i] = A[j];
                    A[j] = tmp;
                    i++; j--;
                }

            } while (i <= j);
            if(i < high) InnerSort(i,high);
            if(low < j) InnerSort(low,j);
            
        }

               
    }
}
As you  can see  it quit trivial  implementation,  but very efficient.

Tuesday, March 13, 2012

Shingles algorithm

A Few years ago I wrote my "AI" web crawler,  for gathering news articles all over the world in real time. Yeas, that was great adventure! And one of the problem, that I've impacted was- almost 30% of text was copied each other and  little bit edited, but the main  goal remain are the same.
Search for fuzzy duplicates allows to assume there are two objects if they same part or not. The object can be interpreted  as text files or any other data types. We will work with the text, but  once you realizing how the algorithm work-  will not be difficult to move my implementation on the objects you need.


Let  consider the  example of the text. Let us assume we have a text file in the 10th paragraph. Make a complete copy, and then rewrite only the last paragraph. 9 of 10 paragraphs of the second file is a complete copy of the original.

Another example is more complicated. If we rewrite the copy of the original text of each 5-6th of the sentences, the text will still be almost the same

How does Shingles algorithm works?
So, we have two of text and we need to assume if they are almost duplicates. The implementation of the algorithm involves the following steps:


  • canonization of texts;
  • partition of the text on the shingle;
  • finding a checksum;
  • find similar subsequences.
Now more specific. The algorithm is implemented shingles comparing checksums texts. In its implementation, I use MD5, but it applies, and others, such as SHA1 or  CRC32, etc.

Here is my code in C#:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Security.Cryptography;

namespace dataAnalyze.Algorithms
{
    public class Shingles
    {
        
        private char[] _stopSymbols = null;
        public Shingles(string stopSymbolFilePath)
        {
            if (!File.Exists(stopSymbolFilePath))
                throw new Exception("File with stop symbols not exists: " + stopSymbolFilePath);
            this._stopSymbols = File.ReadAllText(stopSymbolFilePath).ToCharArray();

        }


        public double CompareStrings(string s1, string s2)
        {
            if (s1.Length <= 50) return 0.0;
            if (s2.Length <= 50) return 0.0;
            if (s1.Length > s2.Length)
            {
                if (Math.Abs((s1.Length / (double)s2.Length)) >= 1.7)
                    return 0.0;
            }
            else
            {
                if (Math.Abs((s2.Length / (double)s2.Length)) >= 1.7)
                    return 0.0;
            }


            RemoveStopSymbols(ref s1);
            RemoveStopSymbols(ref s2);

            if (s1.Length <= 50) return 0.0;
            if (s2.Length <= 50) return 0.0;

            string[] shingles1 = getShingles(ref s1, 50);
            string[] shingles2 = getShingles(ref s2, 50);
            int same = 0;
            for (int i = 0; i < shingles1.Length; i++)
            {
                if (shingles2.Contains(shingles1[i]))
                    same++;
            }

            return same * 2 / ((double)(shingles1.Length + shingles2.Length)) * 100;
        }





        public double CompareStringsCashed(ref string[] shingles1, ref string[] shingles2, string s1, string s2)
        {
            if (s1 != null && s2 != null)
            {
                if (s1.Length > s2.Length)
                {
                    if (Math.Abs((s1.Length / (double)s2.Length)) >= 1.7)
                        return 0.0;
                }
                else
                {
                    if (Math.Abs((s2.Length / (double)s1.Length)) >= 1.7)
                        return 0.0;
                }
            }

            if (s1 != null)
            {
                if (s1.Length <= 50) return 0.0;
                string inS1 = s1;
                RemoveStopSymbols(ref inS1);
                if (inS1.Length <= 50) return 0.0;
                shingles1 = getShingles(ref inS1, 50);
            }

            if (s2 != null)
            {
                if (s2.Length <= 50) return 0.0;
                string inS2 = s2;
                RemoveStopSymbols(ref inS2);
                if (inS2.Length <= 50) return 0.0;
                shingles2 = getShingles(ref inS2, 50);
            }                        
           
            int same = 0;
            for (int i = 0; i < shingles1.Length; i++)
            {
                if (shingles2.Contains(shingles1[i]))
                    same++;
            }

            return same * 2 / ((double)(shingles1.Length + shingles2.Length)) * 100;
          
        }


        /// <summary>
        /// get shingles and calculate hash for everyone
        /// </summary>
        /// <param name="source"></param>
        /// <param name="shingleLenght"></param>
        /// <returns></returns>
        private string[] getShingles(ref string source, int shingleLenght)
        {
            string[] shingles = new string[source.Length - (shingleLenght - 1)];
            int shift = 0;
            for (int i = 0; i < shingles.Length; i++)
            {

                shingles[i] = CalculateMD5Hash(
                    (source.Length >= shift + shingleLenght ? source.Substring(shift, shingleLenght) : source.Substring(shift, source.Length - (shift + shingleLenght)))
                    );
                shift++;
            }

            return shingles;

        }

        /// <summary>
        /// delete some inappropriate chars from the string
        /// </summary>
        /// <param name="source"></param>
        private void RemoveStopSymbols(ref string source)
        {
            int[] positionForRemove = new int[source.Length];
            int arrayCounter = 0;
            FindIndexOfSymbols(ref source, ref positionForRemove, ref arrayCounter, ref this._stopSymbols);
            Array.Resize(ref positionForRemove, arrayCounter);
            Array.Sort(positionForRemove);
            //Array.Reverse(positionForRemove);
            int shift = 0;
            StringBuilder result = new StringBuilder(source.Length - arrayCounter);
            for (int i = 0; i < source.Length; i++)
            {
                if (i == positionForRemove[shift])
                {
                    if (positionForRemove.Length > shift + 1)
                        shift++;
                }
                else
                    result.Append(source[i]);
            }

            //positionForRemove = null;
            source = result.ToString();

        }




        /// <summary>
        /// 
        /// </summary>
        /// <param name="source"> link for original string</param>
        /// <param name="positionsForRemove">array of indexes</param>
        /// <param name="arrayCounter">point to next element in array</param>
        /// <param name="symbols"></param>
        private void FindIndexOfSymbols(ref string source, ref int[] positionsForRemove, ref int arrayCounter, ref char[] symbols)
        {
            for (int i = 0; i < source.Length; i++)
            {
                for (int j = 0; j < symbols.Length; j++)
                    if (source[i] == symbols[j])
                    {
                        positionsForRemove[arrayCounter] = i;
                        arrayCounter++;

                    }
            }
        }





        public string CalculateMD5Hash(string input)
        {
            // step 1, calculate MD5 hash from input
            MD5 md5 = System.Security.Cryptography.MD5.Create();
            byte[] inputBytes = System.Text.Encoding.ASCII.GetBytes(input);
            byte[] hash = md5.ComputeHash(inputBytes);

            // step 2, convert byte array to hex string
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < hash.Length; i++)
            {
                sb.Append(hash[i].ToString("X2"));
            }
            return sb.ToString();
        }

    }
}



Unit tests


Do you use Unit tests on the regular base?
You can ask me - what is this? what exactly a unit tests?
I believe a lots of us  too lazy for using Unit tests, and that's true.  I would be willing to bet that only about a few  really do.
You see, the main problem  here is that when most developers are asked to provide any definition of unit testing they will say something like “tests that are cover every unit in  your application” or something like that. Yes, that's partially true.  Let me provide you some info regarding Unit Tests.
Unit test  it's s a test which verifies a “unit”, or the smallest piece of an application which is able to be tested.  Unit testing is all about testing small pieces in isolation.

Why should we use unit tests?

Unit tests find problems early in the development cycle.
Then  earlier problems are found, then cheaper it is to fix them.

The development process becomes more flexible
Sometimes it may be required to fix a problem and to deploy the new fix very quick. Despite efforts, a bug may hides in and an important feature may out of order.  Releasing quick fixes makes us feel uneasy because we are not certain what side-effects the changes might have. Running the unit tests with the fixes applied saves the day as they should reveal undesirable side-effects.

Unit tests are fast.
It's all  about point above.  after any changes you will definitely know  current  status of your source code, just tun unit test that you have wrote before.

At the end of the end, I would like to show you  how do I test QuickSort Method, that I've wrote recently.
I'll use NUnit framework  for  this purpose.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
namespace TestLib
{
    [TestFixture]
    public class TestQuickSort
    {
        [Test, TestCaseSource("Test_Success_Cases")]
        public void SortTest(int[] A)
        {
            bool verifyForNull = (A == null ? true : false);

            SortLib.QuickSort<int> sort = new QuickSort<int>();
            sort.Sort(A);
            if (!verifyForNull && A == null)
                Assert.Fail("Sorted Array is null, but should not be");
            if (verifyForNull)
            {
                Assert.IsNull(A);
                return;
            }

            int tmp = int.MinValue;
            for (int i = 0; i < A.Length; i++)
                if (tmp > A[i])
                    Assert.Fail("Array not sorted properly");
                else
                {
                    tmp = A[i];
                }

            Assert.Pass("Sorted properly");



        }



        static int[] GenArray(int size, int maxVal)
        {
            int[] result = new int[size];
            Random rnd = new Random();
            for (int i = 0; i < size; i++)
            {
                result[i] = rnd.Next(0, Math.Abs(maxVal));
                if (maxVal < 0)
                    result[i] *= -1;
            }
            return result;
        }

        static object[] Test_Success_Cases =
                     {
                         null,
                         new int[0],
                            GenArray(10,-11),                          
                            GenArray(10,0),                          
                            GenArray(1,1), 
                            GenArray(10,11), 
                            GenArray(100,101), 
                            GenArray(1000,1001), 
                            GenArray(1000000,10000001)
                           
                     };
    }
}

Yes, I know this not too deep dive description of Unit test, but I'll add some thing more additional later.

Monday, March 12, 2012

Image Processing

I few weeks ago  I was  played with  image  processing.  Actually I  have a dream, I want  write super smart  application,  for image processing. But today, I'll talk about very simple   image modifications.
As I said, I few weeks  ago  I got  tons of   images,  that  I have to  update them with some sort of template.
So I wrote small Lib  and Console application for this purpose.
Result:

Original image:

Here is Image Processing Lib source code:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
 public class EditImage
    {
        public Image cropImage(Image img, Rectangle cropArea)
        {
            Bitmap bmpImage = new Bitmap(img);
            Bitmap bmpCrop = bmpImage.Clone(cropArea,
                                            bmpImage.PixelFormat);
            return (Image)(bmpCrop);
        }

        public Image resizeImage(Image imgToResize, Size size)
        {
            int sourceWidth = imgToResize.Width;
            int sourceHeight = imgToResize.Height;

            float nPercent = 0;
            float nPercentW = 0;
            float nPercentH = 0;

            nPercentW = ((float)size.Width / (float)sourceWidth);
            nPercentH = ((float)size.Height / (float)sourceHeight);

            if (nPercentH < nPercentW)
                nPercent = nPercentH;
            else
                nPercent = nPercentW;

            int destWidth = (int)(sourceWidth * nPercent);
            int destHeight = (int)(sourceHeight * nPercent);

            Bitmap b = new Bitmap(destWidth, destHeight);
            Graphics g = Graphics.FromImage((Image)b);
            g.InterpolationMode = InterpolationMode.HighQualityBicubic;

            g.DrawImage(imgToResize, 0, 0, destWidth, destHeight);
            g.Dispose();

            return (Image)b;
        }

        public void saveJpeg(string path, Bitmap img, long quality)
        {
            // Encoder parameter for image quality
            EncoderParameter qualityParam =
                new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, quality);

            // Jpeg image codec
            ImageCodecInfo jpegCodec = getEncoderInfo("image/jpeg");

            if (jpegCodec == null)
                return;

            EncoderParameters encoderParams = new EncoderParameters(1);
            encoderParams.Param[0] = qualityParam;

            img.Save(path, jpegCodec, encoderParams);
        }

        public ImageCodecInfo getEncoderInfo(string mimeType)
        {
            // Get image codecs for all image formats
            ImageCodecInfo[] codecs = ImageCodecInfo.GetImageEncoders();

            // Find the correct image codec
            for (int i = 0; i < codecs.Length; i++)
                if (codecs[i].MimeType == mimeType)
                    return codecs[i];
            return null;
        }

        public Image RoundCorners(Image StartImage, int CornerRadius, Color BackgroundColor)
        {
            CornerRadius *= 2;
            Bitmap RoundedImage = new Bitmap(StartImage.Width, StartImage.Height);
            Graphics g = Graphics.FromImage(RoundedImage);
            g.Clear(BackgroundColor);
            g.SmoothingMode = SmoothingMode.AntiAlias;
            Brush brush = new TextureBrush(StartImage);
            GraphicsPath gp = new GraphicsPath();
            gp.AddArc(0, 0, CornerRadius, CornerRadius, 180, 90);
            gp.AddArc(0 + RoundedImage.Width - CornerRadius, 0, CornerRadius, CornerRadius, 270, 90);
            gp.AddArc(0 + RoundedImage.Width - CornerRadius, 0 + RoundedImage.Height - CornerRadius, CornerRadius, CornerRadius, 0, 90);
            gp.AddArc(0, 0 + RoundedImage.Height - CornerRadius, CornerRadius, CornerRadius, 90, 90);
            g.FillPath(brush, gp);
            return RoundedImage;
        }

    }

Here is Console Application with  Image processing invocation :
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
static void Main(string[] args)
        {
            string[] filePaths = Directory.GetFiles(@"C:\projects\ProjectX\ImageProcessing\testProcessing\bin\Debug\input\", "*.jpg");
            Image workedImage = null;
            Image template = null;
            ImageProcessing.EditImage editImage = new ImageProcessing.EditImage();
            for (int i = 0; i < filePaths.Length; i++)
            {
                try
                {
                    workedImage = Image.FromFile(filePaths[i]);//("nissan-370z-rc-1.jpg");
                    template = Image.FromFile("template_glass_orb.png");

                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.Message);
                    Console.ReadKey();
                }
                if (workedImage.Width < workedImage.Height || workedImage.Width < 300)
                {
                    workedImage.Dispose();
                    template.Dispose();
                    continue;
                }

                else if (workedImage.Width >= 300)
                {
                    workedImage = editImage.resizeImage(workedImage, new Size(300, 300));
                    workedImage = editImage.cropImage(workedImage, new Rectangle(0, 0, 200, workedImage.Height));
                    if (workedImage.Height > 200)
                    {
                        workedImage = editImage.cropImage(workedImage, new Rectangle(0, 0, workedImage.Width, 195));
                    }
                    workedImage = editImage.RoundCorners(workedImage, 100, Color.Transparent);

                    if (workedImage.Width < 200)
                    {
                        workedImage.Dispose();
                        template.Dispose();
                        Console.WriteLine("Oops, we are here");
                        Console.ReadKey();
                        continue;
                    }
                }



                using (var bitmap = new Bitmap(workedImage.Width, workedImage.Height))
                {
                    using (var canvas = Graphics.FromImage(bitmap))
                    {
                        canvas.InterpolationMode = InterpolationMode.HighQualityBicubic;
                        canvas.DrawImage(workedImage, new Rectangle(0, 0, workedImage.Width, workedImage.Height), new Rectangle(0, 0, workedImage.Width, workedImage.Height), GraphicsUnit.Pixel);

                        //modifiedImage = RoundCorners(disairedImage, 80, Color.Transparent);
                        canvas.DrawImage(template, 0, 0, template.Width, template.Height);
                        canvas.Save();
                    }
                    try
                    {
                        bitmap.Save(@"C:\projects\ProjectX\ImageProcessing\testProcessing\bin\Debug\input\new\" + Path.GetFileName(filePaths[i]), ImageFormat.Png);
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex.Message);

                    }
                }
                workedImage.Dispose();
                template.Dispose();
            }
            Console.WriteLine("done");
            Console.ReadKey();
}

As you can see, it's  very easy to use.

Thursday, January 12, 2012

How to create a datatable programmatically

Hi, Today I want to discuss  about how to create a datatable programmatically.
Actually it's very easy.
We need complete next few steps:

  1. Declare DataTable
  2. Declare Columns
  3. Add columns to DataTable
  4. Fill Rows of DataTable with data if you need.
Yes, that's all.
OK, Let me show some sort of example, How I did it:


1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
using System;
using System.Data;

namespace HowToDataTableProgr
{
    class Program
    {
        class CompanyInfo
        {
            public CompanyInfo(string CompanyName, string CompanyAddress, double budget)
            {
                this.CompanyAddress = CompanyAddress;
                this.CompanyName = CompanyName;
                this.budget = budget;
            }
            public string CompanyName { get; set; }
            public string CompanyAddress { get; set; }

            private double budget;


            public string GetBudget()
            {
                return this.budget.ToString("C0");
            }



        }

        static void Main(string[] args)
        {

            var dataTable = new DataTable();

            var columnId = new DataColumn("Id", typeof(int));
            columnId.AutoIncrement = true;
            var columnCity = new DataColumn("City", typeof(string));
            var columnCompany = new DataColumn("compInfo", typeof(CompanyInfo));
            dataTable.Columns.AddRange(new DataColumn[] { columnId, columnCity, columnCompany });

            //add values into  rows
            dataTable.Rows.Add(null, "Washington", new CompanyInfo("Bradly co", "someStreet 12", 10000000.0));
            dataTable.Rows.Add(null, "New York", new CompanyInfo("Pooply ltd", "lane stree 45 12", 4000000.0));
            dataTable.Rows.Add(null, "Cupertino", new CompanyInfo("NewGoggle", "W Fremon AVe 2", 70000000.0));
            dataTable.Rows.Add(null, "Sidney", null);

            //Show names of Columns
            Console.WriteLine("DataTable has {0} DataColumns named:",
                dataTable.Columns.Count);
            foreach (DataColumn currentColumn in dataTable.Columns)
                Console.WriteLine("\t{0}", currentColumn.ColumnName);
            
            //show  count of rows
            Console.WriteLine("DataTable has {0} rows: ",
                dataTable.Rows.Count);
            Console.WriteLine();

            //show all data from all rows
            foreach (DataRow dataRow in dataTable.Rows)
            {
                foreach (DataColumn dataColumn in dataTable.Columns)
                {
                    if (dataRow[dataColumn] is CompanyInfo)
                    {
                        var currentCompany = dataRow[dataColumn] as CompanyInfo;
                        if (currentCompany != null)
                            Console.WriteLine(string.Format("CompanyAddress:{0}, CompanyName={1}, budget={2}", currentCompany.CompanyAddress, currentCompany.CompanyName, currentCompany.GetBudget()));                        
                    }
                    else
                        Console.WriteLine(dataRow[dataColumn].ToString());
                }
                Console.WriteLine();

            }
            Console.ReadKey();
        }
    }

}

Not sure, But I think  you have a few questions regarding source code.
Why so complex? you can ask me.
You see, right, If you need create  DataTable  and that's all, then it should looks like:

1
2
3
4
5
6
7
8
var dataTable = new DataTable();//declare DataTable

            var columnId = new DataColumn("Id", typeof(int));//declare column
            columnId.AutoIncrement = true;//set as columnId.AutoIncrement
            var columnCity = new DataColumn("City", typeof(string));//declare column
            var columnCompany = new DataColumn("compInfo", typeof(CompanyInfo));//declare column

            dataTable.Columns.AddRange(new DataColumn[] { columnId, columnCity, columnCompany });//add Range of columns to existed DataTable



Wednesday, January 11, 2012

How to convert List<string> into solid string

I few days ago I've stuck with  really trivial  things. I have to convert List to solid string with delimiter in my high-performance application. I've decided to write some test for this purpose.  

What I'll test:

  • Aggregate Linq extension method.
  • StringBuilder and loop class.
  • string.Concat - yeah, It's not my best code, especially  if I use Concat method in wrong way =)
  • String.Join static method.


Let's do that:

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
class Program
    {
        static void Main(string[] args)
        {
            //generate some data for our test
            var tstList = Enumerable.Range(1, 10000).Select(i => i.ToString()).ToList();
            string delimeter = ",";

            #region List<string> to solid string using Linq extension method
            Stopwatch sw = Stopwatch.StartNew();//start measure time
            string result1 = tstList.Aggregate((i, j) => i + delimeter + j);//run string cancatination using Linq extension
            sw.Stop(); //stop measuring

            Console.WriteLine("linq aggregate func took\t {0}", sw.Elapsed.ToString()); //show result on screen
            #endregion

            #region  List<string> to solid string using StringBuilder class            
            StringBuilder sb = new StringBuilder();
            bool isEmpty = true;
            sw = Stopwatch.StartNew();//start measure time
            for (int i = 0; i < tstList.Count(); i++)
            {//yeah some mess here, I know =)
                if (!isEmpty)
                    sb.Append(",");
                sb.Append(tstList[i]);
                isEmpty = false;
            }            
            sw.Stop(); //stop measuring
            Console.WriteLine("for loop and StringBuilder took\t {0}", sw.Elapsed.ToString());
            #endregion

            
            
            string result3 = string.Empty;
            isEmpty = true;
            sw = Stopwatch.StartNew();//start measure time
            for (int i = 0; i < tstList.Count(); i++)
            {
                if (!isEmpty)
                    result3 = string.Concat(result3, ",");

                result3 = string.Concat(result3, tstList[i]);
                isEmpty = false;
            }
            sw.Stop(); //stop measuring
            Console.WriteLine("string.Concat took\t\t {0}", sw.Elapsed.ToString());


            #region  List<string> to solid string using String.Join
            string result4 = string.Empty;
            sw = Stopwatch.StartNew();//start measure time
            result4 = String.Join(delimeter, tstList.ToArray());
            sw.Stop();
            Console.WriteLine("string.Join took\t\t {0}", sw.Elapsed.ToString());
            #endregion
            //Console.WriteLine(result3); //for test 
            Console.ReadKey();

        }
    }

As result I've got something like that:

I'm really disappointed with Linq Extension Aggregate method. It  takes forever to finish so simple thing.
What  about other methods?
StringBuilder. In every book you will see, that StringBuilder is best for frequently changing strings, because of string class is mutable. Mmmm... I don't know,  I think,  if you have any chance to use string.Concat, do it.
As you can see, string.Concat- is the super fast.
Yeah, I'm definitely will use string.Concat in my  application instead of Linq methods.
and What do you think?
Does Linq  good enough to use in  production environment?

Tuesday, January 10, 2012

Comparing Linq performance.

I heard some rumours in internet,  that  we have to avoid LINQ, because it's slow.  What can I do here? Right, I gonna write my own  test.
I decided to test following  scope of methods:

  • A simple foreach loop
  • A simple for loop
  • Using the ICollection.Contains method
  • The Any extension method using HashSet
  • The Any extension method ("LINQ")

Here is  source code of my app:
1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class Program
    {
        /// <summary>
        /// this method for testing performance only
        /// It runs Action method in the loop and uses Stopwatch to track time.
        /// </summary>
        /// <param name="funcList">list of delegates and  some  additional help strings</param>
        static void ProfileReport(Dictionary<string, Action> funcList)
        {
            foreach (var func in funcList)
            {
                Stopwatch sw = Stopwatch.StartNew();//start timer
                var f = func.Value;//get instance of current delegate
                f();//execute it
                sw.Stop();//stop time tracking
                Console.WriteLine(func.Key + '\t' + sw.Elapsed.ToString());//here is result
                GC.Collect();// removing unnecesary  data before next iteration
            }
            Console.WriteLine();
        }

        static void Main(string[] args)
        {
            var names = Enumerable.Range(1, 1000000).Select(i => i.ToString()).ToList();//generate some test values as sequence of numbers
            var namesHash = new HashSet<string>(names);
            string testName = "99999";
            for (int i = 0; i < 10; i++)
            {
                ProfileReport(new Dictionary<string, Action>() 
            {
                { "Foreach Loop\t", () => Search(names, testName, ContainsForeachLoop) },    
                { "For Loop\t", () => ContainsForLoop(names,testName) },    
                { "Enumerable.Any\t", () => Search(names, testName, ContainsAny) },                
                { "HashSet\t\t", () => Search(namesHash, testName, ContainsCollection) },
                { "ICollection.Contains", () => Search(names, testName, ContainsCollection) }
            });


            }
            Console.ReadLine();
        }
        static bool ContainsAny(ICollection<string> names, string name)
        {
            return names.Any(s => s == name);
        }

        static bool ContainsCollection(ICollection<string> names, string name)
        {
            return names.Contains(name);
        }

        static bool ContainsForeachLoop(ICollection<string> names, string name)
        {
            foreach (var currentName in names)
            {
                if (currentName == name)
                    return true;
            }
            return false;
        }

        static bool ContainsForLoop(List<string> names, string name)
        {
            for (int i=0; i<names.Count(); i++)
            {
                if (names[i] == name)
                    return true;
            }
            return false;
        }


        static bool Search(ICollection<string> names, string name,
        Func<ICollection<string>, string, bool> containsFunc)
        {
            return (containsFunc(names, name) ? true : false);
        }
    }

As result, I've got:

I'm little bit surprized here. I thought, that "For" loop will be faster then  foreach because  in for loop we don't need copy current value on every iteration from entiredata list. Yes, It looks like miracle, I can't  explaine it right now, I have to check it in msdn. But any way, lets back to LINQ, this is  our main point of all these  measurements.

So, as you can see, LINQ is really slower then foreach loop, HashSet and ICollection.Contains.
It should be avoided if it is not fast enoughSlow and not fast enough are not at all the same thing!
Slow is irrelevant to our customers.
I found some discussions  related to this topic, for example some guys things like:

"Performance optimization is expensive. Writing code so that it can be read and maintained by others is expensive. Those goals are frequently in opposition to each other, so in order to spend your customer's money responsibly you've got to ensure that you're only spending valuable time and effort doing performance optimizations on things that are not fast enough. "


But I do not agree, I believe, we have to write good for reading but also fast code.
So if I will have posibility to avoid LINQ, will do. =)

Thursday, January 5, 2012

A few words about C# extension

A few days ago I was digging MDN C# field for something interesting for me.  It was several articles about C# extension methods. It’s quite useful things.
The basic idea is that the set of methods available on an instance of a particular type is open to extension. As result, we can add new methods to existing types.  I thing most of us  already impacted with  some issues such as requirements to extend third party  classes, sealed classes and so on. In such cases, most general way here is create “helper” class or methods and use it everywhere. Sure, it can be.
C# has better solution.
Imagine we are wants to get all Mondays start from selected date.
Piece of cake, check result:





1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using DateTimeExtensions;



namespace DateTimeExtensions{
 public static class DateHelper
    {
        public static IEnumerable&lt;DateTime&gt; GetMondays(this DateTime startDate)
        {
            DateTime endDate = new DateTime(startDate.Year, 12, 31);

            while (startDate.DayOfWeek != DayOfWeek.Monday)
                startDate = startDate.AddDays(1);

            while (startDate &lt; endDate)
            {
                yield return startDate;
                startDate = startDate.AddDays(7);
            }
        }
    }
}
namespace UsingExtensions
{
    class Program
    {
        static void Main(string[] args)
        {
            DateTime startDate = new DateTime(1983, 7, 23); //
            foreach (DateTime monday in startDate.GetMondays())
                Console.WriteLine(monday.ToString());
               
            Console.ReadKey();
        }      
    }

}





At the end of the end I would like to provide several rules:
couple of rules to consider when deciding on whether or not to use extension methods:

·         Extension methods cannot be used to override existing methods

·         An extension method with the same name and signature as an instance method will not be called

·         The concept of extension methods cannot be applied to fields, properties or events

·         Do not overuse extension methods; it can be a bad thing!

Wednesday, January 4, 2012

I'm back

Yes. I decided to back. I would say Hello world, or it will be better to say hello world again.
So, what I'm going to write about?
I'll continue highlighting new feature of .Net, most memorable and exciting things that I like in C# and Oracle.
So, please, feel free check my blog from time to time. Comment my articles; I'll be pleasure to discuss with you.

Thanks,
Max