Sunday, October 20, 2013

Set Collection data structure implementation using C#

Hi,
today I'm going to show you how to implement own set collection type with some commonly associate algorithms
We will start by first learning what is set is? Next, I'll show you have it can be translated in class. After that I'll show four algorithms commonly associated with a sets
  • Union
  • Intersections
  • SetDifference
  • SymmetricDifference
At the end of the topic, I'll show you how to use it sample application.

Let's start from the defining what is Set is? -set is collection of objects. Any type of the object can be stored in the set, as long as object comparable
Let's looks some examples of Sets:
  • Neutrals={0}
  • Positives={1, 2, 3, 4.....}
  • Negative ={....,-4, -3, -2, -1}
  • Evens ={..., -4, -2, 0, 2, 4, …}
  • Odds={..., -3, -1, 1, 3, …}

Before we dive into individual methods, let's carefully look a class.



First- you can see this class is Generic class, and it take an argument “T” and we are make sure that T implements IComparable interface. This insure us that we can compare instance of “T” using compare method. We can also see, that class contains Ienumerable Interface, this will allow us to use this class in context as enumerable as needed. For example in foreach loop. Let start by looking class construction.

Since we are storing objects in the set, we need to have a place to put them. This could be any basic data structures. This time I chose Microsoft .Net List class. Of course it will be possible to use array instead of list. But doing so it will require adding logic to the class the is not relevant to Set class and will require additional time to showing.

private readonly List _items=new List();

So first we have empty constructor.

public Set (){ }

And we also have constructor that accepts IEnumerable, that allow us easy create a Sets from any IEnumerables such as from other sets.

  /// <summary>
  /// If the items already exists in the Set- exception will be thrown
  /// This is because our class doe not allow dublicate items
  /// </summary>
  /// <param name='item'>
  /// Item.
  /// </param>
  /// <exception cref='InvalidOperationException'>
  /// Is thrown when an operation cannot be performed.
  /// </exception>
  public void Add(T item)
  {
   if(this.Contains(item))
    throw new InvalidOperationException("Item already exists in Set");
   this._items.Add(item);
  }

We can also add several items using AddRange method


  public void AddRange(IEnumerable<T> items)
  {
   foreach(T item in items)
    this.Add(item);   
  }    


Just as important add adding items is removing them. My Set class will support single remove method. This will takes item to remove an return value that indicates if the item found and removed.


  public bool Remove(T item)
  {
   return this._items.Remove(item);
  }

I would to determine if the item already exists in the Set or to get the Cardinality of the set


  public int Count 
  {
   get
   {
    return this._items.Count;
   }
  }

  public bool Contains(T item)
  {
   return this._items.Contains(item);   
  }

The last non Set specific methods I going to show you  is Enumeration method


  #region IEnumerable[T] implementation
  public IEnumerator<T> GetEnumerator ()
  {
   return this._items.GetEnumerator();
  }
  #endregion

  #region IEnumerable implementation
  System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator ()
  {
   return this._items.GetEnumerator();
  }
  #endregion

The first Set algorithm I'm going to look at - is the Union algorithm. The Union algorithm compare two Sets and produce third Set all of the unique items in both Sets


  /// <summary>
  /// Union two Sets.
  /// <example>The union of {1,2,3} and {3,4,5} is {1,2,3,4,5}</example>
  /// </summary>
  /// <param name='other'>
  /// New Set, that contains unique itemsin both sets
  /// </param>
  public Set<T> Union(Set<T> other)
  {
   Set<T> result =new Set<T>(_items);
   result.AddRangeSkipDublicates(other._items);
   return resut;                             
  }

As you can see - a new method I'm using here- AddTRangeSkipDublicates. That's because, I don't want to throw exception if the item already present in the Set. This  method is for internal purpose only, that's mean it can be marked as private. The loginc of this method is quite simple:

  • We are loading all items from first Set
  • Do loop through second list 
  • on each item check if the item already exists in the Set.
  • If the item is not Exist- then  current item should be added to the result Set.

The second algorithm of Sets I'm going to  show you is -Intersection. Intersection algorithm  compare two Sets and produce third Set that contains all of the intersecting, or matching members of both Sets. 
Lets take  look an example



  /// <summary>
  /// Intersection of two Sets
  /// <example>The intersection of{1,2,3} and {2,3,4} is{2,3}</example>
  /// </summary>
  /// <param name='other'>
  /// Other.
  /// </param>
  public Set<T> Intersection(Set<T> other)
  {
   Set<T> result =new Set<T>();
   foreach(T item in this._items)
   {
    if(other._items.Contains(item))
     result.Add(item);
   }
   return result;
  }

The SetDifference algorithm is the third set algorithm I'm  going to  show you. Set difference works by taking two different sets and returning all of the items of first Set that not members of second Set.
Lets take a look an example


  /// <summary>
  /// Difference of two Sets
  /// <example>The Set Difference of {2,3,4} and {3,4,5} is {2}</example>
  /// </summary>
  /// <param name='other'>
  /// Other.
  /// </param>
  public Set<T> Difference(Set<T> other)
  {
   Set<T> result=new Set<T>(this._items);   
   foreach(T item in other._items)
    result.Remove(item);
   return result;
  }


The final Set algorithm I'm going to show you is Symmetric Difference algorithm.
The Symmetric Difference algorithm is similar to Difference algorithm now output set will include all of that exists in only one of two input sets. The Symmetric difference is the set difference of the intersection and union of the input sets. Lets take an example to make it more clear.


  /// <summary>
  /// Symmetrics difference.
  /// <example>The symmetric difference of {1,2,3} and {2,3,4} is {1,4}
  /// 1. Itersection={2,3}
  /// 2. Union={1,2,3,4}
  /// 3. SetDifference of {1,2,3,4} and {2,3} is {1,4}
  /// </example>
  /// </summary>
  /// <returns>
  /// The difference.
  /// </returns>
  /// <param name='other'>
  /// Other.
  /// </param>
  public Set<T> SymmetricDifference(Set<T> other)
  {
   Set<T> intersection=this.Intersection(other);
   Set<T> union=this.Union(other);
   return union.Difference(intersection);
  }

That's all.


No comments:

Post a Comment