jeudi, octobre 15, 2009

Left outer join in C#

Left outer join is well known is SQL. But how to do it in C# ? That's what I show here.

Imagine you have 2 lists of elements you want to compare.
For exemple, 2 lists of rates with a currency and a value.
First list:
AUD 0,2
BRL 4,3
EUR 1,2
GBP 1,2
USD 5

Second list:
BRL 4,2
CAD 0,2
EUR 1,3
GBP 1,2
HKD 0,2
USD 5,2

This could be for exemple 2 lists of rates of two following days , and you want to compare the variation.
It could be also rates of different types for the same day like LIBOR and EURIBOR.

You notice that list do not have the same number of element, and they do not have the same currencies so there can be some hole.
The table we want to obtain is the following one:

AUD | 0,2 | -
BRL | 4,3 | 4,2
CAD | - | 0,2
EUR | 1,2 | 1,3
GBP | 1,2 | 1,2
HKD | - | 0,2
USD | 5 | 5,2

Once we have this matrix, it is easy to calculate the difference.

Here how I implemented this in C#. The code uses generics, so it can works with any object. The only requirement is that these objects implement IComparable interface.
The application of this algorithm is wide. Compare prices of cars, size of files, etc...

This algorithm is fast, because we loop across each list once. Its complexity is O(n).
The method returns an array of Couple formed by associated elements of both list.

using System;
using System.Collections.Generic;

namespace DiffLib
{

public class Joiner<T> where T:IComparable<T>
{
public Couple<T>[] OuterJoin(IEnumerable<T> leftList, IEnumerable<T> rightList)
{
List<Couple<T>> joinedList = new List<Couple<T>>();
List<T> l1 = new List<T>(leftList);
List<T> l2 = new List<T>(rightList);
l1.Sort();
l2.Sort();
Queue<T> q1 = new Queue<T>(l1); // left queue
Queue<T> q2 = new Queue<T>(l2); // right queue
T p1 = default(T);
T p2 = default(T); ;

// to pop the first element of each empty queues
bool popLeft = (q1.Count > 0);
bool popRight = (q2.Count > 0);

while (q1.Count > 0 || q2.Count > 0) // while there is still an element
{
T leftElected = default(T);
T rightElected = default(T);
if (q1.Count == 0)
{
// left queue is empty, so just take the element on the right
leftElected = p1; // may be not null if popped from last loop
rightElected = q2.Dequeue();
}
else if (q2.Count == 0)
{
// right queue is empty, elect left element
leftElected = q1.Dequeue();
rightElected = p2;
}
else
{
if (popLeft)
p1 = q1.Dequeue();
if (popRight)
p2 = q2.Dequeue();

// need to compare elements
int comp = comp = p1.CompareTo(p2);
if (comp == 0) // equals => join the 2 elements
{
leftElected = p1;
rightElected = p2;
p1 = p2 = default(T); // clear variables for next loop
}
else if (comp < 0) // p1<p2 => take p1 only & skip p2
{
leftElected = p1;
p1 = default(T); // clear
rightElected = default(T); // empty box
}
else if (comp > 0) // p1>p2 => take p2 only & skip p1
{
leftElected = default(T); // empty box
rightElected = p2;
p2 = default(T); // clear
}
// pop the element on left queue if the last element popped on left
// was lower than the last popped on right,
popLeft = (comp <= 0);

// symetric
popRight = (comp >= 0);
}
joinedList.Add(new Couple<T>(leftElected, rightElected));
}
return joinedList.ToArray();
}
}

public class Couple<T>
{
public T Left { get; set; }
public T Right { get; set; }
public Couple(T left, T right)
{
this.Left = left;
this.Right = right;
}
}
}


-----

Using it

The following code shows an implementation of use for the case described above. I create a class Rate that implement IComparable. This allows the the objects to be associated when they are considered equals. The comparison must operate on a given key on which the match will perform. Here the key is the currency.


private class Rate : IComparable<Rate>
{
public string Currency;
public float Value;

public int CompareTo(Rate other)
{
return this.Currency.CompareTo(other.Currency);
}
}

private class RateCompare
{
public string Currency;
public string Rate1;
public string Rate2;
public override string ToString()
{
return String.Format("{0}\t| {1}\t| {2}\t",
this.Currency, this.Rate1, this.Rate2);
}
}

[Test]
public void TestObjectOuterJoin()
{
List<Rate> l1 = new List<Rate>();
l1.Add(new Rate { Currency = "AUD", Value = 0.2f });
l1.Add(new Rate { Currency = "BRL", Value = 4.3f });
l1.Add(new Rate { Currency = "EUR", Value = 1.2f });
l1.Add(new Rate { Currency = "GBP", Value = 1.2f });
l1.Add(new Rate { Currency = "USD", Value = 5.0f });


List<Rate> l2 = new List<Rate>();
l2.Add(new Rate { Currency = "BRL", Value = 4.2f });
l2.Add(new Rate { Currency = "CAD", Value = 0.2f });
l2.Add(new Rate { Currency = "EUR", Value = 1.3f });
l2.Add(new Rate { Currency = "HKD", Value = 0.2f });
l2.Add(new Rate { Currency = "GBP", Value = 1.2f });
l2.Add(new Rate { Currency = "USD", Value = 5.2f });

Joiner<Rate> joiner = new Joiner<Rate>();
Couple<Rate>[] joinList = joiner.OuterJoin(l1, l2);
foreach (Couple<Rate> couple in joinList)
{
Console.WriteLine(GetDiff(couple));
}


}
RateCompare GetDiff(Couple<Rate> couple)
{
RateCompare diff = new RateCompare();
if (couple.Left != null)
{
diff.Currency = couple.Left.Currency;
diff.Rate1 = couple.Left.Value.ToString();
}
else
{
diff.Currency = couple.Right.Currency;
diff.Rate1 = "-";
}
if (couple.Right != null)
diff.Rate2 = couple.Right.Value.ToString();
else
diff.Rate2 = "-";
return diff;
}

Output

AUD | 0,2 | -
BRL | 4,3 | 4,2
CAD | - | 0,2
EUR | 1,2 | 1,3
GBP | 1,2 | 1,2
HKD | - | 0,2
USD | 5 | 5,2

If you are interested by this code, you can download the Visual Studio 2008 solution.

Commentaires: Enregistrer un commentaire

Abonnement Publier les commentaires [Atom]





<< Accueil

This page is powered by Blogger. Isn't yours?

Abonnement Articles [Atom]