lördag 10 augusti 2013

Leaderboard, some approaches

I'm looking into the leader board problem. In short, the problem can be expressed as the most efficient way to take the n best out of a large (often semi ordered) set with the size m (the cardinality of the total league).

The most naive version is to keep a list of the n≤m best players seen so far. If we assume no sorting among these n players, the complexity of the algorithm will be n*m, which is often unacceptable, since both m and n can be huge.

Cache the worst one the list
A slightly less naive way is to cache the least good player in the current toplist. This will, at best, give only slightly above m comparisions, given that the best players came first. Beware though, if the set is given in reversed order, the complexity once again soars to n*m.

Keep a binary tree heap
A heap is an often excellent way to select items out of a set. The most general selection, the heap-sort has a well known n log n complexity. The heap could be a candidate because it keeps the players on the top list sorted and could be a good candidate. This is not the case. The binary tree has the sad property that it can be very unbalanced and we are back again at n*m.

Keep a balanced binary tree heap!
Now we're talking! There are several trees that balance themselves in a well-known worst case time available. Actually, there are some specialized heap data structures availiable, a list from Wikipedia on Heap data structures gives the following
  • 2-3 heap
  • Beap 
  • Binary heap
  • Binomial heap
  • Brodal queue
  • d-ary heap
  • Fibonacci heap
  • Leftist heap
  • Pairing heap
  • Skew heap
  • Soft heap
  • Weak heap
  • Leaf heap
  • Radix heap
  • Randomized meldable heap
where the Fibonacci and Brodal heaps seems to have among the best over-all performance (edit: no, the different heaps do of course have different properties depending on the insert/delete proportions). 

I'll give a try to implement the Fibonacci heap. And yes, it's one availiable in clojure already, courtesy of Mary Rose Cook, maryrosecook/fibonacciheap, built with zippers, nicely described a great blogpost. Super elegant.

Inga kommentarer:

Skicka en kommentar