import Control.Monad (mapM_, liftM)
import Data.List (sort)
import qualified Data.ByteString.Lazy.Char8 as L
main = L.interact processInput
processInput :: L.ByteString -> L.ByteString
processInput = L.unlines . map printInt . radixSort . map (maybe 0 fst . L.readInt) . L.lines
where
printInt = L.pack . show
radixSort :: [Int] -> [Int]
radixSort xs = sortOnDigit 1 xs
where
sortOnDigit n ls =
let
buckets = createBucketsFor $ (listWithDigit n ls)
result = concat buckets
in
if length (buckets !! 0) == length xs
then result
else sortOnDigit (n + 1) result
listWithDigit :: Int -> [Int] -> [(Int, Int)]
listWithDigit n xs
| n <= 0 = error ("Index " ++ show n ++ " not valid")
| otherwise = zip xs $ map getDigit xs
where
getDigit x = (x `div` (10^(n-1))) `mod` 10
createBucketsFor :: [(Int, Int)] -> [[Int]]
createBucketsFor xs = zipWith (elementsEndingOn) [0..9] (repeat xs)
where
elementsEndingOn :: Int -> [(Int, Int)] -> [Int]
elementsEndingOn n xs = [val | (val, digit) <- xs, digit == n]
Refactorings
No refactoring yet !
sargon
November 10, 2009, November 10, 2009 09:37, permalink
why radixsort ?
Doesn't fit mergesort better ?
Thomas Salvador
November 15, 2009, November 15, 2009 00:06, permalink
hi sargon:
radixsort is faster than mergesort, as it uses knowledge of the structure of data to sort.
speed of radixsort is O(mn) for n keys of length m => O(n) here for that constant m.
mergesort would perform as always with O(n log n), hence significant slower.
regards,
thomas.
sargon
November 16, 2009, November 16, 2009 23:44, permalink
joa.
But this is only a landuau notation trick,
as long as m > log n the dropped constant is bigger.
And haskell is a functional language thus it is dump, sry,
to not implement a algorithm which scale on parallel architectures.
http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
And for the performance of this algorithm, I wouldn't transform the
string into numbers and back. Strings and thus lists are very effective in haskell ;)
Jedai
November 21, 2009, November 21, 2009 18:29, permalink
Ok... radix sort by itself is doubtlessly faster than mergesort (at least in your case), but your implementation is very slow.
For instance your createBucketsFor() traverse your list 10 times each call whereas a good radixsort would only do it once, also your test of termination (length (buckets !! 0) == length xs) have to traverse the whole bucket every time (is in O(n), even if GHC is smart enough to optimize away the length xs) whereas you could replace it by (all null . map (buckets !!) $ [1..9]) which would be O(1). Also you try to do buckets by the decimal representation of your Int instead of using the underlying binary representation.
In other words, a small test show your implementation as at least 3 times slower than the simple mergesort implemented as sort in the GHC standard library (Data.List). Of course a good radix sort can do even better and it so happen that there already is a radix sort in the standard library, though it doesn't present itself like that you can use Data.IntSet ! It is far faster than sort() and incomparably better than your implementation.
import qualified Data.IntSet as IS radixSort = IS.toList . IS.fromList
Jedai
November 21, 2009, November 21, 2009 20:37, permalink
To have an idea of the performance, I used Criterion to compare sorting times of [999999, 999998..1] :
my radixSort took 0.5s, mergesort took 1s and your implementation took 6.5s (so 13 times slower) and it only get worse (or better depending on which sort you use) with more Ints.
SemenovEvgeni
July 22, 2010, July 22, 2010 11:26, permalink
it was very interesting to read.
I want to quote your post in my blog. It can?
And you et an account on Twitter?
I'm trying to implement a solution with a radixSort algorithm. The input will be T lines each containing a number N, where 0 <= T, number_of_digits(N) <= 10^6.
I made a naive implementation, I think this can be optimized a lot, but I can't find a way to avoid list replication (my haskell foo is not that evolved yet).
I managed to optimize the input processing time using ByteString (from 58 secs to 7 secs on a compiled source with a -O3 flag), but still the algorithm is slow, (It must execute at least in 5 secs).