Haskell/Performance examples
Goal: Explain optimizations step by step with examples that actually happened.
Tight loop
[edit | edit source]dons: Write Haskell as fast as C: exploiting strictness, laziness and recursion.
dons: Write Haskell as fast as C: exploiting strictness, laziness and recursion.
CSV Parsing
[edit | edit source]haskell-cafe: another Newbie performance question I hope he doesn't mind if I post his code here, I still have to ask him. -- apfeλmus 08:46, 18 May 2008 (UTC)
type CSV = [[String]] main = do args <- getArgs file <- readFile (head args) writeFile (head args ++ "2") (processFile (args !! 1) file) processFile s = writeCSV . doInteraction s . readCSV doInteraction line csv = insertLine (show line) (length csv - 1) csv writeCSV = (\x -> x ++ "\n") . concat . intersperse "\n" . (map (concat . intersperse "," . (map show))) insertLine line pos csv = (take pos csv) ++ [readCSVLine line] ++ drop pos csv readCSVLine = read . (\x -> "["++x++"]") readCSV = map readCSVLine . lines
I think there was another cvs parsing thread on the mailing list which I deemed appropriate, but I can't remember.
Space Leak
[edit | edit source]jkff asked about some code in #haskell which was analyzing a logfile. Basically, it was building a histogram
foldl' (\m (x,y) -> insertWith' x (\[y] ys -> y:ys) [y] m) M.empty [(ByteString.copy foo, ByteString.copy bar) | (foo,bar) <- map (match regex) lines]
The input was a 1GB logfile and the program blew the available memory mainly because the ByteString.copy
weren't forced and the whole file lingered around in memory.
Branch Prediction Failure
[edit | edit source]This code demonstrates how your CPU may fail or succeed in predicting code branches (aka if clauses). The code generates a large random Vector of Ints. From this we compute the sum of all numbers larger than 999. If the data is sorted the CPU can easily deduce the probable outcome of the next branch. If the data is not sorted this will be much more difficult and less successful. Unfortunately, sorting for one purpose only is seldom a good idea. The code uses the criterion package for time measurement. Compile this e.g. with -Wall -keep-llvm-files -O2 -optlo-O3 -fllvm -rtsopts -threaded -eventlog -feager-blackholing -fmax-simplifier-iterations=20 -fsimplifier-phases=3 -fno-liberate-case -fspecialise-aggressively. The idea is from github.com/Kobzol/hardware-effects. (tkx68hamburg)
{-# LANGUAGE FlexibleInstances, RankNTypes, BangPatterns, CPP, PartialTypeSignatures, UnicodeSyntax, OverloadedStrings #-} {-# OPTIONS_GHC -fno-warn-orphans #-} module Main (main) where import Control.DeepSeq import Control.Exception (evaluate) import Control.Monad.ST import Criterion.Main import qualified Data.Vector.Algorithms.Radix as Radix import qualified Data.Vector.Unboxed as UV import System.Random #ifdef DEFINE_NFDATA_VECTOR instance (NFData a) => NFData (V.Vector a) where rnf v = V.foldr (\a b -> rnf a `seq` b) () v #endif setupUnsorted :: IO [UV.Vector Int] setupUnsorted = do small <- randomSampleUVVector 1024 med <- randomSampleUVVector $ 1024*1024 large <- randomSampleUVVector $ 20*1024*1024 return [small, med, large] setupSorted :: IO [UV.Vector Int] setupSorted = fmap (fmap sortUVec) setupUnsorted main :: IO () main = do defaultMainWith defaultConfig [ env setupUnsorted (\ ~[small, med, large] -> bgroup "Branch without sort" [ bench "small" $ whnf core1 small, bench "medium" $ whnf core1 med, bench "large" $ whnf core1 large ] ), env setupUnsorted (\ ~[small, med, large] -> bgroup "Branch without sort using filter" [ bench "small" $ whnf core1a small, bench "medium" $ whnf core1a med, bench "large" $ whnf core1a large ] ), env setupUnsorted (\ ~[small, med, large] -> bgroup "Branch including sort" [ bench "small" $ whnf core2 small, bench "medium" $ whnf core2 med, bench "large" $ whnf core2 large ] ), env setupSorted (\ ~[small, med, large] -> bgroup "Branch after sort" [ bench "small" $ whnf core1 small, bench "medium" $ whnf core1 med, bench "large" $ whnf core1 large ] ) ] {- INLINE -} core1 :: UV.Vector Int -> Int core1 = UV.foldl' condSum 0 {- INLINE -} core2 :: UV.Vector Int -> Int core2 = UV.foldl' condSum 0 . sortUVec {- INLINE -} core1a :: UV.Vector Int -> Int core1a = UV.sum . (UV.filter (>999)) {- INLINE -} condSum :: Int -> Int -> Int condSum !x !y = if y>999 then x+y else x randomSampleUVVector :: Int -> IO (UV.Vector Int) randomSampleUVVector i = evaluate $ force $ UV.fromList (take i (randoms (mkStdGen 0) :: [Int])) sortUVec :: forall a . (Ord a, Radix.Radix a, UV.Unbox a) => UV.Vector a -> UV.Vector a sortUVec vec = runST (do mv <- UV.thaw vec Radix.sort mv UV.unsafeFreeze mv)