Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic instance with omitNothingFields = True is very slow #1030

Open
brprice opened this issue Jun 7, 2023 · 1 comment
Open

Generic instance with omitNothingFields = True is very slow #1030

brprice opened this issue Jun 7, 2023 · 1 comment

Comments

@brprice
Copy link

brprice commented Jun 7, 2023

There seems to be exponential time complexity with genericToEncoding defaultOptions {omitNothingFields = True}, although this seems to depend on the ghc version used. Consider the following code (which has been somewhat minimised, but still depends on a fair bit of aeson)

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingStrategies #-}

module Main (main) where

import Data.Aeson.Types.ToJSON (
  ToJSON(toEncoding,toJSON),
  genericToEncoding,
 )
import Data.Aeson.Types.Internal (
  defaultOptions,
  omitNothingFields,
 )
import Data.Aeson.Encoding.Internal (encodingToLazyByteString)
import GHC.Generics (Generic)
import System.Environment (getArgs)

import qualified Data.ByteString.Lazy as L

data Tree
  = Node () [Tree]
  deriving stock (Show, Generic)

instance ToJSON Tree where
  toEncoding = genericToEncoding defaultOptions {omitNothingFields = True}

tree' :: Int -> Tree
tree' 0 = Node () []
tree' n = Node () [tree' $ n-1]

main :: IO ()
main = do
  [n'] <- getArgs
  let n = read n'
  putStrLn "about to print tree"
  let tree = tree' n
  print $ tree
  putStrLn "about to ToJSON tree"
  print $ toJSON tree
  putStrLn "about to encode tree"
  print $ encode tree
  putStrLn "done"

---- From aeson

encode :: (ToJSON a) => a -> L.ByteString
encode = encodingToLazyByteString . toEncoding

I get results such as the following with ghc 9.2.7

Command (with ghc 9.2.7) Mean [ms] Min [ms] Max [ms] Relative
cabal run -O1 aeson-slow -- 1 53.5 ± 0.1 53.4 53.9 1.00 ± 0.00
cabal run -O1 aeson-slow -- 2 53.5 ± 0.0 53.4 53.6 1.00 ± 0.00
cabal run -O1 aeson-slow -- 3 53.5 ± 0.1 53.4 53.9 1.00 ± 0.00
cabal run -O1 aeson-slow -- 4 53.5 ± 0.0 53.4 53.6 1.00 ± 0.00
cabal run -O1 aeson-slow -- 5 53.5 ± 0.1 53.4 53.9 1.00 ± 0.00
cabal run -O1 aeson-slow -- 6 53.5 ± 0.0 53.3 53.6 1.00 ± 0.00
cabal run -O1 aeson-slow -- 7 53.5 ± 0.1 53.3 54.0 1.00 ± 0.00
cabal run -O1 aeson-slow -- 8 53.5 ± 0.1 53.3 54.0 1.00 ± 0.00
cabal run -O1 aeson-slow -- 9 53.4 ± 0.1 53.3 53.6 1.00
cabal run -O1 aeson-slow -- 10 54.0 ± 2.4 53.2 64.0 1.01 ± 0.05
cabal run -O1 aeson-slow -- 11 63.6 ± 0.2 63.4 64.3 1.19 ± 0.00
cabal run -O1 aeson-slow -- 12 63.6 ± 0.2 63.3 64.4 1.19 ± 0.00
cabal run -O1 aeson-slow -- 13 63.6 ± 0.2 63.4 64.4 1.19 ± 0.00
cabal run -O1 aeson-slow -- 14 71.5 ± 4.2 63.2 74.9 1.34 ± 0.08
cabal run -O1 aeson-slow -- 15 82.0 ± 3.7 73.2 84.0 1.53 ± 0.07
cabal run -O1 aeson-slow -- 16 103.5 ± 0.1 103.4 103.9 1.94 ± 0.00
cabal run -O1 aeson-slow -- 17 143.4 ± 0.0 143.4 143.5 2.68 ± 0.00
cabal run -O1 aeson-slow -- 18 226.5 ± 4.8 223.2 233.6 4.24 ± 0.09
cabal run -O1 aeson-slow -- 19 391.6 ± 15.6 383.4 434.0 7.33 ± 0.29
cabal run -O1 aeson-slow -- 20 722.9 ± 12.0 713.9 753.9 13.53 ± 0.22
cabal run -O1 aeson-slow -- 21 1382.9 ± 14.5 1373.5 1413.9 25.87 ± 0.27
cabal run -O1 aeson-slow -- 22 2720.9 ± 40.5 2674.2 2774.0 50.91 ± 0.76
cabal run -O1 aeson-slow -- 23 5379.8 ± 49.2 5304.0 5464.0 100.65 ± 0.93
cabal run -O1 aeson-slow -- 24 10575.9 ± 32.6 10523.8 10613.9 197.87 ± 0.68
cabal run -O1 aeson-slow -- 25 21091.9 ± 54.6 21033.8 21203.5 394.61 ± 1.18
cabal run -O1 aeson-slow -- 26 42325.9 ± 264.1 41984.1 42673.9 791.89 ± 5.08

Note that this seems rather dependent on the specific ghc version used -- I don't know whether this implies that there is a ghc regression/performance fix and this is really an upstream issue, or whether it means that aeson relies on some optimization firing and that this can be unreliable (and if it is this second option, I don't know if it is possible to make it more reliable). Running a sweep over different ghc versions I see the following

GHC version Mean [ms] Min [ms] Max [ms] Relative
GHC 8.10.7 1.0 ± 0.2 1.0 3.8 1.03 ± 0.19
GHC 9.0.2 1.0 ± 0.0 1.0 1.5 1.00
GHC 9.2.4 40564.2 ± 125.1 40362.9 40783.3 40655.26 ± 1458.36
GHC 9.2.5 39164.4 ± 86.0 38997.4 39267.6 39252.38 ± 1405.47
GHC 9.2.6 39244.7 ± 81.5 39111.9 39365.8 39332.87 ± 1408.07
GHC 9.2.7 38837.3 ± 141.0 38603.0 39147.1 38924.50 ± 1398.27
GHC 9.4.2 43628.6 ± 752.3 42621.7 44991.6 43726.59 ± 1735.12
GHC 9.4.3 1272.6 ± 12.0 1261.6 1291.6 1275.50 ± 47.14
GHC 9.4.4 12.2 ± 0.1 11.8 12.4 12.18 ± 0.45
GHC 9.4.5 12.2 ± 0.2 11.7 12.6 12.19 ± 0.46
@phadej
Copy link
Collaborator

phadej commented Jun 27, 2023

I can reproduce this. The recent changes in master (and upcoming 2.2.0.0) also didn't made this go away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants