Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A general serializer for Data.Map #79

Closed
nponeccop opened this issue May 5, 2012 · 15 comments
Closed

A general serializer for Data.Map #79

nponeccop opened this issue May 5, 2012 · 15 comments

Comments

@nponeccop
Copy link

I don't know if the idea is good but look at this dirty proof of concept code:

class ToPropertyName a where
    toPropertyName :: a -> String

instance (ToPropertyName a, ToJSON b) => ToJSON (M.Map a b) where
    toJSON = toJSON . M.mapKeys toPropertyName  

In many languages (think of JS, PHP, Perl) map keys can only be strings. So people serialize other values to strings to get the lookup performance of native associative containers in their languages. To interoperate with those people a Haskeller needs to parse their compound map keys (property names in ECMAScript parlance) into a more type safe form.

Of course one can put compound values in any place, but property name is a special case because of performance, so it may deserve a special treatment in Aeson.

Also, in Haskell people use enumerations and newtypes isomorphic to strings in places where in poorer languages they have to use strings. So it's good to have an ability to use an enumeration (I mean data Foo = Bar | Baz | Quux) for map keys, either by detecting enumerations during TH derivation or by allowing people to use simple instances like toPropertyName = show instead of having to implement a full instance for Data.Map Foo Something

What do you think?

@basvandijk
Copy link
Member

I actually use the following module at work: Note the TODO ;-)

So I'm very much +1 on this.

Bryan, should I go ahead and make a pull request out of this?

{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE FlexibleInstances #-}

module Data.Aeson.Name where

import qualified Data.ByteString      as B
import qualified Data.ByteString.Lazy as BL
import qualified Data.Text            as T
import qualified Data.Text.Encoding   as TE
import qualified Data.Text.Lazy       as TL

import Data.Aeson (Value(Object), ToJSON, toJSON, FromJSON, parseJSON)

import Data.Hashable (Hashable)

import qualified Data.HashMap.Strict as H

-- TODO: Propose these classes for aeson:

class ToName   a where toName   :: a      -> T.Text
class FromName a where fromName :: T.Text -> a

instance ToName   T.Text         where toName   = id
instance FromName T.Text         where fromName = id

instance ToName   TL.Text        where toName   = TL.toStrict
instance FromName TL.Text        where fromName = TL.fromStrict

instance ToName   String         where toName   = T.pack
instance FromName String         where fromName = T.unpack

instance ToName   B.ByteString   where toName   = TE.decodeUtf8
instance FromName B.ByteString   where fromName = TE.encodeUtf8

instance ToName   BL.ByteString  where toName   = TE.decodeUtf8 . B.concat . BL.toChunks
instance FromName BL.ByteString  where fromName = BL.fromChunks . (:[]) . TE.encodeUtf8

--------------------------------------------------------------------------------

instance (ToName k, ToJSON a) => ToJSON (H.HashMap k a) where
    toJSON = Object . mapKeyVal toName toJSON

instance (Eq k, Hashable k, FromName k, FromJSON a) =>
    FromJSON (H.HashMap k a) where
    parseJSON = fmap (mapKey fromName) . parseJSON

--------------------------------------------------------------------------------
-- Copied from aeson:

-- | Transform the keys and values of a 'H.HashMap'.
mapKeyVal :: (Eq k2, Hashable k2) => (k1 -> k2) -> (v1 -> v2)
          -> H.HashMap k1 v1 -> H.HashMap k2 v2
mapKeyVal fk kv = H.foldrWithKey (\k v -> H.insert (fk k) (kv v)) H.empty
{-# INLINE mapKeyVal #-}

-- | Transform the keys of a 'H.HashMap'.
mapKey :: (Eq k2, Hashable k2) => (k1 -> k2) -> H.HashMap k1 v -> H.HashMap k2 v
mapKey fk = mapKeyVal fk id
{-# INLINE mapKey #-}

@nponeccop
Copy link
Author

Key mappings can be implemented more efficiently - for example, we can use map (first toPropertyName) . M.toList which is O(n) instead of M.toList . mapKeys toPropertyName which is O(n*log n).

And we may need a more general interface. For example,

class ToJSONObject where
   ToJSONObject :: (ToPropertyName b, ToJSON c) => a -> [(b, c)]

or we may consider changing (.=) to use the ToPropertyName type class.

@basvandijk
Copy link
Member

Key mappings can be implemented more efficiently - for example, we can use map f . M.toList which is O(n) instead of mapKeys which is O(n*log n).

But map f . M.toList produces a list which still needs to be converted to a HashMap which is O(n*log n).

@nponeccop
Copy link
Author

It doesn't need to be converted - we can serialize the list of key-value tuples right away by passing the result of map to object:

toJSON = return . object . map (\(k, v) -> toPropertyName k .= v) . M.toList

@basvandijk
Copy link
Member

But note that object converts the list to a HashMap which is O(n*log n):

object :: [Pair] -> Value
object = Object . H.fromList

we may consider changing (.=) to use the ToPropertyName type class.

I fear that this can cause ambiguity when using the OverloadedStringsextension.

@nponeccop
Copy link
Author

Bummer I didn't know

@nponeccop
Copy link
Author

Is there a way to avoid constructing hashtables during serialization?

@basvandijk
Copy link
Member

No, serialization and deserialization both go through the Value datatype where an Object is defined as a HashMap.

Of course you can imagine a serializer for objects that doesn't construct a HashMap. However, you still want to ensure that your keys are unique so only accepting a HashMap is a nice way of guaranteeing that. It's also nicely consistent with deserialization which I think is even more important.

@bos
Copy link
Collaborator

bos commented Jun 15, 2012

I think that the original request is a reasonable thing to want, but I'm not sure I want to make aeson even bigger to support it. The API is already a bit unwieldy.

By the way, I have indeed thought about adding a direct encoding function to bypass the HashMap construction. I think it would make encoding quite a bit faster. I'm not concerned about duplicate keys. The current serializer will choose a winner at random if there's a duplicate key, which isn't a very sensible behaviour to try to defend :-)

@bos
Copy link
Collaborator

bos commented Oct 15, 2013

See this blog post for my current thinking around direct encoding.

@ibotty
Copy link

ibotty commented Oct 15, 2013

regarding your open questions: i like the homogeneous arrays, as that's what i usually use, but the same idea for objects sounds unreasonable for me. i constantly have to deal with heterogeneous objects. i guess most are, so for me the type would get into the way.

@imz
Copy link

imz commented Jun 4, 2015

@imz
Copy link

imz commented Jun 4, 2015

If there are GADT-style type annotations for the underlying JSON types of Value (like Value Object), then the constraint on keys could be expressed as:

ToJSON String

it's toJSSON method would give a Value String.

In fact, I have used such an assumption about my keys in my workaround for the problem of converting generic maps/assoc-lists into JSON:

instance (ToJSON a) => ToJSON (D a)
    where toJSON kvs = object [ t .= v | (k,v) <- kvs, let (String t) = toJSON k ]

Now I'm thinking about making my ugly workaround at least a bit more safe and controlled: instead of using a non-total match where I need to use it, encode the assumption with a class like ToJSONString whose method would do the ugly deconstruction, but at least, the programmer (me) would have to explicitly introduce the instance consciously (near the place where the data type and instance ToJSON are defined). (As I told in GADT-style type annotations for the underlying JSON types of Value, I need some kind of such guarantees not only for representing keys as JSON strings, but also for other cases of deconstructing JSON objects for modifying them, and--importantly--for giving guarantees about conforming to a JSON format which requires a specific type--say, Object--at a specific place.)

@bos
Copy link
Collaborator

bos commented Jul 11, 2015

I had a long think about this, and wasn't at all sure how to make progress without accidentally killing type inference and backwards compatibility.

For example, object has the following type:

object :: [(Text, Value)] -> Value

If I replace the above with a signature like this...

object :: ToName a => [(a, Value)] -> Value

...then we will get either inference failures or bad defaults chosen under the extremely common use case of something like this:

instance ToJSON Wibble where
  toJSON a = object ["foo" .= fromWibble a]

@bos
Copy link
Collaborator

bos commented Jul 21, 2015

I'm happy to accept a pull request that gets this right, but I'm going to close out the issue as I don't plan to address it myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants