haskell-magic-coercible-ala.md 12 KB

Title: Haskell Magic: Coercible, ala, Monoids and all that Date: 2018-10-08 Modified: 2018-10-08 Category: Blog Slug: haskell-magic-coercible-ala Tags: haskell-magic, haskell Summary: Where I describe a penny-drop moment that too me way too long

So, I recently gushed about Foldable, thanks to a significant penny-drop moment in my own Haskelling. Today, I figured I'd share one more, which leads on from the ideas related to Foldable and monoids I canvassed previously, but adds on a few helpful and clever ideas. Let's get to it.

Newtypes and Coercible

In Haskell, the 'newtype wrapper' is a common idea; something like:

:::haskell
newtype Foo a = Foo { unFoo :: Bar a }

While most newtypes are much more meaningful than that, the idea is common for two reasons:

  1. They allow us to define a different API for Foo than the underlying Bar, ensuring that inappropriate uses are made impossible; and
  2. For types which can be (lawful) members of type classes in more than one way, we can tell the compiler which one we want.

foldMap in particular benefits from the latter use of newtype, as it allows us to vary the behaviour of Foldables full of as by wrapping those as in a different monoid. As many types can be monoids in many ways (even Bool has two ways it can be a monoid), this can give us a wide range of behaviours, as we've seen. However, this isn't the only reason to use newtypes this way: another good example are the two different instances of Applicative for lists.

When we're using newtypes in this way, we're essentially telling the compiler something like 'I want you to do this function in this way, but in reality, these two things are the same'. Thus, in an ideal world, once we've figured out what functions to call to make our type behave the way we want, the compiler can forget all about the newtypes and just work directly with the 'underlying' type. While in theory, this is exactly what the compiler does, in practice, this can be confounded a bit, even for simple cases (for example, lists of newtypes). This forces us to have a runtime cost for operations that really ought to be 'free', which is undesirable. Since GHC 7.8.1, however, we don't need to - thanks to Data.Coerce.Coercible.

So what's the deal here? Essentially, Data.Coerce provides a special Coercible type class, whose instances cannot be defined by the user, and which doesn't persist beyond compile time, for any types which are represented the same way. The exact rules for this are quite specific, but for our purposes, we need only really remember that, whenever we have:

:::haskell
newtype Foo = Foo { unFoo :: Bar }

Then GHC generates instances equivalent to:

:::haskell
instance Coercible Foo Bar
-- plus an appropriate coerce :: Foo -> Bar
instance Coercible Bar Foo
-- plus an appropriate coerce :: Bar -> Foo

Again, these instances don't really exist, and coerce is merely an instruction to treat the same representation differently for the purposes of dispatch (deciding what functions get called) - none of this translates to runtime at all. We can do some fancier things with this, but for our purposes, this will do.

Directing coerce with TypeApplications

Unfortunately, coerce is a highly polymorphic function - its type is effectively:

:::haskell
coerce :: (Coercible a b) => a -> b

This can sometimes throw GHC for a loop when writing polymorphic functions - even if we know exactly what we actually want the coercion to be. To give the compiler a helping hand, we would usually provide a type signature or an argument, but this can be awkward. TypeApplications aims to resolve this problem by giving us some syntax to allow us to state things more directly.

To demonstrate, let's consider a GHCi session. We'll set ourselves up as so:

:::haskell
import Data.Coerce
:set -fprint-explicit-foralls
:set -XTypeApplications

This gives us access to coerce, and also enables a bit of helpful output to allow us to understand how to use TypeApplications more easily. We also have to enable the extension to use it. Let's see some examples of TypeApplications in action:

:::haskell
:t coerce
-- coerce :: forall {a} {b} . Coercible a b => a -> b
:t coerce @Int
-- coerce @Int :: forall {b} . Coercible b Int => Int -> b
:t coerce @_ @Int
-- coerce @_ @Int :: forall {w} . Coercible w Int => w -> Int
:t foldMap
-- foldMap 
--   :: forall {t :: * -> *} {m} {a}.
--      (Foldable t, Monoid m) =>
--      (a -> m) -> t a -> m

Let's take a closer look at that last one. Thanks to -fprint-explicit-foralls, GHCi shows us exactly what type variables we have, and their kinds (as indicated by {t :: * -> *}. Normally, this wouldn't be
helpful, but with TypeApplications, it shows us where we have to use the @ notation to 'fix' the type we want. For example, suppose we wanted to know what type foldMap would have if our Monoid was Sum Int. As m is the second argument listed, we would need to write this:

:::haskell
import Data.Monoid
:t foldMap @_ @(Sum Int)
-- foldMap @_ @(Sum Int)
--   :: forall {w :: * -> *} {a} .
--      Foldable w =>
--      (a -> Sum Int) -> w a -> Sum Int

Any 'trailing' type variables can be left alone, but any 'leading' ones we don't care about need to be 'filled in' with @_, which says 'I don't care what this is'. We also need parentheses if our type is more complex than a single word (as in the case above). Thus, we can use TypeApplications to 'fix' the excessive polymorphism of functions like coerce that confuse GHC, as we showed above.

Making Coercible work for us

One of the downsides of using foldMap is the constant need to do something like this:

:::haskell
count :: (Foldable t) => (a -> Bool) -> t a -> Int
count p = getSum . foldMap (Sum . fromEnum . p)

What we really want to be able to do is say something like:

:::haskell
count p = foldMap (fromEnum . p)

However, this won't work, as our compiler can't possibly read our minds and figure out which monoid definition of Bool we want. We can reduce some of this boilerplate using the coercible-utils package, which provides this wonder-weapon of a function:

:::haskell
ala' :: (Coercible a b, Coercible a' b') => (a -> b) -> ((d -> b) -> c ->
b') -> (d -> a) -> c -> a'

When I first saw this function, I was a bit taken aback - it was difficult to see what on earth it did! However, it's really not all that scary. While its uses are many and varied, for our purposes, it's sufficient to see it as a way of abstracting away the wrapping and unwrapping of Monoid newtypes used with functions like foldMap. Essentially, ala' wants three arguments:

  1. A 'witness' of the Coercible a b (which is never actually used for anything);
  2. A 'higher-order function' which we want to operate with; and
  3. A 'preprocessing function' to massage our input into a more appropriate form.

Let's see how we can derive count using ala', with the help of GHCi once again. Our 'higher-order function' in this case is foldMap:

:::haskell
:t ala'
-- ala'
--  :: forall {a} {b} {a'} {b'} {d} {c}.
--     (Coercible a b, Coercible a' b') =>
--     (a -> b) -> ((d -> b) -> c -> b') -> (d -> a) -> c -> a'
:t \witness preprocessor -> ala' witness foldMap preprocessor
-- \witness preprocessor -> ala' witness foldMap preprocessor
--   :: forall {t :: * -> *} {b'} {a} {a'} {d}.
--      (Foldable t, Monoid b', Coercible a b', Coercible a' b') =>
--      (a -> b') -> (d -> a) -> t d -> a' 

This is starting to get us somewhere. We also know what we want in the end (an Int), so let's use TypeApplications to help. This is a bit wordy in GHCi, but isn't an issue in practice, as we usually give type signatures to functions in source code:

:::haskell
let partialCount :: (Foldable t, Monoid w, Coercible a w, Coercible a' w) => (a -> w) -> (d -> a) -> t d -> a'; partialCount witness preprocessor = ala' witness foldMap preprocessor
:t partialCount
-- partialCount
--  :: forall {t :: * -> *} {w} {a} {a'} {d}.
--     (Foldable t, Monoid w, Coercible a w, Coercible a' w) =>
--     (a -> w) -> (d -> a) -> t d -> a'
:t partialCount @_ @_ @_ @Int @_
-- I rewrote this one with some renaming to avoid confusion - your GHCi may
-- output something else.
-- partialCount @_ @_ @_ @Int @_
--  :: forall {t :: * -> *} {m} {w3} {a}.
--     (Foldable t, Monoid m, Coercible w3 m, Coercible m Int) =>
--     (w3 -> m) -> (a -> w3) -> t a -> Int

Based on the signature we just saw, we can see that our Monoid instance must be coercible to Int. We happened to use Sum Int in our previous definition of count, so we have a witness readily available: Sum. Now we just have to convert the elements of our Foldable into Int, which we can do using fromEnum . p. Putting it all together, we get:

:::haskell
:t \p -> ala' Sum foldMap (fromEnum . p)
-- \p -> ala' Sum foldMap (fromEnum . p)
--   :: forall {t :: * -> *} {a1} {a'} {a2}.
--      (Foldable t, Enum a1, Coercible a' Int) =>
--      (a2 -> a1) -> t a2 -> a'

Now, this is actually more general than count as we defined it before, but it works just fine. If we substitute Bool for a1 and Int for a', we recover count just as we had it. Thus, we can write it as

:::haskell
count p = ala' Sum foldMap (fromEnum . p)

No wrapping or unwrapping necessary.

coercible-utils also provides several helper functions. In many cases, ala' is too general, as we don't really need a 'preprocessor'. Consider, for example, possible default of and and or:

:::haskell
and :: (Foldable t) => t Bool -> Bool
and = ala' All foldMap id

or :: (Foldable t) => t Bool -> Bool
or = ala' Any foldMap id

Having to inject that id in both cases should really not be necessary to do manually. Hence the more restrictive cousin of ala':

:::haskell
ala :: (Coercible a b, Coercible a' b') => (a -> b) -> ((a -> b) -> c -> d')
-> c -> a'
ala f hof = ala' f hof id

So now, we can rewrite those as:

:::haskell
and = ala All foldMap
or = ala Any foldMap

In fact, these definitions are more general than the ones in Data.Foldable:

:::haskell
-- without -fprint-explicit-foralls
:t ala Any foldMap
-- ala Any foldMap :: (Foldable t, Coercible a' Bool) => t Bool -> a'

We don't have to give back a Bool as a result of this function -- anything that's coercible to Bool will do. This is another advantage of using ala and ala'; we can potentially 're-wrap' into different newtypes after each call if we need to. This is an extra layer of generality on top of what foldMap already gives us; this is why I've been using foldMap everywhere recently, along with coercible-utils.

Rounding off

Thus ends my description of my penny-drop moment about Coercible, TypeApplications and ala'. Each of these seem tailor-made for the others, and thanks to coercible-utils, can be used to do some amazing things. I encourage everyone to check out coercible-utils - if nothing else, you'll find some really good tricks for working with newtypes in it, and likely might be able to solve some of your problems much more concisely. Think hard, and have fun.