Great power newtypes

A newtype is a specialized data type declaration. Such that it contains only one constructor and a field.

newtypeFoo a = Bar anewtypeId = MkIdWord

Common Beginner Questions

What is the difference from data type data?

dataFoo a = Bar adataId = MkIdWord

The main specificity of a newtype is that it consists of the same parts as its only field. More precisely, it differs from the original at the type level, but it has the same memory representation, and it is calculated strictly (not lazily).
In short, the newtype is more efficient due to its presentation.

Yes, it means nothing to me ... I will use data
No, well, in the end, you can always turn on the -funpack-strict-fields :) extension for strict (not lazy) fields or specify directly

dataId = MkId !Word

Yet the power of newtype is not limited to the efficiency of the calculations. They are much stronger!

3 newtype roles

Hiding implementation

module Data.Id (Id()) wherenewtypeId = MkIdWord

A newtype is different from the original, internally just Word .
But we hide the MkId constructor outside the module.

Distribution implementation

{-# LANGUAGE GeneralizedNewtypeDeriving #-}newtypeId = MkIdWordderiving (Num, Eq)

Although this is not in the Haskell2010 standard, thanks to the expansion of the generalized newTypes output, you can automatically infer newtype behavior the same as the internal field behavior. In our case, the behavior of Eq Id and Num Id is the same as the Eq Word and Num Word .

Much more can be achieved through the expansion of the refined derivation ( DerivingVia ), but more on that later.

Implementation of choice

Despite your own constructor, in some cases you can use your own internal representation.

Task

There is a list of integers. Find the maximum and total amount for just one pass through the list.
And do not use the foldl and folds packages .

Typical answer

Of course, fold ! :)

foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
{-
-- instance Foldable []
foldr :: (a -> b -> b) -> b -> [a] -> b
-}

And, the final function is described as:

aggregate :: [Integer] -> (MaybeInteger, Integer)
aggregate = foldr
      (\el (m, s) -> (Just el `max` m, el + s))
      (Nothing, 0)
{-
ghci> aggregate [1, 2, 3, 4]
(Just 4, 10)
-}

If you look closely, you can see similar operations on both sides: Just el `max` m and el + s . In both cases - mapping and binary operation. And the empty elements are Nothing and 0 .

Yes, these are monoids!

Monoid and Semigroup more details

Полугруппа — это свойство ассоциативной бинарной операции

x ⋄ (y ⋄ z) == (x ⋄ y) ⋄ z

Моноид — это свойство ассоциативной операции (то есть полугруппы)

x ⋄ (y ⋄ z) == (x ⋄ y) ⋄ z

которое имеет пустой элемент, не меняющий любой элемент ни справа, ни слева

x ⋄ empty  ==  x  ==  empty ⋄ x

Both max and (+) are associative, both have empty elements - Nothing and 0 .

And the union of the mapping of monoids together with the convolution is the same Foldable !

Foldable more details

Напомним определение сворачиваемости:

classFoldable t where 
      foldMap :: (Monoid m) => (a -> m) -> t a -> m
      ...

Let's apply the rollover behavior to max and (+) . We will be able to organize no more than one implementation of the Word monoid . It's time to use the newtype option !

{-# LANGUAGE GeneralizedNewtypeDeriving #-}-- already in Data.Semigroup & Data.MonoidnewtypeSum a = Sum {getSum :: a}deriving (Num, Eq, Ord)
instance (Numa, Orda) => Semigroup (Suma) where
    (<>) = (+)
instance (Numa, Orda) => Monoid (Suma) where
    mempty = Sum0newtypeMax a = Max {getMax :: a}deriving (Num, Eq, Ord)
instance (Numa, Orda) => Semigroup (Maxa) where
    (<>) = max

It is necessary to make a remark.

The fact is that in order to be a monoid for the Max a data type , we need a minimum element, that is, for an empty element to exist. So, a monoid can only be a limited Max a .

Theoretically correct monoid of maximal element

newtypeMax a = Max ainstanceOrd a => Semigroup (Maxa)
instanceBounded a => Monoid (Maxa)

So somehow we have to convert our data type so that an empty element appears and we can use clotting.

-- already in PreludedataMaybe a = Nothing | Just ainstanceSemigroup a => Semigroup (Maybea) whereNothing <> b = b
    b <> Nothing = b
    (Just a) <> (Just b) = Just (a <> b)
instanceSemigroup a => Monoid (Maybea) where
    mempty = Nothing-- ------instanceFunctorMaybewhere
    fmap _ Nothing = Nothing
    fmap f (Just b) = Just (f b)

The conjugate element Maybe turns a semigroup into a monoid!

Liberalization of restrictions in fresh versions of GHC

Ещё в GHC 8.2 требовался моноид в ограничении типа

instanceMonoid a => Monoid (Maybea)

а значит нам был необходим ещё один новыйТип:

-- already in Data.Semigroup & Data.MonoidnewtypeOption a = Option {getOption :: Maybea}deriving (Eq, Ord, Semigroup)
instance (Orda, Semigroupa) => Monoid (Optiona) where
    mempty = OptionNothing

И значительно проще уже в GHC 8.4, где необходима лишь полугруппа в ограничении типа, и даже нет необходимости в создании типа Опция.

instanceSemigroup a => Monoid (Maybea)

Cooldown response

Well, now let's update the code using the rollover and arrows.
Recall that (.) Is just a functional composition:

 (.) :: (b -> c) -> (a -> b) -> a -> c
 f . g = \x -> f (g x)

And remember that fmap is a functor:

fmap :: Functor f => (a -> b) -> f a -> f b

its implementation for Maybe is described just above.

Arrow more details

Стрелки — это свойства некоторых функций, которые позволяют работать с ними блок-схемно.
Более детально, можно посмотреть тут: Arrows: A General Interface to Computation
В нашем случае мы используем Стрелки функции
То есть

instanceArrow (->)

Мы будем использовать функции:

(***) :: Arrow a => a b c -> a b' c' -> a (b, b') (c, c')
(&&&) :: Arrow a => a b c -> a b c' -> a b (c, c')

Для нашего случая

a b c   ==   (->) b c   ==   b -> c

И, соответственно, подпись наших функций сокращается до:

(***) :: (b -> c) -> (b' -> c') -> ((b, b') -> (c, c'))
(&&&) :: (b -> c) -> (b -> c') -> (b -> (c, c'))

Или совсем простыми словами, функция (***) объединяет две функции с одним аргументом(и одним выходным типом) в функцию с работой пары аргументов на входе, и на выходе, соответственно пара выходных типов.

Функция (&&&) — это урезанная версия (***), где тип входного аргументов двух функций одинаков, и на входе у нас не пара аргументов, а один аргумент.

Total, unifying function acquired the form:

import Data.Semigroup 
import Data.Monoid
import Control.Arrow
aggregate :: [Integer] -> (MaybeInteger, Integer)
aggregate = 
      (fmap getMax *** getSum)
      . (foldMap (Just . Max &&& Sum))
{-
-- for GHC 8.2
aggregate = 
     (fmap getMax . getOption *** getSum)
     . (foldMap (Option . Just . Max &&& Sum))
-}

It turned out very briefly!

But, it is still tiring to wrap and wrap data from nested types!
You can still cut, and we will help resourceless forced conversion!

Safe non-resource forced conversion and role roles

There is a function from the package Unsafe.Coerce - unsafeCoerce

import Unsafe.Coerce(unsafeCoerce)
unsafeCoerce :: a -> b

The function forcibly converts the type: from a to b .
In essence, the function is magic, it tells the compiler to consider data of type a as type b , without taking into account the consequences of this step.

It can be used to convert nested types, but you must act very carefully.

In 2014, a revolution occurred with a newtype , namely, a secure resource-free forced conversion appeared!

import Data.Coerce(coerce)
coerce :: Coercible a b => a -> b

This function has opened a new era in working with newtype .

The coercible force converter works with types that have the same structure in memory. It looks like a class-type, but in fact GHC converts types during compilation and it is impossible to independently determine instances.
The Data.Coerce.coerce function allows nonresource type conversions, but for this we need access to the type constructors.

Now simplify our function:

import Data.Semigroup 
import Data.Monoid
import Control.Arrow
import Data.Coerce
aggregate :: [Integer] -> (MaybeInteger, Integer)
aggregate = 
      coerce . (foldMap (Just . Max &&& Sum))
-- coerce :: (Maybe (Max Integer), Sum Integer) -> (Maybe Integer, Integer)

We avoided the routine of pulling out nested types; we did it without wasting resources with just one function.

Roles of nested data types

With the coerce function , we can forcefully convert any nested types.
But should this feature be so widely used?

-- already in Data.Ord-- Down a - reversed ordernewtypeDown a = Down aderiving (Eq, Show)
instanceOrd a => Ord (Downa) where
    compare (Down x) (Down y) = y `compare` x
import Data.List(sort)
-- SorteddataSorted a = Sorted [a]deriving (Show, Eq, Ord)
fromList2Sorted :: Ord a => [a] -> Sorted a
fromList2Sorted = Sorted . sort
-- minimum: O(1) !minView :: Sorted a -> Maybe a
minView (Sorted []) = NothingminView (Sorted (a : _))  = Just a

Semantically, it is absurd to convert to Sorted a from Sorted (Down a) .
However, you can try:

ghci> let h = fromList2Sorted [1,2,3] :: SortedIntghci> let hDown = fromList2Sorted $ fmap Down [1,2,3] :: Sorted (DownInt)
ghci> minView h
Just (Down1)
ghci> minView (coerce h :: Sorted (DownInt))
Just (Down1)
ghci> minView hDown
Just (Down3)

All anything, but the correct answer is Just (Down 3) .
It was in order to cut off the wrong behavior that type roles were introduced.

{-# LANGUAGE RoleAnnotations #-}type role Sorted nominal

Let's try now:

ghci> minView (coerce h :: Sorted (DownInt))
error: Couldn't match type ‘Int’ with ‘DownInt’
        arising from a use of ‘coerce’

Much better!

In total there are 3 roles ( type role ):

representational - equivalent if the representation is the same.
nominal - must be exactly the same type
phantom - does not depend on real content. Equivalent to anything

In most cases, the compiler is smart enough to reveal the role of the type, but it can be helped.

Specified Injection DerivingVia Behavior

Thanks to the expansion of the language DerivingVia , the newtype distribution role has improved .

Starting with GHC 8.6, which was recently released, this new extension has appeared.

{-# LANGUAGE DerivingVia #-}newtypeId = MkIdWordderiving (Semigroup, Monoid) via MaxWord

As you can see, the type behavior is automatically derived due to the clarification of how to output.
DerivingVia can be applied to any type that supports Coercible and what's important - completely without the consumption of resources!

Even more, DerivingVia can be applied not only to newtype , but also to any isomorphic types, if they support generics Generics and forced conversion Coercible .

findings

Types newtype is a powerful force that greatly simplifies and improves the code, eliminates the routine and reduces resource consumption.

Original translation : The Great Power of newtypes (Hiromi Ishii)

PS I think after this article, published more than a year ago [not mine] article Haskell 's newtype magic about new types will be a little clearer!

Tags: