Developers spend most of their time reading code, understanding it and exploring other ways to use existing solutions. Frankly, in our profession, there is very little time on actually writing new libraries and creating new interfaces in real-life development. So it is quite important to have some help in the most common activities. Naming conventions is one such thing that improves readability and eases the usage cost if agreed upon and spread worldwide.
Some languages have their own special naming conventions that make sense. Haskell is among them. There are a bunch of naming patterns that are commonly used everywhere in the ecosystem (including the standard libraries) that may help you to recognise the function’s meaning without looking at its documentation and even its type! This ability is especially relevant because naming is one of the hardest development problems, so having some help and no-brainer rules to guide in this area improves everyone’s life.
In this post, we will explore common naming conventions in Haskell together. It is going to be useful for both creators (library and API developers) and consumers (library users), as it establishes norms accepted in the libraries’ APIs.
🦋 If you are interested in other conventions and best practices on how to write Haskell code, you can take a look at our style guide.
Checked🔗
Let’s start with the conventional and straightforward norms established in Haskell’s specifications and standards. Names in Haskell must satisfy the following simple rules:
- Types and typeclasses must start with an uppercase letter
- Functions and variables must start with a lowercase letter
- Top-level operator functions must start with any allowed symbol except for
:
- Constructors as operators must start with
:
These rules are in the specifications and therefore checked by the compiler. So if you try to break the naming rules, you will get errors during the compilation.
Additionally, functions follow the lowerCamelCase style and types follow the UpperCamelCase style. This is the de facto standard of writing code in Haskell, but using distinct styles doesn’t lead to a compiler error. Moreover, there are some testing libraries that use snake_case to discover tests automatically. However, you can restrict that with the Haskell tooling, e.g. HLint can check this for you.
There are various details in the names that will give you hints on what the function does. We will walk through them to learn to recognise them all.
Type variables🔗
Let’s start our excursions with the type variables that are most commonly used in type signatures and definitions. Some variables represent particular meaning: it could be a typeclass-related convention or just a convenient shorter usage. But in any way, this information is useful to know while reading and writing types:
Monad
or Semigroup
/Monoid
Foldable
or Traversable
or types of kind Type
(previously *
)
Function variables🔗
The way we name arguments and variables in functions is also not accidental. They contain hints that make reading these variables used in function bodies easier. Variables in functions use the following established commonly used names:
Patterns for lists where x
means a single x and xs
means many x
s
Also used together with the previous one with as-patterns:
Suffixes🔗
Suffix in a Haskell function can contain a lot of information about its purpose. Sometimes you would see multiple different suffixes simultaneously that combine the characteristics of each piece, so it is helpful to pay attention to that.
Apostrophe ’🔗
The '
symbol is used in the functions, for which there is a corresponding function without the apostrophe, e.g. foo
and foo'
.
The apostrophe at the end means that it’s a strict version of a similar function. Both functions must have the same type, but different implementations underneath. The only difference in their behaviour is that the one with the '
symbol evaluates intermediate results more eagerly.
Example:
foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
foldMap' :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
As you can see, both functions have the same type. But foldMap'
is more efficient and helps to avoid space leaks when monoidal operation <>
is strict in both arguments.
Typeclasses🔗
There is a group of symbols that is used to indicate that the function returns the value in some context. This suffix – an uppercase letter or word – tells us the typeclass this context should represent.
Meet the FAM
ily 👩👩👦👦
The suffix F
suggests that it works with Functor
s in some way.
Such functions can have an alternative without the suffix F
.
Example from the containers library:
However, sometimes the suffix F
has an alternative meaning. F
frequently used as a suffix in formatting libraries to indicate that a function is a formatter or a pretty-printer.
Example from the fmt library:
You can see how the same naming can have different meanings. What’s important is that the library establishes its naming convention explicitly and uses it consistently.
The suffix A
means that the function works with some general Applicative
type (the type that has the Applicative
instance).
Examples:
The suffix M
is among the most common ones. It usually means that the function works with the Monad
s or in the monadic context.
Similarly to the previous suffixes, functions with M
have their counterparts without M
:
Note: Historically, the standard Haskell library
base
didn’t haveApplicative
functors, and there weren’t the superclass of Monads. But now the suffixM
is also sometimes used with Applicative functions.
Underscore _🔗
Underscore, as a suffix of functions, also has a special meaning. It gives us a clue that the function works exactly as the one without _
but discards the result (returns ()
instead).
Examples:
Number🔗
You can often see the series of functions with the numbers at the end of their names. These groups have the same initial part but differ with the number. Numbers there represent the number of arguments each function takes.
🔢 The number 1 is not used usually in that meaning as it’s redundant.
👀 Have you noticed the suffix
A
here? :)
Number 1 has a special meaning of requiring at least one argument to be present in a container:
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
foldr1 :: Foldable t => (a -> a -> a) -> t a -> a
Though, the fact that it works with Foldable
is not ideal. There is a proposal to implement a typeclass called Foldable
(or Semifoldable
) for non-empty types.
L/R🔗
The suffixes L
and R
(sometimes l
and r
) represent the direction of function application or order of traversing a data structure.
ℹ️ Most of the time, these sibling functions have the same type, but in some functions, certain arguments are reversed for convenience.
Examples:
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b
scanr :: (a -> b -> b) -> b -> [a] -> [b]
scanl :: (b -> a -> b) -> b -> [a] -> [b]
mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
mapAccumL :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
By/On🔗
Some of the overloaded functions that work with Foldable
s or lists have a non-overloaded sibling with the suffix By
.
It is often convenient to use these functions together with on
, for instance:
This on
pattern is so widely used, so functions also have the suffix On
.
Compare these three types:
In some libraries or applications’ code, the suffix P
shows that the function is a parser of some type, e.g. when using the optparse-applicative library. The usage of this naming convention can look like this:
data Config = Config
{ configPort :: Port
, configPath :: FilePath
}
portP :: Parser Port
pathP :: Parser FilePath
configP :: Parser Config
configP = do
configPort <- portP
configPath <- pathP
pure Config{..}
Prefixes🔗
Now we are going to focus on function prefixes and their meaning. Similar to suffixes, there are some established patterns that are often used by developers.
newtypes🔗
Newtypes in Haskell is a widespread pattern. It is a wrapper around some type. Thus, it is important to mention this relation to the type or the fact that it’s a newtype in the name.
un/get/run🔗
Newtypes can have a name for their only field. One of the most common naming conventions is to name this field as the type name prefixed with un
(short for unwrap):
When un
is followed by small letter, it usually means the inverse of the same function (short for undo):
In the standard library base
, you can find a lot of Monoid
al newtypes that use the prefix get
for the same purposes:
However, if the newtype is some wrapper for a Monad, the prefix run
is utilised instead:
records🔗
Fields in record data types have several well-known naming conventions widely used in the Haskell ecosystem and probably often equally.
One popular naming rule is to prefix each field with the full type name to avoid name conflicts with other records:
Sometimes, the abbreviation is used as a prefix when the full name of the type is too long:
pretty🔗
The prefix pretty
is used for pure functions that display values in a prettier human-readable way, unlike show
, which is supposed to be parsed by Haskell.
data GhcVersion
= Ghc884
| Ghc8102
deriving stock (Show)
prettyGhcVersion :: GhcVersion -> Text
prettyGhcVersion = \case
Ghc884 -> "GHC 8.8.4"
Ghc8102 -> "GHC 8.10.2"
when🔗
The when*
family of functions usually do some actions when the criterion is met. Usually, the first argument is the criterion followed by the action that needs to be run. Such functions typically discard the result of either and return pure ()
.
This convention is originated from the when
function in base:
when :: Applicative f => Bool -> f () -> f ()
-- variations
whenM :: Monad m => m Bool -> m () -> m ()
whenJust :: Applicative m => Maybe a -> (a -> m ()) -> m ()
whenNothingM :: Monad m => m (Maybe a) -> m a -> m a
whenLeft_ :: Applicative f => Either l r -> (l -> f ()) -> f ()
🔗 See how multiple naming conventions are used together?
Similarly, there’s the prefix unless
that has the inverse meaning for the check: when (not p) ≡ unless p
.
Prefix is
is used for predicates that check some property and return Bool
. The property could also be a check on the constructor for sum types or some more specific check:
We have already seen the suffix M
. However, m
is also often used as a prefix. When you see m
in this position, it could have two different meanings described below.
When followed by a lowercase letter, it usually means that the function works with some monadic type (similar to the suffix meaning).
filter :: (a -> Bool) -> [a] -> [a]
mfilter :: MonadPlus m => (a -> Bool) -> m a -> m a
zip :: [a] -> [b] -> [(a, b)]
mzip :: MonadZip m => m a -> m b -> m (a, b)
But when followed by an uppercase letter, it usually means that this is a Maybe
version of a value. This naming convention is generally used with local variables.
printPath :: Maybe FilePath -> IO ()
printPath mPath = case mPath of
Nothing -> putStrLn "No path given"
Just path -> putStrLn $ "Path is: " ++ path
generic🔗
The standard library uses the suffix generic
to provide functions that return polymorphic values or work with more polymorphic arguments. They are usually much slower, as a consequence, but in some cases, they are the best option.
Smart constructors are usually named with the prefix mk
, followed by the type name:
Sometimes, ordinary constructors also start with the prefix Mk
:
Operator conventions🔗
Haskell allows defining custom operators, and as regular functions, they also have a few of their own naming conventions.
Arrows around some already existing operator usually mean that it is a lifted version of it in some sense:
The number of <>
layers can mean the number of applications of the same concept.
In the same spirit, arrows can mean the direction of function application:
Some operators have !
in them, which means they are stricter versions of their analogues:
($) :: (a -> b) -> a -> b
($!) :: (a -> b) -> a -> b
(<$>) :: Functor f => (a -> b) -> f a -> f b
(<$!>) :: Monad m => (a -> b) -> m a -> m b
Others🔗
Haskell also introduces several additional naming conventions.
A function that handles each constructor by returning a value or applying some action to its argument is called an eliminator. It has the same name as the type and starts with a lower letter.
Examples:
When functions or constructors are unsafe, they have the prefix unsafe
or Unsafe
.
Sometimes functions also have the prefix is
or suffix Of
/From
(or both) to make them read more like natural language.
Examples:
Possible ecosystem improvements🔗
We highlighted some of the most common and established naming conventions in the Haskell ecosystem. But sometimes different Haskell libraries or particular functions don’t follow common rules, and have inconsistent or non-obvious naming rules within the library itself.
That means that not every library uses naming conventions, which is unfortunate. It’s very confusing to get used to some rules and common sense, and then realise that they don’t work in some places and your assumptions on how something should work are incorrect. It wastes a bit of our time and also slows down the processes. We, as a community, should work harder on establishing and following the best practices, as this is one the most topical struggles for Haskell developers according to the 2020 State of Haskell survey results.
Here are several examples of potential areas for improvement:
-
In packages with container implementations, functions to extract
Map
keys are calledkeys
but functions to extract values are calledelems
, notvalues
which would be logically ensuing. -
Functions for converting a dictionary to a list of key-value pairs in
containers
is calledassocs
and inunordered-containers
is calledtoList
. At the same time,toList
is also a method ofFoldable
and behaves precisely aselems
in both cases. -
Generally speaking, not having a unified interface for container data structures (maps, sets, sequences, etc.) causes pain from time to time. containers-backpack is one way to solve this problem, though the ecosystem is not yet ready for the backpack feature (which is 4 years old in Haskell).
-
*sql-simple
family of libraries have functions where suffix_
means “no arguments” instead of “this function discards the result”. -
People use apostrophe
'
to define local variables for their updated variables because it is too hard to come up with a new name that will better reflect the meaning of the new var in the scope. E.g. you can often see something likelet cur = f x; cur' = g cur; cur'' = h cur'
. This approach makes code hard to follow and often confusing when variables are not close to each other, and your first thought is that some stricter version of a function is used. -
Haskell has a feature called typed holes. This feature allows using a variable starting with an underscore in expressions, and it lets the compiler help you with the type of the specified expression. However, this conflicts with lens naming rules:
_1
and_2
lenses for tuples and prisms starting with_
. -
Names in the standard library
base
are also inconsistent in some aspects. There are patterns, which we also described in the post, but some anomalies also exist. For example, newtypes likeMax
andConst
have fields namedgetMax
andgetConst
, butIdentity
(also a newtype) has the namerunIdentity
. This inconsistency can be very puzzling often and requires keeping in mind different naming conventions for values of the similar structures. -
The
m/M
letter in various functions doesn’t really explain where the monadic type should go. See for yourselves:filterM :: Applicative m => (a -> m Bool) -> [a] -> m [a] mfilter :: MonadPlus m => (a -> Bool) -> m a -> m a
Both functions satisfy the naming convention we were talking about in a sense. However, there are no logical explanations as to why exactly the first one is working with lists, while the other one is working with general
Monad
s.
Challenge time!🔗
What do you think would be the type of a function called mfilterM
?
We believe that we all can do better here by embracing standard rules and sharing this knowledge with each other.
Conclusion🔗
When coming to programming, one usually doesn’t know anything about the accepted rules and best practices. It takes some time (along with the right people and resources to learn from) to feel “at home”. The same applies when coming to a new language. Naming is one of the essential keys of code readability, usability and understandability. So, sharing this knowledge is as much as important.
To make a community stronger, its users more confident and working as a team, we all need to follow some common standards in naming. We hope that our observations in this write-up could be the first steps into some more common norms and guidelines.
from Hacker News https://ift.tt/3a3fVJE
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.