Supports encoding a value of type A
to a BitVector
and decoding a BitVector
to a value of A
.
Supports encoding a value of type A
to a BitVector
and decoding a BitVector
to a value of A
.
Not every value of A
can be encoded to a bit vector and similarly, not every bit vector can be decoded to a value
of type A
. Hence, both encode and decode return either an error or the result. Furthermore, decode returns the
remaining bits in the bit vector that it did not use in decoding.
There are various ways to create instances of Codec
. The trait can be implemented directly or one of the
constructor methods in the companion can be used (e.g., apply
, derive
). Most of the methods on Codec
create return a new codec that has been transformed in some way. For example, the xmap method
converts a Codec[A]
to a Codec[B]
given two functions, A => B
and B => A
.
One of the simplest transformation methods is def withContext(context: String): Codec[A]
, which
pushes the specified context string in to any errors (i.e., Err
s) returned from encode or decode.
See the methods on this trait for additional transformation types.
See the codecs package object for pre-defined codecs for many common data types and combinators for building larger codecs out of smaller ones.
The ~
operator supports combining a Codec[A]
and a Codec[B]
in to a Codec[(A, B)]
.
For example:
val codec: Codec[Int ~ Int ~ Int] = uint8 ~ uint8 ~ uint8
Codecs generated with ~
result in left nested tuples. These left nested tuples can
be pulled back apart by pattern matching with ~
. For example:
Codec.decode(uint8 ~ uint8 ~ uint8, bytes) map { case a ~ b ~ c => a + b + c }
Alternatively, a function of N arguments can be lifted to a function of left-nested tuples. For example:
val add3 = (_: Int) + (_: Int) + (_: Int) Codec.decode(uint8 ~ uint8 ~ uint8, bytes) map add3
Similarly, a left nested tuple can be created with the ~
operator. This is useful when creating the tuple structure
to pass to encode. For example:
(uint8 ~ uint8 ~ uint8).encode(1 ~ 2 ~ 3)
Tuple based codecs are of limited use compared to HList
based codecs, which is discussed later.
Note: this design is heavily based on Scala's parser combinator library and the syntax it provides.
Sometimes when combining codecs, a latter codec depends on a formerly decoded value.
The flatZip
method is important in these types of situations -- it represents a dependency between
the left hand side and right hand side. Its signature is def flatZip[B](f: A => Codec[B]): Codec[(A, B)]
.
This is similar to flatMap
except the return type is Codec[(A, B)]
instead of Decoder[B]
.
Consider a binary format of an 8-bit unsigned integer indicating the number of bytes following it.
To implement this with flatZip
, we could write:
val x: Codec[(Int, ByteVector)] = uint8 flatZip { numBytes => bytes(numBytes) } val y: Codec[ByteVector] = x.xmap[ByteVector]({ case (_, bv) => bv }, bv => (bv.size, bv))
In this example, x
is a Codec[(Int, ByteVector)]
but we do not need the size directly in the model
because it is redundant with the size stored in the ByteVector
. Hence, we remove the Int
by
xmap
-ping over x
. The notion of removing redundant data from models comes up frequently.
Note: there is a combinator that expresses this pattern more succinctly -- variableSizeBytes(uint8, bytes)
.
HList
s are similar to tuples in that they represent the product of an arbitrary number of types. That is,
the size of an HList
is known at compile time and the type of each element is also known at compile time.
For more information on HList
s in general, see Shapeless.
Codec
makes heavy use of HList
s. The primary operation is extending a Codec[L]
for some L <: HList
to
a Codec[A :: L]
. For example:
val uint8: Codec[Int] = ... val string: Codec[String] = ... val codec: Codec[Int :: Int :: String] = uint8 :: uint8 :: string
The ::
method is sort of like cons-ing on to the HList
but it is doing so *inside* the Codec
type.
The resulting codec encodes values by passing each component of the HList
to the corresponding codec
and concatenating all of the results.
There are various methods on this trait that only work on Codec[L]
for some L <: HList
. Besides the aforementioned
::
method, there are others like :::
, flatPrepend
, flatConcat
, etc. One particularly useful method is
dropUnits
, which removes any Unit
values from the HList
.
Given a Codec[X0 :: X1 :: ... Xn :: HNil]
and a case class with types X0
to Xn
in the same order,
the HList
codec can be turned in to a case class codec via the as
method. For example:
case class Point(x: Int, y: Int, z: Int) val threeInts: Codec[Int :: Int :: Int :: HNil] = uint8 :: uint8 :: uint8 val point: Codec[Point] = threeInts.as[Point]
The HList
analog to flatZip
is flatPrepend
. It has the signature:
def flatPrepend[L <: HList](f: A => Codec[L]): Codec[A :: L]
It forms a codec of A
consed on to L
when called on a Codec[A]
and passed a function A => Codec[L]
.
Note that the specified function must return an HList
based codec. Implementing our example from earlier
using flatPrepend
:
val x: Codec[Int :: ByteVector :: HNil] = uint8 flatPrepend { numBytes => bytes(numBytes).hlist }
In this example, bytes(numBytes)
returns a Codec[ByteVector]
so we called .hlist
on it to lift it
in to a Codec[ByteVector :: HNil]
.
There are similar methods for flat appending and flat concating.
Given some ordered list of types, potentially with duplicates, a value of the HList
of those types
has a value for *every* type in the list. In other words, an HList
represents having an X0
AND X1
AND
... AND XN
. A Coproduct
for the same list of types represents having a value for *one* of those types.
In other words, a Coproduct
represents having an X0
OR X1
OR ... OR XN
. This is somewhat imprecise
because a coproduct can tell us exactly which Xi
we have, even in the presence of duplicate types.
A coproduct can also be thought of as an Either
that has an unlimited number of choices instead of just 2 choices.
Shapeless represents coproducts in a similar way as HList
s. A coproduct type is built using the :+:
operator
with a sentinal value of CNil
. For example, an Int
or Long
or String
is represented as the coproduct type:
Int :+: Long :+: String :+: CNil
For more information on coproducts in general, see Shapeless.
Like HList
based codecs, scodec supports Coproduct
based codecs by coopting syntax from Shapeless. Specifically,
the :+:
operator is used:
val builder = uint8 :+: int64 :+: utf8
Unlike HList
based codecs, the result of :+:
is not a codec but rather a codecs.CoproductCodecBuilder.
Having a list of types and a codec for each is not sufficient to build a coproduct codec. We also need to describe
how each entry in the coproduct is differentiated from the other entries. There are a number of ways to do this
and each way changes the binary format significantly. See the docs on CoproductCodecBuilder
for details.
Codecs for case classes and sealed class hierarchies can often be automatically derived.
Consider this example:
import scodec.codecs.implicits._ case class Point(x: Int, y: Int, z: Int) Codec[Point].encode(Point(1, 2, 3))
In this example, no explicit codec was defined for Point
yet Codec[Point]
successfully created one.
It did this by "reflecting" over the structure of Point
and looking up a codec for each component type
(note: no runtime reflection is performed - rather, this is implemented using macro-based compile time reflection).
In this case, there are three components, each of type Int
, so it looked for an implicit Codec[Int]
.
It then combined each Codec[Int]
using an HList
based codec and finally converted the HList
codec
to a Codec[Point]
. It found the implicit Codec[Int]
instances due to the import of scodec.codecs.implicits._
.
Furthermore, if there was an error encoding or decoding a field, the field name (i.e., x, y, or z) is included
as context on the Err
returned.
This works similarly for sealed class hierarchies -- each subtype is internally represented as a member of a coproduct. There must be the following implicits in scope however:
Discriminated[A, D]
for some discriminator type D
, which provides the Codec[D]
to use for encoding/decoding
the discriminatorDiscriminator[A, X, D]
for each subtype X
of A
, which provides the discriminator value for type X
Codec[X]
for each subtype X
of A
Full examples are available in the test directory of this project.
Note that both case class and sealed hierarchies require implicit component codecs in scope. In both cases, those implicit codecs can themselves be automatically derived, although diverging implicit expansion errors often occur when recursively deriving codecs. These errors can be avoided by lifting derived codecs for the component types to implicit codecs like so:
case class Foo(x: Bar, y: Baz, ...) implicit val codecBar = Codec.derive[Bar] implicit val codecBaz = Codec.derive[Baz] Codec.derive[Foo]
Codecs derived automatically are not defined implicitly -- meaning that if Codec.derive[Foo]
returns
a derived codec, that derived codec will not be available via implicitly[Codec[Foo]]
. Instead,
derived codecs are provided implicitly via the DerivedCodec witness. In fact, Codec.derive[A]
is just an implicit summoning method for DerivedCodec[A]
.
When writing generic combinators that depend on implicitly available codecs, it is often useful
to allow for fallback to a derived codec if there is no explicitly defined implicit codec available.
This support is provided by the ImplicitCodec witness. Instead of requesting an implicit Codec[A]
,
request an ImplicitCodec[A]
to get the fallback to derived behavior.
Note: the decode function can be lifted to a state action via StateT[Err \/ ?, BitVector, A]
. This type alias
and associated constructor is provided by DecodingContext
.
Supports decoding a value of type A
from a BitVector
.
Provides functions for working with decoders.
Alias for state/either transformer that simplifies calling decode on a series of codecs, wiring the remaining bit vector of each in to the next entry.
Wrapper for a codec that was automatically derived by compile time reflection of the structure of the type.
Wrapper for a codec that was automatically derived by compile time reflection of the structure of the type.
This type is not typically used directly. Rather, to get a derived codec,
call Codec.derive[A]
. This type is an implementation detail of Codec.derive
.
See the docs for Codec.derive for more information.
Instances of this type can be used directly as a Codec[A]
. There is no need to manually unwrap by calling d.codec
.
Supports encoding a value of type A
to a BitVector
.
Provides functions for working with encoders.
Provides methods specific to decoders of Shapeless coproducts.
Provides methods specific to encoders of Shapeless coproducts.
Provides additional methods on HList
s.
Describes an error.
Describes an error.
An error has a message and a list of context identifiers that provide insight into where an error occurs in a large structure.
This type is not sealed so that codecs can return domain specific subtypes and dispatch on those subtypes.
Generalized codec that allows the type to encode to vary from the type to decode.
Provides common operations on a Codec[HList]
.
Type class that supports implicit lookup of implicit Codec
instances with a fallback to automatically derived codecs.
Type class that supports implicit lookup of implicit Codec
instances with a fallback to automatically derived codecs.
When writing general combinators that depend on implicit codecs, it is generally best to use ImplicitCodec[A]
as
an implicit parameter, as opposed to using Codec[A]
directly. The advantage of using ImplicitCodec[A]
is that
it allows for codecs that can be automatically derived via Codec.derive
.
Defines derived codecs as a fallback to regular implicit Codec
instances.
Typeclass that describes type constructors that support the exmap
operation.
Provides method syntax for working with a type constructor that has a Transform typeclass instance.
Witness operation that supports transforming an F[A]
to an F[B]
for all F
which have a Transform
instance available.
Witness operation that supports transforming an F[A]
to an F[B]
for all F
which have a Transform
instance available.
Low priority Transformer
builders.
Provides syntax related to generic programming for codecs of any type.
Provides HList
related syntax for codecs of any type.
Companion for Codec.
Operations on coproducts from Shapeless 2.1 backported to 2.0.
Companion for Decoder.
Provides constructors for DecodingContext
.
Companion for DerivedCodec.
Companion for Encoder.
Companion for Err.
Companion for GenCodec.
Operations on HList
s that are not provided by Shapeless.
Companion for ImplicitCodec.
Companion for Transform.
Companion for Transformer.
Provides codecs for common types and combinators for building larger codecs.
Provides codecs for common types and combinators for building larger codecs.
The simplest of the provided codecs are those that encode/decode BitVector
s and ByteVectors
directly.
These are provided by bits and bytes methods. These codecs encode all of the bits/bytes directly
in to the result and decode *all* of the remaining bits/bytes in to the result value. That is, the result
of decode
always returns a empty bit vector for the remaining bits.
Similarly, fixed size alternatives are provided by the bits(size)
and bytes(size)
methods, which
encode a fixed number of bits/bytes (or error if not provided the correct size) and decoded a fixed number
of bits/bytes (or error if that many bits/bytes are not available).
There are more specialized codecs for working with bits, including ignore and constant.
There are built-in codecs for Int
, Long
, Float
, and Double
.
There are a number of predefined integral codecs named using the form:
[u]int${size}[L]
where u
stands for unsigned, size
is replaced by one of 8, 16, 24, 32, 64
, and L
stands for little-endian.
For each codec of that form, the type is Codec[Int]
or Codec[Long]
depending on the specified size.
For example, int32
supports 32-bit big-endian 2s complement signed integers, and uint16L supports 16-bit little-endian
unsigned integers.
Note: uint64[L]
are not provided because a 64-bit unsigned integer does not fit in to a Long
.
Additionally, methods of the form [u]int[L](size: Int)
and [u]long[L](size: Int)
exist to build arbitrarily
sized codecs, within the limitations of Int
and Long
.
IEEE 754 floating point values are supported by the float, floatL, double, and doubleL codecs.
In addition to the numeric codecs, there are built-in codecs for Boolean
, String
, and UUID
.
Boolean values are supported by the bool codecs.
There are a number of methods provided that create codecs out of other codecs. These include simple combinators such as fixedSizeBits and variableSizeBits and advanced combinators such as discriminated, which provides its own DSL for building a large codec out of many small codecs. For a list of all combinators, see the Combinators section below.
There are codecs that support working with encrypted data (encrypted), digital signatures and checksums
(fixedSizeSignature and variableSizeSignature). Additionally, support for java.security.cert.Certificate
s
is provided by certificate and x509Certificate.
Combinator library for working with binary data.
The primary abstraction of this library is Codec, which provides the ability to encode/decode values to/from binary.
There are more general abstractions though, such as Encoder and Decoder. There's also GenCodec which extends both
Encoder
andDecoder
but allows the types to vary. Given these more general abstractions, aCodec[A]
can be represented as aGenCodec[A, A]
.The more general abstractions are important because they allow operations on codecs that would not otherwise be possible. For example, given a
Codec[A]
, mapping a functionA => B
over the codec yields aGenCodec[A, B]
. Without the more general abstractions,map
is impossible to define (e.g., how wouldcodec.map(f).encode(b)
be implemented?). Given aGenCodec[A, B]
, the encoding functionality can be ignored by treating it as aDecoder[B]
, or the encoding type can be changed viacontramap
. If after further transformations, the two types toGenCodec
are equal, we can reconstitute aCodec
from theGenCodec
by callingfuse
.See the codecs package object for pre-defined codecs for many common data types and combinators for building larger codecs out of smaller ones.
For the categorically minded, note the following:
Decoder
is a monadEncoder
is a contravariant functorGenCodec
is a profunctorCodec
is an invariant functorEach type has the corresponding Scalaz typeclass defined in its companion object.