| Portability | GHC |
|---|---|
| Maintainer | Simon Meier <iridcode@gmail.com> |
Data.ByteString.Lazy.Builder.BasicEncoding
Contents
Description
This module provides the types of fixed-size and bounded-size encodings,
which are the basic building blocks for constructing Builders.
They are used for application specific performance tuning of Builders.
For example,
libraries such as blaze-html
or aeson use the functions provided by this module to implement efficient
encodings that combine escaping and character encoding.
We explain fixed-size and bounded-size encodings in three steps.
First, we define them formally.
Then, we explain how they can improve the performance of a Builder.
Finally, we give two examples to illustrate their use.
Fixed(-size) encodings are encodings that always result in a sequence of bytes
of a predetermined, fixed length.
An example for a fixed encoding is the big-endian encoding of a Word64,
which always results in exactly 8 bytes.
Bounded(-size) encodings are encodings that always result in a sequence
of bytes that is no larger than a predetermined bound.
An example for a bounded encoding is the UTF-8 encoding of a Char,
which results always in less or equal to 4 bytes.
Note that every fixed encoding is also a bounded encoding.
This module does not expose functions that exploit the special
properties of fixed-size encodings.
However,
they are exploited in the functions encodeSizePrefixed and encodeChunked
from Data.ByteString.Lazy.Builder.Extras, which prefix a Builder with
its (chunk) size.
In the following,
we therefore only refer to bounded encodings.
The goal of bounded encodings is to improve the performance of Builders.
These improvements stem from making the two
most common steps performed by a Builder more efficient.
The first most common step is the concatentation of two Builders.
Internally,
concatentation corresponds to function composition.
(Note that Builders can be seen as difference-lists
of buffer-filling functions;
cf. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/dlist.
)
Function composition is a fast O(1) operation.
However,
we can use bounded encodings to
remove some of these function compositions altoghether,
which is obviously more efficient.
The second most common step performed by a Builder is to fill a buffer
using a bounded encoding,
which works as follows.
The Builder checks whether there is enough space left to
execute the bounded encoding.
If there is, then the Builder executes the bounded encoding
and calls the next Builder with the updated buffer.
Otherwise,
the Builder signals its driver that it requires a new buffer.
This buffer must be at least as large as the bound of the encoding.
We can use bounded encodings to reduce the number of buffer-free
checks by fusing the buffer-free checks of consecutive
Builders.
We can also use bounded encodings to simplify the control flow
for signalling that a buffer is full by
ensuring that we check first that there is enough space left
and only then decide on how to encode a given value.
Let us illustrate these improvements on the
CSV-table rendering example from Data.ByteString.Lazy.Builder.
Its "hot code" is the rendering of a table's cells,
which we implement as follows using only the functions from the
Builder API.
import Data.ByteString.Lazy.Builder as B import Data.ByteString.Lazy.Builder.ASCII as B renderCell :: Cell -> Builder renderCell (StringC cs) = renderString cs renderCell (IntC i) = B.intDec i renderString :: String -> Builder renderString cs = B.charUtf8 '"' <> foldMap escape cs <> B.charUtf8 '"' where escape '\\' = B.charUtf8 '\\' <> B.charUtf8 '\\' escape '\"' = B.charUtf8 '\\' <> B.charUtf8 '\"' escape c = B.charUtf8 c
Efficient encoding of Ints as decimal numbers is performed by intDec
from Data.ByteString.Lazy.Builder.ASCII.
Optimization potential exists for the escaping of Strings.
The above implementation has two optimization opportunities.
First,
we can use a single buffer free check for 4 bytes before escaping a
character and only then decide on how to escape the character.
This simplifies the control flow and allows to avoid a closure construction,
which improves the performance of the Builder.
Second,
the concatenations performed by foldMap can be eliminated.
The following implementation exploits these optimizations.
import qualified Data.ByteString.Lazy.Builder.BasicEncoding as E
import Data.ByteString.Lazy.Builder.BasicEncoding
( ifB, fromF, (>*<), (>$<) )
renderString :: String -> Builder
renderString cs =
B.charUtf8 '"' <> E.encodeListWithB escape cs <> B.charUtf8 '"'
where
escape :: E.BoundedEncoding Char
escape =
ifB (== '\\') (fixed2 ('\\', '\\')) $
ifB (== '\"') (fixed2 ('\\', '\"')) $
E.charUtf8
{-# INLINE fixed2 #-}
fixed2 x = fromF $ const x >$< E.char7 >*< E.char7
The code should be mostly self-explanatory.
The slightly awkward syntax is because the combinators
are written such that the sizeBound of the resulting BoundedEncoding
can be computed at compile time.
We also explicitly inline the fixed2 encoding,
which encodes a fixed tuple of characters,
to ensure that the bound compuation happens at compile time.
When encoding the following list of Strings,
the optimized implementation of renderString is two times faster.
maxiStrings :: [String] maxiStrings = take 1000 $ cycle ["hello", "\"1\"", "λ-wörld"]
Most of the performance gain stems from using encodeListWithB,
which encodes a list of values from left-to-right with a
BoundedEncoding.
It exploits the Builder internals to avoid unnecessary function
compositions (i.e., concatentations).
In the future,
we would expect the compiler to perform the optimizations
implemented in encodeListWithB by himself.
However,
it seems that the code is currently to complicated for the
compiler to see through.
Therefore,
we provide the BoundedEncoding escape hatch,
which allows data structures to provide very efficient encoding traversals,
like encodeListWithB for lists.
Note that BoundedEncodings are a bit verbose, but quite versatile.
Here is an example of a BoundedEncoding for combined HTML escaping and
UTF-8 encoding.
It exploits that the escaped character with the maximal Unicode
codepoint is '>'.
{-# INLINE charUtf8HtmlEscaped #-}
charUtf8HtmlEscaped :: E.BoundedEncoding Char
charUtf8HtmlEscaped =
ifB (> '>' ) E.charUtf8 $
ifB (== '<' ) (fixed4 ('&',('l',('t',';')))) $ -- <
ifB (== '>' ) (fixed4 ('&',('g',('t',';')))) $ -- >
ifB (== '&' ) (fixed5 ('&',('a',('m',('p',';'))))) $ -- &
ifB (== '"' ) (fixed5 ('&',('#',('3',('4',';'))))) $ -- "
ifB (== '\'') (fixed5 ('&',('#',('3',('9',';'))))) $ -- '
(fromF E.char7) -- fallback for Chars smaller than '>'
where
{-# INLINE fixed4 #-}
fixed4 x = fromF $ const x >$<
E.char7 >*< E.char7 >*< E.char7 >*< E.char7
{-# INLINE fixed5 #-}
fixed5 x = fromF $ const x >$<
E.char7 >*< E.char7 >*< E.char7 >*< E.char7 >*< E.char7
- data FixedEncoding a
- size :: FixedEncoding a -> Int
- emptyF :: FixedEncoding a
- pairF :: FixedEncoding a -> FixedEncoding b -> FixedEncoding (a, b)
- contramapF :: (b -> a) -> FixedEncoding a -> FixedEncoding b
- encodeWithF :: FixedEncoding a -> a -> Builder
- encodeListWithF :: FixedEncoding a -> [a] -> Builder
- encodeUnfoldrWithF :: FixedEncoding b -> (a -> Maybe (b, a)) -> a -> Builder
- encodeByteStringWithF :: FixedEncoding Word8 -> ByteString -> Builder
- encodeLazyByteStringWithF :: FixedEncoding Word8 -> ByteString -> Builder
- data BoundedEncoding a
- sizeBound :: BoundedEncoding a -> Int
- fromF :: FixedEncoding a -> BoundedEncoding a
- emptyB :: BoundedEncoding a
- pairB :: BoundedEncoding a -> BoundedEncoding b -> BoundedEncoding (a, b)
- eitherB :: BoundedEncoding a -> BoundedEncoding b -> BoundedEncoding (Either a b)
- ifB :: (a -> Bool) -> BoundedEncoding a -> BoundedEncoding a -> BoundedEncoding a
- contramapB :: (b -> a) -> BoundedEncoding a -> BoundedEncoding b
- (>$<) :: Contravariant f => (a -> b) -> f b -> f a
- encodeWithB :: BoundedEncoding a -> a -> Builder
- encodeListWithB :: BoundedEncoding a -> [a] -> Builder
- encodeUnfoldrWithB :: BoundedEncoding b -> (a -> Maybe (b, a)) -> a -> Builder
- encodeByteStringWithB :: BoundedEncoding Word8 -> ByteString -> Builder
- encodeLazyByteStringWithB :: BoundedEncoding Word8 -> ByteString -> Builder
- int8 :: FixedEncoding Int8
- word8 :: FixedEncoding Word8
- int16BE :: FixedEncoding Int16
- int32BE :: FixedEncoding Int32
- int64BE :: FixedEncoding Int64
- word16BE :: FixedEncoding Word16
- word32BE :: FixedEncoding Word32
- word64BE :: FixedEncoding Word64
- floatBE :: FixedEncoding Float
- doubleBE :: FixedEncoding Double
- int16LE :: FixedEncoding Int16
- int32LE :: FixedEncoding Int32
- int64LE :: FixedEncoding Int64
- word16LE :: FixedEncoding Word16
- word32LE :: FixedEncoding Word32
- word64LE :: FixedEncoding Word64
- floatLE :: FixedEncoding Float
- doubleLE :: FixedEncoding Double
- intHost :: FixedEncoding Int
- int16Host :: FixedEncoding Int16
- int32Host :: FixedEncoding Int32
- int64Host :: FixedEncoding Int64
- wordHost :: FixedEncoding Word
- word16Host :: FixedEncoding Word16
- word32Host :: FixedEncoding Word32
- word64Host :: FixedEncoding Word64
- floatHost :: FixedEncoding Float
- doubleHost :: FixedEncoding Double
- word8Base128LE :: BoundedEncoding Word8
- word16Base128LE :: BoundedEncoding Word16
- word32Base128LE :: BoundedEncoding Word32
- word64Base128LE :: BoundedEncoding Word64
- wordBase128LE :: BoundedEncoding Word
- int8ZigZagBase128LE :: BoundedEncoding Int8
- int16ZigZagBase128LE :: BoundedEncoding Int16
- int32ZigZagBase128LE :: BoundedEncoding Int32
- int64ZigZagBase128LE :: BoundedEncoding Int64
- intZigZagBase128LE :: BoundedEncoding Int
- char7 :: FixedEncoding Char
- int8Dec :: BoundedEncoding Int8
- int16Dec :: BoundedEncoding Int16
- int32Dec :: BoundedEncoding Int32
- int64Dec :: BoundedEncoding Int64
- intDec :: BoundedEncoding Int
- word8Dec :: BoundedEncoding Word8
- word16Dec :: BoundedEncoding Word16
- word32Dec :: BoundedEncoding Word32
- word64Dec :: BoundedEncoding Word64
- wordDec :: BoundedEncoding Word
- word8Hex :: BoundedEncoding Word8
- word16Hex :: BoundedEncoding Word16
- word32Hex :: BoundedEncoding Word32
- word64Hex :: BoundedEncoding Word64
- wordHex :: BoundedEncoding Word
- int8HexFixed :: FixedEncoding Int8
- int16HexFixed :: FixedEncoding Int16
- int32HexFixed :: FixedEncoding Int32
- int64HexFixed :: FixedEncoding Int64
- word8HexFixed :: FixedEncoding Word8
- word16HexFixed :: FixedEncoding Word16
- word32HexFixed :: FixedEncoding Word32
- word64HexFixed :: FixedEncoding Word64
- floatHexFixed :: FixedEncoding Float
- doubleHexFixed :: FixedEncoding Double
- char8 :: FixedEncoding Char
- charUtf8 :: BoundedEncoding Char
- evalF :: FixedEncoding a -> a -> [Word8]
- evalB :: BoundedEncoding a -> a -> [Word8]
- showF :: FixedEncoding a -> a -> String
- showB :: BoundedEncoding a -> a -> String
Fixed-size encodings
data FixedEncoding a Source
An encoding that always results in a sequence of bytes of a pre-determined, fixed size.
Instances
| Contravariant FixedEncoding |
size :: FixedEncoding a -> IntSource
The fixed size of the sequences of bytes generated by this
FixedEncoding.
Combinators
The combinators for FixedEncodings are implemented such that the size
of the resulting FixedEncoding is computed at compile time.
emptyF :: FixedEncoding aSource
The FixedEncoding that always results in the zero-length sequence.
pairF :: FixedEncoding a -> FixedEncoding b -> FixedEncoding (a, b)Source
Encode a pair by encoding its first component and then its second component.
contramapF :: (b -> a) -> FixedEncoding a -> FixedEncoding bSource
Change an encoding such that it first applies a function to the value to be encoded.
Note that encodings are Contrafunctors
http://hackage.haskell.org/package/contravariant.
The following laws hold.
contramapF id = id contramapF f . contramapF g = contramapF (g . f)
Builder construction
In terms of expressivity, the function encodeWithF would be sufficient
for constructing Builders from FixedEncodings. The fused variants of
this function are provided because they allow for more efficient
implementations. Our compilers are just not smart enough yet; and for some
of the employed optimizations (see the code of encodeByteStringWithF)
they will very likely never be.
Note that functions marked with "Heavy inlining." are forced to be
inlined because they must be specialized for concrete encodings,
but are rather heavy in terms of code size. We recommend to define a
top-level function for every concrete instantiation of such a function in
order to share its code. A typical example is the function
byteStringHexFixed from Data.ByteString.Lazy.Builder.ASCII, which is
implemented as follows.
{-# NOINLINE byteStringHexFixed #-}
byteStringHexFixed :: S.ByteString -> Builder
byteStringHexFixed = encodeByteStringWithF word8HexFixed
encodeWithF :: FixedEncoding a -> a -> BuilderSource
Encode a value with a FixedEncoding.
encodeListWithF :: FixedEncoding a -> [a] -> BuilderSource
Encode a list of values from left-to-right with a FixedEncoding.
encodeUnfoldrWithF :: FixedEncoding b -> (a -> Maybe (b, a)) -> a -> BuilderSource
Encode a list of values represented as an unfoldr with a FixedEncoding.
encodeByteStringWithF :: FixedEncoding Word8 -> ByteString -> BuilderSource
Heavy inlining. Encode all bytes of a strict ByteString from
left-to-right with a FixedEncoding. This function is quite versatile. For
example, we can use it to construct a Builder that maps every byte before
copying it to the buffer to be filled.
mapToBuilder :: (Word8 -> Word8) -> S.ByteString -> Builder mapToBuilder f =encodeByteStringWithF(contramapFfword8)
We can also use it to hex-encode a strict ByteString as shown in the
byteStringHexFixed example above.
encodeLazyByteStringWithF :: FixedEncoding Word8 -> ByteString -> BuilderSource
Heavy inlining. Encode all bytes of a lazy ByteString from
left-to-right with a FixedEncoding.
Bounded-size encodings
data BoundedEncoding a Source
An encoding that always results in sequence of bytes that is no longer than a pre-determined bound.
Instances
| Contravariant BoundedEncoding |
sizeBound :: BoundedEncoding a -> IntSource
The bound on the size of sequences of bytes generated by this
BoundedEncoding.
Combinators
The combinators for BoundedEncodings are implemented such that the
sizeBound of the resulting BoundedEncoding is computed at compile time.
fromF :: FixedEncoding a -> BoundedEncoding aSource
Convert a FixedEncoding to a BoundedEncoding.
emptyB :: BoundedEncoding aSource
The BoundedEncoding that always results in the zero-length sequence.
pairB :: BoundedEncoding a -> BoundedEncoding b -> BoundedEncoding (a, b)Source
Encode a pair by encoding its first component and then its second component.
eitherB :: BoundedEncoding a -> BoundedEncoding b -> BoundedEncoding (Either a b)Source
Encode an Either value using the first BoundedEncoding for Left
values and the second BoundedEncoding for Right values.
Note that the functions eitherB, pairB, and contramapB (written below
using >$<) suffice to construct BoundedEncodings for all non-recursive
algebraic datatypes. For example,
maybeB :: BoundedEncoding () -> BoundedEncoding a -> BoundedEncoding (Maybe a) maybeB nothing just =maybe(Left ()) Right>$<eitherB nothing just
ifB :: (a -> Bool) -> BoundedEncoding a -> BoundedEncoding a -> BoundedEncoding aSource
Conditionally select a BoundedEncoding.
For example, we can implement the ASCII encoding that drops characters with
Unicode codepoints above 127 as follows.
charASCIIDrop =ifB(< '\128') (fromFchar7)emptyB
contramapB :: (b -> a) -> BoundedEncoding a -> BoundedEncoding bSource
Change a BoundedEncoding such that it first applies a function to the
value to be encoded.
Note that BoundedEncodings are Contrafunctors
http://hackage.haskell.org/package/contravariant.
The following laws hold.
contramapB id = id contramapB f . contramapB g = contramapB (g . f)
We provide an overloaded operator for the contramapF and contramapB
combinators to allow for a more convenient syntax. The operator is
source-compatible with the one provided by the contravariant library.
Once this library is part of the Haskell platform, we will make
FixedEncoding and BoundedEncoding instances of its Contravariant
type-class.
(>$<) :: Contravariant f => (a -> b) -> f b -> f aSource
An overloaded infix operator for contramapF and contramapB.
We can use it for example to prepend and/or append fixed values to an
encoding. The following implementation surrounds an UTF-8 encoded character
with singlequotes; e.g., .
showF singleQuotedUtf8 'a' = "'a'"
singleQuotedUtf8 ::FixedEncodingCharsingleQuotedUtf8 = ((x -> ('\'', (x, '\'')))>$<pairFchar7(pairFcharUtf8char7)
Note that the rather verbose syntax for composition stems from the
requirement to be able to compute the sizes and sizeBounds at
compile time.
Builder construction
encodeWithB :: BoundedEncoding a -> a -> BuilderSource
Encode a value with a BoundedEncoding.
Note that consecutive uses of encodeWithB and encodeWithF are rewritten
such that their bounds-checks are fused; i.e., we rewrite using the rules
encodeWithF = encodeWithB . fromF
encodeWithB be1 x1 `mappend` (encodeWithB be2 x2) = encodeWithB (pairB be1 be2) (x1, x2)
For example,
encodeWithB (word32 x1) `mappend` encodeWithB (word32 x2)
is rewritten such that the resulting Builder checks only once, if ther are
at least 8 free bytes, instead of checking twice, if there are 4 free bytes.
This rewrite rule is not observationally equivalent, as it may change the
boundaries of the generated chunks. We deem this acceptable, as for all
use-cases of Builders known to us the precise location of chunk
boundaries does not matter.
encodeListWithB :: BoundedEncoding a -> [a] -> BuilderSource
Encodes a list of values from left-to-right using a BoundedEncoding.
encodeUnfoldrWithB :: BoundedEncoding b -> (a -> Maybe (b, a)) -> a -> BuilderSource
Create a Builder that encodes a list of values represented as an
unfoldr with a BoundedEncoding.
encodeByteStringWithB :: BoundedEncoding Word8 -> ByteString -> BuilderSource
Heavy inlining. Encode all bytes of a strict ByteString from
left-to-right with a BoundedEncoding.
For example, we can use this function to construct a Builder that filters
every byte before copying it to the buffer to be filled.
filterToBuilder :: (Word8 -> Bool) -> S.ByteString -> Builder filterToBuilder p =encodeByteStringWithB(ifBp (fromFword8)emptyB)
encodeLazyByteStringWithB :: BoundedEncoding Word8 -> ByteString -> BuilderSource
Heavy inlining. Encode all bytes of a lazy ByteString from
left-to-right with a BoundedEncoding.
Standard encodings of Haskell values
Binary encodings
int8 :: FixedEncoding Int8Source
Encoding single signed bytes as-is.
word8 :: FixedEncoding Word8Source
Encoding single unsigned bytes as-is.
Fixed-length
Big-endian
int16BE :: FixedEncoding Int16Source
Encoding Int16s in big endian format.
int32BE :: FixedEncoding Int32Source
Encoding Int32s in big endian format.
int64BE :: FixedEncoding Int64Source
Encoding Int64s in big endian format.
word16BE :: FixedEncoding Word16Source
Encoding Word16s in big endian format.
word32BE :: FixedEncoding Word32Source
Encoding Word32s in big endian format.
word64BE :: FixedEncoding Word64Source
Encoding Word64s in big endian format.
floatBE :: FixedEncoding FloatSource
Encode an IEEE Float in big endian format.
doubleBE :: FixedEncoding DoubleSource
Encode an IEEE Double in big endian format.
Little-endian
int16LE :: FixedEncoding Int16Source
Encoding Int16s in little endian format.
int32LE :: FixedEncoding Int32Source
Encoding Int32s in little endian format.
int64LE :: FixedEncoding Int64Source
Encoding Int64s in little endian format.
word16LE :: FixedEncoding Word16Source
Encoding Word16s in little endian format.
word32LE :: FixedEncoding Word32Source
Encoding Word32s in little endian format.
word64LE :: FixedEncoding Word64Source
Encoding Word64s in little endian format.
floatLE :: FixedEncoding FloatSource
Encode an IEEE Float in little endian format.
doubleLE :: FixedEncoding DoubleSource
Encode an IEEE Double in little endian format.
Non-portable, host-dependent
intHost :: FixedEncoding IntSource
Encode a single native machine Int. The Ints is encoded in host order,
host endian form, for the machine you are on. On a 64 bit machine the Int
is an 8 byte value, on a 32 bit machine, 4 bytes. Values encoded this way
are not portable to different endian or integer sized machines, without
conversion.
int16Host :: FixedEncoding Int16Source
Encoding Int16s in native host order and host endianness.
int32Host :: FixedEncoding Int32Source
Encoding Int32s in native host order and host endianness.
int64Host :: FixedEncoding Int64Source
Encoding Int64s in native host order and host endianness.
wordHost :: FixedEncoding WordSource
Encode a single native machine Word. The Words is encoded in host order,
host endian form, for the machine you are on. On a 64 bit machine the Word
is an 8 byte value, on a 32 bit machine, 4 bytes. Values encoded this way
are not portable to different endian or word sized machines, without
conversion.
word16Host :: FixedEncoding Word16Source
Encoding Word16s in native host order and host endianness.
word32Host :: FixedEncoding Word32Source
Encoding Word32s in native host order and host endianness.
word64Host :: FixedEncoding Word64Source
Encoding Word64s in native host order and host endianness.
floatHost :: FixedEncoding FloatSource
Encode an IEEE Float in native host order and host endianness. Values written
this way are not portable to different endian machines, without conversion.
doubleHost :: FixedEncoding DoubleSource
Encode an IEEE Double in native host order and host endianness.
Variable-length
Little-endian, base-128
word8Base128LE :: BoundedEncoding Word8Source
Variable-length, little-endian, base-128 encoding of a Word8.
word16Base128LE :: BoundedEncoding Word16Source
Variable-length, little-endian, base-128 encoding of a Word16.
word32Base128LE :: BoundedEncoding Word32Source
Variable-length, little-endian, base-128 encoding of a Word32.
word64Base128LE :: BoundedEncoding Word64Source
Variable-length, little-endian, base-128 encoding of a Word64.
wordBase128LE :: BoundedEncoding WordSource
Variable-length, little-endian, base-128 encoding of a Word.
Note that in contrast to the fixed-width binary encoding of a Word,
whose width depends on the register-width of a machine,
this encoding is machine-independent for values small enough to
be represented using a Word on all relevant machines.
Little-endian, base-128, zig-zag
int8ZigZagBase128LE :: BoundedEncoding Int8Source
Variable-length, little-endian, base-128, zig-zag encoding of an Int8.
int16ZigZagBase128LE :: BoundedEncoding Int16Source
Variable-length, little-endian, base-128, zig-zag encoding of an Int16.
int32ZigZagBase128LE :: BoundedEncoding Int32Source
Variable-length, little-endian, base-128, zig-zag encoding of an Int32.
int64ZigZagBase128LE :: BoundedEncoding Int64Source
Variable-length, little-endian, base-128, zig-zag encoding of an Int64.
intZigZagBase128LE :: BoundedEncoding IntSource
Variable-length, little-endian, base-128, zig-zag encoding of an Int.
Note that in contrast to the fixed-width binary encoding of an Int,
whose width depends on the register-width of a machine,
this encoding is machine-independent for values small enough to
be represented using an Int on all relevant machines.
Character encodings
ASCII
char7 :: FixedEncoding CharSource
Encode the least 7-bits of a Char using the ASCII encoding.
Decimal numbers
Decimal encoding of numbers using ASCII encoded characters.
int8Dec :: BoundedEncoding Int8Source
Decimal encoding of an Int8.
int16Dec :: BoundedEncoding Int16Source
Decimal encoding of an Int16.
int32Dec :: BoundedEncoding Int32Source
Decimal encoding of an Int32.
int64Dec :: BoundedEncoding Int64Source
Decimal encoding of an Int64.
intDec :: BoundedEncoding IntSource
Decimal encoding of an Int.
word8Dec :: BoundedEncoding Word8Source
Decimal encoding of a Word8.
word16Dec :: BoundedEncoding Word16Source
Decimal encoding of a Word16.
word32Dec :: BoundedEncoding Word32Source
Decimal encoding of a Word32.
word64Dec :: BoundedEncoding Word64Source
Decimal encoding of a Word64.
wordDec :: BoundedEncoding WordSource
Decimal encoding of a Word.
Hexadecimal numbers
Encoding positive integers as hexadecimal numbers using lower-case ASCII characters. The shortest possible representation is used. For example,
showB word16Hex 0x0a10 = "a10"
Note that there is no support for using upper-case characters. Please contact the maintainer if your application cannot work without hexadecimal encodings that use upper-case characters.
word8Hex :: BoundedEncoding Word8Source
Hexadecimal encoding of a Word8.
word16Hex :: BoundedEncoding Word16Source
Hexadecimal encoding of a Word16.
word32Hex :: BoundedEncoding Word32Source
Hexadecimal encoding of a Word32.
word64Hex :: BoundedEncoding Word64Source
Hexadecimal encoding of a Word64.
wordHex :: BoundedEncoding WordSource
Hexadecimal encoding of a Word.
Fixed-width hexadecimal numbers
Encoding the bytes of fixed-width types as hexadecimal numbers using lower-case ASCII characters. For example,
showF word16HexFixed 0x0a10 = "0a10"
int8HexFixed :: FixedEncoding Int8Source
Encode a Int8 using 2 nibbles (hexadecimal digits).
int16HexFixed :: FixedEncoding Int16Source
Encode a Int16 using 4 nibbles.
int32HexFixed :: FixedEncoding Int32Source
Encode a Int32 using 8 nibbles.
int64HexFixed :: FixedEncoding Int64Source
Encode a Int64 using 16 nibbles.
word8HexFixed :: FixedEncoding Word8Source
Encode a Word8 using 2 nibbles (hexadecimal digits).
word16HexFixed :: FixedEncoding Word16Source
Encode a Word16 using 4 nibbles.
word32HexFixed :: FixedEncoding Word32Source
Encode a Word32 using 8 nibbles.
word64HexFixed :: FixedEncoding Word64Source
Encode a Word64 using 16 nibbles.
floatHexFixed :: FixedEncoding FloatSource
Encode an IEEE Float using 8 nibbles.
doubleHexFixed :: FixedEncoding DoubleSource
Encode an IEEE Double using 16 nibbles.
ISO/IEC 8859-1 (Char8)
The ISO/IEC 8859-1 encoding is an 8-bit encoding often known as Latin-1.
The Char8 encoding implemented here works by truncating the Unicode
codepoint to 8-bits and encoding them as a single byte. For the codepoints
0-255 this corresponds to the ISO/IEC 8859-1 encoding. Note that the
Char8 encoding is equivalent to the ASCII encoding on the Unicode
codepoints 0-127. Hence, functions such as intDec can also be used for
encoding Ints as a decimal number with Char8 encoded characters.
char8 :: FixedEncoding CharSource
Char8 encode a Char.
UTF-8
The UTF-8 encoding can encode all Unicode codepoints.
It is equivalent to the ASCII encoding on the Unicode codepoints 0-127.
Hence, functions such as intDec can also be used for encoding Ints as
a decimal number with UTF-8 encoded characters.
charUtf8 :: BoundedEncoding CharSource
UTF-8 encode a Char.
Testing support
The following four functions are intended for testing use
only. They are not efficient. FixedEncodings and BoundedEncodings
are efficently executed by creating Builders from them using the
encodeXXX functions explained at the top of this module.
evalF :: FixedEncoding a -> a -> [Word8]Source
For testing use only. Evaluate a FixedEncoding on a given value.
evalB :: BoundedEncoding a -> a -> [Word8]Source
For testing use only. Evaluate a BoundedEncoding on a given value.
showF :: FixedEncoding a -> a -> StringSource
For testing use only. Show the result of a FixedEncoding of a given
value as a String by interpreting the resulting bytes as Unicode
codepoints.
showB :: BoundedEncoding a -> a -> StringSource
For testing use only. Show the result of a BoundedEncoding of a given
value as a String by interpreting the resulting bytes as Unicode
codepoints.