The Haskell ecosystem’s preferred way of representing binary data is the ByteString type. The bytestring introduces its two variants like so:

  • Strict ByteStrings keep the string as a single large array. This makes them convenient for passing data between C and Haskell.
  • Lazy ByteStrings use a lazy list of strict chunks which makes it suitable for I/O streaming tasks.

Otherwise it has little to preference to express between strict ByteStrings and lazy ByteStrings.

The broader library ecosystem often offers functions for working with both lazy and strict ByteStrings. For example, in aeson exposes decode which decodes a json value from a lazy ByteString in addition to a strict variant decodeStrict. From this, it’s reasonable to assume that lazy ByteString is the default, because aeson exposes it unqualified, while relegating strict to a section with “Variants for strict bytestrings”.

However, as a general rule, I recommend sticking to strict ByteStrings as a default:

  • more memory efficient: lazy ByteStrings include extra bookkeeping overhead for maintaining its list of strict ByteString chunks.
  • reads are faster: in a strict ByteString, reading any position in the string is just reading from an offset in memory. For a lazy ByteString its necessary to follow pointers for values later in the bytestring.
  • O(1) conversion to lazy ByteStrings: converting a lazy ByteString to a strict one is O(n), requiring copying the lazy ByteString’s chunks into a single contiguous block of memory. Strict ByteStrings don’t need to be split into smaller chunks, the existing memory reused.
  • avoid lazy IO: while often convenient seeming, lazy IO leads to hard to diagnose bugs, like IOExceptions being thrown from pure code. You’re generally better off using a streaming framework like conduit, pipes or streamly together with strict ByteStrings.

Output oriented APIs, like serialization, are the one scenario where lazy ByteStrings make a good amount of sense:

  • builders produce lazy ByteStrings: Data.ByteString.Builder only exposes toLazyByteString for converting a Builder into an actual ByteString. It takes an O(n) conversion to turn the result into a strict ByteString.
  • concatenation of lazy ByteStrings is more efficient: lazy ByteString’s list of chunks allows appending new chunks without copying entire buffers of data. Use a builder instead of concatenating lazy ByteStrings directly to avoid creating lots of small inefficient chunks if possible though!
  • APIs for encoding/serializing often produce lazy ByteStrings: because of the above its common for serialization operations like aeson’s encode to only expose versions producing a lazy ByteString.