A new architecture for Cabal

Haskell Ecosystem Workshop

Sam Derbyshire, Well-Typed

June 7th, 2024

Plan

  • Cabal as a packaging framework: Common Architecture for Building Applications and Libraries.
  • The Setup.hs interface and its shortcomings.
  • The Cabal library interface, and how to leverage it (cabal-install, HLS).

Background on Cabal: framework and library

The Cabal specification (2005) was designed to allow Haskell tool authors to package their code and share it with other developers.


The Haskell Package System (Cabal) has the following main goal:

  • to specify a standard way in which a Haskell tool can be packaged, so that it is easy for consumers to use it, or re-package it, regardless of the Haskell implementation or installation platform.

Cabal packages

  • unit of distribution in source format
  • enough metadata to make a into system package
    • package name and version number
    • dependencies (e.g. base >= 4.17 && < 4.21, lens ^>= 5.3)
    • exposed API (libraries, exposed modules, executables…)
    • how to build (e.g. build-type: Simple)

The Cabal library

The Cabal library (including Cabal-syntax) provides:

  • data types, parser and pretty printer for
    • the .cabal file format,
    • the hc-pkg installed package info format,
  • information about Haskell compilers (e.g. supported Haskell language extensions),
  • the Setup.hs CLI,
  • how to build a single package.

Package registration

To implement the Cabal spec, a Haskell compiler (hc) must provide a package registration program (hc-pkg).

The details of package registration are laid out in the Cabal specification.

> ghc-pkg describe attoparsec --package-db=<cabal-store>/<ghc-ver>/package.db
name:            attoparsec
version:         0.14.4
visibility:      public
id:              attoparsec-0.14.4-a723f6157cb40470c8790185a66d38e55c777104
abi:             52f592e79352f6fa9575bbf0d73225b8
exposed:         True
exposed-modules: [...]
hidden-modules:  [...]
depends:         array-0.5.6.0-4cb3 [...]
[...]

Note that, rather confusingly, one does not register packages with this tool, only individual units.

> ghc-pkg describe z-attoparsec-z-internal
name:            z-attoparsec-z-attoparsec-internal
version:         0.14.4
package-name:    attoparsec
lib-name:        attoparsec-internal
id:              attoparsec-0.14.4-6fbd7dcf5cca9291ea0685cbdfeb6ec13ed4a4cb
[...]

The Setup interface

To implement the Cabal specification, the build system of a package needs only provide the Setup command-line interface, consisting of a Setup executable which supports ./Setup <cmd> invocations.

<cmd> description
configure resolve compiler, tools and dependencies
build/haddock/repl prepare sources and build/generate docs/open a GHCi session
test/bench run testsuites or benchmarks
install/register move files into final location/register libraries in the PackageDB
sdist create an archive for distribution/packaging
clean clean local files (local package store, local build artifacts, …)

Flags

Each command comes with its own set of flags, e.g. Cabal ConfigFlags (by far the most complex).

In practice, ./Setup configure takes many flags, with the configuration being preserved for subsequent invocations (which barely take any flags, e.g. ./Setup build -v2 --builddir=<dir>).

Manually building packages with ./Setup

In a build plan, we must manually build all dependencies in dependency order.

To build individual units:

  • ./Setup configure <compName> <confArgs>
  • ./Setup build --builddir=<buildDir>
  • ./Setup haddock --builddir=<buildDir> <haddockArgs>
  • ./Setup copy --builddir=<buildDir> --destDir=<destDir>
  • ./Setup register --builddir=<buildDir> --gen-pkg-config=<unitPkgReg> (libs only)
  • hc-pkg register --package-db=<pkgDb> <unitPkgRegFile> (libs only)

Invoking Setup: the tricky parts

  • passing appropriate arguments to ./Setup configure
    • --package-db=<pkgDb>
    • --cid=<unitId>
    • --dependency=<depPkgNm>:<depCompNm>=<depUnitId>
  • constructing the correct environment for invoking ./Setup
    • putting build-tool-depends executables in PATH
    • defining the corresponding <buildTool>_datadir environment variables.

Versioning of the Setup interface

  • As the Cabal specification evolves, so does the set of flags understood by the Setup CLI.
  • Care needed when cabal-install and the Setup executable use a different version of the Cabal library (Distribution.Client.Setup.filterConfigureFlags).

Setup.hs too general

Each package brings its own (possibly completely custom) build system.
This limits what cabal-install or HLS can do in multi-package projects.

In practice, all packages use the Cabal library:

  1. build-type: Simple

    module Main where
    import Distribution.Simple ( defaultMain ); main = defaultMain
    or
  2. An implementation of build-type: Custom using UserHooks.

build-type: Custom example (singletons-base)

See the Setup.hs file for singletons-base.

Instead we would be better off with a Haskell library interface for customising the build system of a package, and make use of this in cabal-install instead of going through the Setup CLI.

Setup hooks

The Hooks build-type provides a new way to customise how a package is built: use the Cabal library, but with custom hooks that augment (but don’t override) what the Cabal library does.

Setup hooks in practice

cabal-version: 3.14
...
build-type: Hooks
...

custom-setup
  setup-depends:
    base        >= 4.18 && < 5,
    Cabal-hooks >= 0.1  && < 0.2
module SetupHooks where

-- Cabal-hooks
import Distribution.Simple.SetupHooks

setupHooks :: SetupHooks
setupHooks =
  noSetupHooks
    { configureHooks = myConfigureHooks
    , buildHooks = myBuildHooks }

Configure hooks

There are three hooks into the configure phase:

  1. Package-wide pre-configure.
    type PreConfPackageHook = PreConfPackageInputs -> IO PreConfPackageOutputs
    custom ./configure-style logic
  2. Package-wide post-configure.
    type PostConfPackageHook = PostConfPackageInputs -> IO ()
    write package-wide information to disk for (3)
  3. Per-component pre-configure.
    type PreConfComponentHook = PreConfComponentInputs -> IO PreConfComponentOutputs
    modify components (add exposed modules, specify flags)

Configuring a custom preprocessor

See the configure hooks of the custom-preproc example.

Modifying individual components

See the configure hooks of the system-info example.

Pre-build rules

The general mechanism for preparing source files for compilation is that of pre-build rules (Hackage).

Thinking in terms of custom pre-processors:

  • each rule is a pre-processor invocation with specific arguments,
  • the collection of all custom preprocessors is statically known.

Example: singletons-base, using Hooks

See the pre-build rules for singletons-base migrated to build-type: Hooks.

Example: custom preprocessors

See the pre-build rules of the custom-preproc example (myPreBuildRules).

Leveraging the library interface

Plan:

  • compiling hooks to an external executable,
  • versioning of the Hooks API,
  • recompilation checking for pre-build rules,
  • additional challenges of using a library interface in cabal-install.

hooks-exe: the external hooks executable

To integrate packages with build-type: Hooks through a library interface, we compile the SetupHooks module into a separate executable with which we communicate via a CLI.

> hooks-exe <inputHandle> <outputHandle> <hookName>

Note: Uses new CommunicationHandle API from process.


The API for build-type Hooks is not a CLI, it is a library interface.
It is just cabal-install which internally compiles SetupHooks.hs to a separate executable and uses a CLI.

Versioning

How do we compile a package pkg with build-type: Hooks when:

  • pkg declares setup-depends: Cabal == 3.14.*,
  • cabal-install is linked against Cabal 3.16.1.0?

First line of defense: Structured

We use Cabal’s Structured mechanism to ensure that both sides of the IPC channel agree on the Binary instances used.

Private dependencies

To provide compatibility between different Cabal library versions, we propose to use private dependencies.

cabal-install would be linked against multiple versions of the Cabal library, and would come bundled with internal adapter functions.

projectOut_localBuildInfo_V3_16_V3_14
  :: V3_16.LocalBuildInfo -> V3_14.LocalBuildInfo

projectOut_PreConfComponentInputs_V3_16_V3_14
  :: V3_16.PreConfComponentInputs -> V3_14.PreConfComponentInputs

inject_preConfComponentOutputs_V3_14_V3_16
  :: V3_14.PreConfComponentOutputs -> V3_16.PreConfComponentOutputs

hooks-exe: pre-build rules

The external hooks executable supports three queries for pre-build rules:

  • ask for all known rules using the preBuildRules hook
    returns a value of type Map RuleId RuleBinary
  • run a dynamic dependency computation,
    returns additional edges to the build graph of pre-build rules
    (+ extra arguments to be passed to the rule)
  • execute a rule (e.g. run a pre-processor)


Details about how e.g. HLS would leverage this are given in the Cabal-hooks documentation.

TODO: process-global state

When cabal-install invokes ./Setup, it sets a bunch of process-global state.

  let cp =
        (proc path args)
          { Process.cwd = fmap getSymbolicPath $ useWorkingDir options
          , Process.env = env
          , Process.std_out = loggingHandle
          , Process.std_err = loggingHandle
          , Process.delegate_ctlc = isInteractive options
          }
  maybeExit $ rawSystemProc verbosity cp

TODO: making Custom a separate component

  • Packages that make use of a custom setup stanza (Custom and Hooks build types) are treated as a whole.
  • Locked out of certain features (e.g. multiple sublibraries).

This is a long-standing flaw in the implementation of cabal-install.


The task is up for grabs at Cabal #9986.

End of slides

Slides available online: sheaf.github.io/cabal-talk.


Cabal tickets (in increasing order of difficulty):

  • Monitoring directory-recursive file globs in cabal-install: Cabal #10064.
  • Add support for logging handles to Cabal: Cabal #9987.
  • Make setup a separate component: Cabal #9986.

Also:

  • HLS: add fine-grained recompilation logic for pre-build rules.


Setup Hooks reference material: