carp-docs

Hacking the Carp Compiler

This doc contains various tips and tricks, notes, explanations and examples that can help you make changes to the Carp compiler. Be forewarned that it’s not an exhaustive guide book, and likely will remain a hodgepodge of accumulated remarks, observations and hints contributed by people that have modified the compiler in the past.

Note: General familiarity with compilers and compilation terminology is assumed.

Structure

The Carp compiler source lives in the src/ directory. Carp is, roughly speaking, organized into four primary passes or components:

carp compiler phases

Each source file plays a part in one or more components/phases in the compiler. The sections below briefly describe the purpose of each stage and list important source files. You can use these sections to get a rough idea of what files you might need to edit in order to alter the functionality of a particular phase.

Note: Some sources contain definitions that are important or used in pretty much every phase of the compiler, in result some files may appear more than once in the sections below.

Parsing

The parsing phase translates .carp source files into abstract syntax trees (AST). In carp, AST nodes are represented using an abstract data type called XObj. XObjs are ubiquitous across the compiler and are used in several different phases and contexts. Every XObj consists of:

The following sources are important for parsing:

Dynamic Evaluator

As stated in the Macro guide the dynamic evaluator is the central component in the compiler. As the name suggests, the evaluator evaluates parsed carp code (XObjs) and prepares it for emission. Evaluation entails:

In addition to the XObjs corresponding to the source file being compiled, the evaluator relies on a ContextContext is a global object that contains state for the compiler. The compiler’s Context is comprised of several environments, defined by the Env type–which hold references to known bindings. Different environments are used by different phases of the compiler to evaluate forms, resolve types, and, generally speaking prepare code for emission.

Binders are another important abstract data type used in evaluation. Any value that’s bound to a name in a source program is translated into a binder, which is comprised of the XObj of the form bound to the name, as well as additional metadata for the binding. Binders are added to the environments in the Context.

The following sources are important for evaluation:

Some other pieces of the type system and borrow checking mechanisms could be included in this list as well, but this list captures the core functionality related to evaluation. Generally speaking, the evaluation component is the conductor of our compilation symphony and orchestrates all the other parts of the compiler.

Note: For a more in depth look at the dynamic evaluator, see the section on inner workings in the Macro guide

Type System

The type system is responsible for checking the types of Carp forms and ensuring programs are type safe. It also supports polymorphism and is responsible for replacing polymorphic types with concrete types.

Carp types are represented by the Ty data type.

The following sources are important for the type system:

Borrow Checking/Ownership System

Borrow checking an lifetime parameters are an extension of the type system. All of the files that are important to the type system are likewise important for the borrow checker.

Code Emission

The compiler’s final job is to emit C code corresponding to the source Carp input. Emission relies heavily on the concept of Templates – effectively a structured way to generate C strings based on evaluated Carp AST nodes.

The following sources are important for the code emission system:

Other sources

In addition to the sources listed above, there are other miscellaneous source files that serve different purposes in the compiler:

Mini HowTos

Select compiler changes are more frequent than others and have common high-level steps. The following sections provide some guidance on making such changes.

Adding a new Primitive

If it doesn’t require anything fancy or out of the ordinary, adding a new primitive to the compiler entails the following:

  1. Define your new primitive in Primitives.hs
  2. Add your primitive to the starting environment using makePrim in StartingEnv.hs

Define your Primitive

Primitives are functions of the Primitive type:

type Primitive = XObj -> Context -> [XObj] -> IO (Context, Either EvalError XObj)

Every primitive takes an xobj, the form that represents the primitive, a compiler context, and a list of XObjs the primitive form’s arguments. Primitives return a new Context, updated based on the logic they performed, and either an XObj or evaluation error that’s reported to the user.

For example, here’s how the defmodule primitive maps to the Primitive type:

(defmodule Foo (defn bar [] 1))
 |         |-----------------|
 XObj      [XObj] (arguments)

The Context argument captures the state of the compiler and doesn’t have a corresponding direct representation in Carp forms.

In Primitives.hs, you should name your primitive using the naming scheme primitive<name>, where <name> is the name of the symbol that will call your primitive in Carp code. For example, defmodule is given by the primitive primitiveDefmodule.

Most of the time, primitives have three core steps:

Let’s step through each of these core steps by implementing a simple immutable primitive. The immutable primitive will take a variable (the name of a form passed to a def) and mark it as immutable, preventing users from calling set! on it.

And that wraps up the core logic of our primitive. To make it available, we just need to register it in StartingEnv.hs.

Add your primitive to the starting environment

To add a primitive to the starting environment, call makePrim:

, makePrim "immutable" 1 "annotates a variable as immutable" "(immutable my-var)" primitiveImmutable

That’s about it. Note that this implementation just adds special metadata to bindings–to actually prevent users from calling set! on an immutable def we’d need to update set!’s logic to check for the presence of the immutable metadata.