Gambit-C: Embedding C code directly in Scheme

Jan 18, 2024

black horse chess piece near roque chess piece — Photo by Piotr Makowski on Unsplash

Every programming language bills itself as a general purposes programming language, but that always comes with tradeoffs. Working in a systems programming language gives you access to pointers and raw memory, affording you fine grained control over your code. However, this comes at the cost of slower development and platform specific headaches. Working in a high level programming language likes Python lets you crank out code quickly, but the performance suffers as you are many layers removed from the hardware. So if one programming language can’t cover all of your needs, how about two? One for high level programming, and one for low level programming? Thankfully just about every language has a way of interacting with C code through a foreign interface. But some make that easier than others. Enter Gambit-C.

Gambit-C is a Scheme that compiles to C code, and can be built using a C compiler. There are two ways of using C code in Gambit. One is calling C functions in Gambit, the other is by calling Gambit code from C. Lets look at both.

Using C code in Gambit

Gambit defines 5 special forms for working with C data

c-declare
c-lambda
c-define
c-initialize
c-define-type

These are executed during the “C phase” of a Gambit-C program, before the execution of Scheme code. Here is how you would use them.

C-Declare

Lets create an empty file called embed.scm and save it. We can compile this with the Gambit compiler gsc with the -c flag to see the corresponding C file that is created.

gsc -c embed.scm

The resulting C file contains preprocessor defines, information about the version of Gambit and an include for gambit.h. Here is a portion of what that file looks like

#ifdef ___LINKER_INFO
; File: "embed.c", produced by Gambit v4.9.3
(
409003
(C)
"embed"
(("embed"))
(
"embed"
)
(
)
(
"embed#"
)
(
)
(
)
 ()
)
#else
#define ___VERSION 409003
#define ___MODULE_NAME "embed"
#define ___LINKER_ID ___LNK_embed
#define ___MH_PROC ___H_embed
#define ___SCRIPT_LINE 0
#define ___SYMCOUNT 1
#define ___GLOCOUNT 1
#define ___SUPCOUNT 1
#define ___SUBCOUNT 1
#define ___LBLCOUNT 2
#define ___MODDESCR ___REF_SUB(0)
#include "gambit.h"

c-declare is a special form that allows us to write our own includes, function declarations and variable declarations in the resulting C file. The code defined here can be used in the rest of our c-special forms.

Let’s test this out by adding a simple c-declare to our embed.scm file that includes stdio.h

(c-declare #<<c-declare-end
#include <stdio.h>
c-declare-end
)

Now if we compile our code with the same flags as last time

gsc -c embed.scm

We see that stdio.h is now included in our embed.c file.

___BEGIN_SUB
 ___DEF_SUB(___X0)
___END_SUB




#include <stdio.h>



#undef ___MD_ALL
#define ___MD_ALL ___D_R0 ___D_R1

C-Lambda

C-lambda allows you to create a Scheme procedure that corresponds to a C function, or a sequence of C code. It is used to bridge between Scheme and C, allowing you to call C functions or execute C code directly from Scheme. It has 3 parts.

A list of types for each argument the C function expects.
The type for the value the C function returns.
The body with the name of the C function or a string of C code to execute.

Since we defined stdio.h in the c-declare section, lets write a simple hello world program in our c-lambda.

(c-declare #<<c-declare-end
#include <stdio.h>
c-declare-end

)

((c-lambda () void
#<<c-lambda-end
printf("Hello World!\n");
c-lambda-end
))

Now we compile our Scheme code with gsc

gsc -exe embed.scm

and execute to get our Hello World!

./embed
 
Hello World!

And now, if we look at the C file we can see our printf.

#undef ___AT_END
#define ___return goto ___return_embed_23_0
printf("Hello World!\n");
___return_embed_23_0:;

I don’t know about you, but this is starting to get interesting. Being able to execute C code by just embedding it directly into a Scheme file is super cool and convenient.

We can expand our example further by writing an actual C function and using it in Scheme. Here is our new embed.scm file

;; create our square function
(c-declare #<<c-declare-end

  int square(int x) {
    return x * x;
  }

c-declare-end
)

;; Define a Scheme function to call our 'square' c function
(define square-scheme (c-lambda (int) int "square"))

;; using our square function
(display "Enter an integer: ")
(define input (read))
(let ((result (square-scheme input)))
  (display "The square of ")
  (display input)
  (display " is ")
  (display result)
  (newline))

It contains a new function in the c-declare section that squares numbers. In the c-lambda section, I tell Scheme what the C function is called, and what the arguments and return values are.

Then I write some regular Scheme code to prompt the user for an integer, read said integer, and then call the bound C function “square” using the square-scheme function. Now we compile like last time, execute the code, type in an integer, and bam!

We square values. So we can not only define c functions directly in our Scheme code, but also bind and execute those functions as well, without ever having to leave Scheme.

But I admit, writing C in a macro block inside a Scheme file is not the best user experience. There is neither syntax highlighting nor formatting. Thankfully you can define a C function in a regular ole C file and just use that. Here is a simple file named triple.c that does just that

//triple.c
int triple(int num){
  return num * 3;
}

In our embed.scm file, using our trusty c-declare and c-lambda, we can bind the c code.

(c-declare #<<c-declare-end
  #include "triple.c"
c-declare-end
)

(define num 100)
(define triple (c-lambda (int) int "triple"))

(display (string-append "The triple of " (number->string num) " is " (number->string (triple num))))
(newline)

Then we compile and execute…

And just like that we are using C code in Scheme with no fuss.

Inline Assembly

A quick detour. In C you can drop down to inline assembly when the moments require it. Since a c-lambda allows you to write any valid C code, it stands to reason you would be able to do this in Gambit-C as well. Here is a small example in a file called asm.scm

((c-lambda () void
#<<c-lambda-end
    int a = 10;
    int b = 20;
    int result;

    // Inline assembly block
    __asm__(
        "add %1, %2\n\t" // Assembly instruction: Add b to a
        "mov %2, %0\n\t" // Move the sum (now in a) to result
        : "=r" (result)   // Output operands: 'result' is output in a register
        : "r" (a), "r" (b) // Input operands: a and b are inputs, both in registers
    );

    printf("The result is: %d\n", result);

c-lambda-end
))

(display "ASM example")
(newline)

After compiling and running the file like normal, we can get the result of our assembly operation.

terminal showing the output of our asm result

Using Gambit Code in C

C-Define

The c-define special form in Gambit-C is a way to connect Scheme procedures with C, allowing C functions to call Scheme code directly. When C calls the Scheme code, its arguments are converted to Scheme types, and passed to the Scheme procedure. The Scheme procedure's result is then converted back to a C type, and returned by the function.

Lets look at an example. Let’s say I want to read in a file using Scheme instead of C. I can do it by creating a read.scm file that looks like this using c-define…

(c-define (read-file-into-string filename) (char-string) char-string "c_read_file_into_string" ""
  (let ((port (open-input-file filename))  
        (content ""))
    (let loop ((line (read-line port)))  
      (if (eof-object? line)  
          (begin
            (close-input-port port)  
            content)  
          (begin
            (set! content (string-append content line "\n"))  
            (loop (read-line port)))))
    ))

The c-define is simple, consisting of the five parameters.

The name for the function in Scheme (read-file-into-string)
The number of arguments (in this case just filename)
the type of the input arguments (char-string) and the type of the output value (char-string)
The name of the function in C
The scope. In our case ““ but it can be “static” as well

After that it’s just a regular ole body of Scheme code. We can create a read.h file which has the prototype for our Scheme/C function

#ifndef READ_H
#define READ_H

char* c_read_file_into_string(const char* filename);

#endif // READ_H

And then we create our main.c function from which we will call our Scheme code from. I struggled with this part a bit, but I found this old conversation on GitHub that cleared things up for me. Here is the boiler plate for how you set up C to call Scheme in your main.c. I’ve bolded the sections that you have to change. The rest can stay the same

#include <stdio.h>

#include "gambit.h"

#include "your_scheme.h"

#define SCHEME_LIBRARY_LINKER ____20_your_scheme_

___BEGIN_C_LINKAGE
extern ___mod_or_lnk SCHEME_LIBRARY_LINKER (___global_state_struct*);
___END_C_LINKAGE


int main(int argc, char** argv) {
// C code like normal 
  printf("Hello World, this is from C\n");

  ___setup_params_struct setup_params;
  ___setup_params_reset (&setup_params);

  setup_params.version = ___VERSION;
  setup_params.linker = SCHEME_LIBRARY_LINKER;

  ___setup (&setup_params);
// scheme code
  your_scheme_function();

  ___cleanup();
  
  return 0;
}

Our files we want to use will be called read.h, and read_.c. To create our read_.c file properly, we need to generate c code from Scheme that does not contain a main entry point, or it will conflict with our main.c file. I’ll let Marc Feeley (the maintainer of Gambit-C) in the GitHub discussions explain why…

By default the link file (here somescheme_.c generated from somescheme.c) contains a C “main” function so that without any additional step the code can be compiled and linked (at the C level) to obtain an executable program. In other words the entry point of the program will be the Gambit runtime system.
However your file main.c also defines a C “main” function so you end up with a duplicate definition. The way around this, shown below, is to compile the link file with the -D___LIBRARY C compiler option. This will cause the somescheme_.c file to avoid the inclusion of a C “main” function (through a series of #ifdefs). In other words, the Scheme code and the Gambit runtime system are acting as a library to the main C program.

Let’s create a quick build.sh file, that will build our Scheme code without a main, and then call gcc to glue everything together…

#!/bin/bash
gsc -c read.scm
gsc -link read.c
gsc -obj -cc-options -D___LIBRARY read.c read_.c
gsc -obj main.c

# don't forget to include the location of the Gambit headers and dynamic libraries in the gcc command!

gcc read.o read_.o main.o -I/usr/include/gambit.h /usr/lib/x86_64-linux-gnu/libgambit.so.4 -lm -ldl -lutil -lssl -lcrypto -o main

As you can see, the read.scm file is converted into a read.c using the commands we talked about before, and then is linked. The linked file is then compiled with the gsc -obj -cc-options -D___LIBRARY and converted to a read_.c file. We then compile an object file out of our main.c. Finally we just call gcc and pass it our object files and specify all our libs and voilà! We have a main program. Here is what our main.c function looks like before we compile it. I’ve bolded the sections of the boiler plate that I changed.

#include <stdio.h>

#include "gambit.h"

#include "read.h"

#define SCHEME_LIBRARY_LINKER ___LNK_read__

___BEGIN_C_LINKAGE
extern ___mod_or_lnk SCHEME_LIBRARY_LINKER (___global_state_struct*);
___END_C_LINKAGE


int main(int argc, char** argv) {
  printf("Hello World, this is from C\n");

  ___setup_params_struct setup_params;
  ___setup_params_reset (&setup_params);

  setup_params.version = ___VERSION;
  setup_params.linker = SCHEME_LIBRARY_LINKER;

  ___setup (&setup_params);

  char *text = c_read_file_into_string("dickinson.txt");
  printf("%s\n", text);

  ___cleanup();
  
  return 0;
}

And when we run the main executable…

scheme code in C reading Emily Dickinson

Boom! we are calling Scheme code in C.

C-Define-Type

At its simplest, c-define-type lets you define a name in Scheme that refers to a C type. You use this when you have to interact with c-types that are not your standard, int, char, float etc. For example the manual shows how it can be used with Cs “File” type, and how it can be used to create a pointer to a file type.

(c-define-type FILE "FILE")
(c-define-type FILE* (pointer FILE))

I’ve found it the most useful when binding C code that requires custom structs. In Raylib, Raylib uses a custom Color struct in many of it’s functions

typedef struct Color {
    unsigned char r;        // Color red value
    unsigned char g;        // Color green value
    unsigned char b;        // Color blue value
    unsigned char a;        // Color alpha value
} Color;

We can make the Color struct available from our Gambit-C code by using c-define-type

(c-define-type color (struct "Color"))

This allows us to annotate functions like clear background with the “color” type

(define ClearBackground (c-lambda (color) void "ClearBackground"))

But what if we actually need to create a color? Well we have to define a c-lambda that constructs the color for us. Here we define our make-color function, which takes in 4 unsigned-int8s (don’t annotate with unsigned-char because unsigned-char is not 8 bit on the Scheme side), and returns a color. The body of our c-lambda creates a color from the arguments we passed to the c-lambda, which we can access with ___arg1, ___arg2, etc.

(define make-color (c-lambda (unsigned-int8 unsigned-int8 unsigned-int8 unsigned-int8) color
#<<c-lambda-end
Color col =(Color){___arg1, ___arg2, ___arg3, ___arg4}; 
___return( col);
c-lambda-end
))

(define LIGHTGRAY (make-color 200 200 200 255))
(define YELLOW (make-color 253 249 0 255))
(define ORANGE (make-color 255 161 0 255))

If we want to get a individual value of the color, unfortunately we will have to create functions for each value I.E color-r, color-g, color-b etc. Here is an example of me doing this using another type, Vector2, which is a struct with two float components, an X, and a Y.

(define make-vector2 (c-lambda (float float) vector2
#<<c-lambda-end
Vector2 vec =(Vector2){___arg1, ___arg2}; 
___return(vec);
c-lambda-end
))

(define vector2-x (c-lambda (vector2) float
#<<c-lambda-end
___return(___arg1.x);
c-lambda-end
))

(define vector2-y (c-lambda (vector2) float
#<<c-lambda-end
___return(___arg1.y);
c-lambda-end
))

but this is Scheme, so defining a hygenic macro to do this should be easy enough.

C-Initialize

I’ll be honest I haven’t needed to use this so I don’t know much about it. All’s I know is it is good if you need to start a resource before your code runs. I.E connect to a database.

Binding other C-Libraries

Because Gambit-C interfaces with C so well, it is trivial to bind to C libraries. As an example, lets expand upon the Raylib bindings (A popular game library written in C) we introduced earler. Since Raylib builds the releases for every major platform, we don’t have to compile the code ourselves. Download the build for your platform on their release page. Since I”m on linux I’ll download the tar.gx

# Download the code
wget https://github.com/raysan5/raylib/releases/download/5.0/raylib-5.0_linux_amd64.tar.gz

# decompress
tar -xzf raylib-5.0_linux_amd64.tar.gz

Once we have Raylib on our system we see two folders in the downloaded directory. include contains all of our C header files, and lib contains our compiled code (libraylib.a, libraylib.so, etc). So as not to confuse Gambit with multiple definitions for the same code, lets delete everything in the lib folder except libraylib.a. This lets us use Raylib statically (Gambit can use dynamic libraries too!)

Now that we’ve done that we can work on writing the Gambit code. To simplify things we will just write it directly in this directory. First lets create a file called main.scm. Once we’ve done that, we can use our c-declare’s and c-lambda’s, just like we’ve been doing before, to bring in the header files, and bind our functions.

(c-declare #<<c-declare-end
#include "raylib.h"
c-declare-end
)

(define screenWidth 800)
(define screenHeight 450)

;; Declare Raylib functions
(define InitWindow (c-lambda (int int char-string) void "InitWindow"))
(define SetTargetFPS (c-lambda (int) void "SetTargetFPS"))
(define WindowShouldClose (c-lambda () bool "WindowShouldClose"))
(define BeginDrawing (c-lambda () void "BeginDrawing"))
(define EndDrawing (c-lambda () void "EndDrawing"))
(define CloseWindow (c-lambda () void "CloseWindow"))


;; Initialization
(InitWindow screenWidth screenHeight "raylib [core] example - basic window")
(SetTargetFPS 60)

(let loop ()
  (unless (WindowShouldClose)
    (BeginDrawing)
    (EndDrawing)
    (loop)))

Once we’ve written our binding code, it’s time to compile. If this were C code we would compile our program with these flags

gcc -o main main.c -Iinclude/ -Llib/ -lraylib -lGL -lm -lpthread -ldl -lrt -lX11

Thankfully, gsc, the Gambit compiler, allows us to compile it in much the same way. The Gambit-C compiler works in two phases, the first is a compilation step to create .o files and the next is a linking phase. If you specify the above commands to run in the compilation phase (--cc-options) and the linking phase (-ld-options) then it will build the executable just fine

gsc -exe -cc-options "-Iinclude/ -Llib/" -ld-options "-Llib/ -lraylib -lGL -lm -lpthread -ldl -lrt -lX11" main.scm

Now we run our main program and…

We can use Raylib to make games.

One Final Experiment

I’ve been recently getting back into C and translated my classic brainfuck interpreter to C for practice.

It consists of 3 files, a main.c, bf_interpreter.c, and a bf_interpreter.h. I wrote this before I ever started using Gambit, so there are no tricks to make this code easier to embed. We know how easy it is for Gambit to call the C code on it’s own, all languages with an FFI can do that. But what if I copy and paste the code directly into a c-declare and then call it with a c-lambda? Would it still work? Let’s start with by defining our c-declare. I'll create a brainfuck.scm file and I’ll copy and paste my bf_interpreter.h into the declare…

 (c-declare #<<c-declare-end

#include <string.h>
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#define STACK_SIZE 30000

struct Interpreter {
    unsigned char stack[STACK_SIZE];
    unsigned char *stack_ptr;
    char *code;
    char *code_ptr;
    bool is_valid;
};
struct Interpreter createInterpreter(const char code[]);

int interpreterInterpret(struct Interpreter *interpreter);

)

Then in the same block I’ll copy and paste all of bf_interpreter.c (truncated for brevity)

(c-declare #<<c-declare-end

#include <string.h>
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#define STACK_SIZE 30000

struct Interpreter {
    unsigned char stack[STACK_SIZE];
    unsigned char *stack_ptr;
    char *code;
    char *code_ptr;
    bool is_valid;
};


struct Interpreter createInterpreter(const char code[]);

int interpreterInterpret(struct Interpreter *interpreter);


struct Interpreter createInterpreter(const char code[]) {
    struct Interpreter interpreter;
    interpreter.is_valid = false;
    // Initialize stack to zero
    memset(interpreter.stack, 0, STACK_SIZE);

    // Set the data pointer to the beginning of the stack
    interpreter.stack_ptr = interpreter.stack;

    // Allocate memory and copy the Brainfuck code for the interpreter
    // Ensure the code is null-terminated
    interpreter.code = strdup(code);
    if (interpreter.code == NULL) {
        perror("Failed to copy code to interpreter");
        return interpreter;
    }

    // Set the code pointer to the beginning of the code.
    interpreter.code_ptr = interpreter.code;
    interpreter.is_valid = true;

    return interpreter;
}

int interpreterInterpret(struct Interpreter *interpreter) {
    printf("===BEGINNING INTERPRETATION===\n");

    int user_number;
    char *code = interpreter->code;
    for (int i = 0; i < strlen(code); i++) {
        char instr = code[i];
        switch (instr) {
.......

then in a c-lambda I’ll copy over my main.c code.

((c-lambda () void
#<<c-lambda-end
    char filename[] = "instructions_hello_world.txt";
    char *content = readCode(filename);
    printf("%s\n", content);
    struct Interpreter bf = createInterpreter(content);
    interpreterInterpret(&bf);

c-lambda-end
))

And it should run right? It will, with two minor updates. For the sake of simplicity I added the code from the bf_interpreter.h and .c file into the c-declare. Since I did that, I no longer need the includes in my main.c, because these are already visible from our c-declare (I know that from the documentation 😃)

///No needed anymore in our c-lambda
#include <stdio.h>
#include <stdlib.h>
#include "bf_interpreter.h"

Lastly, I removed the return EXIT_SUCCESS line from the c-lambda that was present in my main.c. The reason is it will cause our program to segfault. This is understandable because Gambit expects to run Scheme code after it’s executed all the c stuff, but we prematurely exit. But other than that, all the rest of the code is the same. Let’s execute it!

termianl showing the results of the brainfuck interpreter

Pretty…Darn…Cool.

Before we wrap up, I have one more thing to talk about in regards to Gambit’s CFFI overhead. You may be wondering, as I was, what the overhead between Gambit calling C is, or C calling Gambit. I asked this very question on the gambit-c gitter, and Marc Feeley the maintainer for Gambit actually answered!

Diego: Since Gambit Compiles to C code, there shouldn't be any overhead for calling C functions like there is when using a C-FFI in other languages right?
Marc Feeley: People often misunderstand the type of “Scheme to C” compilation that is performed by Gambit. In many other compilers that compile a certain language X to C there is a 1-to-1 mapping of functions in language X and functions in C. If you define function f in your X program then you can expect some function to exist in the generated C code that is a translation of f. Some Scheme to C compilers actually do this (Bigloo, and I think Chicken also). But Gambit takes another approach in order to correctly and fully implement Scheme tail-calls and continuations, which don’t have a direct equivalent in (standard) C. Gambit uses a trampoline to implement control transfers between pieces of Scheme code (whether they are calls to a procedure or returns from a procedure). So if you use the C calling convention as a reference, then Scheme calls and returns typically have a higher cost than a C call/return.

Gambit's C FFI has the additional burden of converting the data representation of parameters and return values. A Scheme integer is not represented the same way as a C integer, so some conversion must happen when going from Scheme to C and back. This is true for all types (floats, booleans, characters, strings, etc) except the scheme-object type that keeps the same representation. There is also the burden of protecting Scheme values (and converted values) from being garbage collected and moved in a way that would cause issues on the C side. Finally there’s an overhead in supporting bidirectional calls between Scheme and C, and exceptions, and continuations.
So calls between Scheme and C using c-lambda and c-define are certainly not as cheap as a C function calling another C function. Surprisingly, the use of trampolines and some compiler optimizations can make Scheme to Scheme calls faster than C to C calls (but that is a different subject).
When performance is critical the lesser known (##c-code "<c-code>") which has no direct overhead can be quite useful. If x is a variable containing a fixnum, then computing the square of x can be done with essentially no overhead with (##c-code "___RESULT = ___FIX(___INT(___ARG1)*___INT(___ARG1));" x). Note however that this code does no type checking and no overflow checking (but that’s sometimes what you want to avoid).

A very thorough answer!

I hope that this has given you a brief overview of Gambit-C’s, C interfacing capabilities, and I hope that you will give the language a try. If you could only pick two programming languages to cover all of your programming needs, what would you choose? Comment down below, I’d love to hear your thoughts.

Gitlab Repo

You can find all of the code in this example in the Gitlab repo. Also if you haven’t had enough Gambit yet check out Gerbil Scheme! It is built on top of Gambit-C and adds some cool features!

Call To Action 📣

Hi 👋 my name is Diego Crespo and I like to talk about technology, niche programming languages, and AI. I have a Twitter and a Mastodon, if you’d like to follow me on other social media platforms. If you liked the article, consider liking and subscribing. And if you haven’t why not check out another article of mine listed below! Thank you for reading and giving me a little of your valuable time. A.M.D.G

Share Deus In Machina

💎Crystal the language for humans💎

Diego Crespo

April 27, 2023

I recently implemented a Brainfuck interpreter in the Crystal programming language and I’d like to share my honest opinion about working in the language. When looking at a new programming language I am interested in these things in no particular order

Read full story

Deus In Machina