SCHEME IMPLEMENTATIONS

Oh the schemes I have seen !

Being the scheme newbie that I am/was (or is it was/am ?) I have tried many schemes. Even thought there is a section about scheme implementations in the scheme FAQ I decided it was time that someone created an evaluation of scheme implementation in terms that a scheme newbie might be able to understand and appreciate.

I also thought it would be nice to include a table of criteria that people could use to make an evaluation of the suitability of a particular scheme implementation for their application.

If you see any errors on this page, please let me know so I can correct them ! E-mail me at


Scheme Implementation Feature Table

R5RS Features
Implementation Version Numeric
Tower
Compiler Proper
Tail Calls
call/cc Multiple
Values
syntax-
rules
Lispish
Macros
Bigloo 2.5a I/F C Y? Y* Y Y Y
STklos 0.52 I/F/B/C/R byte Y Y Y Y Y
RScheme 0.7.3 I/F C Y Y N N Y
Scheme48 0.56 I/F/B/C/R byte Y Y Y Y N
SCM 5d6 I/F/B/C C Y Y Y Y Y
Gambit 3.0 I/F/B/C/R C Y Y N Y* Y
SCSH 0.6.1 I/F/B/C/R byte Y Y Y Y N
MzScheme 103p1 I/F/B/C/R C Y Y Y Y Y
Chicken 1024 I/F C Y Y Y Y Y
MIT-Scheme 7.7.0 I/F/B/C/R native Y Y Y Y Y
LispMe 3.11 I/F/B/C byte Y Y N N Y
Kawa 1.6.98 I/F/B/C/R jvm optional esc Y Y Y
Implementation Objects FFI Dynamic
Loading
Threads Modules Debugger
BIGLOO Y Y Y Y Y N
STKlos Y Y+ Y N Y N
RSCHEME Y N Y Y Y Y
SCHEME48 N Y Y Y Y Y
SCM N N Y N N N
GAMBIT N Y Y N N Y!
*SCSH N Y Y Y Y ?
MzScheme Y N? ? Y Y ?
Chicken Y Y Y Y Y Y
MIT-Scheme Y Y Y Y Y Y
LispMe N Y+ N N N N
Kawa Y Y+ Y Y Y Y
Libraries and Features
Implementation Regular
Expressions
OS
Functions
Hash
Tables
Sockets SLIB
BIGLOO Y Y Y Y N
STKlos Y Y Y Y Y
RSCHEME Y Y Y Y N
SCHEME48 Y Y Y N N
SCM Y* Y N Y Y
GAMBIT N N N N Y*
*SCSH Y Y! Y Y N
MzScheme Y Y Y Y Y
Chicken Y Y Y Y ?
MIT-Scheme Y Y Y Y Y
LispMe N Y N Y N
Kawa N Y Y Y Y-

*SCSH is a little different since it is not really a scheme implementation. It is actually a version of scheme-48 which has been extended for use as a shell programming environment. It has many powerful procedures for doing work which might be done in something like bash, or perl.

Explanation of table headings & entries

Version

The version I currently have installed which is not necessarily the latest version.

Numeric Tower

Support for floating point, integer, complex, arbitrary precision and rational numbers.

Most implementations which support bignums will "promote" operation arithmetic operations which would normally overflow into a bignum.

Proper tail calls

From section 3.5. of R5RS:

Implementations of Scheme are required to be properly tail recursive.

it's easy to demonstrate tail recursion, using the standard factorial example :

(define factorial
  (lambda (n value)
    (if (= 0 n)
       value
       (factorial (- n 1) (* n value)))))

> (factorial 7 1)
5040

Writing a routine like factorial in a language which does not properly support tail calls, will cause the stack to grow for each recursion. A more useful example involves state machines. State machines are generally represented as recursive procedures, with one procedure assigned to each state. So the states may call each other recursively. As long as proper tail calls are implemented, and the state machine procedures invoke the next procedure in a tail-recursive manner, you don't need to worry about your stack blowing up.

Why state machines might be useful is left as an exercise for the reader.

Consult usenet for long and involved flame wars as to why you might (not) care about proper tail calls.

call-with-current-continuation (call/cc)

Another feature which is hard to get used to. I actually understand the operation of call/cc more than I understand force/delay. call/cc is basically a way of "remembering" the state of a program at a certain point in its execution.

You can read a more in-depth explanation .

Some implementations suffer a rather nasty performance hit if you make heavy use of continuations. Obviously this only matters if you use continuations a lot.

Multiple Return Values

Allows you to return multiple values. This is not some trick involving lists. Separate multiple values are returned to the continuation (there's that word again).

R5RS Macros

The macros which were an "appendix" in R4RS are now part of the spec. I like 'em. It did take some deep-thought to figure them out. Just think pattern matching . Now that R5RS has made this type of macro "official", they should be showing up in more implementations. R5RS is also a good source of examples for the use of define-syntax since the "derived" expression types are defined in terms of the "primitive" expression types, e.g. cond is defined in terms of let , if and begin.

The gambit implementation has R5RS macros available on the web-page. You need to get the syntax-case macro package.

speaking of syntax-case , syntax-case is a more powerful macro system than define-syntax. More powerful in the sense that you can write define-syntax using syntax-case. I'm not sure if it is as powerful as lispish macros, i.e. allows name-mangling, but it's worth reading about .

Lispish macros

Are "define-macro" or "defmacro" style macros. Here is an example of a while macro written with lispish macros and define-syntax.

    (define-syntax while
      (syntax-rules()
       ((while test body1 ...)
	(let loop ()
	  (cond 
	   (test body1 ... (loop)))))))
   
    (define-macro (while test . body)
      `(let loop ()
	   (cond 
	    (,test
	     ,@body
	     (loop)))))
   

There are things you can do with lispish macros which cannot be done with R5RS macros. Most notably you can do "name mangling" with lispish macros, i.e. creating symbol names. This is very handy when defining "define-structure" sorts of operations where you want a bunch of procedures named automatagically using the structure name, .e.g

(define-structure foo field1)
should define the following procedures :
make-foo
foo-field1
foo-field1-set!

The downside of lispish macros is that they are subject to something called "variable capture". Even though this is often considered a disadvantage those crazy Lisp hackers will often write macros where the variable capture is intentional.

Objects

I have to admit I like OO-programming. I find it very intuitive. A lot of, but not all, problems seem to lend themselves to an object oriented solution. An object system makes structures or records redundant, although structures/records would probably be more efficient than using an OO system for simple record tasks.

STklos, RScheme : support an object system which is very similar to CLOS.

Foreign Function Interface (FFI)

There are two, sort of overlapping, definitions as to what an FFI could be. Version 1 of an FFI is something which allows you to call functions from another language. Usually this is accomplished by using the dynamic loading facilities of the OS. Here is a simple example :

Let's say, for some reason, that the cos function is not implemented for your scheme. You know, of course, that libm, the c math library includes it. If your implementation has an FFI which can interface to C, you could tell it the to use the routine named cos, in libm, which takes a double as an argument and returns a double. This definition allows the implementation to dynamically load the math modules and call the cos function, taking care of the type conversion stuff for you. An FFI can be very useful in allowing you to access OS-specific routines.

It can also be very dangerous if you don't describe things properly, or if you start calling memory allocation procedures in a haphazard manner.

Obviously this concept can be extended for use for libraries written in some other language with the express purpose of extending scheme. The most common use of this feature is to write certain procedures in C for efficiency reasons. The FFI will take care of generating the "glue" code for you.

The second "version" of an FFI is the way in which almost all of the schemes can be extended by using C-routines and then linking them into the interpreter or dynamically loading them. The specifics of how to create a new scheme type or extend the implementation in this manner is, of course, very implementation dependent.

Basically routines are written in C and compiled. Special glue routines are used to tell the implementation that you have added a new type and/or new procedures. The object file could then linked in as part of the scheme interpreter. Some implementations support dynamic loading which means that the object file can be read using the "load" procedure. This is very useful in that it doesn't bloat the core executable since the library is only pulled in if you need the routines. It also makes it easy to debug, since the module can be compiled as a separate unit.

According to David Fox, an implementation needs to include object finalization, i.e., there needs to be a procedure which can be called for an object just before it is garbage collected. And I quote :

I've been suggesting to implementors that if you have an FFI it is crucial that you also have object finalization. Otherwise a foreign object that, say, has an exclusive lock an external physical device can't properly relinquish the device when it is garbage collected.

Y+ indicates that the implementation does include object finalization as part of it's type definition.

Compiler

Many scheme implementations compile into bytecodes, which corresponds to the "byte" entry. Some allow you to compile to C-code which can then be compiled to machine code. Only one that I know of, Larceny, compiles to native code directly.

Dynamic Loading

If you write extensions in C and then compile them to object modules, dynamic loading will allow you to load the object module instead of having to link to the interpreter executable. A very nice feature. Almost indispensable if you are planning on writing C-code to extend a particular implementation.

Threads

Allows multi processing. Most schemes that support threads, support them even if the underlying OS does not.

Modules

Some people complain of name-space clutter when using scheme, i.e. the definitions you make a globally visible and so you have to be careful you don't clobber and existing name.

I personally don't find "name-space clutter" to be that big a deal. A module system does promote modular programming which is goodness. However, I don't want to modularize a quick scheme hack, and some implementations require you to always use modules - which is inconvenient if you are using different implementations with (*sigh*) different module systems.

Debugger

Most implementations provide debugging by giving you a trace function (actually I'm sure it's a macro). When a function is traced, the program reports the value of its arguments and the return value(s) whenever the function is called. Really good debuggers will allow you to set breakpoints and then step through the code. AFAIK, only Gambit allows you to do this.

Regular Expressions

Regular expression are mighty useful for text manipulation.

SCM : regular expressions are a separate item compiled in to the interpreter so you need to have the GNU regex package, which easy to get a hold of.

OS Functions

Function which allow you to invoke OS operations including things like external program execution, file manipulation, etc...

Hash Tables

Hash tables - you know tables with O(1) look-up. Personally I think that hash-tables are not really supported unless they already include a string-hashing function.

Don't worry, even if an implementation doesn't include hash-table support you can use the hash-tables found in SLIB.

Sockets

Haven't a clue as to what you use sockets for. I know it has something to do with network access. Haven't ever used them. So if you want to know more about why you should care if a scheme implementation supports sockets, visit Beej's Guide to Network Programming .

SLIB
SLIB is a collection of scheme routines do to a great deal of common, and even uncommon, tasks in scheme. You really should use it as it greatly enhances portability. A Y here means that the scheme implementation can use SLIB in some sort of useful manner. The #1 problem in using SLIB is a problem with macros. Hopefully all implementations will soon support R5RS macros and that will greatly improve the portability of SLIB.

The #2 problem is an out of date "implementation.init" file.

Gambit : Macros cannot cannot be "load"ed. They must be "include"d. This makes it a pain to use SLIB packages that require macros, which, unfortunately, is a lot of them.

My Experiences

Here are my personal experiences with the different scheme implementations. My experiences are heavily weighted by ease of building. A lot of times there is a debian or red-hat package, but sometimes, especially if you want to stay current, you have to go straight to the source.

If there is one thing that drives me nuts its having to goof around with makefiles or running around trying to find libraries to make something build. My patience limit is about 1/2 hour. If I can't get it to build in 1/2 hour, I make room on the hard drive for an implementation I can get to build. Hey, this doesn't mean its not a good implementation. SCM always gives me a lot of trouble and it's an excellent scheme implementation - if I can get it built it works great. I'm just not patient. Luckily there are people out there who can build it and then upload a debian package. "apt-get install" is really the best way to install your favorite scheme.

Chicken I haven't been using Chicken for active program development, but I have been using it in my benchmarking tests. It appears to be quite stable and is extremely feature rich.

STklos A re-invention of STk. It uses a byte-code compiler and is now one of the fastest interpreters around. The former Tk interface has now been replaced with a Gtk interface.

SCM FAST. Small and fast. Has all sorts of neat extensions like os-support and regexps. Probably the biggest pain to build. I've gotten smart and I just install the debian package, scm installation for the chronically lazy. This is the underlying implementation for guile.

Gambit Gambit can compile to C code. Use the correct "compile options" and the code is VERY fast. The default gambit distribution does not include an object system, but there is a separate package based on "Meroon" which is an object system done by Christian Quinnec. Also, Gambit does not support R5RS macros by default, but there is a separate package available on the web-page which provides R5RS macro support (look for "syntax-case" macros). Gambit includes a nice debugger and provides really thorough/accurate error messages. I've written some very numerically intensive programs and achieved quite impressive results as compare to equivalent implementations of the same program in C. Usually I see about a 10-15% penalty for using Gambit.

Scheme48 is very nice. It's not documented really well, and includes a LOT of different packages, so it's kind of like being on a treasure hunt :-).

The module system is pretty darn complete. Olin Shivers published a letter about the module system in comp.lang.scheme.scsh which you really want to read. I also want to know how to access those ,commands from within the interpreter.

SCSH A great program for text processing and shell programming which uses the scheme48 implementation. There are some really useful things in scsh like field-splitters and a really impressive awk macro. It has many built in functions to help with "information processing", e.g., a command which allow you to run a process and then return standard output as a list of string. Read the source code for the awk macro and be impressed.

I actually managed to get a modularized program working under scsh, and as long as I don't lose that example - I'll always know how to do it.

Bigloo Erick (of STk/STklos fame) and Manuel are now working together. Bigloo has a compiler, lexing, LALR(1) parsing, object system, pattern matching and modules. The lexer and parser are great ! I did a couple of small projects with them and found them to be very useful. They are based on publicly available scheme code, so it should be possible to enable this sort of functionality in scheme. What a productivity boost. Make changes to your lexer and see the results right away, it's great ! The modules in bigloo are fairly easy to understand. If you want to interpret code without having to modularize it, just use the -i option.

Rscheme Hmmm. It does have macros, define-macro. I wonder how I missed that the first time around ?

MzScheme Very fast. Very complete. I've recently discovered the GUI programming environment which is known as "Mr. Ed". The editor does some really nice real-time annotation of your source code. You need Mr Spidey to turn it on.

For one thing it will auto-magically show you is the proper scope of the variable you are pointing to. It's hard to explain, download it and try it out. Normally I'm not much of a GUI person, but I though Mr Spidey was pretty darn useful, especially since it also does static analysis of the source.

Benchmarks

I have made some modifications to the benchmarking package found in the Gambit distribution. Most of the changes are major. For one thing the driver has been completely re-written to use scsh.

Other than giving an indication of the speed of various scheme implementations, the package also contains benchmarks which will break some scheme implementations, so it makes a good regression test for a new version of your favorite implementation.

I hope to begin posting results of some of the benchmarks on this page in the near future, which pretty much corresponds to after I fix all the new bugs which have cropped up since I "upgraded" to 0.6.1.