strymonas is a streaming library design for OCaml and Scala that offers support for fast, bulk, in-memory processing. It is developed using the state of the art facilities of Multi-Stage Programming (MSP) for each language. The utmost goal of the library is to offer a streaming API that achieves stream fusion at the highest level without altering the compiler backend. It covers the combination of many interesting (and challenging) combinators. The OCaml flavor, depends on BER MetaOCaml, OCaml’s dialect for MSP. The Scala one depends on scala-lms.
The approach is described in detail in the paper Stream Fusion, to Completeness (pdf), that will be presented at the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’17) in Paris.
Getting Started
The project consists of three separate projects that you can clone separately:
git clone https://github.com/strymonas/staged-streams.ocaml
git clone https://github.com/strymonas/staged-streams.scala
git clone https://github.com/strymonas/java8-benchmarks
Prerequisites
To run each one of the three benchmark and test suites, please install the following:
- Java 8 (JVM version of at least 1.8.0 65): from your system’s package manager
- Install
maven
the Apache build manager for Java projects - OCaml: from your system’s package manager
- OPAM: after you install OPAM you will need to initialize it with
opam init
and resolve any dependencies with:- either the
opam depext
command - or your system’s package manager (e.g., OML’s dependencies)
- either the
- sbt: install following sbt’s documentation
How to compile and run the benchmarks
MetaOCaml
The following will enable MetaOCaml, install dependencies, make the project and run unit tests.
$ opam update
$ opam switch 4.02.1+BER
$ eval"opam config env"
$ cd staging-streams.ocaml
$ opam pin add staged-streams http://github.com/strymonas/staged-streams.ocaml.git
To run the benchmarks via command line use the commands below:
$ ./benchmark_batteries.byte
$ ./benchmark_stream.byte
LMS
The following will fire-up the sbt console, compile and run the test suite.
$ cd staging-streams.scala
$ sbt
$ test
$ testOnly <name of test>
To run the benchmarks via the sbt command line:
$ cd staging-streams.scala
$ sbt
$ jmh:run -i 10 -wi 10-f1 .*
Java 8
To run the benchmarks via command line:
$ cd java8-benchmarks
$ mvn clean install
$ java -Xms6g -Xmx6g -XX:-TieredCompilation -jar target/benchmarks.jar -i 10 -wi 10-f1 .*
Hello World
Developers can use our library as any other streaming/collection library. We assume the reader is familiar with: i) streaming APIs and ii) Multi-Stage Programming.
Firstly, the API is similar with F#’s Seq
type, Java 8’s Stream
API, collection APIs from OCaml and OCaml Batteries and many more. Our library supports all the core combinators for pipeline construction. Creation of streams is realized with of_arr
(from arrays) and unfold
(for infinite streams). We support: fold
, map
, filter
, take
, flat_map
and zip_with
.
Secondly, Multi-Stage Programming (MSP) is the core technique that this library follows. MSP is a meta-programming technique that allows a disciplined and safe form of run-time code generation. Staging, refers to the distinction of stages between compile- and run-time with more stages in-between. When more information about run-time values is available, staging can make a program profit in performance, by partially evaluating.
In staged-streams, this kind of (dynamic) information consists of the structure of the pipeline, the in-between combinators and the bodies of the lambdas used.
We prompt the user to read about MSP in the paper A Gentle Introduction to Multi-stage Programming, on BER MetaOCaml and on Scala-Lightweight Modular Staging (LMS) which we use for our Scala implementation. The details of our technique and the benefits on the high level of stream fusion that is achieved, are included in the POPL17 paper.
Hello World in OCaml!
In the following example we create a simple stream from an array of six elements in OCaml using MetaOCaml’s staging annotations.
open Stream_combinatorslet example =
of_arr .<[| 0;1;2;3 |]>.
|> filter (fun x -> .<.~x mod2 = 0>.)
|> fold (fun z a -> .<.~a :: .~z>.) .<[]>.
Runcode.run example;;
When the execution reaches the run
method, an optimized version of the stream will be compiled (emitted) and executed (for optimal results, native code compilation must be used using ocamlopt
). The resulted code will consist of a tight, for-loop.
Hello World in Scala!
With Scala LMS, the same example is demonstrated in the snippet below.
def example (xs : Rep[Array[Int]]) : Rep[Int] = Stream[Int](xs)
.filter(d => d % 2 == 0)
.fold(0, ((a : Rep[Int]) => (b : Rep[Int]) => a + b))val example_s = compile(example)
example_s(Array(0, 1, 2, 3))
For more examples see the unit directories with the unit tests for OCaml and Scala.
Bugs and Feedback
To discuss bugs and improvements please use our Github Issues, for discussions and recommendations our gitter pages.