Quantcast
Channel: Hacker News
Viewing all 25817 articles
Browse latest View live

One Step Closer to a Closed Internet

$
0
0

Today, the FCC voted on Chairman Ajit Pai’s proposal to repeal and replace net neutrality protections enacted in 2015. The verdict: to move forward with Pai’s proposal

We’re deeply disheartened. Today’s FCC vote to repeal and replace net neutrality protections brings us one step closer to a closed internet.  Although it is sometimes hard to describe the “real” impacts of these decisions, this one is easy: this decision leads to an internet that benefits Internet Service Providers (ISPs), not users, and erodes free speech, competition, innovation and user choice.

This vote undoes years of progress leading up to 2015’s net neutrality protections. The 2015  rules properly place ISPs under “Title II” of the Communications Act of 1934, and through that well-tested basis of legal authority, prohibit ISPs from engaging in paid prioritization and blocking or throttling of web content, applications and services. These rules ensured a more open, healthy Internet.

Pai’s proposal removes the 2015 protections and re-re-classifies ISPs under “Title I,” which courts already have determined is insufficient for ensuring a truly neutral net. The result: ISPs would be able to once again prioritize, block and throttle with impunity. This means fewer opportunities for startups and entrepreneurs, and a chilling effect on innovation, free expression and choice online.

Net neutrality isn’t an abstract issue — it has significant, real-world effects. For example, in the past, without net neutrality protections, ISPs have imposed limits on who can FaceTime and determined how we stream videos, and also adopted underhanded business practices.

So what’s next and what can we do?

We’re now entering a 90-day public comment period, which ends in mid-August. The FCC may determine a path forward as soon as October of this year.

During the public comment period in 2015, nearly 4 million citizens wrote to the FCC, many of them demanding strong net neutrality protections.  We all need to show the same commitment again.

We’re already well on our way to making noise. In the weeks since Pai first announced his proposal, more than 100,000 citizens (not bots) have signed Mozilla’s net neutrality petition at mzl.la/savetheinternet. And countless callers (again, not bots) have recorded more than 50 hours of voicemail for the FCC’s ears. We need more of this.

We’re also planning strategic, direct engagement with policymakers, including through written comments in the FCC’s open proceeding. Over the next three months, Mozilla will continue to amplify internet users’ voices and fuel the movement for a healthy internet.


Building Auto-Tuners with Structured Bayesian Optimization [pdf]

Heroku CI Is Now Generally Available

$
0
0

Today we are proud to announce that Heroku CI, a low-configuration test runner for unit and browser testing that is tightly integrated with Heroku Pipelines, is now in General Availability.

Tests@2x

To build software with optimal feature release speed and quality, continuous integration (CI) is a popular and best practice, and is an essential part of a complete continuous delivery (CD) practice. As we have done for builds, deployments, and CD, Heroku CI dramatically improves the ease, experience, and function of CI. Now your energy can go into your apps, not your process.

With today's addition of Heroku CI, Heroku now offers a complete CI/CD solution for developers in all of our officially supported languages: Node, Ruby, Java, Python, Go, Scala, PHP, and Clojure. As you would expect from Heroku, Heroku CI is simple, powerful, visual, and prescriptive. It is intended to provide the features and flexibility to be the complete CI solution for the vast majority of application development situations, serving use cases that range from small innovation teams, to large Enterprise projects.

Setup@2x

Configuration of Heroku CI is quite low (or none). There is no IT involved; Heroku CI is automatically available and coordinated for all apps in Heroku Pipelines. Just turn on Heroku CI for the Pipeline, and each push to GitHub will run your tests. Tests reside in the location that is the norm typical for each supported language, for example: test scripts in Go typically reside in the file named "function_test.go". These tests are executed automatically on each git push. So no learning curve is involved, and little reconfiguration is typically necessary when migrating to Heroku CI from Jenkins and other CI systems.

For users who are also new to continuous delivery, we've made Heroku Pipelines set-up easier than ever with a straightforward 3-step setup that automatically creates and configures your review, development, staging, and production apps. All that's left is to click the "Tests" tab and turn on Heroku CI.

Pipeline@2x

From setup, to running tests, to CI management, everything about Heroku CI is intended to be fully visual and intuitive -- even for users who are new to continuous integration. For each app, the status of the latest or currently running test run is shown clearly on the Pipelines page. Test actions are a click away, and fully available via the UI: re-run any test, run new tests against an arbitrary branch, search previous tests by branch or pull request, and see full detail for any previous test. And Heroku CI integrates seamlessly with GitHub - on every git push your tests run, allowing you to also see the test result within GitHub web or GitHub Desktop interfaces.

CI users who want more granular control, direct debug access, and programmatic control of CI actions can use the CLI interface for Heroku CI.

For every test you run, Heroku CI creates and populates an ephemeral app environment that mirrors your Staging and Production environments. These CI apps are created automatically, and then destroyed immediately after test runs complete. All the add-ons, databases, and configurations your code requires are optimized for test speed, and parity with downstream environments. Over the beta period, we have been working with add-on partners to make sure the CI experience is fast and seamless.

Setup and tear-down for each CI run happens in seconds. Because we use these ephemeral Heroku apps to run your tests, there is no queue time (as is common with many CI systems). Your tests run immediately, every time on dedicated Performance dynos.

Across the thousands of participants in our public beta, most developers observed test runs completing significantly faster than expectations.

We view CI as an essential part of effective development workflows, that is, part of good overall delivery process.

Each CI-enabled Heroku Pipeline is charged just $10/month for an unlimited number of test runs. For each test run, dyno charges apply only for the duration of tests. We recommend and default to Performance-M dynos to power test runs, and you can specify other dyno sizes.

Note that all charges are pro-rated per second, with no commitment, so you can try out Heroku CI for pennies -- usually with little modification to your existing test scripts.

All Heroku Enterprise customers get unlimited CI-enabled Pipelines, and an unlimited number of test runs, all, of course, with zero queue time. No provisioning, authentication set-up, or management of CI is required for new projects, and Heroku CI can be turned on for any Heroku Pipeline with a single click.

Existing Heroku Enterprise dyno credits are automatically used for test runs, and invoices will contain a new section listing the CI-enabled Pipelines alongside the account-wide dyno usage for CI test runs.

All test run results are available at permanent URLs that can be referenced for compliance regimes, and all authentication is managed under existing Heroku Enterprise Teams (Org) security. Unification of security, authentication, billing between CI and production deployments, along with a prescriptive methodology across company projects, lets Enterprises innovate on Heroku with the agility of a start-up.

Some terms are not usually associated with CI systems: we think Heroku CI is among the most pleasant, beautiful software testing systems available -- and we have you to thank for this. More than 1500 beta users tested Heroku CI, surfacing bugs, offering suggestions; telling us that some webhooks got dropped, that an icon on the tab might be nice, that it should be more obvious how to re-run a test ... and roughly 600 other notes, many of which grew into e-mail conversations with you. As is the case with all software: we will still be perfecting. And we are pretty proud of what we have here. Thank you, and keep the comments coming!

It's easy. Set-up a Heroku Pipeline and you're ready. There's even a two-minute video here and a simple how-to. Give it a spin, and let us know what you think.

Startup School 13: How to Find Product Market Fit – Peter Reinhardt [video]

$
0
0

Peter Reinhardt, co-founder and CEO of Segment, shares his story on building different products and eventually finding product market fit.

Resources
Lecture Slides

Some materials in this MOOC are derived from content of a course offered at Stanford University, and this MOOC does not reflect the complete course offering at Stanford. This MOOC is offered solely by Y Combinator, and your participation in this MOOC does not establish any relationship between you and Stanford University.

© 2017
Startup School by Y Combinator.
All rights reserved.

Privacy Policy

Startup School Logo

Learn Kotlin in Y Minutes

$
0
0

Kotlin is a statically typed programming language for the JVM, Android and the browser. It is 100% interoperable with Java.Read more here.

// Single-line comments start with ///*Multi-line comments look like this.*/// The "package" keyword works in the same way as in Java.packagecom.learnxinyminutes.kotlin/*The entry point to a Kotlin program is a function named "main".The function is passed an array containing any command line arguments.*/funmain(args:Array<String>){/*    Declaring values is done using either "var" or "val".    "val" declarations cannot be reassigned, whereas "vars" can.    */valfooVal=10// we cannot later reassign fooVal to something elsevarfooVar=10fooVar=20// fooVar can be reassigned/*    In most cases, Kotlin can determine what the type of a variable is,    so we don't have to explicitly specify it every time.    We can explicitly declare the type of a variable like so:    */valfoo:Int=7/*    Strings can be represented in a similar way as in Java.    Escaping is done with a backslash.    */valfooString="My String Is Here!"valbarString="Printing on a new line?\nNo Problem!"valbazString="Do you want to add a tab?\tNo Problem!"println(fooString)println(barString)println(bazString)/*    A raw string is delimited by a triple quote (""").    Raw strings can contain newlines and any other characters.    */valfooRawString="""funhelloWorld(valname:String){println("Hello, world!")}"""println(fooRawString)/*    Strings can contain template expressions.    A template expression starts with a dollar sign ($).    */valfooTemplateString="$fooString has ${fooString.length} characters"println(fooTemplateString)/*    For a variable to hold null it must be explicitly specified as nullable.    A variable can be specified as nullable by appending a ? to its type.    We can access a nullable variable by using the ?. operator.    We can use the ?: operator to specify an alternative value to use    if a variable is null.    */varfooNullable:String?="abc"println(fooNullable?.length)// => 3println(fooNullable?.length?:-1)// => 3fooNullable=nullprintln(fooNullable?.length)// => nullprintln(fooNullable?.length?:-1)// => -1/*    Functions can be declared using the "fun" keyword.    Function arguments are specified in brackets after the function name.    Function arguments can optionally have a default value.    The function return type, if required, is specified after the arguments.    */funhello(name:String="world"):String{return"Hello, $name!"}println(hello("foo"))// => Hello, foo!println(hello(name="bar"))// => Hello, bar!println(hello())// => Hello, world!/*    A function parameter may be marked with the "vararg" keyword    to allow a variable number of arguments to be passed to the function.    */funvarargExample(varargnames:Int){println("Argument has ${names.size} elements")}varargExample()// => Argument has 0 elementsvarargExample(1)// => Argument has 1 elementsvarargExample(1,2,3)// => Argument has 3 elements/*    When a function consists of a single expression then the curly brackets can    be omitted. The body is specified after a = symbol.    */funodd(x:Int):Boolean=x%2==1println(odd(6))// => falseprintln(odd(7))// => true// If the return type can be inferred then we don't need to specify it.funeven(x:Int)=x%2==0println(even(6))// => trueprintln(even(7))// => false// Functions can take functions as arguments and return functions.funnot(f:(Int)->Boolean):(Int)->Boolean{return{n->!f.invoke(n)}}// Named functions can be specified as arguments using the :: operator.valnotOdd=not(::odd)valnotEven=not(::even)// Lambda expressions can be specified as arguments.valnotZero=not{n->n==0}/*    If a lambda has only one parameter    then its declaration can be omitted (along with the ->).    The name of the single parameter will be "it".    */valnotPositive=not{it>0}for(iin0..4){println("${notOdd(i)} ${notEven(i)} ${notZero(i)} ${notPositive(i)}")}// The "class" keyword is used to declare classes.classExampleClass(valx:Int){funmemberFunction(y:Int):Int{returnx+y}infixfuninfixMemberFunction(y:Int):Int{returnx*y}}/*    To create a new instance we call the constructor.    Note that Kotlin does not have a "new" keyword.    */valfooExampleClass=ExampleClass(7)// Member functions can be called using dot notation.println(fooExampleClass.memberFunction(4))// => 11/*    If a function has been marked with the "infix" keyword then it can be    called using infix notation.    */println(fooExampleClassinfixMemberFunction4)// => 28/*    Data classes are a concise way to create classes that just hold data.    The "hashCode"/"equals" and "toString" methods are automatically generated.    */dataclassDataClassExample(valx:Int,valy:Int,valz:Int)valfooData=DataClassExample(1,2,4)println(fooData)// => DataClassExample(x=1, y=2, z=4)// Data classes have a "copy" function.valfooCopy=fooData.copy(y=100)println(fooCopy)// => DataClassExample(x=1, y=100, z=4)// Objects can be destructured into multiple variables.val(a,b,c)=fooCopyprintln("$a $b $c")// => 1 100 4// destructuring in "for" loopfor((a,b,c)inlistOf(fooData)){println("$a $b $c")// => 1 100 4}valmapData=mapOf("a"to1,"b"to2)// Map.Entry is destructurable as wellfor((key,value)inmapData){println("$key -> $value")}// The "with" function is similar to the JavaScript "with" statement.dataclassMutableDataClassExample(varx:Int,vary:Int,varz:Int)valfooMutableData=MutableDataClassExample(7,4,9)with(fooMutableData){x-=2y+=2z--}println(fooMutableData)// => MutableDataClassExample(x=5, y=6, z=8)/*    We can create a list using the "listOf" function.    The list will be immutable - elements cannot be added or removed.    */valfooList=listOf("a","b","c")println(fooList.size)// => 3println(fooList.first())// => aprintln(fooList.last())// => c// Elements of a list can be accessed by their index.println(fooList[1])// => b// A mutable list can be created using the "mutableListOf" function.valfooMutableList=mutableListOf("a","b","c")fooMutableList.add("d")println(fooMutableList.last())// => dprintln(fooMutableList.size)// => 4// We can create a set using the "setOf" function.valfooSet=setOf("a","b","c")println(fooSet.contains("a"))// => trueprintln(fooSet.contains("z"))// => false// We can create a map using the "mapOf" function.valfooMap=mapOf("a"to8,"b"to7,"c"to9)// Map values can be accessed by their key.println(fooMap["a"])// => 8/*    Sequences represent lazily-evaluated collections.    We can create a sequence using the "generateSequence" function.    */valfooSequence=generateSequence(1,{it+1})valx=fooSequence.take(10).toList()println(x)// => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]// An example of using a sequence to generate Fibonacci numbers:funfibonacciSequence():Sequence<Long>{vara=0Lvarb=1Lfunnext():Long{valresult=a+ba=bb=resultreturna}returngenerateSequence(::next)}valy=fibonacciSequence().take(10).toList()println(y)// => [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]// Kotlin provides higher-order functions for working with collections.valz=(1..9).map{it*3}.filter{it<20}.groupBy{it%2==0}.mapKeys{if(it.key)"even"else"odd"}println(z)// => {odd=[3, 9, 15], even=[6, 12, 18]}// A "for" loop can be used with anything that provides an iterator.for(cin"hello"){println(c)}// "while" loops work in the same way as other languages.varctr=0while(ctr<5){println(ctr)ctr++}do{println(ctr)ctr++}while(ctr<10)/*    "if" can be used as an expression that returns a value.    For this reason the ternary ?: operator is not needed in Kotlin.    */valnum=5valmessage=if(num%2==0)"even"else"odd"println("$num is $message")// => 5 is odd// "when" can be used as an alternative to "if-else if" chains.vali=10when{i<7->println("first block")fooString.startsWith("hello")->println("second block")else->println("else block")}// "when" can be used with an argument.when(i){0,21->println("0 or 21")in1..20->println("in the range 1 to 20")else->println("none of the above")}// "when" can be used as a function that returns a value.varresult=when(i){0,21->"0 or 21"in1..20->"in the range 1 to 20"else->"none of the above"}println(result)/*    We can check if an object is a particular type by using the "is" operator.    If an object passes a type check then it can be used as that type without    explicitly casting it.    */funsmartCastExample(x:Any):Boolean{if(xisBoolean){// x is automatically cast to Booleanreturnx}elseif(xisInt){// x is automatically cast to Intreturnx>0}elseif(xisString){// x is automatically cast to Stringreturnx.isNotEmpty()}else{returnfalse}}println(smartCastExample("Hello, world!"))// => trueprintln(smartCastExample(""))// => falseprintln(smartCastExample(5))// => trueprintln(smartCastExample(0))// => falseprintln(smartCastExample(true))// => true// Smartcast also works with when blockfunsmartCastWhenExample(x:Any)=when(x){isBoolean->xisInt->x>0isString->x.isNotEmpty()else->false}/*    Extensions are a way to add new functionality to a class.    This is similar to C# extension methods.    */funString.remove(c:Char):String{returnthis.filter{it!=c}}println("Hello, world!".remove('l'))// => Heo, word!println(EnumExample.A)// => Aprintln(ObjectExample.hello())// => hello}// Enum classes are similar to Java enum types.enumclassEnumExample{A,B,C}/*The "object" keyword can be used to create singleton objects.We cannot instantiate it but we can refer to its unique instance by its name.This is similar to Scala singleton objects.*/objectObjectExample{funhello():String{return"hello"}}funuseObject(){ObjectExample.hello()valsomeRef:Any=ObjectExample// we use objects name just as is}

Got a suggestion? A correction, perhaps? Open an Issue on the Github Repo, or make a pull request yourself!

What Does the Met’s New Online Collection Mean for Art Students?

$
0
0

This morning I was looking at a few of the 400,000 images the Metropolitan Museum of Art published online a couple days ago — zooming in close to the paintings to see how they were made. For someone who went to art school being able to do this is a revelation. I used to go to the museum with my sketchpad and copy the old masters. I’d get as close as I could to understand the brush strokes, colors, lines. The guards knew who to watch out for and would bark suddenly when we stuck our faces over the imaginary line.

As class assignments we were required to copy hundreds — literally hundreds — of the masters drawings and paintings. for those we mostly worked from images in books — a picture the size of a wallet photo.

Which is one of the many reasons this new met resource is fucking phenomenal.

You can get so, so close — far closer than one could in real life.

Here are three images of the same painting, Study Head of A Young Woman by Anthony Van Dyck: the image as it would be in a book, as close as you could get in a museum, and then as close as you can get online.

Portrait as you’d see it in a textbook.
Portrait as you’d see it in a museum gallery.
Portrait as you can see it on The Met’s online collection.

What, whoa.

I was just looking to see how the ear was formed using highlights and shadows — and turns out there’s text under the painting. It’s not legible, but I’d never have seen it if I hadn’t been able to zoom in like I did. I looked it up. This is from the Met’s website:

The picture’s style suggests a date of about 1618–20, when van Dyck collaborated with Rubens. The paper support appears to be from an account book written in Italian with traces of Flemish (possibly proper names). Perhaps the paper was used previously by Rubens, who often wrote in Italian, but it is far from certain that the barely discernible writing is in his hand.

The above text is possibly written on the wall of the gallery where the painting is hung, or as a sidebar in a textbook — but i rarely read the labels anymore — so discovering the text accidentally was a total treat for me. And it’s the kind of discovery that i think students need to have often if they are going to keep interested.

I envy art students today, with access to every painting they could ever want at their finger tips. Being able to take a magnifying glass over 400,000 works of art in the Met’s Collection, deconstruct it’s surface anatomy. However — and it’s a big however — I hope they remember that the painting they are examining is no longer a painting. It is a photograph of a painting. Moreover, it’s a photograph being re-presented on a glowing screen. The result? The work becomes inherently flat, and not paint. You can see the lushness of the brush strokes, but via shadow and light represented in a photograph. You can see the color, but as light, not paint — and through at least two modifications (the color change of the digital camera, then the color change your screen.) What you can’t do is move around the surface of the paitning and see the sculpture of the paint. You can’t get the Thingness of the painting that is so inherent in the making of any object, of any painting.

I have no doubt I’m going to be using the Met’s online collection frequently for research, reference or just plain enjoyment. The creation of the online gallery is tremendously generous move and an example to other museums and private collections. For those of us who are current or eternal students of how drawings and paintings are made, I’m sure the collection will serve as an endless resource of detail and discovery. And instead of being a replacement for an actual visit to the see the work in person, I hope it will be an inspiration and reminder for us to go all the more.

Writing a Lisp: Continuations

$
0
0

, in Lisp, Haskell

This week I added continuations to my Lisp. They’re a fascinating feature, and a popular demand in the survey. It has been the most challenging feature so far, both to implement and explain, but the result has been worth it.

First, what are continuations?

A con­tin­u­a­tion is a spe­cial kind of func­tion that’s like a book­mark to the loca­tion of an expres­sion. Con­tin­u­a­tions let you jump back to an ear­lier point in the pro­gram, thereby cir­cum­vent­ing the con­trol flow of the usual eval­u­a­tion model. Beautiful Racket

Continuations can be used to implement other control mechanisms like exceptions, return, generators, coroutines, and so on.

In this example, let/cc binds the current continuation to here. It evaluates the let block, which assigns the continuation to cont. The last expression is returned as usual, and the evaluation continues.

(define contnil)(+ 1(+ 2(+ 3(+ (let/cchere(set! conthere)4)5)))); 15

Ordinary functions

In my lisp, continuations are just ordinary functions, containing a short-circuit form. This transparency makes debugging easier, and the implementation smaller.

(lambda (x)(<primitive>(+ 1(+ 2(+ 3(+ x5))))))

When invoked, short-circuit immediately transfers control to the continuation, skipping the surrounding expression.

(* 10(cont10)); 21, not 210

Another example: the return statement. In this case control immediately jumps back to the top of the function scope, returning the given value.

(define (return-test)(let/ccreturn1(return2)3)))(return-test); 2

Let as a macro

The let form is really just syntactic sugar. It’s expanded into a call to call/cc, short for call-with-current-continuation.

(define-syntax (let/ccsym.body)'(call/cc~(list*'lambda'(~sym)body)))

Early exit with either

To account for the possibility of early exit, I added EitherT to the monad stack, where Left represents an early exit, and Right the usual order of evaluation.

newtypeLispMa=LispM{unLispM::EitherTLispVal(StateTCallstackIO)a}deriving(Monad,Functor,Applicative,MonadIO,MonadStateCallstack)

run extracts the Either from the LispM monad, using an empty call stack.

run::LispMa->IO(EitherLispVala)runm=evalStateT(runEitherT(unLispMm))[]

shortCircuit is just a synonym for left and throwError.

shortCircuit::LispVal->LispM()shortCircuit=LispM.left

Capturing the context

call/cc and shortCircuit are also reified special forms. call/cc is predefined in the enviroment. When it is invoked, it captures the surrounding computation, wraps it inside of a function, and passes it to it’s argument.

When the continuation is invoked, it calls shortCircuit' on it’s body, and the evaluation continues from there.

impurePrimitiveMacros=wrapPrimitivesTrueImpure[--("call/cc",callCC)]callCCenv[l]=dolambda<-evalenvlcont<-makeContevalenv$List[lambda,cont]wheremakeCont=docontFnBody<-topFrame>>=walkreplaceContFormreturn$makeFnFalseAnonymous[Symbol"x"][List[shortCircuit',contFnBody]]envshortCircuit'=wrapPrimitiveFalseImpurescwherescenv[val]=dor<-evalenvvalshortCircuitrreturnr

The outermost expression that lexically contains the call/cc form is used as the continuation. This avoids infinite loops when used inside of named functions, where the continuation and named function would keep calling each other indefinitely.

topFrame=State.get<&>reverse<&>mapextractCallframe<&>findcontainsCallCCForm<&>fromJustextractCallframe(Callframeval)=valcontainsCallCCFormval=casevalofList[Symbol"call/cc",_]->TrueListxs->anycontainsCallCCFormxs_->False

Finally, the call/cc form itself is replaced by the parameter ‘x’.

replaceContFormval=return$casevalofList[Symbol"call/cc",_]->Symbol"x"_->val

Evaluation and error handling

After evaluating code, the result is unwrapped from the LispM monad and printed. Because both Left and Right must contain a LispVal at this point, they are treated equally.

evalString::Env->String->IO()evalString=runWithCatchactionwhereactionenvstring=doreadtable<-getReadtableenvletr=readOnereadtablestring>>=evalenvliftIO$runr>>=eitherprintValprintValrunWithCatch::(Env->String->LispM())->Env->String->IO()runWithCatchfenvx=doletaction=fromRight'<$>run(fenvx)catchaction(printError::LispError->IO())

Further reading

Beautiful Racket has a great chapter on continuations.
The Wikipedia page also gives a decent overview of the topic.

Triplebyte (YC S15) looking for remote engineers

$
0
0
Triplebyte (YC S15) looking for remote engineers
54 minutes ago | hide
We're building a consistent, reliable, credential-blind hiring process for engineers. You can read more about us here: https://triplebyte.com/press We're looking for engineers to do remote, part-time work with us, as Triplebyte technical interviewers. This is a great opportunity for highly skilled software engineers to do well-paid, flexible work, on your own schedule.

Part of our evaluation process for our candidate is a technical interview, where engineers demonstrate a variety of skills and abilities. We're looking for people to help us administer those interviews. We'll pay $300 per two-hour technical interview.

This role is a best fit for solid engineers with deep knowledge and skill in many different areas. The work is part time, with a flexible schedule. Preparation for it will involve intensive, full-time training on site, with us, in San Francisco. You'll work with our custom interview processes and interviewing software. We've done thousands of hours of interviews - we're experts, and we'll expect you to be an expert before you can administer interviews on our behalf.

If you're interested, please sign up here: https://triplebyte.com/remote_interviewer


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

The Nimble Type Inferencer for Common Lisp-84 (1990)

$
0
0
The Nimble Type Inferencer for Common Lisp-84
HENRY G. BAKER
Nimble Computer Corporation, 16231 Meadow Ridge Way, Encino, CA 91436
(818) 986-1436 (818) 986-1360 (FAX)
This work was supported in part by the U.S. Department of Energy Contract No. DE-AC03-88ER80663
Copyright (c) 1989-90 by Nimble Computer Corporation

We describe a framework and an algorithm for doing type inference analysis on programs written in full Common Lisp-84 (Common Lisp without the CLOS object-oriented extensions). The objective of type inference is to determine tight lattice upper bounds on the range of runtime data types for Common Lisp program variables and temporaries. Depending upon the lattice used, type inference can also provide range analysis information for numeric variables. This lattice upper bound information can be used by an optimizing compiler to choose more restrictive, and hence more efficient, representations for these program variables. Our analysis also produces tighter control flow information, which can be used to eliminate redundant tests which result in dead code. The overall goal of type inference is to mechanically extract from Common Lisp programs the same degree of representation information that is usually provided by the programmer in traditional strongly-typed languages. In this way, we can provide some classes of Common Lisp programs execution time efficiency expected only for more strongly-typed compiled languages.

The Nimble type inference system follows the traditional lattice/algebraic data flow techniques [Kaplan80], rather than the logical/theorem-provingunification techniques of ML [Milner78]. It can handle polymorphic variables and functions in a natural way, and provides for "case-based" analysis that is quite similar to that used intuitively by programmers. Additionally, this inference system can deduce the termination of some simple loops, thus providing surprisingly tight upper lattice bounds for many loop variables.

By using a higher resolution lattice, more precise typing of primitive functions, polymorphic types and case analysis, the Nimble type inference algorithm can often produce sharper bounds than unification-based type inference techniques. At the present time, however, our treatment of higher-order data structures and functions is not as elegant as that of the unification techniques.

Categories and Subject Descriptors:

General Terms: compilers, data types, lattices, Boolean algebras, dataflow, static analysis, polymorphism.

Additional Key Words and Phrases: Common Lisp, ML, type inference, interpreted language, compiled language.


High-level programming languages can be grouped into two camps--the compiled languages such as Fortran, Algol, Pascal, Ada and C--and theinterpreted languages such as Lisp, APL, and Smalltalk. The compiled languages put great emphasis on high levels of compile-time type checking and execution speed, while the interpreted languages put great emphasis on run-time type flexibility and speed of program development. As might have been expected, each camp has worked hard to add the advantages of the other camp to its own inherent advantages. Interpreted languages have developed sophisticated compilers for higher execution speed and safety, while compiled languages have developed interpreters and incremental compilers for faster program development and more sophisticated typing systems for more flexible programs and data structures.

Type inferencing is a technique by which the interpreted languages move towards the goal of achieving the safety and execution speed of traditional compiled languages, while preserving the flexibility of runtime data typing. Type inferencing is the mechanical extraction of "type declarations" from a program that has not provided this information, which is required of the programmer by traditional compiled languages. An optimizing compiler can then use this more precise type information to generate more efficient code.

This paper describes a type inferencer we have developed to test some of the algorithmic limits of the whole notion of type inferencing. We have chosen to test our approach using Common Lisp, as it is a standardized, dynamically-typed language which already contains the infrastructure--a type description language and optional type declarations--that is needed to support type inference. These type declarations are then used by the type inferencer to demonstrate its typing of a program. To date, we have focussed on extracting the best type information possible using algorithmic methods, while giving less emphasis to computational efficiency of these methods.

This paper is structured to provide both the background and approach of the type inference system being developed:

  • Section 2 describes the nature of "type inference"
  • Section 3 describes the goals of the Nimble type inference system
  • Section 4 shows the capabilities of our algorithm through a number of examples
  • Section 5 describes the Kaplan-Ullman type inference algorithm, from which ours evolved
  • Section 6 describes the Nimble type inference algorithm
  • Section 7 shows an example of the Nimble type inference algorithm in operation
  • Section 8 briefly analyzes the complexity of the Nimble algorithm
  • Section 9 discusses previous work on type inference, both for other languages and Lisp
  • Section 10 concludes.
To begin, we need to construct a theoretical framework and background for performing "type inference" on Common Lisp programs. We call the process of automatically determining the datatype of objects "type inference" rather than "type checking", since Common Lisp already has a well-developed dynamic type system in which types are computed and checked at run-time. However, to improve the efficiency of Lisp programs which have been compiled for execution on modern RISC architectures without type-checking hardware, more information is required at compile time. We need to identify during compilation those variables which will not utilize the full range of Lisp runtime datatypes, and give them specialized representations which can be dealt with more efficiently. This process is called representation analysis. Representation analysis has quite different goals from the problem of checking type safety at compile time, which we will call compile time type checking. However, these two problems are intimately related, and maximal run-time speed is achieved when the results of both kinds of analyses are available.

There has been some confusion about the difference between type checking for the purposes of compiling traditional languages, and type checking for the purposes of ensuring a program's correctness. While the redundancy which results from incorporating type declarations enhances the possibility of detecting semantic errors at compile time, this redundancy is not the best kind for that purpose. The goal of strong typing declarations in compiled languages is the efficient mapping of the program onto hardware datatypes, yet hardware datatypes may carry little semantic meaning for the programmer. For detecting semantic errors, the most powerful kinds of redundancy are provided by abstract data types and constructs such asassert. Abstract data types model the semantic intent of the programmer with respect to individual variable values, so that global properties of these individual values (e.g., evenness or primeness of an integer value) are maintained. The assert construct allows for the specification of complex relationships among several variables. However, since we are interested in improving run-time efficiency, we will assume that the program is already semantically correct, and will therefore concern ourselves only with the determination of tight lattice bounds on the values of variables.

Performing type inference requires proving many small theorems about programs, and therefore runs the risk of being confused with the more difficult task of theorem-proving for the purpose of proving programs correct relative to some external criteria. While some of the techniques may be similar to both tasks, the goals are completely different. For example, it is considered acceptable and routine for correctness provers to interact with a programmer when reasoning about the correctness of his code, but these interactions are not appropriate in a type inferencer. The whole point of type inference is to prove these small theorems and insert type declarations mechanically, since the raison d'etre of typeless languages is to eliminate unnecessary redundancies--e.g., declarations--from programs which clutter up the code and reduce productivity in program development and maintenance. Thus, if a type inferencer cannot routinely prove a certain class of theorems without human interaction, then its use as a productivity-enhancing tool will be severely limited.

We have chosen to perform static type inference on programs in the dynamically-typed Common Lisp programming language [CLtL84]. Common Lisp-84 has a reasonably complex type system. This system involves 42 simple type specifiers, 4 standard type operators, and 21 type specialization forms. Common Lisp has the usual primitive integer and floating point types, characters, the traditional atoms and conses of Lisp, vectors, arrays, and strings. In addition, Common Lisp has a host of non-traditional datatypes, such as hash tables, readtables, and other special purpose types. The datatype system can also be extended through user-definedstructures, which are analogous to structures in C and torecords in Pascal and Ada.

Common Lisp functions, unlike functions in traditional compiled languages, are inherently polymorphic. A function may accept arguments of any datatype, and perform operations on them without restriction. Of course, many built-in functions will generate runtime errors if not given arguments of the proper types. The Lisp function +, for example, will complain if it is not given numbers to add. However, the programmer is under no obligation to restrict his own functions in such a simple manner. For example, he is free to define his own + function in which numeric arguments are summed in the normal fashion, but non-numeric arguments are coerced into their "print length" before summing.

Traditional compiled languages such as Algol, Pascal, C and Ada cannot offer this level of flexibility, because they must determine a unique datatype for every value at compile time. This is because their built-in operations cannot accept more than one kind of datatype representation. While this level of specificity regarding datatypes allows for great efficiency in the datatypes and operations that they do support, these languages often cripple the programmer by forcing him to program in styles which may not be the most natural for the problem. These styles of programming may sometimes be less efficient than a style based on dynamic data types.

There has been some significant progress in incorporating more complex type specification and type checking into compiled languages. The most developed (in the sense of actual implementation) of these languages is ML [Milner78, Harper86]. ML allows for parameterized datatype specifications by means of type variables, which can stand for "any actual type". These variables are automatically instantiated during the type checking process, allowing for a much more elegant solution to the problem of polymorphic functions such as length (length produces the length of any list, regardless of the type of its elements). In Pascal, a different length procedure would be required for each type of list element. In Ada, it is possible to state a generic length function, but this generic function is not a true function, but only a template from which a true function must be instantiated before it can be called [AdaLRM].

The ML compiler can handle the typing problem of length by introducing a type variable (denoted by '<type-variable-name>) which stand forall actual types, so that arguments to length can be specified more generically as list('A) instead of list(integer) orlist(character). The ML compiler itself then determines the appropriate actual type to substitute for the type variable in every instance where length is used. However, while ML allows for a parameterized type specification, ML still requires that the datatype of everyinstance of the variable or procedure be resolved to a single actual type. This is because ML is still a traditional strongly-typed language, without the dynamic run-time datatypes necessary to handle full polymorphism. Thus, through parameterized type specification, ML extends the power of traditional "compiled" languages in the direction of type polymorphism without giving up the level of efficiency on standard hardware expected of these languages.

The Nimble type inferencer, on the other hand, approaches the same goal of high efficiency on "standard" hardware, but from the opposite direction: instead of restricting the programming language itself to consist of only those constructs which can be efficiently compiled, the Nimble type inferencer identifies those usages of variables and functions within Common Lisp that can be efficiently compiled, and utilizes the traditional Lisp runtime datatype system as a fall-back strategy. Thus, the ML and Nimble type inferencing algorithms can be considered analogous in the sense that they are both looking for constructs that can be efficiently compiled. They differ, however, on what is done when no such constructs are found: ML considers this case an error, while the Nimble type inferencer accepts the program, but its subsequent execution will involve a higher level of run-time type checking than otherwise.

The goals of the Nimble type inferencing algorithm are as follows:
  • produce the most highly constrained set of datatypes possible for each variable definition, to allow for the efficient representation of the variable
  • produce the most highly constrained set of datatypes possible for each variable use, to allow for the elimination of redundant datatype checking during execution
More succinctly, Nimble would like to:
  • utilize hardware datatypes whenever possible
  • eliminate as much redundant run-time type-checking as possible
These two goals are intimately related, and extremely important when compiling Common Lisp for a modern RISC architecture.

Utilizing hardware datatypes is important because some hardware datatypes are extremely efficient. For example, on the Intel i860 [Intel89], 32-bit 2's complement integers, 32-bit IEEE floating point numbers and 64-bit IEEE floating point numbers are all extremely efficient and extremely fast. However, in order to compile a Common Lisp program using hardware instructions for these datatypes, we must know that the variables and temporaries accessed by these instructions can take on only the appropriate types at run-time. In other words, we must be able to prove that any run-time type checks for those variables will always show the variables and values to be of the correct type; i.e., we must prove that the run-time type checks are redundant. Thus, the analysis necessary to eliminate redundant type checks is a prerequisite for the utilization of specialized representations.

What do we mean by redundant runtime type-checks? Consider the following Common Lisp recursive program for the factorial function:

(defun fact (n)
  (if (zerop n) 1
      (* n (fact (1- n)))))
During the execution of (fact 5), we first must execute (zerop 5), which first checks whether 5 is a number, and if so, whether it is zero. If 5 is not a number, then an error is signalled. Having determined that 5 is not zero, we then compute (1- 5), which in turn checks again whether 5 is a number, and what type of number it is, so that it can perform the appropriate decrement operation, whether for integers, rationals, floating-point numbers, or complex numbers. However, since we have already checked 5 for numberhood, we have shown that the type check during the first part of the function 1- is redundant. We then pass the result 5-1=4 back to fact for a recursive computation. However, we must once again check 4 for numberhood as part of the call to zerop in the recursive call, so there is a redundant type check within every recursive call at the entry tozerop.

A similar kind of redundant type checking occurs during the unwinding of the recursion. At the bottom of the recursion, fact returns 1, which is a number, as well as an integer. Yet * will check again whether this result is a number, and what kind of number it is, before performing its computation. By induction, we know that fact will always return a number, if it returns at all, since 1 is a number, and* always returns a number. Hence the check for numberhood on entry to* inside fact is always redundant.

In modern RISC architectures type checks are expensive, because they involve conditional branching which is extremely expensive on a pipelined architecture due to the "pipeline turbulence" it causes. Pipeline turbulence is the inefficiency that results from idle time slots within the pipeline as a result of the conditional branch. Error checks of the sort described above can be implemented on pipelined architectures as conditional branches whose branch is almost never taken. The type dispatch operation, however, used to find the appropriate type of instruction to execute (integer, rational, floating-point, complex) almost always branches, but to different places at different times, and therefore almost always causes pipeline turbulence.

While there exist architectures and compilers whereby this sort of pipeline turbulence can be minimized (e.g., the Trace technology used in the Multiflow machine [Ellis86]), it still extracts a significant performance penalty. For example, on many RISC architectures, 32-bit integer operations are 4-6 times faster than conditional branches and hence to a first order approximation, the type checks take up all of the execution time!

In this section, we show some of the capabilities of the Nimble algorithm to automatically infer tight lattice upper bounds on the datatypes of variables.

The most trivial of type inference algorithms should be capable of inferring the types of constants, whether they be constant objects, or constant built-in functions. Thus, even in the absence of additional information, the Nimble type inferencer (henceforth called "NTI") can automatically infer the following types in our simple recursive factorial program (automatically inferred information is shown in boldface type):

(defun fact (n)(declare (type number n))
  (if (zerop n) (the integer 1)
      (the number
           (* n (the number
                     (fact (the number
                                (1- n))))))))
Given the additional information that fact is only called from elsewhere with integer arguments, NTI can automatically infer the following more restrictive types:
(defun fact (n)(declare (type integer n))
  (if (zerop n) (the integer 1)
      (the integer
           (* n (the integer
                     (fact (the integer
                                (1- n))))))))
If fact is only called from elsewhere with non-negative integer arguments, NTI can automatically infer the even more restrictive types:
(defun fact (n)(declare (type nonnegative-integer n))
  (if (zerop n) (the positive-integer 1)
      (the positive-integer
           (* n (the positive-integer
                     (fact (the non-negative-integer
                                (1- n))))))))
This last example deserves some comment. The ability of NTI to predict that n will never become negative, given that n started as a positive integer, is a reasonably deep inference. Since this inference then allows NTI to conclude that the result of fact will always be positive shows a significant level of subtlety.

If fact is only called from elsewhere with "small" non-negative integers (i.e., non-negative fixnums, in Common Lisp parlance), NTI canautomatically infer the following types:

(defun fact (n)(declare (type nonnegative-fixnum n))
  (if (zerop n) (the positive-fixnum 1)
      (the positive-integer
           (* n (the positive-integer
                     (fact (the non-negative-fixnum
                                (1- n))))))))
This ability of NTI to automatically prove "fixnum-hood" for all recursive calls to fact is quite significant, because it allows a compiler to represent n as a 32-bit 2's complement integer, and utilize a hardware "decrement" operation to implement (1- n).

4.1 INCORPORATING PROGRAMMER-SUPPLIED INFORMATION

In the factorial example above, NTI could determine that the argument to fact would always be a nonnegative fixnum, and could show that theresult would be a positive integer, but no bound could be placed on the size of this integer. We know, of course, that the factorial function grows very rapidly, and that arguments greater than 12 will produce results greater than 2^32. However, the NTI algorithm will usually be incapable of inferring such deep mathematical results, and in these cases it will be necessary for the programmer to supply additional information to aid NTI in its analysis if he wants to obtain the most efficient code. The programmer can supply this information in several different ways.

The NTI algorithm relies on all data provided by the programmer, both from declarations and from actual use, so that the following programs all produce the same code (except for the wording of any error message):

(defun test1 (x y)
  (declare (integer x) (type (integer 0 *) y))	; Info as declaration
  (expt x y))

(defun test2 (x y)
  (assert (integerp x))				; Info as assertions
  (assert (typep y '(integer 0 *)))
  (expt x y))

(defun test3 (x y)
  (if (and (integerp x)				; Info as conditional
           (integerp y)
           (not (minusp y)))
      (expt x y)
      (error "Bad types for ~S, ~S in test3" x y)))

(defun test4 (x y)
  (etypecase x					; Info as typecase
    (integer
      (etypecase y
        ((integer 0 *) (expt y x)))))

(defun test5 (x y)
  (assert (or (zerop y) (plusp y)))		; Info as assertion
  (the integer (expt x y)))			; and declaration
(The last example is not necessarily true in all Common Lisps, but true in most, as it assumes that only rational arguments can produce rational results from the two-argument exponential function.)

4.2 GENERATING DECLARATIONS WITH THE NIMBLE TYPE INFERENCER

Common Lisp already has a syntactically well-specified mechanism for the programmer to declare his intentions regarding the range of values (and hopefully the representations) of variables and temporaries--the declaration. Declarations allow the programmer to specify to the compiler that more specialized and efficient representations can be used to speed up his program. In fact, the output of the Nimble type inferencer after its analysis of an input program is a copy of the input program with additional declarations inserted.

Unfortunately, the meaning of declarations is not well-specified in Common Lisp-84. While the Common Lisp standard[CLtL84,p.153] states that "declarations are completely optional and correct declarations do not affect the meaning of a correct program" (emphasis supplied), the meaning of "correct declaration" is not made clear, and the standard allows that "an implementation is not required to detect such errors" (where "such error" presumably means the violation of a declaration, i.e., "error" = "incorrect declaration").

The Nimble type inferencer takes a slightly different point of view. NTI considers any programmer-supplied declarations to be a material part of the program. These declarations are treated as constraining the possible values which a variable or a temporary can assume, in much the same way that the "type" of an array in Common Lisp constrains the possible values which can be stored into the array. NTI enforces this "constraint" semantics of declarations by inserting dynamic type checks of its own when it cannot prove that the constraint will always be satisfied.

Since our declarations do constrain the values of variables and temporaries, they can change the meaning of a program, in the sense that a program with declarations can exhibit a smaller range of correct behavior than the same program with the declarations elided. Therefore, declarations can become a source of bugs, as well as a way to find bugs (assuming that a compiler or run-time system complains about declarations which actually constrain a program's meaning). The fact that declarations may actually introduce bugs into previously correct programs means that the traditional advice to "only add declarations to well-debugged programs" becomes circular and nonsensical!

However, if we advocate that programmers always decorate their programs with declarations, we risk losing something far more important than the correctness of previously debugged programs. We risk losing the advantages of polymorphism entirely. Lisp has traditionally been considered a polymorphic language, where functions can accept arguments of different types at different times, and produce different types as results. For example, consider a straight-forward implementation of the arcsinh function:

(defun asinh (z)
  (log (+ z (sqrt (1+ (* z z))))))
This function should work properly regardless of the argument type. asinh is polymorphic, and must return a floating-point result of the same format when given a floating-point argument, and must return a complex floating-point result of the same format when given a complex floating-point argument. This polymorphism can be implemented in one of two ways: true polymorphism, where the function can actually handle all of the possible types, and overloading, where any instance of the function works on only one type, and the particular instance is chosen at compile time. Overloading is more analogous to macro-expansion, except that the choice of which macro-expansion to use is dependent upon the type of the argument (and possibly upon the type of the expected result). Overloading is usually more efficient than true polymorphism, because no type dispatching need be performed at run-time.

By leaving the declarations out of a function definition, and instead inferring them when the argument and result types are already known, we can have reasonably generic function definitions. These generic functions can be stored in a library and compiled with the properly inferred types on demand. The alternative is precompiling a whole host of versions of the same function with differently declared types, and choosing the proper one at compile time through overloading techniques. The overloading technique is strongly advocated in the Ada language, but is only moderately successful due to the restrictions of that language, and due to the poor interaction of overloading with other Ada language features, such as separate compilation [AdaLRM]. Due to these problems, the default alternative supplied by most Common Lisp implementations is to endure the systematic inefficiency of truly polymorphic library routines.

The Nimble approach, on the other hand, allows for the above definition of asinh to act as more like an Ada generic function [AdaLRM], where each instance--whether in-line or out-of-line--will be compiled using only the functionality required by that instance. Thus, if a programmer only calls asinh with single-precision floating-point arguments in a particular program, then the asinh function compiled for that program will be specialized to work with only single-precision floating-point arguments and results. This ability to automatically specialize the standard Common Lisp libraries to the actual usage of a program is unique to the Nimble approach.

If one actually wants a truly polymorphic function which is also efficient--e.g., for a high-performance compiled library function--then one should utilize the following programming technique when using the Nimble type inferencer:

(defun asinh (z)
 (typecase z
  (single-float			(log (+ z (sqrt (1+ (* z z))))))
  (short-float			(log (+ z (sqrt (1+ (* z z))))))
  (double-float			(log (+ z (sqrt (1+ (* z z))))))
  (long-float			(log (+ z (sqrt (1+ (* z z))))))
  ((complex single-float)	(log (+ z (sqrt (1+ (* z z))))))
  ((complex short-float)	(log (+ z (sqrt (1+ (* z z))))))
  ((complex double-float)	(log (+ z (sqrt (1+ (* z z))))))
  ((complex long-float)		(log (+ z (sqrt (1+ (* z z))))))
  (fixnum			(log (+ z (sqrt (1+ (* z z))))))
  (integer			(log (+ z (sqrt (1+ (* z z))))))
  (rational			(log (+ z (sqrt (1+ (* z z))))))
  ((complex rational)		(log (+ z (sqrt (1+ (* z z))))))
  (t				(log (+ z (sqrt (1+ (* z z))))))))
This version of asinh is more efficient than the previous one when compiled for all numeric types, because while the expressions within each arm are identical, they are compiled differently. Within each arm, z is inferred as having a different type, and hence different versions oflog, +, sqrt, etc., are compiled within that arm. In the first arm, for example, +, 1+, and * can be open-coded with single hardware instructions, and for some processors,log and sqrt have single hardware instructions, as well. (If the programmer also specifies in his calling program that this version ofasinh should be expanded inline, then the Nimble algorithm will in most cases be able to eliminate the type check on the entry to the body ofasinh, as well as the dead code from all the unused arms.)

The Nimble type inferencing algorithm allows the programmer to use standard case-based programming in addition to declarations to tell the compiler what to do. Consider, for example, the abs function:

(defun abs (z)
  (cond ((complexp z)
         (sqrt (+ (sqr (realpart z)) (sqr (imagpart z)))))
        ((minusp z) (- z))
        (t z)))
NTI is able to infer that within the first cond clause z is complex, hence numeric, hence realpart and imagpart are defined and cannot fail. Furthermore, those functions produce real results, sosqr produces non-negative real results, hence sqrt always produces a real result. Within the second cond clause, NTI can infer that z is a negative real, because minusp cannot be applied to other than real numbers, and it is true only for negative reals. NTI therefore concludes that the second clause produces a positive real result. It may be surprising, but NTI also infers that z is a non-negative real in the third cond clause even though it apparently could be any Lisp type other than complex and negative numbers. However, since minusp is not defined for non-numeric types, control will never reach the final clause unless z is a number, and since minusp cannot be applied to complex numbers and will return true for negative numbers, z must therefore be a non-negative real in the third clause. Putting all the clauses together, NTI concludes that the result from abs must always be a non-negative real number.

It is also instructive to see whether NTI can infer that the square function sqr always produces a non-negative result for non-complex arguments.

(defun sqr (z)
  (* z z))
Given this simple definition for sqr, NTI can only conclude thatz is numeric, and cannot say anything about the sign of z when z is real. However, consider the following definition for thesqr function, which is actually more efficient for complex arguments:
(defun sqr (z)
  (cond ((complexp z)
         (let ((r (realpart z)) (i (imagpart z)))
           (complex (- (sqr r) (sqr i)) (* 2 r i))))
        ((minusp z) (* z z))
        (t (* z z))))
While the second and third clauses in sqr do not help in improving the efficiency of sqr, they do allow NTI to infer that sqr applied to real arguments produces a non-negative real result. This is because NTI can easily infer that negative*negative is positive and non-negative*non-negative is non-negative, and can put these facts together to infer that the result of sqr must always be non-negative for real arguments. (Note that while NTI required the programmer to split out the negative real and non-negative real cases in order to extract the most information, an optimizing "back-end" could later merge the two clauses back together, so that no efficiency is necessarily lost.)

The Nimble type inference algorithm is not only "case-based", but also "state-based". This means that NTI can properly type the following program:

(defun test ()
  (let ((x 0))
    (let ((y x))
      (setq x 0.0)
      (values x y))))
For this program, NTI will provide declarations along the lines of the following code:
(defun test ()
  (let ((x 0)) (declare (type (or fixnum single-float) x))
    (let ((y x)) (declare (fixnum y))
      (setq x (the single-float 0.0))
      (values (the single-float x) (the fixnum y)))))
The Nimble type inferencer can also handle functions with side-effects, without open-coding them:
(defun test-swap ()
  (let ((x 0) (y 0.0))
    (flet
      ((swap (&aux z) (setq z x x y y z)))
      (print (list x y))
      (swap)
      (values x y))))
NTI would annotate this example in a manner similar to that shown below:
(defun test-swap ()
  (let ((x 0) (y 0.0))(declare (type (or fixnum single-float) x y))
    (flet
      ((swap (&aux z) (declare (fixnum z))
         (setq z x x y y z)))
      (print (list (the fixnum x) (the single-float y)))
      (swap)
      (values (the single-float x) (the fixnum y)))))
As this example shows, the Nimble type inferencer is able to keep track of the different states of the variables x and y at different places in the program.

Since the Nimble type inference algorithm is derived from the Kaplan-Ullman type inference algorithm [Kaplan80], we will discuss the Kaplan-Ullman approach in detail. It should first be pointed out that the Kaplan-Ullman approach to type inference uses the same type of lattice techniques as does traditional data flow analysis in compilers [Aho86].

The Kaplan-Ullman algorithm utilizes a simple state-based model of computation. Within this model, a program consists of a finite number of variables, and a finite directed graph with nodes, each of which describes a simple computational step. The simple steps consist of assignment statements of the form z:=f(a,b,c,...), where a,b,c,...,z are program variables, and f(,,,...) is a predefined function. Constants are introduced as functions of zero arguments, and the flow of control is non-deterministic. The Kaplan-Ullman type inference algorithm seeks to find an assignment of variable names to datatypes in the datatype lattice which is consistent with the needs of the program, and is also minimal, in that any smaller assignment of datatypes is no longer consistent.

In the classical non-deterministic manner, the flow of control is terminated only by branches of the computation that fail in the sense that there are no legitimate values for variables. In this setting,predicates are modelled by partial functions whose results are ignored. One particularly valuable partial function of this kind is "assert(p)" which is defined only for the argument "true".

Even though the control flow of the Kaplan-Ullman model is non-deterministic, it can accurately model deterministic programs through the following mapping. An "if-then-else" statement can be modelled as follows:

(if (p x y) (setq z (f x y))
            (setq z (g x y)))
can be modelled as:
(alt (progn (assert      (p x y))	(setq z (f x y)))
     (progn (assert (not (p x y)))	(setq z (g x y))))
In this case, the construct (alt A B) means "perform either A or B", with the choice made non-deterministically, and following either arm describes a legitimate computation. In the specific usage above, only one (or none) of the arms will actually succeed, since it cannot happen that both p and not p are true at the same time. Therefore, while the alt construct is powerful enough to introduce true non-determinism, we will only need it under the controlled conditions given above.

The Kaplan-Ullman model introduces a state for each node in the graph which describes the best approximation to the actual datatypes of all of the program variables at that point in the program. A state can be thought of as a simple vector of components--one for each program variable--whose values are elements of the datatype lattice needed by the algorithm. We will actually use two states for each program point, since we will want to describe the state just before as well as the state just after the execution of a particular node.

The Kaplan-Ullman model then proceeds to model the program statements z:=f(a,b,c,...) in terms of their effect on these program states. Since we are utilizing the datatype lattice, rather than actual run-time values, we will be inferring the datatype z from knowledge of the datatypes a,b,c,... The Kaplan-Ullman model utilizes a "t-function", which answers such questions in the most conservative way. Thus, if we have an approximation to the datatypes a,b,c,..., the t-function of f(,,,...) will allow us to produce an approximation to the datatype of z. Perhaps as importantly, the t-function of f(,,,...) will also allow us to approximate a,b,c,... given the datatype of z.

There is nothing paradoxical about these "t-functions". For example, if we know that the result of (+ x y) is an integer, then it must be the case that x,y were both rationals (given the standard policy of floating-point contagion in Common Lisp). While this information doesn't allow us to pin down the types of x,y more specifically to be integers (e.g., 3/4 + 1/4 = 1), we can usually get additional information regarding one or both of x and y, and thus pin down the types of all three quantities x,y, andx+y more exactly.

As an aside, it should be noted that this class of inferences is analogous to the Bayesian analysis of Markov processes [???], where the states of the program are identified with Markov states, and the transition functions are associated with transition probabilities. However, the Kaplan-Ullman model produces information about the range of possibilities, rather thanprobabilities. It may be possible to extend type inference analysis to perform probability analysis, given some method of producing a priori probabilities about frequency of execution. Such a probability analysis would enable a compiler to generate better code for the most probable cases.

Kaplan and Ullman call the propagation of information in the normal direction of computation (i.e., from arguments to function values) forward inferencing, while the propagation of information in the retrograde direction (i.e., from function values to argument values) backward inferencing. Thus, for every statement, we can compute a forward inference function which carries the "before-state" into the "after-state", as well as a backward inference function which carries the "after-state" into the "before-state". If we assign an index to every statement in the program, we can then construct a vector of states for the entire program, which then becomes a representation of the flowstate of the program.

This flowstate has little in common with the state of a program running on actual data. In the case of an executing program, the "state" of the program would be a program counter, holding the index of the currently executing program statement, and an environment vector, holding the current values of all of the program variables. Our data type "flowstate", on the other hand, consists of a vector of vectors whose elements are in the datatype lattice. The outer vector is indexed by program statements; the inner vectors are indexed by program variables. The reason for requiring a different state for each program statement is that the approximation to the datatype of a program variable at one particular program statement could be different from the datatype assumed at another program statement. This is especially true when variable assignment is involved, since the datatype of a program variable can then be different from one program statement to another.

Once we have represented the flowstate of an entire program in the above manner, we can now describe the forward and backward inferencing operations in terms of matrix multiplication operations. To do this, we construct a diagonal square matrix F of order n, where n is the number of program statements. The element Fii is a function from states to states which describes the transition of the program statement i in going from a before-state to an after-state. The off-diagonal elements are constant functions yielding the bottom state, which assign to all program variables the bottom datatype.

We also construct a square matrix C of order n in which the element Cij is a function from states to states which describes the connectivity of the program statements. Cij is the identity function if program statement j immediately follows program statement i, and it is the constant function producing the bottom state otherwise. We will use a form of matrix multiplication in which the usual role of "+" is taken by the latticejoin operation (here extended componentwise to vectors of elements), and the usual role of "*" is taken by functional composition.

Given a program flowstate S, which is a vector of states indexed by program statements, we can compute the effect of a single execution of each program statement by means of the matrix-vector product F*S. This product takes the before-flowstate S into an after-flowstate F*S. Similarly, the matrix-vector product C*S takes the after-flowstate S into a before-flowstate C*S. Thus, we can utilize the composition F*C*S to take an after-flowstate to another after-flowstate.

Kaplan and Ullman show that all of these operations are monotonic in the datatype lattice, and since this lattice must obey the finite chain condition, the following sequence has a limit for every S:

S, F*C*S, F*C*F*C*S, F*C*F*C*F*C*S, ...

In particular, the limit exists when S is the (vector of) bottom element(s) of the datatype lattice. This limit is one consistent datatype labeling of the program, although it may not be a minimal such labeling (in the datatype lattice ordering).

The previous limit can actually be taken with a finite computation. This computation will converge in a finite amount of time due to the finite chain condition of the original datatype lattice, and the finite numbers of program variables and program statements. The operation of the program can intuitively be described as assuming each program variable takes on the value of the bottom lattice element, and then propagating this assumption throughout the entire program. Whenever a statement of the form z:=C (wherein z is assigned a constant), the resulting after-state is forced to include the possibility of the datatype of the constant, and therefore this state can no longer assign z the bottom datatype. If a later assignment of the form x:=f(z) occurs, then the forward inferencing function will force x to include the datatype f(C) (as determined by the "t-function" for f), and so x will no longer be assigned the bottom datatype, either, if C is in the domain of f. In this fashion, the datatypes of all program variables at all program points are approximated from below until an assignment is reached that is consistent for all of the program statements. Within this computation, the assignment for a particular variable at a particular program point is always adjusted monotonically upwards from the bottom element of the lattice. (This datatype propagation process is analogous to the solution of a partial differential equation by relaxation, where the initial boundary conditions are propagated into the interior of the region until a consistent solution is obtained.)

It is obvious that this solution is consistent, but upon closer examination, we find that this solution is not necessarily minimal. The non-minimal solution arises because forward inferencing utilizes information in the same order that it occurs during normal program flow, and a more stringent constraint on the datatype of a variable may occur later in the program than the computation producing a value for the variable. Consider, for example, the sequence

x:=sqrt(y); if x>0 then A else B

In this sequence, x is constrained to be a "number" by virtue of being the result of a square root operation, but x is not necessarily a "non-complex number" until it is compared with 0 in the following step. Since complex numbers cannot be compared using the ">" operator, x must have been real, which means that y must have been non-negative. Thus, in order to compute a minimal datatype assignment for this sequence, a type inferencer must utilize "backward inferencing" in addition to "forward inferencing".

The Kaplan-Ullman algorithm therefore extends the matrix inferencing process to include backwards inferencing. To implement this, we define a backwards inferencing matrix B, which is diagonal and takes after-states into before-states. We now require the dual of the connection matrix C, where the dual matrix takes before-states into after states; this dual connection matrix is simply the transpose C^t of the forward connection matrix C. We then construct the infinite sequence

S, C^t*B*S, C^t*B* C^t*B*S, C^t*B* C^t*B* C^t*B*S, ...

This sequence is monotonic, and must also converge to a limit by the finite chain condition. This limit is a consistent approximation to the legal datatypes of the program variables, but is usually different from the limit of the forward inferencing chain. We could perform both of these sequence computations separately, followed by a lattice meet operation on the results to get a better approximation than either the forward or the backward inferencing computations could produce separately. The result of this complex computation, however, would still not be minimal.

To obtain a better result, Kaplan and Ullman utilize a technique for incorporating an upper bound for each state into the approximation process such that no intermediate state is allowed to violate the upper bound. This improved process can best be described as a small program:

S := bottom; repeat {oldS := S; S := (F*C*S) meet U} until S=oldS;

Not allowing the upper bound U to be violated at any point results in a tighter lower bound, but it is still not the tightest possible.

The final Kaplan-Ullman algorithm is a nested pair of loops. The inner loop is similar to the one above, in which the upper bound remains constant, while the lower bound runs from bottom up to convergence. The result of this inner process is a consistent legal datatype assignment, and no actual computation can stray outside its boundaries without producing an error. Thus, at the completion of a pass through this inner loop, we can take this lower bound as the new upper bound. Thus, the outer loop runs the inner loop using the lattice top element as an upper bound, then takes the resulting lower bound as the new upper bound, and begins the next pass of the inner loop. This process continues until the upper bound no longer changes, or equivalently, the greatest lower bound equals the least upper bound. In order to use all of the available information, Kaplan and Ullman suggest alternating forwards inferencing and backwards inferencing with each iteration of the outer loop.

/* Complete Kaplan-Ullman Type Inference Algorithm. */
/* U is upper
bound, L is lower bound. */
U := top;repeat {
	oldU := U;
	L := bottom; repeat {oldL := L; L := (F*C*S) meet U} until L=oldL;
	U := L;
	L := bottom; repeat {oldL := L; L := (C^t*B*S) meet U} until L=oldL;
	U := L; } until U=oldU;
Kaplan and Ullman prove that within a certain mathematical framework, their algorithm is optimal. They propose that their algorithm produces the minimal datatype assignment within the space of all datatype assignments which can be computed via any technique which utilizes a finite procedure in a lattice having the finite chain condition.

However, the Kaplan-Ullman inferencing algorithm does not produce the minimal assignment outside the stated framework. Consider the statement z:=x*x in a datatype lattice where negative and non-negative numbers are different datatypes. Let us further assume that there are alternative paths before this statement that assign to x both negative and non-negative numbers. Since the t-function for this statement must allow for all possibilities, it claims that {+,-}*{+,-}=>{+,-}; in other words, real multiplication can produce both positive and negative results, given arguments which are both positive and negative. Nevertheless, we know that x*x is always non-negative, no matter what the arguments are, so long as they are real. In this instance, the t-function for multiplication is not producing the sharpest possible information. This is because it does not notice that the two arguments to "*" are not only dependent, but identical.

A smarter t-function which detected duplicated argument variables would give better information in this instance, but this smarter t-function would be easily defeated by a simple transformation into the computationally equivalent sequence [y:=x; z:=x*y]. The problem arises from the approximation Kaplan and Ullman make in the representation of the states of the program variables. The approximation provides that only rectangular regions in the Cartesian product of datatypes can be represented as the states of the program variables; in other words, only a variable's own datatype can be represented, not the relationship between its datatype and that of another variable. As a result, the information produced by statements like z:=assert(x>y) is ignored, because there is no way to accurately represent the non-rectangular region "x>y" in our space of program variable states.

In the Nimble type inference algorithm, we make this same approximation in our representation of states as did Kaplan and Ullman, and for the same reason: it would take an extraordinarily complex representation and a very demanding calculation in order to keep track of all such relationships. As an indication of the complexity of this task, the "first-order theory of real addition with order" can result in a multiply-exponential time complexity for its decision procedure [Ferrante75].

Curiously, the statement [if x<0 then z:=x*x else z:=x*x] allows the Kaplan-Ullman algorithm to conclude that z is always positive! This result obtains because the programmer has forced the algorithm to perform case analysis which produces the intended result.

Several rules must be observed in order for a Kaplan-Ullman algorithm to work at all. Since the high-water-marks (greatest lower bounds) achieved during any lower-bound propagation will become the upper bounds for the next iteration, it is essential that we have found the true high-water-mark. This means that the lower-bound propagation must continue until there is no change, no matter how small. It is also important to see the entire program, since even a small piece of the program could introduce some new behavior, which would violate some high-water-mark achieved on some earlier cycle in the algorithm.

Since we are performing limit steps, the datatype lattice must obey the finite chain condition (i.e. it must be a discrete lattice [MacLane67]), else the algorithm may not converge. If the lattice is not discrete, then the limit steps may run forever. For example, the lattice described in our implementation of Common Lisp's subtypep predicate[Baker92] does not obey the finite chain condition because its representation allows individual integers to be represented, and the lattice of finite ranges of integers does not obey the finite chain condition.

The Nimble type inference (NTI) algorithm produces essentially the same information as would a "straight-forward" implementation of the Kaplan-Ullman algorithm, but it does so using a completely different representation and a completely different organization of its tasks. As a result, it is also much more efficient than a straight-forward Kaplan-Ullman implementation.

The NTI algorithm represents program states in a way which is much more tuned to the requirements of an expression-oriented language than a statement-oriented language. The NTI algorithm also carries explicit lower and upper bounds for each state. These lower and upper bound computations are performed simultaneously. While this technique does not improve the worst-case speed of the Kaplan-Ullman algorithm, it dramatically improves the performance in the common cases.

Consider the example: z:=f(x,y). The "before-state" for this statement has both a lower bound and an upper bound, between which any legal datatype assignment must lie. During forward inferencing, we can utilize the lower bounds of x and y to compute a (new) lower bound for z, and we can utilize the upper bounds of x and y to compute a (new) upper bound for z. Of course, these new bounds are still subject to previously determined lower and upper bounds for z. For this case, we have the following forward propagation rule for z:=f(x,y):

upper(z) = t-function(f,0,upper(x),upper(y)) meet upper(z)
lower(z) = t-function(f,0,lower(x),lower(y)) meet upper(z)

If we have a program point where two states split, as in the beginning of an alt (alternative) statement, then we propagate the lower and upper bounds differently from the above case. In this case, we have the following forward propagation rule for when A splits into B and C:

upper(B) = upper(A) meet upper(B)
upper(C) = upper(A) meet upper(C)
lower(B) = lower(A) meet upper(B)
lower(C) = lower(A) meet upper(C)

If we have a program point where two states join ("merge"), as in the end of an alt expression, then we propagate the lower and upper bounds using the forward propagation rule for B and C merging into A:

upper(A) = (upper(B) join upper(C)) meet upper(A)
lower(A) = (lower(B) join lower(C)) meet upper(A)

Backward propagation rules for the three cases above follow a similar pattern.

Backward propagation rule for z:=f(x,y):

upper(x) = t-function(f,1,upper(x),upper(y),upper(z)) meet upper(x)
upper(y) = t-function(f,2,upper(x),upper(y),upper(z)) meet upper(y)
upper(z) = upper(zbefore)

Backward propagation rule for split of A into B and C:

upper(A) = (upper(B) join upper(C)) meet upper(A)
lower(A) = (lower(B) join lower(C)) meet upper(A)

Backward propagation rule for join of B,C into A:

upper(B) = upper(A) meet upper(B)
upper(C) = upper(A) meet upper(C)
lower(B) = lower(A) meet upper(B)
lower(C) = lower(A) meet upper(C)

The proof of termination for our algorithm is somewhat less obvious than for the Kaplan-Ullman algorithm, since we modify both lower and upper bounds simultaneously. However, upper bounds still always decrease monotonically, as we always "meet" any new value with the old one. But the lower bounds do not necessarily monotonically increase, since a new upper bound can decrease a previous lower bound.

Nevertheless, our algorithm does converge. The proof goes as follows. The upper bound computations are completely independent of the lower bound computations; they monotonically decrease, and due to the finite chain condition on the datatype lattice, they still converge to a limit in a finite number of steps. Once the upper bounds stabilize, the lower bounds can, from that point on, only monotonically increase. The situation with stable upper bounds is identical to the situation in the original Kaplan-Ullman proof. Since the finite chain condition also applies to the lower bound computation, that case also converges, in a finite number of steps, to an assignment which is less than or equal to the stabilized upper bound.

Carrying both lower and upper bounds simultaneously does not relieve us of the necessity of performing backwards inferencing. Our previous example of

x:=sqrt(y);if x>0 then A else B

does not allow us to carry back the information about the realness of x into the previous statement, whence we can conclude that x is actually non-negative, and hence y is non-negative. However, the simultaneous computation of upper and lower bounds does improve the speed of convergence.

The question arises as to why an iterative limit step is required at all; i.e., why can't we produce the best solution with only a single pass through the program? The answer is that certain programs combined with certain datatype lattices require iteration. Consider the following simple program:

(labels
  ((foo (x)
     (if (zerop x) 0
         (1+ (foo (1- x))))))
  (foo 10000))
foo is an expensive identity function for non-negative integers which goes into an infinite recursion for any other numbers. Let us assume a datatype lattice having different datatypes for the five ranges: zero, positive integers less than 100, positive integers 100 or greater, negative integers greater than -100, and negative integers -100 or less. Using this datatype lattice, the first forward inferencing iteration of our algorithm will assign x the type {i|i>=100}, and the union of the types {i|0<i<100}and {i|i>=100} (i.e., {i|i>0}) for the result of (1- x). The second iteration will assign x the type {i|i>0} (from the union of {i|i>=100} and {i|i>0}), and assign to (1- x) the union of {0} and {i|i>0} (i.e., {i|i>=0}). The third iteration will assign x the type {i|i>=0}, but the conditional expression will separate the casesx=0 and x>0, so that (1- x) is again assigned the type {i|i>=0}.

Simultaneously with the above iteration to compute the datatype of x, the algorithm is also approximating the result of (foo x). This result starts out {}, but then becomes {0} as a result of the first arm of the conditional. The second iteration assigns the result the union of {0} and {i|0<i<100} (i.e., {i|0<=i<100}). The third and subsequent iterations assign the result {i|i>=0}.

Thus, the type inference algorithm applied to this program infers the constraints x>=0 and (foo x)>=0. Note that the algorithm cannot conclude that (foo x)=x or even that(foo 10000)>100, but it is still remarkable that the algorithmdoes conclude that x does not go negative, and that(foo x) also does not go negative!

From the above example, it should be obvious that iterations are necessary in the type inference algorithm in order to properly handle the limit operation on looping and recursive constructs. These constructs could cause NTI to work its way up from the bottom of the datatype lattice to the top with only one step per iteration. While this looping is quite expensive computationally, it is very important, as it allows sharp bounds information to be inferred regarding loop and recursion index variables.

We thus are led to the simple data type looping/recursion type inference principle:

Loops and recursion in the input program force iteration in the type inference algorithm.

On the other hand, with a properly designed integer datatype lattice, this type inference algorithm can conclude that certain simple loops 1) terminate; and 2) stay within the bounds of an implementation-defined short integer type (e.g., -128<=x<128, or -2^30<=x<2^30).

We will also show that no fixed number (i.e., independent of the size of the program text) of limit steps (the inner loops in the Kaplan-Ullman type inference program given above) is sufficient to guarantee that we have produced the minimal datatype assignment within the Kaplan-Ullman inferencing framework.

Consider the following program:

a:=read(); b:=read(); c:=read(); d:=read(); e:=read();
w:=read(); x:=read(); y:=read(); z:=read();
assert a*w>0; v:=0; assert b*x>0; v:=1; assert c*y>0; v:=2; assert d*z>0; v:=3;
assert b*w>0; v:=4; assert c*x>0; v:=5; assert d*y>0; v:=6; assert e*z>0; v:=7;
assert e>0;

For this example, we will assume that the datatype lattice for numbers simply distinguishes between positive, negative and zero numbers. The inclusion of the dummy assignments to the variable v in the program forces the creation of multiple states during inferencing, which drastically slows down the propagation of information.

During the first (forward) inferencing pass, all variables are set to take on any values, since they are assigned the results of "read" operations, which can produce any type of value. However, we quickly learn that all of these values must be numbers, because they are used in multiplication operations, which can accept only numeric datatypes. Only at the very end of the forward inferencing pass do we learn that one of the variables--e--is a positive real number.

During the second (backward) inferencing pass, the knowledge that e is real and positive allows us to conclude that z is also real and positive, hence d is also real and positive. During the third (forward) inferencing pass, the knowledge that d is real and positive allows us to conclude that y is real and positive. During the fourth (backward) inferencing pass, the knowledge that y is real and positive allows us to conclude that c is real and positive. During the fifth (forward) inferencing pass, we conclude that x is a positive real. During the sixth (backward) inferencing pass, we conclude that b is a positive real. During the seventh (forward) inferencing pass, we conclude that w is a positive real. During the eighth (backward) inferencing pass, we conclude that a is a positive real. Thus, we have been forced into eight separate limit passes in order to finally converge on the minimal datatype assignment, which concludes that all of the variables must have been positive real numbers.

While this example was specially contrived to defeat type inferencing algorithms based on the Kaplan-Ullman algorithm, it can be extended in the obvious way to force the inferencing algorithm to perform any number of steps. Thus, through our counter-examples, we have shown that the Kaplan-Ullman inferencing algorithm must consist of a doubly nested loop, where the inner loop raises the lower bounds and the outer loop lowers the upper bounds by alternate forward and backward inferencing passes, with no a priori limits on the number of either the inner or the outer loop iterations.

An open question is whether we can change the representation of the states in the Nimble type inferencing algorithm to more quickly converge even on contrived examples like the one above.

6.1 THE DISCRETE DATATYPE LATTICE OF COMMON LISP "SINGLE-VALUES"

The Nimble type inferencing algorithm utilizes a Boolean algebra for its datatype lattice, both because it is more precise, and because it can be efficiently implemented by means of bit-vectors. These bit-vectors encode the possibilities of the different datatypes of a value as a union of "atomic" datatypes, where "atomic" here means an atom in a Boolean algebra, and not a Common Lisp atom. Thus, a bit-vector could encode the union {integer,single-float,character}, meaning that the corresponding variable could take on as a value at run-time either an integer, a single-precision floating point number, or a single character (a highly improbable situation!). Unlike the situation in strongly-typed languages such as C or Ada, there is nostandard lattice required by Common Lisp. We could utilize a lattice which distinguished between Lisp symbols which were keywords and Lisp symbols which were not keywords, or we could lump the two together as simply Lisp symbols. Similarly, we are free to distinguish between negative and non-negative integers--essentially treating them as having different datatypes. A lattice allowing finer distinctions, and the resultant greater resolution for distinguishing datatypes, requires longer bit-vectors to implement. The only requirement is that the lattice satisfy the finite chain condition. This is easily achieved if the total number of bits in all the bit-vectors is bounded by an a priori bound. One implication of this requirement is that we cannot distinguish every single integer, but must group them into a finite set of equivalence classes.

We can normally utilize relatively short bit-vectors in the range of 32-64 bits. This is true since the majority of important Common Lisp distinctions can be made with less than 32 bits [Baker92], and increasing the resolution further can dramatically increase the space and time required to manipulate the Kaplan-Ullman t-functions (this is discussed in a later section). We have found that the datatype resolution required for proper analysis greatly exceeds the number of datatypes needed for efficient representation. For example, even though small positive and negative integers are both represented by the same hardware datatype, it is useful to propagate finer distinctions during analysis, since the final lattice upper bounds achieved can be significantly smaller. In the case where it is desired to achieve high performance on numeric programs, we have found the following numeric distinctions to be useful:

  • Integer classes:
    • {i | i < -2^31}
    • {-2^31} ; we need this class to handle the asymmetry of 2's complement arithmetic
    • {i | -2^31 < i < -1}
    • {-1}
    • {0}
    • {1}
    • {i | 1 < i < 2^31}
    • {2^31} ; ditto.
    • {i | 2^31 < i}
  • Floating point number classes (type,range):
    • type:
      • short-float
      • single-float
      • double-float
      • long-float
    • range:
      • {x | -infinity < x < -1.0}
      • {-1.0}
      • {x | -1.0 < x < -0.0}
      • {0.0,-0.0} ; use of IEEE standard can force {0.0},{-0.0} distinction.
      • {x | 0.0 < x < 1.0}
      • {1.0}
      • {x | 1.0 < x < infinity}
(Note that these distinctions look remarkably similar to those used by those artificial intelligence researchers in "qualitative reasoning" [IJCAI-89]; perhaps they, too, should investigate Kaplan-Ullman inferencing.)

We do not currently attempt to track the contents of the higher-order datatypes of Common Lisp, as is routinely done in ML [Milner78]. In other words, we utilize the single class cons to stand for all cons cells, regardless of their contents. While tracking the contents of Lisp lists would be extremely valuable, the resolution required is far too great to be handled by our current methods (however, see[Baker90a], in which we extend ML-style unificational type inference to also perform storage use inferencing on these higher-order datatypes). Similarly, we do not currently attempt to track the types of higher order functions of Common Lisp, but lump them all together as function. Once again, this would be extremely valuable, as the experience of ML has shown, but again the resolution required is computationally prohibitive.

Note that this lumping of functional objects together as one class does not prevent us from keeping track of the arguments and results of user functions, but only of functional values ("funargs" or "closures"). This is possible, because unlike Scheme [Scheme], Common Lisp keeps functions mostly separate from data objects, and hence is amenable to more a classical compiler treatment of typed functions.

6.2 THE LATTICE OF COMMON LISP "MULTIPLE-VALUES"

Common Lisp, unlike other dialects of Lisp, has a curious notion of multiple values. These multiple values are not lists or vectors, and hence are not first-class objects. They can, however, be returned from functions or accepted as function arguments under certain carefully controlled conditions. These multiple values cause problems for program analysis; while they are rarely used, they could be used, and thus they must be everywhere allowed for.

Unlike our rather casual treatment of the other higher order datatypes in Common Lisp, we must model multiple values carefully. If we do not, we would not be able to infer anything about the results of any function call, and Lisp programs consist mainly of function calls. Multiple values can be more easily modeled than the other higher order data types due to two properties: they are functional objects, in the sense that their size and components cannot be altered via side-effects once they have been constructed; and they are not first class, in the sense that no references to them can be created or compared via eq.

Part of the problem of representing multiple values stems from the fact that one can construct functions which can return a different number of multiple values at different times--a multiple value "polymorphism". Furthermore, some primitive Lisp functions--such as eval or apply--can return any number of multiple values. Further complication arises from the fact that multiple values are coerced, in many circumstances, into a single value; the first one, if it exists, or nil, if it doesn't. Yet one can also write Lisp code which can tell how many values were returned by a function call, and what these values were.

The Nimble type inferencer represents multiple values using a record structure of 3 components. The first component is an integer interpreted as a bit-vector which indicates the set of possible numbers of values which are being represented. Thus, a multiple-value which represents a normal single value uses the integer 2 = 2^1, meaning that the only possible number of values is 1. "Zero values" is represented by the integer 1 = 2^0, and the multiple-value returned by the Common Lisp floor function is represented by the integer 4 = 2^2, meaning that exactly 2 values are returned. If a function sometimes returns one value and sometimes returns two values, then the number of values is represented by 6 = 2^1+2^2. The number 0 then represents "no possible multiple-values"--not zero values--i.e., the function never returns! Finally, the resulting types for functions likeeval and apply are represented by -1 (= the sum to infinity of 2^i in 2's complement notation!). With this encoding for the number of components in a multiple value, it is easy to perform latticemeet and join--they are simply logand andlogior of the representation numbers. (The finite chain condition ("fcc") holds so long as the number of multiple-values is a priori bounded; in pathological cases, it is necessary to limit the resolution of the number of values to "0,1,2,...,31,>=32", for example, in order to guarantee fcc.)

The second and third components of the multiple-value record structure are the "finite part" and the "infinite part" of the multiple-value representation. The finite part is a simple vector whose values are elements of the single-value datatype lattice--typically bit-vectors. The infinite part is a single element of the single-value datatype lattice. The interpretation of these two elements is similar to the digits and sign of a 2's complement number; any component (bit) whose index is less than the length of the finite part can be found in the finite part, and any component (bit) whose index is greater than the length of the finite part has the value of the infinite part. The reason for this structure is the fact that all multiple-values in Common Lisp have only finite lengths, but since our lattice is also a Boolean algebra, we must be capable of producing a complement, as well. But complements of finite sequences are sequences whose infinite parts are all the same--hence still representable in our scheme.

We can now describe meet and join operations on our multiple-value datatype lattice. The number-of-values integers are combined using logand or logior, respectively. Then the finite part and infinite part of the result are computed. Before performing either a meet or a join, we must first extend the finite part of the shorter operand to the length of the longer operand by extending it with copies of its infinite part. We then perform an element-wise meet or join utilizing the single-value meet or join operation from the single-value lattice. Finally, we canonicalize by collapsing the finite part of the result as far as possible; any elements at the end of the finite part vector which are equal to the infinite part are simply ignored and the vector is shortened. Thus, the implemention of these meet and join operations is similar to the implementation of addition and subtraction of multiple-precision 2's-complement integers.

Some examples of this encoding are given below. The reason for the nil in the infinite part of most of the multiple-values below is that Common Lisp specifies that access to unsupplied values from a multiple-value returns nil.

	(values)			<1,<>,{nil}>
	(values 0)			<2,<{0}>,{nil}>
	(floor 3 2)			<4,<{1},{0}>,{nil}>
	(if ...				<6,<{0,1},{1,nil}>,{nil}>
	    (values 0)
	   (values 1 1))
	(eval x)			<-1,<>,t>
(Note that our representation of multiple-values is ideal as a lattice representation for any datatype consisting of finite-length sequences--e.g., character strings. The union of the strings "hi", represented by <4,<{h},{i}>,{}>, and "there", represented by<32,<{t},{h},{e},{r},{e}>,{}>, is represented by<36,<{h,t},{i,h},{e},{r},{e}>,{}>.)

6.3. THE LATTICE OF COMMON LISP "STATES"

Once the preliminary transformations described at the end of this section have been performed on a Common Lisp program, only lexically-scoped variables and global variables remain. Due to their sheer number, we have currently decided not to independently track the contents of true global variables, which in Common Lisp are the so-called "value cells" of symbols, and hence components of the "symbol" record structure. These have presently been lumped together into the single class which tracks the union of all symbol value cells. While this approximation loses valuable information about the use of global variables, the approximation must be used in all existing Lisp implementations, because these variables can be changed by any evaluated function, or by the user himself, and therefore must be capable of holding any Lisp datatype.

More interesting is our treatment of lexically-scoped variables and temporaries. The Nimble type inference algorithm collapses a Lisp program into a Fortran-like program by ignoring the recursive aspect. In other words, in our approximation, the various stack frames of a recursive function are modeled by a single stack frame which is a kind of union of all of the stack frames from the recursion. Thus, each program state within a function can be represented by its "alist environment"--i.e., a vector which associates a "single-value" lattice element which each lexically visible variable. Note that lexical variables not directly visible from within the function are not present in any of the program states within the function, since any changes to any visible variable can only be an effect of the function itself. The representation of temporaries is even easier, since a temporary can have only one "producer" and (at most) one "consumer", and these are tied (after collapsing the stack) directly to the program text. Therefore, these temporaries need not participate in the program states directly, but only as a result of being saved away (via let), or assigned to a program variable (via setq).

Thus, a program state is an environment which consists of only the named lexical variables visible at that point in the program; temporaries and global variables do not appear. Furthermore, this environment is a simple linked Lisp list of "single-value" lattice elements which is indexed by the integer which is the lexical depth of the corresponding variable in the current lexical environment. This list, and all of its elements, are functional, in that we perform no side-effects on them. This means that we are allowed to freely share the tails of these lists, which we do as much as possible. Since each program state is almost the same as its predecessors and its successors, and since most of the changes occur at the top of this list (the inner-most lexical scope), only minor modifications need to be made in order to produce the next state from a previous state.

This massive sharing of state tails saves space, but more importantly, it saves time. This is because during the processing of statements to produce the next state, only a small amount of processing need be performed. When using tail-sharing, this processing is typically O(1) instead of O(n), where n is the number of lexical variables visible at this point. Similarly, when performing the lattice meet and join operations on states, we usually only process the top few items, because the rest of the tails are identical. Thus, by using a functional representation of states and massive tail-sharing, we can represent full state information during our inferencing for only a little more than the storage and time used for a single state which is global to the whole program (like ML [Milner78]).

6.4 THE EFFICIENT IMPLEMENTATION OF KAPLAN-ULLMAN "T-FUNCTIONS"

The success of the Kaplan-Ullman type inference algorithm depends critically on getting sharp type information from the primitive functions of the language. In other words, given a function and a set of bounds on its arguments, we must produce the sharpest possible bounds for the result of the function. In addition to forward and backward inferencing information, we would also like to get "side-ways" type inference information; this information is extracted from the interaction of constraints on the various arguments of multiple-argument functions. The symmetry between arguments and results leads to a symmetrical representation for the t-function information. This representation is in the form of a mathematical relation, which is simply a subset of a Cartesian product of domains. In particular, if f:AxB->C, then f can be represented by a relation, i.e., a subset of the Cartesian product AxBxC. The datatype domain induces an equivalence relation on A, B, and C, in such a way that the t-function for f becomes the function f':A'xB'->C', where A', B', and C' are the quotient sets of A', B', and C' induced by the equivalence relation. While the full relation for f may be infinite in size, if A', B', and C' are all finite, then the Cartesian product A'xB'xC' is also finite, and hence f' can be represented by a finite relation on A'xB'xC'.

If A', B', and C' are all small, then f' can be represented by a list of legal triples, or alternatively a 3-dimensional bit matrix in which bit ijk is on if and only if k is an element of f'(i,j). Neither of these representations is particularly small or computationally efficient if the number of arguments to (or returned values from) a function is very large. Nevertheless, if the tightest possible information is required, then a smaller representation may not exist unless the function can somehow be factored into a composition of simpler functions.

Standard binary operations like "+" will require a bit matrix of size n^3 to be represented, where n is the number of "representatives" (see[Baker92] ) in the datatype lattice representation. We expect n to range from 32-64, hence we will require between 1024 and 8192 32-bit words to simply represent the relation. Binary operations like Common Lisp's floor operation, which takes two arguments and produces two results, will require n^4 bits, requiring between 32,768 and 524,288 32-bit words for its representation. However, should this amount of space be considered excessive, then floor can instead be represented by two different functions--one for each different result--for an amount of space between 2048 and 16384 32-bit words. While this is a substantial savings in space, there is some loss of resolution due to the lack of dependence between the two different results of the floor function in the "two-function" representation.

The success of the Nimble type inferencer is due, in a large part, to its ability to completely encode the type complexity of Common Lisp primitive functions without actually interpreting the underlying implementation code. The type complexity of some Common Lisp functions is quite high. The exponential function (expt base power), for example, is required to return a rational result if base is rational and power is an integer, and may return a floating-point approximation (possibly complex) otherwise. On the other hand, if the arguments to a library function are rational and the true mathematical result is rational, then an implementation is free to return either a floating-point number or the actual rational number result[CLtL,p.203]. For example, some Common Lisp implementations of sqrt go to the trouble to detect integer perfect squares, and in these cases return integer (rather than floating-point) results! Given the number of cases to consider, especially when various subranges of numbers are considered, the exact representation for the result type of expt or sqrt becomes a nightmarish expression in traditional Common Lisp type specifier syntax.

The use in the Nimble type inferencer of bit-arrays to represent the type behavior of Common Lisp primitive functions is inelegant compared with the clever "type variable" method to handle polymorphism in ML. However, the bit-array method is capable of handling the type complexity of Common Lisp while the type variable method of ML would not extend to this higher level of complexity and polymorphism. Without type variables, ML can represent only "rectangular approximations" to the true t-functions, while type variables add the ability to represent "diagonal approximations". If a t-function cannot be decomposed into rectangular and diagonal regions, then the methods of ML will not achieve the same resolution as our bit-array method. Common Lisp's sqrt function cannot be easily decomposed into diagonal or rectangular regions, as can be seen by the chart below.

t-function for sqrt
result ->	integer	ratio	c-rat	float	c-float
arg |
    V
integer		X		X	X	X
ratio			X	X	X	X
c-rational			X	X	X
float					X	X
c-float						X
Given the type complexity of the Common Lisp builtin functions, it becomes difficult to construct the correct bit-arrays for the type inferencer. For example: x-y is equivalent to x+(-y) in normal algebra and even in normal computer arithmetic. However, the t-function for binary "-" cannot be derived from the t-functions for "+" and unary "-" unless the underlying lattice is symmetric about 0. However, given a datatype lattice which is symmetric about 0, we must then derive the t-function of binary "/" from that of "*" and unary "/", which requires a datatype lattice which is symmetric about 1. Since it is impossible to produce a lattice which is symmetric about both 0 and 1 at the same time (unless it is the indiscrete lattice of all rational numbers!), we cannot, in general, derive accurate t-functions of functions from the t-functions of their functional factors. Nor can we construct these t-functions by hand; the number of Common Lisp builtin functions is very large (on the order of 500), and each of these bit-arrays contains a large number of bits. From these observations, it can be seen that we must somehow automate the process of constructing these large bit-arrays.

We can automate the production of the bit arrays for our t-functions by using the machinery of Common Lisp itself. If we are given a sufficient number of "representative elements", then we can simply execute the function on all of the representative elements, and note into which class the result falls. With properly chosen representative elements, we can elicit the complete behavior of the function over the finite set of atomic datatypes in our Boolean lattice. Of course, the choice of representative elements cannot be performed automatically, but must be made carefully and intelligently. The lack of a proper representative may result in the lack of a bit in the t-function array, and hence an improper inference may be made. Furthermore, the appropriate set of representatives may be different for each function. Nevertheless, the number of representatives required is still far less than the total number of bits in the bit-array. Additionally, the ability to get the representatives correct is much less difficult than getting every bit right in the t-function bit-arrays by hand.

Certain functions like +, min, gcd, etc., satisfy certain commutativity properties. In these cases, we can eliminate half of the work for constructing t-functions by considering only non-commutative pairs of arguments and then or'ing the resulting bit-array with one of its transposes. However, even with such short-cuts, the production of these bit-arrays is a time-consuming process. The problem is somewhat eased, since the production of these bit-arrays need be done only once for a particular combination of Lisp implementation and datatype lattice, and so we can precompute these arrays.

An interesting problem occurs when we attempt to precompute large bit-arrays in Common Lisp. There is no efficient mechanism to read or write large bit-vector arrays in a compressed form (i.e., as individual bits) in Common Lisp! If the array is output using the print routine, the array is printed out as integers--i.e., the digits 0 or 1 followed by a single space--or 16 bits for every one bit in the array! Writing the underlying single-dimensioned bit-vector results in the vector being printed in #*10110...01 format--or 8 bits for every one bit in the array. Since the bit-arrays for the normal binary operations such as + may require 32K bytes internally, the prospect of reading 1/4 megabyte for each of these functions seems excessive.

There is no builtin mechanism in Common Lisp to convert bit-vectors into integers. Even if such a mechanism existed, we could at best print out the bit-vector in hexadecimal format--or 4 bits for every one bit in the array. For the Nimble type inferencer, we overcame this problem by tricking Common Lisp into thinking that a bit-vector was really a character vector. In this way, we achieved a one-to-one correspondence between external and internal bits. (Lest advocates of other languages be smug about this point, consider the same problem in Ada. The standard mechanism to read and write boolean arrays in Ada involves the use of character strings "TRUE" and"FALSE", which require an average of 5.5 characters (= 44 bits) for each internal bit read or written!)

6.5 INCORPORATING USER TYPE DECLARATIONS

User type declarations for program variables are trivially incorporated into the Kaplan-Ullman type inference algorithm by using them to initialize the upper bounds instead of initializing these bounds to "top". Since the Kaplan-Ullman algorithm uniformly meets any new bounds with the upper bounds, the minimal datatype it determines will always be consistent with the declared datatype. The Nimble type inferencing algorithm follows the same procedure.

6.6 INSERTING TYPE CHECKS

The Nimble type inferencer puts (the <type><exp>) expressions around every expression, including constants and variables. If the actual value of <exp> at run-time is an element of <type>, then this expression acts as an identity function, while if the actual value is not of the appropriate type, then an error message will be generated and the program is (usually) aborted. If, after type inference, the type of <exp> can be proved to be of a subtype of<type>, then the run-time type check is superfluous. If the compiler used to compile the program uses a complete decision procedure forsubtypep[Baker92], then it will eliminate all of the type checks that the Nimble type inferencer was able to prove superfluous.

6.7 EXTENDING KAPLAN & ULLMAN TO WORK ON USER-DEFINED FUNCTIONS

The Kaplan-Ullman type inference algorithm dealt with only primitive functions, and did not treat user-defined functions. In order to extend Kaplan-Ullman to handle user-defined functions, we must make some approximations. The most important approximation we make is to identify all instances of the same lexical variable or the same lexical function. Thus, the argument "X" in a recursive function will always be given the same type, regardless of where or when it was called--whether from outside the function or as a recursive call from inside the function. This approximation is reasonable, since the compiler will be generating only one instance of the lexical variable or the lexical function, and thus the instance generated must deal with all situations which might occur during run-time.

This approximation collapses the control structure for the program being analyzed into a Fortran-like structure in which the formal parameters and local variables of a function behave more like Fortran variables (or Algol "own" variables) than like local stack-allocated or heap-allocated variables. Some approximation of this type is necessary in order to ensure convergence of the type inferencing algorithm. This particular approximation is the most "natural" one to use, as well as one of the easiest to explain to the programmer. If collapsing a higher order control structure into an iterative control structure loses too much resolution to make the desired distinctions, then the programmer can always "inline" the function or "unroll" the loop one or more times, to achieve higher resolution. Any other policy would be too arbitrary to understand or control.

This collapsing policy also has the advantage of treating Common Lisp lexically scoped free variables (including global variables) in a uniform and reasonably intuitive manner.

6.8 CONTROL FLOW INFERENCING

In addition to infering the types of variables, the Nimble type inferencer also infers whether a portion of the program is executed or not--i.e., it performs control flow inferencing. This is normally used for dead code elimination. By utilizing more sophisticated information to determine deadness, we can in more cases infer that a conditional will only execute one (or neither) of its arms. This is important, since dead code can ruin the sharpness of our type deductions. Consider, for example, the following code:
(let ((x 3))
  (if (integerp x) x (setq x 3.0)))
Note that the ability to tell that the assignment will not be executed is essential to determining that x will hold only integer values. A sloppier algorithm would conclude that x could be either an integer or a floating-point number, thus foregoing a significant optimization. Performing simultaneous control flow and type inferencing is important when processing the bodies of functions expanded in-line, because such in-lined functions are significant sources of dead code.

We can handle dead code elimination within our algorithm by incorporating control information in addition to our datatype information in the "flowstate" lattice during the forward inferencing loops. The control information basically indicates whether a particular node will ever be executed. This information is inferred by induction, using the fact that the first node in the program is executed as an initial condition. Nodes that have not yet been marked as executing do not participate in inferencing. If a node is still unmarked when a forward inferencing loop converges, then it must be the case that the node is dead code, and it (along with the "conditional" branch that goes to it) can be immediately eliminated.

6.9 TWO-PHASE INFERENCING

The implemented Nimble type inference algorithm actually involves two separate type inference processes. This two-phase implementation arises from the limitations of the Kaplan-Ullman algorithm discussed earlier. The two phases are quite similar, except for the following differences:
  • Phase I uses a higher resolution lattice which does not satisfy the finite chain condition
  • Phase I does forward inferencing only
  • Phase I does not loop
There are several reasons for performing type inferencing with two different lattices. Kaplan-Ullman type inferencing is an algorithm by which most of the progress in producing sharp lattice upper bounds comes in the early iterations of the outer loop, while most of the effort in the later stages produces only minor improvements. Thus, the higher resolution lattice (involving a correspondingly larger amount of computational effort) is used in the beginning to achieve a better starting point for a more classical Kaplan-Ullman inferencer. We also use the first forward inference pass to produce very sharp control flow information; this approach is essential to the removal of dead code which would make the subsequent inferencing of sharp bounds impossible. Thus, we utilize the higher resolution indiscrete lattice where it will be the most productive--during the first forward inferencing pass--yet we use it only once in order to avoid its non-convergence problem.

The indiscrete lattice used in the first phase of the Nimble type inferencer is the same as that described earlier in [Baker92]. Basically, the lack of discreteness in this lattice is a result of the interval representation of rational and floating-point ranges. One end of an interval can grow without bound, or can have an irrational limit, thus, this lattice is not discrete. The use of an indiscrete lattice in a Kaplan-Ullman inferencer would prevent it from converging.

The fact that the first phase does not loop means that minor modifications must be made to the inferencer to ensure that the "looping/recursion principle" stated above is not violated. In the classical Kaplan-Ullman algorithm, the carrying of simultaneous upper/lower bounds enhanced the speed of convergence of the algorithm but not its final result. The carrying of simultaneous upper/lower bounds, however, is essential to obtaining a reasonable answer from the first phase algorithm.

Forward inferencing using the indiscrete lattice of [Baker92] subsumes a number of traditional static analyses. Within the limitations of this lattice--that higher order functions and data types are not modeled, except by cons and function--this inferencing can subsume constant propagation for scalars and "interval arithmetic" on primitive arithmetic functions. Therefore, the bounds for scalar numeric values can be surprisingly tight.

6.10 SUMMARY OF THE NIMBLE TYPE INFERENCE ALGORITHM

The Nimble type inference algorithm conceptually operates in several passes, although they are not implemented that way. The first pass converts Common Lisp into a somewhat simpler language by eliminating several complex notions from the language less ruthlessly than [Kranz86]. Macros are expanded; "special" (dynamically-scoped) variables are translated into semantically equivalent implementations; the dynamic operations catch,throw and unwind-protect (analogous to "exceptions" and "signalling" in Ada) are translated into semantically equivalent operations [Haynes87]; functions declared "inline" are expanded; some constant propagation/dead code elimination is performed and many syntactic variations are regularized. The second pass performs forward and control flow inferencing using the indiscrete lattice of[Baker92]. During the second pass, we eliminate a substantial amount of dead code which was accumulated through macro and inline function expansion as well as from the translation of the undesired special forms of Common Lisp. The third pass (which consumes most of the time required for inferencing) performs forward, backward and control flow inferencing using the discrete lattice. The results are then passed directly to a code generator, or can be output as source code decorated with complete declarations. Due to the macro and function expansions and to the issues discussed in [Baker92], this output is not particularly readable. We show below the analysis of the TAK function [Gabriel85] by the Nimble type inferencer:
(defun do-tak ()
  (labels
    ((tak (x y z)
       (if (not (< y x))
           z
           (tak (tak (1- x) y z)
                (tak (1- y) z x)
                (tak (1- z) x y)))))
    (tak 18 12 6)))
Our algorithm is able to make the following inferences:
  1. x,y,z start as integers and are only affected by1-, thus are always integers.
  2. x,y,z are always decremented, so they can never grow in the positive direction.
  3. The value of the function comes eventually from z, and hence is an integer.
However, the algorithm cannot determine whether the function will ever stop, or whether there is any limit to the size of x,y,z in the negative direction. Thus, although the algorithm can be optimized to assume only integer arithmetic, it must still allow for the possibility of negative bignums, even if they are never used. Unfortunately, thepossibility of bignums means the possibility of garbage collection, hence the compiler must be careful about keeping clean calling sequences.

Another inference algorithm would have to be much smarter than the Nimble algorithm in order to determine better range information. Proving a restricted range is equivalent to proving termination, which cannot be proved within the limits of the Kaplan-Ullman framework. Proving termination forTAK requires that relationships among variables be tracked, which is not done by the Kaplan-Ullman algorithm.

Due to these considerations, we believe that the Nimble analysis is about the best that can be expected for an automatic inferencing process given reasonable time and memory constraints.

The complexity of the Nimble type inference algorithm is relatively difficult to analyze. As we have shown in several examples above, it is difficult to determine upper bounds upon the number of iterations in the nested loops that define the Kaplan-Ullman algorithm. The program text alone is not sufficient to determine the number of iterations, since the number will depend upon the resolution of the datatype lattice being used to analyze the program. It is easy to show examples where the number of iterations of an inner loop is equal to the height of the lattice (i.e., the length of the longest chain). It should also be possible to demonstrate that the outer loop can be forced to descend the entire height of the lattice. If nested loops of this type can be demonstrated, then the complexity of the Kaplan-Ullman algorithm would be at least exponential in size of the input program.

We have used the Nimble type inference algorithm on a number of small examples, but have not yet used it on large programs. As we have pointed out, the first few iterations of the outer loop achieve most of the progress at producing sharp bounds, with the additional iterations "polishing" the bounds within one of the loops. While it seems obvious that one should simply cut off the process at some point, the exact point to cut it off is not yet clear. For example, it is likely that the additional polishing done by the later loops is improving the performance of the program's inner loops, which usually occupy most of the program's execution time, and therefore what appears to be minor progress may actually result in substantial reductions in execution time.

Nevertheless, the Nimble algorithm seems to converge quite rapidly on the examples tried to date. This experience is consistent with the general experience of dataflow analyzers; that on most programs, they terminate more quickly than theory would predict.

The Nimble type inferencer required about 30 seconds to type the TAK function given above on a 4 Megabyte Macintosh Plus with a 16MHz 68020 accelerator running Coral Common Lisp. While this is quite slow, it would have been about 100 times slower without the careful tuning of the bit-vector and bit-array manipulation routines described in[Baker90b]. While 30 seconds seems excessive to some, the types of computations performed by the Nimble inferencer could be easily mapped onto parallel architectures: SIMD architectures, such as the Connection Machine [Hillis85], for high performance on bit operations; MIMD architectures, for high performance on inferencing different parts of the program simultaneously.

Schwartz and Tenenbaum [Schwartz75][Tenenbaum74] were early researchers in type inferencing for the high level language SETL. They utilized dataflow techniques on a lattice that included higher order data structures which included both "forward" and "backwards" inferencing. Their task was eased by the functional (side-effect free) nature of their higher order data structures.

The resolution of overloaded function symbols in strongly-typed languages such as Ada [Ada83] bears much resemblance to type inference. Early researchers feared that overload resolution would require an iterative forwards-and-backwards process similar to that described here [Ichbiah79,7.5.1], but these fears proved groundless when algorithms were found that performed this resolution in a single forward and a single backward pass [Pennello80].

Abstract interpretation is the name given to the generic process of "executing" a program on a lattice which is much simpler than the standard execution lattice. This process produces a kind of "homomorphic image" of the real computation, and is often used for various kinds of static analysis [Cousot77, Mycroft81, Burn87]. Most "forward" type inference, including that performed by Kaplan-Ullman, Beer and ourselves, can be viewed as a form of abstract interpretation. However, as Tanenbaum [Tanenbaum74], Kaplan-Ullman [Kaplan80] and we show, forward inferencing, and hence abstract interpretation, is not strong enough by itself to provide the information which is desired.

Constant propagation [Aho86] can be seen as a form of forward type inference or abstract interpretation [Callahan86]. This technique detects and propagates compile-time constants by evaluating expressions (including function calls, if possible) to perform as much of the computation as possible during compilation. A complete implementation of constant propagation subsumes actual program execution, since the provision of a complete set of input data would enable the computation of all output at compile time. Since constant propagation necessitates a violation of the order of evaluation, it has much in common with strictness analysis in lazy functional languages [Burn87].

Kaplan and Ullman [Kaplan80] provide an algorithm and a characterization of a type inference algorithm for a run-time data-typed language such as APL or Lisp. Their algorithm is optimum, in that for a class of languages and programs that he characterizes, it provides the best possible information on the range of types that a variable can assume. Kaplan shows that both "forward" inferencing (in the normal direction of computation) and "backward" inferencing (contrary to the normal direction of computation) is required in order to extract the maximum information. Forward type inferencing propagates the type information from subexpressions to the whole expression by restricting the possibilities for the mathematical range of the subexpression functions; e.g., knowledge about the non-negativity of a "square" function might be useful to restrict the possible results from the next step in the computation. Backward type inferencing propagates the type information about the mathematical domain of functions within subexpressions; e.g., if a function computes the reciprocal of a number, then the requirement of non-zeroness of that argument must be fed backward through the computation to make sure that the reciprocal function will never see a zero argument.

Kaplan's algorithm provides the maximal amount of information, but it depends upon a rather simplified model for a programming language: a language with variables and iteration, but no recursion or data structures. Furthermore, he does not tackle the problem of functional arguments, which makes control flow analysis difficult in Lisp [Shivers88]. The Nimble type inference algorithm extends Kaplan's algorithm to handle the constructs of Common Lisp.

Most existing Lisp implementations utilize a simple forward inferencing scheme in which declaration information is propagated forwards from variables to values, function arguments to function values [Moon74, Teitelman78, Marti79, Brooks82, Yuasa85]. These schemes are not state-based, and hence cannot handle case-based inferencing. Furthermore, the lattice typically used tends to be trivial--e.g., "integer/short-float/long-float/other". Beer [Beer88] has implemented the forward portion of Kaplan's algorithm for Common Lisp using a more precise, hence indiscrete, lattice to infer types and numeric bounds. He finds that it is successful at determining the types of 80% of the variables and expressions at compile-time for an interesting benchmark. More importantly, the program ran 136% faster after type inferencing, while only an additional 3.5% improvement was realized when the rest of the declarations were inserted by hand. We believe that the Nimble two-phase approach is strictly more powerful than the Beer algorithm, although they are difficult to compare because the Beer algorithm uses heuristics to terminate its loops.

[Bauer74] pointed out the possibility of type inferencing in APL. [Budd88] has implemented an APL compiler which is successful at inferring the types of most variables and subexpressions within the APL language.

[Suzuki81] and [Borning82] attack the problem of type inferencing in the Smalltalk language. In Smalltalk, control flow and data flow analysis must be done simultaneously, since in many cases, the code executed depends upon the type and values of the data, and vice versa. They find that Smalltalk also has enough redundancy to make type inferencing quite successful.

Range inferencing is similar in concept to type inferencing. Here, we would like to narrow the range of values assumed by a variable or an expression to be less than the whole universe of values of the particular data type. For example, if a variable is inferred to be an integer, we would like to determine whether its values are restricted to a small set of integers, perhaps 0-255, so that additional optimization can be performed. Range inferencing is particularly important in reducing the need for array bounds checking, because bounds checking can slow down and possibly defeat several array indexing optimizations.

[Harrison77] is one of the first to report on compile-time range inferencing, with [Suzuki] and [Markstein82] following. Even though the results these researchers reported were positive, very few commercial compilers incorporate this sort of analysis, except for Ada compilers [Taffs85], in which range checks are required unless they can be proved redundant. To avoid the overhead of array bounds checking in those compilers which do not perform the analysis, the user must turn off all array bounds checking. This practice is too dangerous for applications where an error could cause loss of property or life.

Even in those cases where array-bounds checking cannot be eliminated, a competent type checker can still be beneficial. The programmer may already have performed his own range check to obtain a more graceful error recovery than the language system normally provides, and in some of these cases, the type checker can usually conclude that an additional check inserted by the compiler would be redundant.

Array bounds checking demonstrates one significant weakness of the Nimble type inference algorithm relative to strongly-typed languages like Ada [AdaLRM]. Ada is a strongly typed language which has a substantial amount of machinery for declaring and manipulating variables subject to range constraints. However, unlike Nimble ranges, whose endpoints must be numeric constants, Ada ranges can have variables as endpoints, meaning that the size of the range is not known until run-time. Thus, an Ada compiler can relatively painlessly determine that the array bounds of v are never violated in the following code, by relying on Ada's strong typing system:

type vector is array(natural range <>) of float;

function sum foo(v: vector) return float is
	total: float := 0;
	begin
		for i in v'range loop
			total := total + v(i);
			end loop;
		return total;
	end sum;
On the other hand, the current Nimble type inferencer cannot eliminate the bounds checking on v in the following equivalent Common Lisp code due to its inability to represent such variable ranges:
(defun sum(v &aux (total 0))
	(dotimes (i (length v) total)
		(incf total (aref v i))))
ML-style type inferencing [Milner78] elegantly solves two problems--typing higher order functions and data structures, and avoiding the forward-backward iterations of the dataflow techniques. However, ML-style type inferencing also has several deficiencies. It cannot handle case-based inferencing due to its lack of state and it cannot handle full Lisp-like polymorphism.

The ML-style unification algorithm which comes closest in goals to ours is that of [Suzuki81] for Smalltalk-76. Suzuki extends the ML algorithm to handle unions of base types, which are quite similar to our techniques for representing Common Lisp types. He uses Milner-style unification to solve a set of simultaneous inequalities on the datatypes of the variable instances instead of the more precise (and slower) Scott-style least-fixed-point limit steps. The Suzuki method may be somewhat faster than our method and it easily extends to higher-order functions, but it does not produce bounds which are as tight as those produced by the Nimble algorithm. For example, it cannot conclude that the argument to the factorial function remains a non-negative fixnum if it starts as a non-negative fixnum, nor can it conclude that the value is always a positive integer if the argument is a non-negative integer.

[Wand84] describes an ML-style type checker for Scheme, another dialect of Lisp. It handles straight-forward ML-style polymorphism, and is best characterized as "ML with parentheses". However, this method is not nearly as powerful as that in [Suzuki81], because it cannot handle the unions of datatypes introduced by Suzuki, and cannot therefore handle the polymorphism of real Lisp programs.

The Nimble type inference algorithm could be used in a functional programming environment, where it could infer sharper information than the ML unification algorithm. This is because the Nimble algorithm can handle polymorphism and case-based reasoning in a way that would be impossible for a unification-based algorithm. Its ability to type builtin functions more accurately than ML will also produce sharper type information. While it may be more expensive to run than a unification-based inference algorithm (although ML typing is itself known to be DEXPTIME-complete [Mairson90]), its better information may yield more efficient programs--a reasonable trade-off in some situations.

Type inferencing in a run-time data typed language such as Lisp or APL is not needed for simple execution. If the goal is optimized execution, however, then more specific information as to the types of variables and expressions is necessary. Type inferencing cannot be dispensed with through additional declarations; e.g., declarations force the same type for an argument in all calls to a procedure, and eliminate the possibility of polymorphism, or execution of the same code at different times with different types [Cardelli85]. Type inferencing can be a real boon in checking types across procedure call interfaces, and allow for different types to be inferred within a procedure depending upon the actual arguments.

Generalized type inferencing would seem to be hopeless. However, while many examples can be contrived to show the impossibility of assigning a distinct type to an expression, most real programs have more than enough redundancy in the use of the built-in functions and operators to enable most data types to be unambiguously assigned [Beer88]. The consequences of an ambiguous assignment in Lisp is not necessarily an error, but it does reduce the possibilities for optimization; hence the more tightly the datatypes are constrained, the more efficiently the code will run.

We have described a type inference algorithm for Common Lisp which has evolved from the Kaplan-Ullman algorithm [Kaplan80] to the point that it can handle the entire Common Lisp-84 language [CLtL84]. We have shown, through a number of examples, that this algorithm uses case-based and state-based reasoning to deduce tight lattice bounds on polymorphic functions, including recursive functions. We have described a number of novel techniques for engineering an efficient implementation of the lattice manipulations required by this algorithm. We have shown how this algorithm is strictly more powerful than other popular techniques such as unification-based techniques [Milner78] on some examples, and seems more appropriate for highly polymorphic languages such as Lisp. While the algorithmic complexity of our inferencer is higher than usual for Lisp compilers, its better information can be used for a greater than usual degree of optimization. The fact that this information can be extracted in a completely mechanical fashion, and the fact that the kind of processing required can be greatly accelerated by parallel computers, mean that the cost of type inference will decrease quickly over time.

A possible improvement that could be made to the basic Kaplan-Ullman type inference machinery is the employment of a larger number of lattices. So long as every inner loop in the Kaplan-Ullman algorithm is allowed to complete, the computed bound can be used as an upper bound on the next stage execution of the inner loop. If this next stage uses a more refined lattice, then tighter bounds can be inferred. Therefore, we could conceivably start with a coarse lattice, distinguishing only between scalars, list cells, functions, etc. The next stage could distinguish various kinds of numbers, various kinds of list cells, etc. Only in the latest stages would we distinguish among the higher order kinds of data structures and their components. A large amount of space and time in type inferencing could be saved by reserving a higher resolution lattice for numbers, only for those variables which have already been shown to be numbers; a higher resolution lattice for different kinds of list cells could be reserved just for those variables shown to be only list cells; and so forth. In this way, we could utiliize different lattices for different variables, which is an improvement that we could also have achieved through strong typing. However, our lattice approach allows far more flexibility, because not all variables need be resolved to the same level of refinement.

Since the Nimble type inferencer must deal with the entirety of the Common Lisp-84 language, it must have a reasonably deep understanding of every one of its datatypes, constructs and functions. One may ask whether the enormous effort involved in incorporating this knowledge into a static analyzer is worth the effort. The answer is yes, if there exist important Common Lisp programs which would be expensive to modify which need to be statically analyzed.

In most cases, the Lisp community would be better served by a language which is much smaller than Common Lisp, since the many different and often redundant features of the language do not contribute either to its efficiency or to its ease of use. For example, the polymorphic type complexity of the Common Lisp library functions is mostly gratuitous, and both the efficiency of compiled code and the efficiency of the programmer could be increased by rationalizing this complexity. Notions such as dynamic floating-point contagion, multiple-values, complex argument-passing, and special variables are obsolete in today's world. Most strings and lists in Lisp are used in a functional manner, yet they are heavily penalized in performance by the remote possibility of side-effects. A major advance in the run-time efficiency and ease of static analysis of Lisp-like languages could be achieved if Lisp programs and argument lists were constructed from some functional data structure instead of from cons cells.

The author wishes to thank the Department of Energy for their support and James J. Hirsch for his help in editing this manuscript. AdaLRM: Reference Manual for the Adareg. Programming Language. ANSI/MIL-STD-1815A-1983, U.S. Government Printing Office, Wash., DC, 1983.

Aho, Alfred V.; Sethi, Ravi; and Ullman, Jeffrey D. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986.

Baker90a: Baker, Henry. "Unify and Conquer (Garbage, Updating, Aliasing...) in Functional Languages". Proc. 1990 ACM Conference on Lisp and Functional Programming, June 1990.

Baker90b: Baker, Henry. "Efficient Implementation of Bit-vector Operations in Common Lisp". ACM Lisp Pointers 3,2-3-4 (April-June 1990), 8-22.

Baker92: Baker, Henry. "A Decision Procedure for Common Lisp's SUBTYPEP Predicate". Lisp and Symbolic Computation 5,3 (Sept. 1992), 157-190.

Bauer, Alan M., and Saal, Harry J. "Does APL really need run-time checking?" Software Practice and Experience, v.4, 1974,pp.129-138.

Beer, Randall D. "The compile-time type inference and type checking of Common Lisp programs: a technical summary". TR 88-116, Ctr. for Automation and Intelligent Sys. Research, Case Western Reserve Univ., May 1988; also LISP Pointers 1,2 (June-July 1987),5-11.

Bobrow, et al. "Common Lisp Object System Specification X3J13", ACM SIGPLAN Notices, v.23, Sept. 1988; also X3J13 Document 88-002R, June 1988.

Borning, Alan H. and Ingalls, Daniel H. H. "A Type Declaration and Inference System for Smalltalk" ACM POPL 9, 1982, pp.133-141.

Brooks, R., Gabriel, R., Steele, G. "S-1 Common Lisp Implementation". Proc. of '82 ACM Symp. on Lisp and Funct. Prog., (Aug. 1982),108-113.

Brooks, R., et al. "Design of an Optimizing, Dynamically Retargetable Compiler for Common Lisp". Proc. of '86 ACM Conf. on Lisp and Funct. Prog., (Aug. 1986),67-85.

Budd, Timothy. An APL Compiler. Springer-Verlag, NY, 1988.

Burn, G.L. Abstract Interpretation and the Parallel Evaluation of Functional Languages. Ph.D. Thesis, Imperial College, London, 1987.

Callahan, D., Cooper, K.D., Kennedy, K., and Torczon, L. "Interprocedural Constant Propagation". Proc. Sigplan '86 Symp. on Compiler Constr., also Sigplan Notices 21, 7 (July 1986),152-161.

Cardelli, L., and Wegner, P. "On Understanding Types, Data Abstraction, and Polymorphism". ACM Comput. Surv. 17,4 (Dec. 1985),471-522.

Cartwright, R. "User-defined Data Types as an Aid to Verifying Lisp Programs". In Michaelson, S., and Milner, R. (eds.). Automata, Languages and Programming, Edinburgh Press, Edinburgh,228-256.

CLtL: Steele, Guy L., Jr. Common Lisp: The Language. Digital Press, 1984.

Cousot, P., and Cousot, R. "Abstract Interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints". 4'th ACM POPL, 1977,238-252.

Dijkstra, E.W. A Discipline of Programming. Prentice-Hall, Englewood Cliffs, NJ, 1976.

Ellis, John R. Bulldog: A Compiler for VLIW Architectures. MIT Press, Cambridge, MA, 1986.

Ferrante, Jeanne, and Rackoff, Charles W. "A decision procedure for the first order theory of real addition with order". SIAM J. Comput. 4, 1 (1975),69-76.

Gabriel, Richard P. Performance and Evaluation of Lisp Systems. MIT Press, Cambridge, MA, 1985.

Harper, R., et al. "Standard ML". Technical Report ECS-LFCS-86-2, Dept. of Computer Science, Edinburgh, UK, March, 1986.

Harrison, William. "Compiler Analysis of the Value Ranges for Variables". IEEE Trans. Soft. Eng. SE-3,3 (May 1977),243-250.

Haynes, Christopher T., and Friedman, Daniel P. "Embedding Continuations in Procedural Objects". ACM TOPLAS 9,4 (Oct. 1987),582-598.

Hillis, W. Daniel. The Connection Machine. The MIT Press, Cambridge, MA, 1985.

Ichbiah, J. "Rationale for the design of the Ada programming language." ACM Sigplan Notices 14,6 (June 1979),part B.

Intel Corporation. i860 64-bit Microprocessor Programmer's Reference Manual. Order #240329, Intel Corporation, Santa Clara, CA, 1989.

Jones, N.D., and Muchnick, S. "Binding time optimization in programming languages". 3'rd ACM POPL, Atlanta, GA (1976),77-94.

Kanellakis, P.C., and Mitchell, J.C. "Polymorphic unification and ML typing". ACM Funct. Prog. Langs. and Comp. Arch. (FPCA), 1989,54-74.

Kaplan, Marc A., and Ullman, Jeffrey D. "A Scheme for the Automatic Inference of Variable Types". ACM JACM 27,1 (Jan. 1980),128-145.

Katayama, Takuya. "Type Inference and Type Checking for Functional Programming Languages--A Reduced Computation Approach". 1984 ACM Conf. on Lisp and Funct. Prog., Aug. 1984,263-272.

Kranz, D., Kelsey, R., Rees, J., Hudak, P., Philbin, J., and Adams, N. "ORBIT: An Optimizing Compiler for Scheme". Proc. Sigplan '86 Symp. on Compiler Constr., also Sigplan Notices 21, 7 (July 1986),219-233.

Ma, Kwan-Liu, and Kessler, Robert R. "TICL--A Type Inference System for Common Lisp". SW--Prac. & Exper. 20,6 (June 1990),593-623.

MacLane, Saunders and Birkhoff, Garrett. ALGEBRA. Macmillan, 1967.

Mairson, H.G. "Deciding ML Typability is Complete for Deterministic Exponential Time". 17'th ACM POPL (Jan. 1990),382-401.

Markstein, Victoria; Cocke, John; and Markstein, Peter. "Optimization of Range Checking". ACM POPL '82,114-119.

Marti, J., Hearn, A.C., Griss, M.L., and Griss, C. "Standard LISP Report". Sigplan Notices 14, 10 (Oct. 1979),48-68.

Milner, Robin. "A Theory of Type Polymorphism in Programming" JCSS17, 1978,pp.348-375.

Moon, D. MACLISP Reference Manual Rev. 0. Proj. MAC--M.I.T., Camb., MA, April 1974.

Morris, J.H. "Types are Not Sets". ACM POPL, 1973, pp.120-124.

Mycroft, Alan. Abstract Interpretation and Optimising Transformation for Applicative Programs. Ph.D. Thesis, Univ. Edinburgh, Scoitland, 1981.

Pennello, T., and Meyers, R. "A Simplified Operator Identification Scheme in Ada". ACM Sigplan Notices 15, 7&8 (July-Aug. 1980),82-87.

Schwartz, J.T. "Optimization of very high level languages--I. Value transmission and its corollaries". J. Computer Lang. 1 (1975),161-194.

Scott, D. "Data types as lattices". SIAM J. Computing, 5,3 (Sept. 1976), 522-587.

Sethi, Ravi. "Conditional Expressions with Equality Tests". J. ACM 25,4 (Oct. 1978),667-674.

Shivers, O. "Control Flow Analysis in Scheme". ACM Sigplan Conf. '88,164-174.

Steele, Guy L., Jr. Rabbit: A Compiler for SCHEME (A Study in Compiler Optimization). AI-TR-474, Artificial Intelligence Laboratory, MIT, May 1978.

Suzuki, Norihisa. "Implementation of an array bound checker". ACM POPL ???.

Suzuki, Norihisa. "Inferring Types in Smalltalk". ACM POPL 8,1981,pp.187-199.

Taffs, D.A., Taffs, M.W., Rienzo, J.C., Hampson, T.R. "The ALS Ada Compiler Global Optimizer". in Barnes & Fisher, "Ada in Use": Proc. Ada Int'l. Conf., Camb. Univ. Press, 1985,355-366.

Teitelman, W., et al. InterLISP Reference Manual. Xerox PARC, Palo Alto, CA, 1978.

Tenenbaum, A. "Type determination for very high level languages". Ph.D. Thesis, Rep. NSO-3, Courant Inst. Math. Sci., New York U., New York, 1974.

Thatte, Satish R. "Quasi-static Typing". 17'th ACM POPL '90, Jan. 1990,367-381.

Wand, M. "A Semantic Prototyping System". Proc. ACM Sigplan '84 Symp. on Compiler Constr., Sigplan Notices 19,6 (June 1984),213-221.

Yuasa, T., and Hagiya, M. Kyoto Common Lisp Report. Research Institute for Mathematical Sciences, Kyoto University, 1985.

Exploring Functional Reactive Programming in Python

$
0
0
Exploring Functional Reactive Programming in Python - Screwtape's Notepad

As a software developer who cares about making robust, debuggable systems, I’ve been interested for a while now in ideas like Functional Reactive Programming, or the Elm Architecture. Sadly, regular employement does not include a lot of opportunities to play with new ideas, so it wasn’t until I took some time off over Easter that I had a chance to sit down and explore for myself.

I had originally intended to implement something likethe simple counter example from the Elm documentation, but as a console application in Python, and built on top of a Functional Reactive base. Unfortunately, it turns out thatElm does not have anything to do with FRP these days, so I wound up with two loosely-related things instead of the single compelling example I’d hoped for. (If you’d like to take a look at the end result, the complete project is up on GitLab.)

Nevertheless, I did learn things, and I want to talk about them. But first, some background information.

What is functional programming?

Functional programming means a lot of different things to different people, but the relevant part to this discussion is mathematically pure functional programming, or just “pure functions” for short. A pure function is one whose return value is determined only by its inputs.

For example, this is a perfectly respectable pure function:

>>>defadd_seven(x):...return7+x

Every time you pass 5 into that function, you’re always going to get 12 back out, no matter what the rest of the program might have done in the meantime.

>>>add_seven(5)12>>>add_seven(5)12

On the other hand, here’s an example of an impure function:

>>>defnumber_from_file(somefile):...returnint(somefile.readline())

Even if you pass in exactly the same value for somefile twice in a row, you probably won’t get the same result out. The second time .readline() will read the next line of the file, not the same line it read originally:

>>>importio>>>handle=io.StringIO("5\n9\n32\n17\n")>>>number_from_file(handle)5>>>number_from_file(handle)9

Pure functional code is easy to work with, because all the context you need to understand it is right in front of you. You don’t need to keep notes about what the rest of the program might be doing. It’s also easier to unit-test, since you don’t need to mock out API calls or reset test fixtures between each test.

It’s impossible to write an entire program in a pure functional style, because eventually you need to get input from the outside world and send a result somewhere. However, the more pure functional code in your codebase, the greater the chance that any given bug will be in one of the easy-to-understand, easy-to-test, easy-to-fix parts of the code.

Where does “reactive” come in?

Functional Reactive Programming (FRP, from now on) is like pure functional programming, except that instead of a function taking values and returning a new value, a function takes streams of values and returns a new stream. You could say that the function “reacts” to changes in its input streams by updating its output stream. Imagine a stream of numbers being fed into the FRP equivalent of add_seven() above. If the input stream contains 5, 9, 32, and 17, the output stream would include 12, 16, 39 and 24.

One natural use of FRP is service monitoring: given some application’s log file, a monitoring tool could read the logs, parse each line, and apply stream functions to convert that stream of records into streams of statistics like “average request latency” or “total requests per client”. You could have even more stream functions stacked on top, like a five-minute moving average of request latency, a one-hour moving average of request latency, and a stream that contains an alert while the five-minute moving average is 20% higher than the one-hour moving average.

FRP is not just about monitoring, it can be useful in any situation where events happen over time. You might think of the state of a database as a function over a stream of insert/update events, or think of a user-interface as a function over a stream of keyboard/mouse/touch events.

If you’ve written much Python, a function that takes a stream and produces another stream might sound a lot like Python’s generator functions.

Let’s try generator functions!

You could write a generator function to add 7 to a stream like this:

>>>defadd_seven_generator(xs):...forxinxs:...yield7+x...>>>xs=iter([5,9,32,17])>>>numbers=add_seven_generator(xs)>>>next(numbers)12>>>next(numbers)16>>>next(numbers)39>>>next(numbers)24

However, Python’s generators aren’t quite the same thing as FRP. Generators are designed to link together into a single long chain: Once a generator has yielded a value it must move on to the next value, it can’t yield the same value to any other the downstream generators that might want to consume it.

As an example, let’s say you have a function that adds two numbers:

>>>defadd(x,y):...returnx+y

You can use that function to double a number by adding the number to itself:

>>>x=5>>>add(x,x)10

However, if you have a generator function that adds two iterables of numbers:

>>>defadd_generator(xs,ys):...forx,yinzip(xs,ys):...yieldx+y

…you cannot use it to double numbers:

>>>xs=iter([5,9,39,24])>>>doubles=add_generator(xs,xs)>>>next(doubles)14

Doubling 5 should not produce 14!

Because of the way Python generators work, the call to zip() inside add_generator() gets the first and second values from xs, rather than two copies of the first value. To implement FRP in Python, we need different behaviour. We want streams to be re-usable, so the same stream can be passed to many different stream functions, the same way a value like 5 can be passed to many different pure functions without being “used up”.

The Stream interface

There’s two basic styles of FRP we could follow:

  • in the push model, we wait for an input event, then notify all the stream functions subscribed to that input, then notify all the stream functions subscribed to the first round of stream functions, and so forth until we get to all the downstream outputs.
  • in the pull model, an output asks its immediate upstream for a new value, that upstream asks its upstreams for a new value, and so forth until we get to all the upstream inputs that can affect the output we’re interested in.

In Python, there’s no standard way for a function to refer to where its output goes to, so a “push” system would likely be awkward to use. On the other hand, every Python function can refer to its inputs (that’s what function parameters are!) so the “pull” model should be a natural fit.

Assuming we have a Python object representing a stream, the “pull” model requires a method to poll for the current value:

>>>classIStream:...defpoll(self):..."Return the current output value of this stream."

As a very basic example, we can describe a stream containing a sequence of numbers:

>>>classNumberStream(IStream):..."A stream of numbers"...def__init__(self,*numbers):...self._numbers=iter(numbers)......defpoll(self):...returnnext(self._numbers)

It works exactly how you’d expect:

>>>numbers=NumberStream(5,9,32,17)>>>numbers.poll()5>>>numbers.poll()9

The stream equivalent of add() is almost as simple:

>>>classAddStream(IStream):..."A stream that adds its inputs"...def__init__(self,xs,ys):...self._xs=xs...self._ys=ys......defpoll(self):...# Get the latest x from the stream of xs...x=self._xs.poll()...# Get the latest y from the stream of ys...y=self._ys.poll()......returnx+y

If we apply this stream function to a NumberStream of 5s, and a NumberStream of 7s, we will get a stream of 12s, just like applying add() to the values 5 and 7 gives the single value 12:

>>>fives=NumberStream(5,5)>>>sevens=NumberStream(7,7)>>>twelves=AddStream(fives,sevens)>>>twelves.poll()12>>>twelves.poll()12

Now we can recreate the generators example from before, but in proper FRP style:

>>>numbers=NumberStream(5,9,32,17)>>>doubles=AddStream(numbers,numbers)>>>doubles.poll()14

Waaaaaait, that’s not right! This is exactly the same problem we had before!

The Stream interface, take 2

If we think about a stream as a value that can change over time, it makes sense that repeated polls at the same time should return the same value. The calls to .poll() do not literally happen at exactly the same nanosecond, but they’re close enough together that we expect them to be treated the same.

That is, we want to introduce an idea of “time” to IStream so that the caller can keep it “the same time” until they’re done examining output streams, and then “advance time” when they are ready to see what happens next.

>>>classIStream:...defpoll(self,phase):..."Return the current output value of this stream."

We’ve added a new parameter named phase. When a stream’s .poll() method is called, if the value of phase is the same as it was for the previous call, the method must return the same value as it did for the previous call. If the phase argument has changed since the previous call, then the stream function may recalculate its output. The idea is that a system decides it’s in “blue” phase (or whatever) and when it polls all the streams it cares about, it can be sure all the calculations are based on the system state at the time “blue” phase began. Then it can switch to “green” phase (or whatever) and be sure that none of the stream outputs are based on stale “blue”-phase data.

Because every stream needs to handle phases in the same way, let’s put that functionality into a base class where it can be easily shared:

>>>classBaseStream(IStream):..."A base class that handles stream phases for you."...def__init__(self):...self._last_phase=None...self._last_output=None......def_poll(self,phase):..."Override this to implement the actual stream function"......defpoll(self,phase):...ifphase!=self._last_phase:...self._last_phase=phase...self._last_output=self._poll(phase)......returnself._last_output

Now, instead of overriding .poll() to implement the calculation we want, we must override ._poll() (with the leading underscore), which .poll() only calls once per phase. Note that ._poll() still takes a phase argument: if it needs data from other streams, it will need to pass the current phase along when polling them.

Let’s recreate NumberStream in this new style:

>>>classNumberStream(BaseStream):..."A stream of numbers"...def__init__(self,*numbers):...super().__init__()...self._numbers=iter(numbers)......# Because NumberStream does not poll any other streams,...# it does not need the 'phase' argument....def_poll(self,_):...returnnext(self._numbers)

Inheriting from BaseStream ensures thatNumberStream respects phases:

>>> numbers = NumberStream(5, 9, 32, 17)>>> numbers.poll("blue")
5>>> numbers.poll("blue")
5>>> numbers.poll("green")
9

Meanwhile, AddStream is nearly the same as it was before:

>>>classAddStream(BaseStream):..."A stream that adds its inputs"...def__init__(self,xs,ys):...super().__init__()...self._xs=xs...self._ys=ys......def_poll(self,phase):...# Get the latest x from the stream of xs...x=self._xs.poll(phase)...# Get the latest y from the stream of ys...y=self._ys.poll(phase)......returnx+y

But this time, because we have phases, the doubling example works properly:

>>>numbers=NumberStream(5,9,32,17)>>>doubles=AddStream(numbers,numbers)>>>doubles.poll("blue")10>>>doubles.poll("green")18

Huzzah!

Stateful stream functions

So far we’ve made stream-based versions of pure functions like “addition”. Because addition is a pure function, it should not be surprising that stream addition maintains the understandable, testable nature of the pure functions it’s based on.

However, streams implicitly hide a little state (the current position in the stream), which means we can write stream functions that do more than stateless, pure functions can do. Accessing arbitrary shared state (like global variables) still breaks the rules, but a stateful stream function can keep the understandable, testable nature of pure functions as long as the stream function always produces the same output stream given a particular set of input streams.

Let’s say we have some stream that’s intermittent (that is, sometimes it has a useful value and sometimes it just has None), and we want a stream that always has a value (for example, maybe we want to compare it to some other stream). We need to “fill the gaps” in the input stream with some sensible value to produce an output stream that always has useful values.

Here’s a stateful stream that does just that:

>>>classGapFiller(BaseStream):..."Fills gaps in the input stream with the last good value."...def__init__(self,xs):...super().__init__()...self._xs=xs...self._last_good_value=None......def_poll(self,phase):...x=self._xs.poll(phase)......ifxisnotNone:...self._last_good_value=x......returnself._last_good_value

GapFiller maintains extra state in the form of the ._last_good_value property, but its behaviour is completely determined by the input stream, so it’s just as easy to test as a plain pure function:

>>> maybe_numbers = NumberStream(5, None, None, 17)>>> numbers = GapFiller(maybe_numbers)>>> numbers.poll("blue")
5>>> numbers.poll("green")
5>>> numbers.poll("blue")
5>>> numbers.poll("green")
17

At this point, we’ve got basic FRP functionality working, but it’s difficult to use: creating a new subclass of BaseStream, overriding ._poll(), calling .poll() on our input streams… that’s a bunch of boilerplate that obscures the actual logic of a stream function.

A decorator for stream functions

Let’s make a decorator that turns a pure function into a pure stream function:

>>>defstream_function(func):..."Decorates a pure function, making a stream function."......classInner(BaseStream):...# We don't know in advance how many inputs func() takes,...# so we'll take any number....def__init__(self,*inputs):...super().__init__()...self._inputs=inputs......def_poll(self,phase):...# Poll all the input streams to get their current values....current_input_values=[...each.poll(phase)...foreachinself._inputs...]......# Whatever func() returns, given these values,...# is the current output stream value....returnfunc(*current_input_values)......returnInner

The Inner class is similar to AddStream, except that it handles any number of inputs instead of hardcoding exactly two, and it calles the wrapped function instead of hard-coding x + y. Now we can write a stream function by writing a pure function and decorating it:

>>>@stream_function...defadd_streams(x,y):...returnx+y

…and all the stream-polling and phase-handling happens automatically:

>>> numbers = NumberStream(5, 9, 32, 17)>>> doubles = add_streams(numbers, numbers)>>> doubles.poll("blue")
10

Huzzah again!

Stateful stream functions, too!

Although Python generator functions can’t properly iterate over an FRP stream (because they don’t pass along the current phase), they’re a still a convenient way to express resumable calculations. If there were some non-iterator-based way to feed in new values from input streams, they could be quite convenient.

Luckily, Python 2.5 introduced “yield expressions”, where the generator function is paused, yields a value, and the caller can later pass in a new value which will be returned by the yield expression. We can use this to write a decorator that makes stateful stream functions out of generator functions:

>>>defstateful_stream_function(func):..."Decorates a generator function, making a stream function."......classInner(BaseStream):...# We don't know in advance how many inputs func() takes,...# so we'll take any number....def__init__(self,*inputs):...super().__init__()...self._inputs=inputs......# We need to store the generator object returned...# when we call func(), so we can repeatedly resume it....self._generator=None......def_poll(self,phase):...# Poll all the input streams to get their current values....current_input_values=[...each.poll(phase)...foreachinself._inputs...]......# If we have not yet created the generator object......ifself._generatorisNone:...# ...create it, passing the initial input values......self._generator=func(*current_input_values)...# ...then execute up to the first yield expression....# The yielded value is our first output....returnself._generator.send(None)......# If we already have a generator object,...# pass in the next input values and execute up to...# the next yield expression. The yielded value is...# our next output....returnself._generator.send(current_input_values)......returnInner

This is more complex than stream_function() because it has to deal with Python’s generator object API, but the basic structure is the same.

We can re-implement the GapFiller stateful stream function as a generator function in this style:

>>>@stateful_stream_function...deffill_gaps(x):...last_good_value=None...whileTrue:...ifxisnotNone:...last_good_value=x......x,=yieldlast_good_value

The initial value for x is passed in as a parameter, and subsequent values are returned from the yield expression.

Note the comma after x in the last line. Because stateful_stream_function() is designed to work for functions with any number of parameters and the generator .send() method only takes a single value, we always send a list of input values. If this function took multiple inputs, we could say:

…but because it only takes one, we need the odd-looking x, to unpack the one-item list.

Although this generator function is not completely idiomatic Python, it’s still straight-forward code, and it produces a proper FRP stream:

>>>maybe_numbers=NumberStream(5,None,None,17)>>>numbers=fill_gaps(maybe_numbers)>>>numbers.poll("blue")5>>>numbers.poll("green")5>>>numbers.poll("blue")5>>>numbers.poll("green")17

FRP is a promising technique, and (as demonstrated) it can be neatly and ergonomically implmented in Python. That said, there’s a number of potential problems with the implementation shown:

  • As far as I know, there’s no standard model for handling streams that end, only infinite streams. As you can imagine, in practice streams almost always end for one reason or another.
  • If you have more than one independent input stream, like network sockets or a timer, you will need some kind of concurrency framework like Twisted Python or asyncio to manage them. If you figure out how to build an FRP system on top of one of those, I’d love to hear about it.
  • This implementation runs the entire stream network in lock-step, requiring every input stream to produce a new value at the same time. You might have input streams that produce values at different rates.
  • The stream_function() and stateful_stream_function() decorators make no attempt to handle keyword-arguments, or keyword-only arguments, or anything besides plain positional arguments.
  • The stream_function() and stateful_stream_function() decorators require every argument to be a stream. If you want some parameters to be constant, you have to wrap them in a stream that returns the same value forever. That’s understandable, but clunky to use.

For the specific use-case I had in mind (a single input stream, namely window.get_wch() in the curses module), none of these limitations affected me, but your mileage may vary.

If you have comments or questions, you can discuss this post onLobste.rs,Hacker News, or /r/Python.

Biology needs more staff scientists

$
0
0

Maria Nemchuk/Broad Inst.

Staff scientist Stacey Gabriel co-authored 25 of the most highly cited papers worldwide in 2015.

Most research institutions are essentially collections of independent laboratories, each run by principal investigators who head a team of trainees. This scheme has ancient roots and a track record of success. But it is not the only way to do science. Indeed, for much of modern biomedical research, the traditional organization has become limiting.

A different model is thriving at the Broad Institute of MIT and Harvard in Cambridge, Massachusetts, where I work. In the 1990s, the Whitehead Institute for Biomedical Research, a self-governing organization in Cambridge affiliated with the Massachusetts Institute of Technology (MIT), became the academic leader in the Human Genome Project. This meant inventing and applying methods to generate highly accurate DNA sequences, characterize errors precisely and analyse the outpouring of data. These project types do not fit neatly into individual doctoral theses. Hence, the institute created a central role for staff scientists — individuals charged with accomplishing large, creative and ambitious projects, including inventing the means to do so. These non-faculty scientists work alongside faculty members and their teams in collaborative groups.

When leaders from the Whitehead helped to launch the Broad Institute in 2004, they continued this model. Today, our work at the Broad would be unthinkable without professional staff scientists — biologists, chemists, data scientists, statisticians and engineers. These researchers are not pursuing a tenured academic post and do not supervise graduate students, but do cooperate on and lead projects that could not be accomplished by a single academic laboratory.

Physics long ago saw the need to expand into different organizational models. The Manhattan Project, which during the Second World War harnessed nuclear energy for the atomic bomb, was not powered by graduate students. Europe's particle-physics laboratory, CERN, does not operate as atomized labs with each investigator pursuing his or her own questions. And the Jet Propulsion Laboratory at the California Institute of Technology in Pasadena relies on professional scientists to get spacecraft to Mars.

A different tack

In biology, many institutes in addition to the Broad are experimenting with new organizational principles. The Mechanobiology Institute in Singapore pushes its scientists to use tools from other disciplines by discouraging individual laboratories from owning expensive equipment unless it is shared by all. The Howard Hughes Medical Institute's Janelia Research Campus in Ashburn, Virginia, the Salk Institute of Biological Sciences in La Jolla, California, and the Allen Institute for Brain Science in Seattle, Washington, effectively mix the work of faculty members and staff scientists. Disease-advocacy organizations, such as the ALS Therapy Development Institute in Cambridge, do their own research without any faculty members at all.

Each of these institutes has a unique mandate, and many are fortunate in having deep resources. They also had to be willing to break with tradition and overcome cultural barriers.

At famed research facilities of yore, such as Bell Labs and IBM Laboratories, the title 'staff scientist' was a badge of honour. Yet to some biologists the term suggests a permanent postdoc or senior technician — someone with no opportunities for advancement who works solely in a supervisor's laboratory, or who runs a core facility providing straightforward services. That characterization sells short the potential of professional scientists.

The approximately 430 staff scientists at the Broad Institute develop cutting-edge computational methods, invent and incorporate new processes into research pipelines and pilot and optimize methodologies. They also transform initial hits from drug screens into promising chemical compounds and advance techniques to analyse huge data sets. In summary, they chart the path to answering complex scientific questions.

Although the work of staff scientists at the Broad Institute is sometimes covered by charging fees to its other labs, our faculty members would never just drop samples off with a billing code and wait for data to be delivered. Instead, they sit down with staff scientists to discuss whether there is an interesting collaboration to be had and to seek advice on project design. Indeed, staff scientists often initiate collaborations.

Naturally, tensions still arise. They can play out in many ways, from concerns over how fees are structured, to questions about authorship. Resolving these requires effort, and it is a task that will never definitively be finished.

“Faculty members would never just drop samples off with a billing code and wait for data to be delivered.”

In my view, however, the staff-scientist model is a win for all involved. Complex scientific projects advance more surely and swiftly, and faculty members can address questions that would otherwise be out of reach. This model empowers non-faculty scientists to make independent, creative contributions, such as pioneering new algorithms or advancing technologies. There is still much to do, however. We are working to ensure that staff scientists can continue to advance their careers, mentor others and help to guide the scientific direction of the institute.

As the traditional barriers break down, science benefits. Technologies that originate in a faculty member's lab sometimes attract more collaborations than one laboratory could sustain. Platforms run by staff scientists can incorporate, disseminate and advance these technologies to capture more of their potential. For example, the Broad Institute's Genetic Perturbation Platform, run by physical chemist David Root, has honed high-throughput methods for RNA interference and CRISPR screens so that they can be used across the genome in diverse biological contexts. Staff scientists make the faculty more productive through expert support, creativity, added capacity and even mentoring in such matters as the best use of new technologies. The reverse is also true: faculty members help staff scientists to gain impact.

Our staff scientists regularly win scientific prizes and are invited to give keynote lectures. They apply for grants as both collaborators and independent investigators, and publish regularly. Since 2011, staff scientists have led 36% of all the federal grants awarded for research projects at the Broad Institute (see ‘Staff-led grants’). One of our staff scientists, genomicist Stacey Gabriel, topped Thomson Reuters' citation analysis of the World's Most Influential Scientific Minds in 2016. She co-authored 25 of the most highly cited papers in 2015 — a fact that illustrates both how collaborative the Broad is and how central genome-analysis technologies are to answering key biological questions.

Source: Broad Inst.

At the Broad Institute's Stanley Center for Psychiatric Research, which I direct, staff scientists built and operate HAIL, a powerful open-source tool for analysis of massive genetics data sets. By decreasing computational time, HAIL has made many tasks 10 times faster, and some 100 times faster. Staff scientist Joshua Levin has developed and perfected RNA-sequencing methods used by many colleagues to analyse models of autism spectrum disorders and much else. Nick Patterson, a mathematician and computational biologist at the Stanley Center, began his career by cracking codes for the British government during the cold war. Today, he uses DNA to trace past migrations of entire civilizations, helps to solve difficult computational problems and is a highly valued support for many biologists.

Irrational resistance

Why haven't more research institutions expanded the roles of staff scientists? One reason is that they can be hard to pay for, especially by conventional means. Some funding agencies look askance at supporting this class of professionals; after all, graduate students and postdocs are paid much less. In my years leading the US National Institute of Mental Health, I encountered people in funding bodies across the world who saw a rising ratio of staff to faculty members or of staff to students as evidence of fat in the system.

That said, there are signs of flexibility. In 2015, the US National Cancer Institute began awarding 'research specialist' grants — a limited, tentative effort designed in part to provide opportunities for staff scientists. Sceptical funders should remember that trainees often take years to become productive. More importantly, institutions' misuse of graduates and postdocs as cheap labour is coming under increasing criticism (see, for example, B. Albertset al. Proc. Natl Acad. Sci. USA111, 57735777; 2014).

Faculty resistance is also a factor. I served as Harvard University's provost (or chief academic officer) for a decade. Several years in, I launched discussions aimed at expanding roles for staff scientists. Several faculty members worried openly about competition for space and other scarce resources, especially if staff scientists were awarded grants but had no teaching responsibilities. Many recoiled from any trappings of corporatism or from changes that felt like an encroachment on their decision-making. Some were explicitly concerned about a loss of access and control, and were not aware of the degree to which staff scientists' technological expertise and cross-disciplinary training could help to answer their research questions.

Institutional leaders can mitigate these concerns by ensuring that staff positions match the shared goals of the faculty — for scientific output, education and training. They must explain how staff-scientist positions create synergies rather than silos. Above all, hiring plans must be developed collaboratively with faculty members, not by administrators alone.

The Broad Institute attracts world-class scientists, as both faculty members and staff. Its appeal has much to do with how staff scientists enable access to advanced technology, and a collaborative culture that makes possible large-scale projects rarely found in academia. The Broad is unusual — all faculty members also have appointments at Harvard University, MIT or Harvard-affiliated hospitals. The institute has also benefited from generous philanthropy from individuals and foundations that share our values and believe in our scientific mission.

Although traditional academic labs have been and continue to be very productive, research institutions should look critically and creatively at their staffing. Creating a structure like that of the Broad Institute would be challenging in a conventional university. Still, I believe any institution that is near an academic health centre or that has significant needs for advanced technology could benefit from and sustain the careers of staff scientists. If adopted judiciously, these positions would enable institutions to take on projects of unprecedented scope and scale. It would also create a much-needed set of highly rewarding jobs for the rising crop of talented researchers, particularly people who love science and technology but who do not want to pursue increasingly scarce faculty positions.

A scientific organization should be moulded to the needs of science, rather than constrained by organizational traditions.

A web-based roguelike written in Dart

$
0
0

README.md

Splash screen

Hauberk is a roguelike, an ASCII-art based procedurally-generated dungeon crawl game. It's written in Dart and runs in your browser.

Behold it in all of its glory:

Dungeon

Running it

To get it up and running locally, you'll need to have the Dart SDK installed. I use the latest dev channel release of Dart, which you can get fromhere.

Once you have Dart installed and its bin/ directory on your PATH, then:

  1. Clone this repo.
  2. From the root directory of it, run: $ pub serve
  3. In your browser, open: http://localhost:8080

Pub will automatically compile the game to JavaScript if you hit that URL with a production browser. Leave pub serve running, and whenever you change the Dart code, it will notice that and recompile the JS on the fly.

You can iterate even faster and have a much better debugging experience if you browse to the server using Dartium, which comes with the Dart SDK. Just hit the same URL and it is smart enough to serve the raw Dart code instead of the compiled JS.

I usually run the game in Dartium, so if you see any bugs in the compiled-to-JS version please do file an issue.

Getting involved

I'd love to have more people involved. You're more than welcome to contribute to Hauberk itself. There's lots to be done, both code and game content (monsters, items, recipes, areas, etc.).

If you'd like to hack some code, search through the codebase for "TODO". I sprinkle those in liberally to mark things that need fixing or are open to extension. If you find one that catches your eye, let me know and I can fill you in on the details, or just send a pull request.

I also had in mind that this codebase could be used as a springboard for other games. Feel free to fork Hauberk and make it into your own thing in any way you choose. It uses a very permissive MIT license, so you can do pretty much whatever you want with it.

An Indian village addicted to chess

$
0
0

The green paint on the walls of Marottichal’s village teashop had started to flake, like coin scrapings on a scratch card, exposing a light blue tone of a bygone era. Perhaps this was once a rowdy bar or beer shop. But not anymore.

Mr Unnikrishnan, the teashop’s owner, sat opposite me at one of the wooden tables, his dark eyes fixated on the chequered board that lay between us with an intimidating intensity.

A callous hand rose and elegantly gripped the white bishop, sliding it gently into the black knight and toppling it over.

“He’s got you now,” said the spectating Baby John, slurping his chai to suppress a grin.

I surveyed the bleak scene unfolding before me. My few remaining pieces were backed into a corner, eager to surrender.

Around the teashop’s four other tables similar intense battles of wits were being fought, while a dust-coated Videocon television set languished on a shelf at the back of the room, unplugged and ignored.

Resorting to distraction, I poked a petrified pawn one square forward and asked Unnikrishnan why this game resonates so much with the people of Marottichal, a remote forest village in northern Kerala.

On a chess board you are fighting, as we are also fighting the hardships in our daily life

“Chess helps us overcome difficulties and sufferings,” said Unnikrishnan, taking my queen. “On a chess board you are fighting, as we are also fighting the hardships in our daily life.”

With a feigned bravado I took one of Unnikrishnan’s isolated pawns.

“And is it really that popular?” I asked.

Unnikrishnan shot me a wry smile. “Come, you can see for yourself,” he said, rising from the table.

I looked down to find my king cowering, surrounded by a murderous mob of white plastic pieces.

I guessed that was checkmate.

It was mid-morning and Marottichal’s tree-lined main street was busy, yet oddly quiet. The forest breeze didn’t carry the vexatious shrill of traffic horns – the deafening symphony of most Indian towns – but instead silently stirred the strips of bright bunting zigzagging overhead.

The bus stop opposite Unnikrishnan’s teashop was full of people, but no-one seemed to be going anywhere. Instead, the gathered crowd were squatted on their haunches, watching an intense chess match play out between two greying gentlemen. The men sat cross-legged and barefoot, their lungis (sarongs) taut across their thighs.

I soon spotted the bus a short distance away, though it carried no passengers; the engine was off, and the driver had turned from the wheel to contest a quick chess match with the conductor before the start of their next shift.

Friends on pavements, spouses on benches, colleagues over shop countertops; the black-and-white board perforated every scene. Around the corner from the teashop on the veranda of Unnikrishnan’s own home, reportedly one of the village’s most popular gaming spots, no fewer than three matches were taking place.

“In other Indian villages perhaps the maximum number of people that know chess is less than 50,” said Baby John, president of the Chess Association of Marottichal. “Here 4,000 of the 6,000 population are playing chess, almost daily.”

“And it is all thanks to this wonderful man,” he added, gesturing to Unnikrishnan.

Fifty years ago, Marottichal was a very different place. Like many villages in northern Kerala, alcoholism and illicit gambling were rife among its small population. Having developed a zeal for chess while living in the nearby town of Kallur, Unnikrishnan moved back to his afflicted hometown and opened his teashop, where he began teaching customers to play chess as a healthier way to pass the time.

Here 4,000 of the 6,000 population are playing chess, almost daily

Miraculously, the game’s popularity flourished while drinking and gambling declined. The village’s enthusiasm for the ancient pastime, which is believed to have originated in India in the 6th Century, has now become so great that Unnikrishnan estimates one person in every Marottichal household knows how to play.

“Luckily for us chess is more addictive than alcohol,” Baby John said.

Not only did the archaic game scupper alcoholism and supersede clandestine card games, but it has engrained itself into Marottichal’s identity, and, according to Baby John, it continues to protect the town’s residents from modern pitfalls.

“Chess improves concentration, builds character and creates community,” he said. “We don’t watch television here; we play chess and talk to each other.”

Chess improves concentration, builds character and creates community

“Even the kids?” I asked.

Unnikrishnan shot me another wry smile.

It was lunchtime when we arrived at Marottichal Primary School, a cluster of blue walls and orange-tiled roofs, to find the dusty courtyard awash with frenzied children, like a startled flock of pigeons in a public square.

But through the fray of bodies, I could see a row of children seated serenely at a line of tables.

We approached the nearest pair, who were perched at a discoloured bench with a chess board between them. Vithun and Eldho, both 12 years old, sported matching tufts of black hair and shared a tangible enthusiasm for chess – with a fervid admiration for one piece in particular.

“The knight is the best,” Vithun said.

“Definitely,” Eldho replied.

“It’s the most powerful.”

“You can move it in any direction!”

In a country undergoing rapid digitalisation, fanning wide-spread fears about Indian youth becoming disconnected from their country and culture, it was strange to hear two children talk so enthusiastically about a 1,000-year-old board game that’s interwoven into India’s history. Surely they would prefer to be watching television, I wondered out loud.

“Chess is best!” shouted Eldho as he sprung from his seat, almost toppling the board. Vithum scowled at him.

“Last year we came to the school with 15 chess boards and invited the children to learn chess,” Baby John explained as we fought our way back through the courtyard. “The following week we went back and all the children in the classroom had bought chess boards of their own.”

The positive response from the students, paired with their belief in the sanative qualities of the game, has led the Chess Association of Marottichal to request that the authorities include chess as part of the official school syllabus. This, they believe, will aid their vision of living in a village where everyone plays chess.

“Only then can we truly call ourselves a chess village,” Baby John concluded, explaining that he believes the title will cement Marottichal’s association to the much-loved sport and its edifying principles.

The wholesome lifestyle promoted by the village is seemingly attractive to Keralites, indicated by the remote area’s growing population despite relatively high land prices. The village has also lured visitors from as far away as Germany and the US keen to learn the game or hone their skills.

But despite this, as we trudged back to the teashop a lingering doubt gnawed at me: would a community centred on an ancient board game be able to withstand the rapid wave of modernisation sweeping across the Indian subcontinent?

My fears were heightened when we neared a group of teenagers tapping away on their smartphones, a sight that prompted me to voice these concerns to Unnikrishnan and Baby John.

But as we drew closer, the three of us could see what was commanding the group’s undivided attention: they were all playing chess online.

Unnikrishnan gave me one last smile.

I guessed that was checkmate.

Join over three million BBC Travel fans by liking us on Facebook, or follow us on Twitter and Instagram.

If you liked this story, sign up for the weekly bbc.com features newsletter called "If You Only Read 6 Things This Week". A handpicked selection of stories from BBC Future, Earth, Culture, Capital and Travel, delivered to your inbox every Friday.

Official starter template for TypeScript and React

$
0
0

README.md

This quick start guide will teach you how to wire up TypeScript with React. By the end, you'll have

  • a project with React and TypeScript
  • linting with TSLint
  • testing with Jest and Enzyme, and
  • state management with Redux

We'll use the create-react-app tool to quickly get set up.

We assume that you're already using Node.js with npm. You may also want to get a sense of the basics with React.

We're going to use the create-react-app because it sets some useful tools and canonical defaults for React projects. This is just a command-line utility to scaffold out new React projects.

npm install -g create-react-app

We'll create a new project called my-app:

create-react-app my-app --scripts-version=react-scripts-ts

react-scripts-ts is a set of adjustments to take the standard create-react-app project pipeline and bring TypeScript into the mix.

At this point, your project layout should look like the following:

my-app/
├─ .gitignore
├─ node_modules/
├─ public/
├─ src/
│  └─ ...
├─ package.json
├─ tsconfig.json
└─ tslint.json

Of note:

  • tsconfig.json contains TypeScript-specific options for our project.
  • tslint.json stores the settings that our linter, TSLint, will use.
  • package.json contains our dependencies, as well as some shortcuts for commands we'd like to run for testing, previewing, and deploying our app.
  • public contains static assets like the HTML page we're planning to deploy to, or images. You can delete any file in this folder apart from index.html.
  • src contains our TypeScript and CSS code. index.tsx is the entry-point for our file, and is mandatory.

Running the project is as simple as running

This runs the start script specified in our package.json, and will spawn off a server which reloads the page as we save our files. Typically the server runs at http://localhost:3000, but should be automatically opened for you.

This tightens the iteration loop by allowing us to quickly preview changes.

Testing is also just a command away:

This command runs Jest, an incredibly useful testing utility, against all files whose extensions end in .test.ts or .spec.ts. Like with the npm run start command, Jest will automatically run as soon as it detects changes. If you'd like, you can run npm run start and npm run test side by side so that you can preview changes and test them simultaneously.

When running the project with npm run start, we didn't end up with an optimized build. Typically, we want the code we ship to users to be as fast and small as possible. Certain optimizations like minification can accomplish this, but often take more time. We call builds like this "production" builds (as opposed to development builds).

To run a production build, just run

This will create an optimized JS and CSS build in ./build/static/js and ./build/static/css respectively.

You won't need to run a production build most of the time, but it is useful if you need to measure things like the final size of your app.

We're going to write a Hello component. The component will take the name of whatever we want to greet (which we'll call name), and optionally the number of exclamation marks to trail with (enthusiasmLevel).

When we write something like <Hello name="Daniel" enthusiasmLevel={3} />, the component should render to something like <div>Hello Daniel!!!</div>. If enthusiasmLevel isn't specified, the component should default to showing one exclamation mark. If enthusiasmLevel is 0 or negative, it should throw an error.

We'll write a Hello.tsx:

// src/components/Hello.tsximport*asReactfrom'react';exportinterfaceProps {
  name:string;
  enthusiasmLevel?:number;
}function Hello({ name, enthusiasmLevel=1 }:Props) {if (enthusiasmLevel<=0) {thrownewError('You could be a little more enthusiastic. :D');
  }return (<divclassName="hello"><divclassName="greeting">Hello {name + getExclamationMarks(enthusiasmLevel)}</div></div>
  );
}exportdefaultHello;// helpersfunction getExclamationMarks(numChars:number) {returnArray(numChars+1).join('!');
}

Notice that we defined a type named Props that specifies the properties our component will take.name is a required string, and enthusiasmLevel is an optional number (which you can tell from the ? that we wrote out after its name).

We also wrote Hello as a stateless function component (an SFC). To be specific, Hello is a function that takes a Props object, and destructures it. If enthusiasmLevel isn't given in our Props object, it will default to 1.

Writing functions is one of two primary ways React allows us to make components. If we wanted, we could have written it out as a class as follows:

classHelloextendsReact.Component<Props, object> {
  render() {const { name, enthusiasmLevel =1 } =this.props;if (enthusiasmLevel<=0) {thrownewError('You could be a little more enthusiastic. :D');
    }return (<divclassName="hello"><divclassName="greeting">Hello {name + getExclamationMarks(enthusiasmLevel)}</div></div>
    );
  }
}

Classes are useful when our component instances have some state. But we don't really need to think about state in this example - in fact, we specified it as object in React.Component<Props, object>, so writing an SFC tends to be shorter. Local component state is more useful at the presentational level when creating generic UI elements that can be shared between libraries. For our application's lifecycle, we will revisit how applications manage general state with Redux in a bit.

Now that we've written our component, let's dive into index.tsx and replace our render of <App /> with a render of <Hello ... />.

First we'll import it at the top of the file:

importHellofrom'./components/Hello';

and then change up our render call:

ReactDOM.render(<Helloname="TypeScript"enthusiasmLevel={10} />,document.getElementById('root') asHTMLElement
);

Type assertions

One final thing we'll point out in this section is the line document.getElementById('root') as HTMLElement. This syntax is called a type assertion, sometimes also called a cast. This is a useful way of telling TypeScript what the real type of an expression is when you know better than the type checker.

The reason we need to do so in this case is that getElementById's return type is HTMLElement | null. Put simply, getElementById returns null when it can't find an element with a given id. We're assuming that getElementById will actually succeed, so we need convince TypeScript of that using the as syntax.

TypeScript also has a trailing "bang" syntax (!), which removes null and undefined from the prior expression. So we could have written document.getElementById('root')!, but in this case we wanted to be a bit more explicit.

Styling a component with our setup is easy. To style our Hello component, we can create a CSS file at src/components/Hello.css.

.hello {text-align: center;margin: 20px;font-size: 48px;font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
}.hellobutton {margin-left: 25px;margin-right: 25px;font-size: 40px;min-width: 50px;
}

The tools that create-react-app uses (namely, Webpack and various loaders) allow us to just import the stylesheets we're interested in. When our build runs, any imported .css files will be concatenated into an output file. So in src/components/Hello.tsx, we'll add the following import.

We had a certain set of assumptions about our Hello component. Let's reiterate what they were:

  • When we write something like <Hello name="Daniel" enthusiasmLevel={3} />, the component should render to something like <div>Hello Daniel!!!</div>.
  • If enthusiasmLevel isn't specified, the component should default to showing one exclamation mark.
  • If enthusiasmLevel is 0 or negative, it should throw an error.

We can use these requirements to write a few tests for our components.

But first, let's install Enzyme.Enzyme is a common tool in the React ecosystem that makes it easier to write tests for how components will behave. By default, our application includes a library called jsdom to allow us to simulate the DOM and test its runtime behavior without a browser. Enzyme is similar, but builds on jsdom and makes it easier to make certain queries about our components.

Let's install it as a development-time dependency.

npm install -D enzyme @types/enzyme react-addons-test-utils

Notice we installed packages enzyme as well as @types/enzyme. The enzyme package refers to the package containing JavaScript code that actually gets run, while @types/enzyme is a package that contains declaration files (.d.ts files) so that TypeScript can understand how you can use Enzyme. You can learn more about @types packages here.

We also had to install react-addons-test-utils. This is something enzyme expects to be installed.

Now that we've got Enzyme set up, let's start writing our test! Let's create a file named src/components/Hello.test.tsx, adjacent to our Hello.tsx file from earlier.

// src/components/Hello.test.tsximport*asReactfrom'react';import*asenzymefrom'enzyme';importHellofrom'./Hello';it('renders the correct text when no enthusiasm level is given', () => {const hello =enzyme.shallow(<Helloname='Daniel' />);expect(hello.find(".greeting").text()).toEqual('Hello Daniel!')
});it('renders the correct text with an explicit enthusiasm of 1', () => {const hello =enzyme.shallow(<Helloname='Daniel'enthusiasmLevel={1}/>);expect(hello.find(".greeting").text()).toEqual('Hello Daniel!')
});it('renders the correct text with an explicit enthusiasm level of 5', () => {const hello =enzyme.shallow(<Helloname='Daniel'enthusiasmLevel={5} />);expect(hello.find(".greeting").text()).toEqual('Hello Daniel!!!!!');
});it('throws when the enthusiasm level is 0', () => {expect(() => {enzyme.shallow(<Helloname='Daniel'enthusiasmLevel={0} />);
  }).toThrow();
});it('throws when the enthusiasm level is negative', () => {expect(() => {enzyme.shallow(<Helloname='Daniel'enthusiasmLevel={-1} />);
  }).toThrow();
});

These tests are extremely basic, but you should be able to get the gist of things.

At this point, if all you're using React for is fetching data once and displaying it, you can consider yourself done. But if you're developing an app that's more interactive, then you may need to add state management.

State management in general

On its own, React is a useful library for creating composable views. However, React doesn't come with any facility for synchronizing data between your application. As far as a React component is concerned, data flows down through its children through the props you specify on each element.

Because React on its own does not provide built-in support for state management, the React community uses libraries like Redux and MobX.

Redux relies on synchronizing data through a centralized and immutable store of data, and updates to that data will trigger a re-render of our application. State is updated in an immutable fashion by sending explicit action messages which must be handled by functions called reducers. Because of the explicit nature, it is often easier to reason about how an action will affect the state of your program.

MobX relies on functional reactive patterns where state is wrapped through observables and passed through as props. Keeping state fully synchronized for any observers is done by simply marking state as observable. As a nice bonus, the library is already written in TypeScript.

There are various merits and tradeoffs to both. Generally Redux tends to see more widespread usage, so for the purposes of this tutorial, we'll focus on adding Redux; however, you should feel encouraged to explore both.

The following section may have a steep learning curve. We strongly suggest you familiarize yourself with Redux through its documentation.

Setting the stage for actions

It doesn't make sense to add Redux unless the state of our application changes. We need a source of actions that will trigger changes to take place. This can be a timer, or something in the UI like a button.

For our purposes, we're going to add two buttons to control the enthusiasm level for our Hello component.

Installing Redux

To add Redux, we'll first install redux and react-redux, as well as their types, as a dependency.

npm install -S redux react-redux @types/react-redux

In this case we didn't need to install @types/redux because Redux already comes with its own definition files (.d.ts files).

Defining our app's state

We need to define the shape of the state which Redux will store. For this, we can create a file called src/types/index.tsx which will contain definitions for types that we might use throughout the program.

// src/types/index.tsxexportinterfaceStoreState {
    languageName:string;
    enthusiasmLevel:number;
}

Our intention is that languageName will be the programming language this app was written in (i.e. TypeScript or JavaScript) and enthusiasmLevel will vary. When we write our first container, we'll understand why we intentionally made our state slightly different from our props.

Adding actions

Let's start off by creating a set of message types that our app can respond to in src/constants/index.tsx.

// src/constants/index.tsxexportconst INCREMENT_ENTHUSIASM ='INCREMENT_ENTHUSIASM';exporttypeINCREMENT_ENTHUSIASM=typeofINCREMENT_ENTHUSIASM;exportconst DECREMENT_ENTHUSIASM ='DECREMENT_ENTHUSIASM';exporttypeDECREMENT_ENTHUSIASM=typeofDECREMENT_ENTHUSIASM;

This const/type pattern allows us to use TypeScript's string literal types in an easily accessible and refactorable way.

Next, we'll create a set of actions and functions that can create these actions in src/actions/index.tsx.

import*asconstantsfrom'../constants'exportinterfaceIncrementEnthusiasm {
    type:constants.INCREMENT_ENTHUSIASM;
}exportinterfaceDecrementEnthusiasm {
    type:constants.DECREMENT_ENTHUSIASM;
}exporttypeEnthusiasmAction=IncrementEnthusiasm|DecrementEnthusiasm;exportfunction incrementEnthusiasm():IncrementEnthusiasm {return {
        type: constants.INCREMENT_ENTHUSIASM
    }
}exportfunction decrementEnthusiasm():DecrementEnthusiasm {return {
        type: constants.DECREMENT_ENTHUSIASM
    }
}

We've created two types that describe what increment actions and decrement actions should look like. We also created a type (EnthusiasmAction) to describe cases where an action could be an increment or a decrement. Finally, we made two functions that actually manufacture the actions which we can use instead of writing out bulky object literals.

There's clearly boilerplate here, so you should feel free to look into libraries like redux-actions once you've got the hang of things.

Adding a reducer

We're ready to write our first reducer! Reducers are just functions that generate changes by creating modified copies of our application's state, but that have no side effects. In other words, they're what we call pure functions.

Our reducer will go under src/reducers/index.tsx. Its function will be to ensure that increments raise the enthusiasm level by 1, and that decrements reduce the enthusiasm level by 1, but that the level never falls below 1.

// src/reducers/index.tsximport { EnthusiasmAction } from'../actions';import { StoreState } from'../types/index';import { INCREMENT_ENTHUSIASM, DECREMENT_ENTHUSIASM } from'../constants/index';exportfunction enthusiasm(state:StoreState, action:EnthusiasmAction):StoreState {switch (action.type) {caseINCREMENT_ENTHUSIASM:return { ...state, enthusiasmLevel: state.enthusiasmLevel+1 };caseDECREMENT_ENTHUSIASM:return { ...state, enthusiasmLevel: Math.max(1, state.enthusiasmLevel-1) };
  }returnstate;
}

Notice that we're using the object spread (...state) which allows us to create a shallow copy of our state, while replacing the enthusiasmLevel. It's important that the enthusiasmLevel property come last, since otherwise it would be overridden by the property in our old state.

You may want to write a few tests for your reducer. Since reducers are pure functions, they can be passed arbitrary data. For every input, reducers can tested by checking their newly produced state. Consider looking into Jest's toEqual method to accomplish this.

Making a container

When writing with Redux, we will often write components as well as containers. Components are often data-agnostic, and work mostly at a presentational level.Containers typically wrap components and feed them any data that is necessary to display and modify state. You can read more about this concept on Dan Abramov's article Presentational and Container Components.

First let's update src/components/Hello.tsx so that it can modify state. We'll add two optional callback properties to Props named onIncrement and onDecrement:

exportinterfaceProps {
  name:string;
  enthusiasmLevel?:number;
  onIncrement?: () =>void;
  onDecrement?: () =>void;
}

Then we'll bind those callbacks to two new buttons that we'll add into our component.

function Hello({ name, enthusiasmLevel=1, onIncrement, onDecrement }:Props) {if (enthusiasmLevel<=0) {thrownewError('You could be a little more enthusiastic. :D');
  }return (<divclassName="hello"><divclassName="greeting">Hello {name + getExclamationMarks(enthusiasmLevel)}</div><div><buttononClick={onDecrement}>-</button><buttononClick={onIncrement}>+</button></div></div>
  );
}

In general, it'd be a good idea to write a few tests for onIncrement and onDecrement being triggered when their respective buttons are clicked. Give it a shot to get the hang of writing tests for your components.

Now that our component is updated, we're ready to wrap it into a container. Let's create a file named src/containers/Hello.tsx and start off with the following imports.

importHellofrom'../components/Hello';import*asactionsfrom'../actions/';import { StoreState } from'../types/index';import { connect, Dispatch } from'react-redux';

The real two key pieces here are the original Hello component as well as the connect function from react-redux.connect will be able to actually take our original Hello component and turn it into a container using two functions:

  • mapStateToProps which massages the data from the current store to part of the shape that our component needs.
  • mapDispatchToProps which creates callback props to pump actions to our store using a given dispatch function.

If we recall, our application state consists of two properties: languageName and enthusiasmLevel. Our Hello component, on the other hand, expected a name and an enthusiasmLevel.mapStateToProps will get the relevant data from the store, and adjust it if necessary, for our component's props. Let's go ahead and write that.

exportfunction mapStateToProps({ enthusiasmLevel, languageName }:StoreState) {return {enthusiasmLevel,
    name: languageName,
  }
}

Note that mapStateToProps only creates 2 out of 4 of the properties a Hello component expects. Namely, we still want to pass in the onIncrement and onDecrement callbacks.mapDispatchToProps is a function that takes a dispatcher function. This dispatcher function can pass actions into our store to make updates, so we can create a pair of callbacks that will call the dispatcher as necessary.

exportfunction mapDispatchToProps(dispatch:Dispatch<actions.EnthusiasmAction>) {return {onIncrement: () =>dispatch(actions.incrementEnthusiasm()),onDecrement: () =>dispatch(actions.decrementEnthusiasm()),
  }
}

Finally, we're ready to call connect.connect will first take mapStateToProps and mapDispatchToProps, and then return another function that we can use to wrap our component. Our resulting container is defined with the following line of code:

exportdefaultconnect(mapStateToProps, mapDispatchToProps)(Hello);

When we're finished, our file should look like this:

// src/containers/Hello.tsximportHellofrom'../components/Hello';import*asactionsfrom'../actions/';import { StoreState } from'../types/index';import { connect, Dispatch } from'react-redux';exportfunction mapStateToProps({ enthusiasmLevel, languageName }:StoreState) {return {enthusiasmLevel,
    name: languageName,
  }
}exportfunction mapDispatchToProps(dispatch:Dispatch<actions.EnthusiasmAction>) {return {onIncrement: () =>dispatch(actions.incrementEnthusiasm()),onDecrement: () =>dispatch(actions.decrementEnthusiasm()),
  }
}exportdefaultconnect(mapStateToProps, mapDispatchToProps)(Hello);

Creating a store

Let's go back to src/index.tsx. To put this all together, we need to create a store with an initial state, and set it up with all of our reducers.

import { createStore } from'redux';import { enthusiasm } from'./reducers/index';import { StoreState } from'./types/index';const store =createStore<StoreState>(enthusiasm, {
  enthusiasmLevel: 1,
  languageName: 'TypeScript',
});

store is, as you might've guessed, our central store for our application's global state.

Next, we're going to swap our use of ./src/components/Hello with ./src/containers/Hello and use react-redux's Provider to wire up our props with our container. We'll import each:

importHellofrom'./containers/Hello';import { Provider } from'react-redux';

and pass our store through to the Provider's attributes:

ReactDOM.render(<Providerstore={store}><Hello /></Provider>,document.getElementById('root') asHTMLElement
);

Notice that Hello no longer needs props, since we used our connect function to adapt our application's state for our wrapped Hello component's props.

If at any point, you feel like there are certain customizations that the create-react-app setup has made difficult, you can always opt-out and get the various configuration options you need. For example, if you'd like to add a Webpack plugin, it might be necessary to take advantage of the "eject" functionality that create-react-app provides.

Simply run

and you should be good to go!

As a heads up, you may want to commit all your work before running an eject. You cannot undo an eject command, so opting out is permanent unless you can recover from a commit prior to running an eject.

create-react-app comes with a lot of great stuff. Much of it is documented in the default README.md that was generated for our project, so give that a quick read.

If you still want to learn more about Redux, you can check out the official website for documentation. The same goes for MobX.

If you want to eject at some point, you may need to know a little bit more about Webpack. You can check out our React & Webpack walkthrough here.

At some point you might need routing. There are several solutons, but react-router is probably the most popular for Redux projects, and is often used in conjunction with react-router-redux.

America’s Cars Are Getting Faster and More Efficient

$
0
0

Sometime in the next couple of months, the Dodge Challenger SRT Demon and its 808 horsepower will show up in dealership windows like some kind of tiny, red, tire-melting factory. Yes, 808 horsepower. There’s no typo.

Teenage boys will lose their minds. Some older ones, too. But beyond the Vin Diesel fan club, it’s actually not such a big deal anymore. Last year, U.S. drivers on the hunt for more than 600 horsepower had 18 models to choose from, including a Cadillac sedan that looks more swanky than angry. Meanwhile, even boring commuter sedans are posting power specifications that would have been unheard of during the Ford Administration.

The horses in the auto industry are running free.

We crunched four decades of data from the Environmental Protection Agency’s emission tests and arrived at a simple conclusion: All of the cars these days are fast and furious—even the trucks.

If a 1976 driver were to somehow get his hands on a car from 2017, he’d be at grave risk of whiplash. Since those days, horsepower in the U.S. has almost doubled, with the median model climbing from 145 to 283 stallions. Not surprisingly, the entire U.S. fleet grew more game for a drag-race: The median time it took for a vehicle to go from 0 to 60 miles per hour was halved, from almost 14 seconds to seven.

Four decades ago, there was one production car in America that made 285 horsepower–the Aston Martin DBS. It had a gaping maw of a hood vent and 75 more ponies than a Chevrolet Corvette. Today, more than half of the cars and trucks for sale boast as much power or more, including the milquetoast Kia Sorento. An Aston Martin Vanquish, meanwhile, makes 568 horsepower, almost double the grunt of its ancestor.

Sure, one would expect automobile engineering to advance over the decades just like any technology, but its acceleration of late would impress Don Garlits.

“It’s been wildly exciting,” said Bob Fascetti, head of powertrain engineering at Ford Motor Co. “If you go back and look at the degree of change in the last five or six years compared to the five or six before that or the five or six before that, it’s dramatic.”

Speed, of course, is a human condition, hard-wired into human DNA. The same atavistic spark that kept our ancestors safe among woolly mammoths also cooked up Dodge’s “Demon.” What’s even more remarkable, however, is that this combustion arms-race has occurred under ever increasing efficiency standards.

While vehicles have been getting more powerful, their engines have been shrinking. Moreover, the entire fleet is stretching a gallon of gas farther, thanks in part to electric engines.

Combustion engines on America’s roads are about 42 percent smaller than they were 40 years ago. At the same time, the EPA’s median measurement of miles-per-gallon has doubled, from 15 to 30. Most of those gains were made under pressure from federal efficiency mandates. The great power push began in 1985 just after the industry had hit a threshold of 27.5 miles-per-gallon.

Vehicles made another efficiency leap starting in 2007, when a new energy bill set a 35 miles-per-gallon threshold. This time, however, carmakers kept adding power.

How did engineers manage this sorcery? It wasn’t a single manufacturing breakthrough; it was about six of them. Consider the contemporary Chevrolet Camaro, which can be had with one of three different engines, each highlighting a major advancement in the race for efficient power.

The top-of-the-line V8, which makes 455 horsepower, is programmed to shut down four of its cylinders when they aren’t needed. Cylinder deactivation debuted about 10 years ago and is now standard in every eight-cylinder engine General Motors Co. makes.

It’s also put to use in the Camaro V6, the middle-of-the pack, Goldilocks choice that makes 335 horsepower. This machine highlights one of the most critical things in engine evolution: direct fuel injection. Carburetors that mixed fuel with air disappeared from assembly lines long ago. But it was only in the 21st century that engineers perfected the practice of shooting a mist of gasoline directly into the cylinder. Less fuel is wasted and the engine is more powerful because it stays cooler. (The gas actually evaporates before it explodes, cooling the cylinder in the same way that sweat cools the skin of an athlete.)

“Today, we can model it, we can visualize it, and we can make sure the fuel ends up in the air, not on the cylinder wall,” said Prabjot Nanua, director of Detroit-based GM’s advanced engine and racing engineering. 

Meanwhile, engineers figured out how to slightly speed up or delay when an engine’s valves open, alternatively offering more power or lower emissions, depending on how much the driver is stomping the pedal.

“Even if you have that old, muscle-car philosophy, your fuel efficiency is pretty much a given these days,” Nanua said.

The real miracles come in the smallest Camaro engine, a little four-cylinder package that makes 275 horsepower. Most small engines today have a turbo unit, one that bears little resemblance to the versions in 1980s Saabs. The Camaro turbo draws air via two channels, eliminating much of the notorious lag between when a driver requests a boost and when one arrives.

Finally, cars on a relative basis have become far lighter. Even as it was stuffed with computers, airbags, sensors, and bulky infotainment unit, the baseline Camaro went from roughly 4,000 pounds in 1976 to about 3,400 pounds in 2017. Almost every part of the car had a heavy material swapped for something lighter. Engine blocks and body panels evolved from iron and steel to lightweight aluminum alloys while intake manifolds and oil pans were poured out of advanced plastics. Meanwhile, more expensive cars–like the Camaro’s sibling, the Corvette—are veined with carbon fiber.

And make no mistake, the same swapping has gone on across the industry’s socioeconomic spectrum, from Rolls Royce to the Kia Rio.

In short: We are all faster now than we were 40 years ago, but only because some of us got a lot more clever.


What's New in Create React App

$
0
0

Less than a year ago, we introduced Create React App as an officially supported way to create apps with zero configuration. The project has since enjoyed tremendous growth, with over 950 commits by more than 250 contributors.

Today, we are excited to announce that many features that have been in the pipeline for the last few months are finally released.

As usual with Create React App, you can enjoy these improvements in your existing non-ejected apps by updating a single dependency and following our migration instructions.

Newly created apps will get these improvements automatically.

webpack 2

This change was contributed by @Timer in #1291.

We have upgraded to webpack 2 which has been officially released a few months ago. It is a big upgrade with many bugfixes and general improvements. We have been testing it for a while, and now consider it stable enough to recommend it to everyone.

While the Webpack configuration format has changed, Create React App users who didn't eject don't need to worry about it as we have updated the configuration on our side.

If you had to eject your app for one reason or another, Webpack provides a configuration migration guide that you can follow to update your apps. Note that with each release of Create React App, we are working to support more use cases out of the box so that you don't have to eject in the future.

The biggest notable webpack 2 feature is the ability to write and import ES6 modules directly without compiling them to CommonJS. This shouldn’t affect how you write code since you likely already use import and export statements, but it will help catch more mistakes like missing named exports at compile time:

Export validation

In the future, as the ecosystem around ES6 modules matures, you can expect more improvements to your app's bundle size thanks to tree shaking.

 Runtime Error Overlay

This change was contributed by @Timer and @nicinabox in #1101, @bvaughn in #2201.

Have you ever made a mistake in code and only realized it after the console is flooded with cryptic errors? Or worse, have you ever shipped an app with crashes in production because you accidentally missed an error in development?

To address these issues, we are introducing an overlay that pops up whenever there is an uncaught error in your application. It only appears in development, and you can dismiss it by pressing Escape.

A GIF is worth a thousand words:

Runtime error overlay

(Yes, it integrates with your editor!)

In the future, we plan to teach the runtime error overlay to understand more about your React app. For example, after React 16 we plan to show React component stacks in addition to the JavaScript stacks when an error is thrown.

Progressive Web Apps by Default

This change was contributed by @jeffposnick in #1728.

Newly created projects are built as Progressive Web Apps by default. This means that they employ service workers with an offline-first caching strategy to minimize the time it takes to serve the app to the users who visit it again. You can opt out of this behavior, but we recommend it both for new and existing apps, especially if you target mobile devices.

Loading assets from service worker

New apps automatically have these features, but you can easily convert an existing project to a Progressive Web App by following our migration guide.

We will be adding more documentation on this topic in the coming weeks. Please feel free to ask any questions on the issue tracker!

Jest 20

This change was contributed by @rogeliog in #1614 and @gaearon in #2171.

We are now using the latest version of Jest that includes numerous bugfixes and improvements. You can read more about the changes in Jest 19 and Jest 20 blog posts.

Highlights include a new immersive watch mode, a better snapshot format, improvements to printing skipped tests, and new testing APIs.

Immersive test watcher

Additionally, Create React App now support configuring a few Jest options related to coverage reporting.

Code Splitting with Dynamic import()

This change was contributed by @Timer in #1538 and @tharakawj in #1801.

It is important to keep the initial JavaScript payload of web apps down to the minimum, and load the rest of the code on demand. Although Create React App supported code splitting using require.ensure() since the first release, it used a webpack-specific syntax that did not work in Jest or other environments.

In this release, we are adding support for the dynamic import() proposal which aligns with the future web standards. Unlike require.ensure(), it doesn't break Jest tests, and should eventually become a part of JavaScript. We encourage you to use import() to delay loading the code for non-critical component subtrees until you need to render them.

Creating chunks with dynamic import

Better Console Output

This change was contributed by @gaearon in #2120, #2125, and #2161.

We have improved the console output across the board.

For example, when you start the development server, we now display the LAN address in additional to the localhost address so that you can quickly access the app from a mobile device on the same network:

Better console output

When lint errors are reported, we no longer show the warnings so that you can concentrate on more critical issues. Errors and warnings in the production build output are better formatted, and the build error overlay font size now matches the browser font size more closely.

But Wait... There's More!

You can only fit so much in a blog post, but there are other long-requested features in this release, such as environment-specific and local .env files, a lint rule against confusingly named globals, support for multiple proxies in development, a customizable browser launch script, and many bugfixes.

You can read the full changelog and the migration guide in the v1.0.0 release notes.

Acknowledgements

This release is a result of months of work from many people in the React community. It is focused on improving both developer and end user experience, as we believe they are complementary and go hand in hand.

We are grateful to everyone who has offered their contributions, whether in code, documentation, or by helping other people. We would like to specifically thank Joe Haddad for his invaluable help maintaining the project.

We are excited to bring these improvements to everybody using Create React App, and we are looking forward to more of your feedback and contributions.

New paint colors invented by neural network

$
0
0

So if you’ve ever picked out paint, you know that every infinitesimally different shade of blue, beige, and gray has its own descriptive, attractive name. Tuscan sunrise, blushing pear, Tradewind, etc… There are in fact people who invent these names for a living. But given that the human eye can see millions of distinct colors, sooner or later we’re going to run out of good names. Can AI help?

For this experiment, I gave the neural network a list of about 7,700 Sherwin-Williams paint colors along with their RGB values. (RGB = red, green, and blue color values) Could the neural network learn to invent new paint colors and give them attractive names?

One way I have of checking on the neural network’s progress during training is to ask it to produce some output using the lowest-creativity setting. Then the neural network plays it safe, and we can get an idea of what it has learned for sure.

By the first checkpoint, the neural network has learned to produce valid RGB values - these are colors, all right, and you could technically paint your walls with them. It’s a little farther behind the curve on the names, although it does seem to be attempting a combination of the colors brown, blue, and gray.

By the second checkpoint, the neural network can properly spell green and gray. It doesn’t seem to actually know what color they are, however.

Let’s check in with what the more-creative setting is producing.

…oh, okay.

Later in the training process, the neural network is about as well-trained as it’s going to be (perhaps with different parameters, it could have done a bit better - a lot of neural network training involves choosing the right training parameters). By this point, it’s able to figure out some of the basic colors, like white, red, and grey:

Although not reliably.

In fact, looking at the neural network’s output as a whole, it is evident that:

  1. The neural network really likes brown, beige, and grey.
  2. The neural network has really really bad ideas for paint names.

Google Bug Bounty – The $5k Error Page

$
0
0

Well, this is going to be quite a short post ..
In January I was looking at some Google services hoping to find something worth a bounty. I came across https://login.corp.google.com which is nothing more than a simple login page (seems to be for Google employees themselves …)

login.corp.google.com Login Page

login.corp.google.com Login Page

Every time the page is accessed, a new image from https://static.corp.google.com is loaded directly into the page. Nothing too fancy happening here hm?!
An example of such an image URL is https://static.corp.google.com/corpsso/images/PICT0004.jpg

Well, after trying some other things I thought provoking an error here is the best thing I can do: I accessed https://static.corp.google.com/corpsso/asd/ and the default Google 404 page appeared with one difference:

Special Google 404 Page

Special Google 404 Page

I found a feature!

Lets check out what this is about. The “Re-run query with SFFE debug trace” link pointed to https://static.corp.google.com/corpsso/asd/?deb=trace.

SSFE and XFE HTTP Request

SSFE and XFE HTTP Request

Uff … That’s bad …

I was able to access internal debug information on static.corp.google.com by just adding “?deb=trace” to a 404 URL.
I saw the complete X-FrontEnd (XFE) debug trace and much more. I am still not sure what “SFFE” stands for but it seems that it’s something like a request engine in Googles backend that handles for example Bigtable Lookups. Bigtable is a “high performance NoSQL database service for large analytical and operational workloads”. For more information go here.

I was also able to view the SFFE response headers which indicate, that nothing was found …

SSFE Response Headers

SSFE Response Headers

In another section of the debug page I had access the complete Bigtable lookup flow which was performed due to my request (sorry for all the blacking):

Replicated Bigtable Lookup Flow

Replicated Bigtable Lookup Flow

This flow contained table names and paths of different Bigtables which were queried because of my request. So basically I was able to access Google internal information like:

  • Internal IP of the server which was used for the query (I think ..) + its uptime
  • Name of the server (the name is actually a link which is not accessible from the Internet but seems to point to Google Borg clusters)
  • SFFE Request and Response Headers
  • XFE HTTP Request
  • Replicated Bigtable Lookup Flow
  • Service Policies

The page did not allow any user interaction and I haven’t found anything to “go deeper” into the system so i reported it right away.

It was my first bounty I got from Google!

Detailed Reporting Timeline

19/01/2017– Initial report
20/01/2017– Report triaged
20/01/2017– Nice catch!
10/02/2017– Google already fixed the issue but forgot to tell me … I contacted them asking for an update
19/02/2017– Got a response, they implemented a short-term fix and forgot to sent my report to the VRP panel …
10/03/2017– Got $5000 bounty
16/03/2017– Google implemented permanent fix

tla+rust: writing correct lock-free and distributed stateful systems in Rust

$
0
0

README.md

Stable stateful systems through modeling, linear types and simulation.

I like to use things that wake me up at 4am as rarely as possible. Unfortunately, infrastructure vendors don't focus on reliability. Even if a company gives reliability lip service, it's unlikely that they use techniques like modeling or simulation to create a rock-solid core. Let's just build an open-source distributed store that takes correctness seriously at the local storage, sharding, and distributed transactional layers.

My goal: verify core lock-free and distributed algorithms in use with rsdb andrasputin with TLA+. Write an implementation in Rust. Use quickcheck and abstracted RPC/clocks to simulate partitions and test correctness under failure conditions.

table of contents
  1. motivations for doing this at all
  1. introductions to TLA+, PlusCal, quickcheck
  1. lock-free algorithms for efficient local storage
  1. consensus within a shard
  1. sharding operations
  1. distributed transactions

terminology

Simulation, in this context, refers to writing tests that exercise RPC-related code by simulating a buggy network over time, partitions and all. Many more failures may be tested per unit of compute time using simulation compared to black-box fault injection with something like Namazu, Jepsen, or Blockade.

Modeling, in this context, refers to the use of the TLA+ model checker to ensure the correctness of our lock-free and distributed algorithms.

why rust?

Rust is a new systems programming language that emphasizes memory safety. It is notable for its compiler, which is able to make several types of common memory corruption bugs (and attack vectors for exploits) impossible to create by default, without relying on GC. It is a Mozilla project, and as of this writing, it is starting to be included in their Firefox web browser.

It uses an "ownership" system that ensures an object's destructor will run exactly once, preventing double-frees, dangling pointers, various null pointer related bugs, etc... When an object is created inside a function's scope, it exists as the property of that scope. The object's lifetime is the same as the lifetime of the scope that created it.

When the lifetime of an object is over, the object's destructor is run. When you pass an object to a function as an argument, that object becomes the property of the called function, and when the called function returns, the objects in its posession will be destroyed unless the function is returning them. Objects returned from a function become the property of the calling scope.

In order to pass an object to several functions, you may instead pass a reference. By passing a reference, the object remains the property of the current scope. It is possible to create references that imply sole ownership, called mutable references, which may be used to, you guessed it, mutate the object being referred to. This is useful for using an object with a function that will mutate it, without the object becoming the property of that function, and allowing the object to outlive the mutating function. While only a single mutable reference may be created, infinite immutable references may be created, so long as they do not outlive the object that the reference points to.

Rust does not use GC by default. However, it does have several container types that rely on reference counting for preventing an object's destructor from being called multiple times. These are useful for sharing things with multiple scopes and multiple threads. These objects are generally rare compared to the total number of objects created in a typical Rust program. The lack of GC for every object may be a compelling feature for those creating high-performance systems. Many such systems are currently written in C and C++, which have a long track record of buggy and insecure code, even when written by security-conscious life-long practitioners.

Rust has the potential to make high-performance, widely-deployed systems much more secure and crash less frequently. This means web browsers, SSL libraries, operating systems, networking stacks, toasters and many vital systems that are much harder to hack and more robust against common bugs.

For databases, the memory safety benefits are wonderful, and I'm betting on being able to achieve faster long-term iteration by not spending so much time chasing down memory-related bugs. However, it needs to be noted that when creating lock-free high-performance algorithms, we are going to need to sidestep the safety guarantees of the compiler. Our goal is to create data structures that are mutated using atomic compare-and-swap (CAS) operations by multiple threads simultaneously, and also supporting reads at the same time. We choose not to sacrifice performance by using Mutexes. This means using Rust'sBox::into_raw/from_raw, AtomicPtr, unsafe pointers and mem::forget. We are giving up a significant benefit of Rust for certain very high-performance chunks of this system. In place of Rust's compiler, we use the TLA+ model checker to gain confidence in the correctness of our system!

why model?

TLA+ allows us to specify and verify algorithms in very few lines, compared to the programming language that we will use to implement and test it. It is a tool that is frequently mentioned by engineers of stateful distributed systems, but it has been used by relatively few, and has a reputation for being overkill. I believe that this reputation is unfounded for this type of work.

Many systems are not well understood by their creators at the start of the project, which leads to architectural strain as assumptions are invalidated and the project continues to grow over time. Small projects are often cheaper to complete using this approach, as an incorrect initial assumption may have a lower long-term impact. Stateful distributed systems tend to have significant costs associated with unanticipated changes in architecture: reliability, iteration time, and performance can be expected to take hits. For our system, we will specify the core algorithms before implementing them, which will allow us to catch mistakes before they result in bugs or outages.

why simulate?

We want to make sure that our implementation is robust against network partitions, disk failures, NTP issues, etc... So, why not run Namazu, Jepsen, or Blockade? They have great success with finding bugs in databases! However, it is far slower to perform black-box fault injection than simulation. A simulator can artificially advance the clocks of a system to induce a leader election, while a "real" cluster has to wait real time to trigger certain logic. It also takes a lot of time to deploy new code to a "real" cluster, and it is cumbersome to introspect.

Simulation is not a replacement for black-box testing. Simulation will be biased, and it's up to the implementor of the simulator to ensure that all sources of time, IPC, and other interaction are sufficiently encapsulated by the artificial time and interaction logic.

Simulation can allow a contributor working on a more resource-constrained system to test locally, running through thousands or millions of failure situations in the time that it takes to create the RPM/container that is then fed to a black-box fault injection system. A CI/CD pipeline can get far more test coverage per unit of compute time using simulation than with black-box fault injection.

Both simulation and black-box fault injection can be constrained to complete in a certain amount of time, but simulation will likely find a lot more bugs per unit of compute time. Simulation tests may be a reasonable thing to expect to pass for most pull requests, since they can achieve a high bug:compute time ratio. However, black box fault injection is still important, and will probably catch bugs arising from the bias of the simulation authors.

We will also use black-box testing, but we will spend less time talking about it due to its decent existing coverage.

We want to use TLA+ to model and find bugs in things like:

  • CAS operations on lists, ring buffers, and radix trees for lock-free local systems
  • paxos-like consensus for leadership, replication and shard management systems
  • lock-free distributed transactions

Distributed and concurrent algorithms have many similarities, but there are some key differences in the primitives that we build on in our work. Concurrent algorithms can rely on atomic CAS primitives, as achieving sequentially consistent access semantics is fairly well understood and implemented at this point. The distributed systems world has many databases that provide strong ordering semantics, but it doesn't have such a reliable, standard primitive as CAS that we can simply assume to be present. So we need to initially work in terms of the "asynchronous communication model" in which messages between any two processes can be reordered and arbitrarily delayed, or dropped altogether. After we have proved our own model for achieving consistency, we will build on it in later higher-level models that describe particularly interesting functionality such as lock-free distributed transactions.

In our TLA+ models, we can simply use a fairly short labeled block that performs the duties of compare and swap (or another atomic operation) on shared state when describing a concurrent algorithm, but we will need to build a complete replicated log primitive before we can work at a similar level of abstraction in our models of distributed algorithms.

So, let's learn how to describe some of our primitives and invariants!

here we go... jumping into pluscal

This is a summary of an example froma wonderful primer on TLA+...

The first thing to know is that there are two languages in play: pluscal and TLA. We test models using tlc, which understands most of TLA (not infinite sets, maybe other stuff). TLA started as a specification language, tlc came along later to actually test it, and pluscal is a simpler language that can be transpiled into TLA. Pluscal has two forms, c and p. They are functionally identical, butc form uses braces and p form uses prolog/ruby-esque begin and end statements that can be a little easier to spot errors with, in my opinion.

We're writing Pluscal in a TLA comment (block comments are written with (* <comment text> *)), and when we run a translator like pcal2tla it will insert TLA after the comment, in the same file.

------------------------------- MODULE pcal_intro -------------------------------EXTENDS Naturals, TLC(* --algorithm transfervariables alice_account = 10, bob_account = 10,          account_total = alice_account + bob_accountprocess TransProc \in 1..2  variables money \in 1..20;begin  Transfer:    if alice_account >= money then      A: alice_account := alice_account - money;      B: bob_account := bob_account + money;    end if;C: assert alice_account >= 0;end processend algorithm *)\* this is a TLA comment. pcal2tla will insert the transpiled TLA here

MoneyInvariant == alice_account + bob_account = account_total

=============================================================================

This code specifies 3 global variables, alice_account, bob_account, account_total. It specifies, using process <name> \in 1..2 that it will run in two concurrent processes. Each concurrent process has local state, money, which may take any initial value from 1 to 20, inclusive. It defines steps Transfer, A, B and C which are evaluated as atomic units, although they will be tested against all possible interleavings of execution. All possible values will be tested.

Let's save the above example as pcal_intro.tla, transpile the pluscal comment to TLA, then run it with tlc! (if you want to name it something else, update the MODULE specification at the top)

pcal2tla pcal_intro.tla
tlc pcal_intro.tla

BOOM! This blows up because our transaction code sucks, big time:

The first argument of Assert evaluated to FALSE; the second argument was:
"Failure of assertion at line 16, column 4."
Error: The behavior up to this point is:
State 1: <Initial predicate>
/\ bob_account = 10
/\ money = <<1, 10>>
/\ alice_account = 10
/\ pc = <<"Transfer", "Transfer">>
/\ account_total = 20

State 2: <Action line 35, col 19 to line 40, col 42 of module pcal_intro>
/\ bob_account = 10
/\ money = <<1, 10>>
/\ alice_account = 10
/\ pc = <<"A", "Transfer">>
/\ account_total = 20

State 3: <Action line 35, col 19 to line 40, col 42 of module pcal_intro>
/\ bob_account = 10
/\ money = <<1, 10>>
/\ alice_account = 10
/\ pc = <<"A", "A">>
/\ account_total = 20

State 4: <Action line 42, col 12 to line 45, col 63 of module pcal_intro>
/\ bob_account = 10
/\ money = <<1, 10>>
/\ alice_account = 9
/\ pc = <<"B", "A">>
/\ account_total = 20

State 5: <Action line 47, col 12 to line 50, col 65 of module pcal_intro>
/\ bob_account = 11
/\ money = <<1, 10>>
/\ alice_account = 9
/\ pc = <<"C", "A">>
/\ account_total = 20

State 6: <Action line 42, col 12 to line 45, col 63 of module pcal_intro>
/\ bob_account = 11
/\ money = <<1, 10>>
/\ alice_account = -1
/\ pc = <<"C", "B">>
/\ account_total = 20

Error: The error occurred when TLC was evaluating the nested
expressions at the following positions:
0. Line 52, column 15 to line 52, column 28 in pcal_intro
1. Line 53, column 15 to line 54, column 66 in pcal_intro


9097 states generated, 6164 distinct states found, 999 states left on queue.
The depth of the complete state graph search is 7.

Looking at the trace that tlc outputs, it shows us how alice's account may become negative. Because processes 1 and 2 execute the steps sequentially but with different interleavings, the algorithm will check alice_account >= money before trying to transfer it to bob. By the time one process subtracts the money from alice, however, the other process may have already done so. We can specify that these steps and checks happen atomically by changing:

  Transfer:
    if alice_account >= money then
      A: alice_account := alice_account - money;
      B: bob_account := bob_account + money;
    end if;

to

  Transfer:
    if alice_account >= money then
      \* remove the labels A: and B:
      alice_account := alice_account - money;
      bob_account := bob_account + money;
    end if;

which means that the entire Transfer step is atomic. In reality, maybe this is done by punting this atomicity requirement to a database transaction. Re-running tlc should produce no errors now, because both processes atomically check + deduct + add balances to the bank accounts without violating the assertion.

The invariant, MoneyInvariant, at the bottom is not actually being checked yet. Invariants are specified in TLA, not in the pluscal comment. They can be checked by creating a pcal_intro.cfg file (or replace the one auto-generated by pcal2tla) with the following content:

SPECIFICATION Spec
INVARIANT MoneyInvariant

useful primitives

So, we've seen how to create labels, processes, and invariants. Here are some other useful primitives:

await bags EXTENDS Naturals, FiniteSets, Sequences, Integers, TLC

For a more in-depth TLA+ introduction, refer to the tutorial that this was summarized from andthe manual.

In the interests of achieving a price-performance that is compelling, we need to make this thing sympathetic to modern hardware. Check outDmitry's wonderful blog for a fast overview of the important ideas in writing scalable code.

lock-free ring buffer

The ring buffer is at the heart of several systems in our local storage system. It serves as the core of our concurrent persistent log IO buffer and the epoch-based garbage collector for our logical page ID allocator.

lock-free list

The list allows us to CAS a partial update to a page into a chain, avoiding the work of rewriting the entire page. To read a page, we traverse its list until we learn about what we sought. Eventually, we need to compact the list of partial updates to improve locality, probably around 4-8.

lock-free stack

The stack allows us to maintain a free list of page identifiers. Our radix tree needs to be very densely populated to achieve a favorable data to pointer ratio, and by reusing page identifiers after they are freed, we are able to keep it dense. Hence this stack. When we free a page, we push its identifier into this stack for reuse.

lock-free radix tree

We use a radix tree for maintaining our in-memory mapping from logical page ID to its list of partial updates. A well-built radix tree can achieve a .92 total size:data ratio when densely populated and using a contiguous key range. This is way better than what we get with B+ trees, which max out between .5-.6. The downside is that with low-density we get extremely poor data:pointer ratios with a radix tree.

lock-free IO buffer

We use a ring buffer to hold buffers for writing data onto the disk, along with associated metadata about where on disk the buffer will end up. This is fraught with peril. We need to avoid ABA problems in the CAS that claims a particular buffer, and later relies on a particular log offset. We also need to avoid creating a stall when all available buffers are claimed, and a write depends on flushing the end of the buffer before the beginning is free. Possible ways of avoiding: fail reservation attempts when the buffer is full of claims, support growing the buffer when necessary. Bandaid: don't seal entire buffer during commit of reservation.

lock-free epoch-based GC

The basic idea for epoch-based GC is that in our lock-free structures, we may end up making certain data inaccessible via a CAS on a node somewhere, but that doesn't mean that there isn't already some thread that is operating on it. We use epochs to track when a structure is marked inaccessible, as well as when threads begin and end operating on shared state. Before reading or mutating the shared state, a thread "enrolls" in an epoch. If the thread makes some state inaccessible, it adds it to the current epoch's free list. The current epoch may be later than the epoch that the thread initially enrolled in. The state is not dropped until there are no threads in epochs before or at the epoch where the state was marked free. When a thread stops reading or mutating the shared state, it leaves the epoch that it enrolled in.

lock-free pagecache

Maintains a radix tree mapping from logical page ID to a list of page updates, terminated by a base page. Uses the epoch-based GC for safely making logical ID's available in a stack. Facilitates atomic splits and merges of pages.

lock-free tree

Uses the pagecache to store B+ tree pages.

We use a consensus protocol as the basis of our replication across a shard. Consensus notes:

  1. support OLTP with small replication batch size
  2. support batch loading and analytical jobs with large replication batch size
  3. for max throughput with a single shard, send disparate 1/N of the batch to each other node, and then have them all forward their chunk to everybody else
  4. but this adds complexity, and if each node has several shards, we are already spreading the IO around, so we can just pick the latency-minimizing simple broadcast where the leader sends full batches to all followers.
  5. TCP is already a replicated log, hint hint
  6. UDP may be nice for receiving acks, but it doesn't work in a surprising number of DCs

harpoon consensus

Similar to raft, but uses leader leases instead of a paxos register. The paxos register preemptable election of raft is vulnerable to livelock in the not-unusual case of a network partition between a leader and another node, which triggers a dueling candidate situation. Using leases allows us to make progress as long as a node has connectivity with a majority of its quorum, regardless of interfering nodes. In addition, a node that cannot reach a leader may subscribe to the replication log of any other node which has seen more successful log entries.

Sharding has these ideals:

  1. avoid unnecessary data movement (wait some time before replacing a failed node)
  2. if multiple nodes fail simultaneously, minimize chances of dataloss (chainsets)
  3. minimize MTTR when a node fails (lots of shards per machine, reduce membership overlap)

ideals 2 and 3 are someone at tension, but there is a goldilocks zone.

Sharding introduces the question of "who manages the mapping?" This is sometimes punted to an external consensus-backed system. We will initially create this by punting the metadata problem to such a system. Eventually, we will go single-binary with something like the following:

If we treat shard metadata as just another range, how do we prevent split brain?

General initialization and key metadata:

  1. nodes are configured with a set of "seed nodes" to initially connect to
  2. cluster is initialized when some node is explicitly given permission to do so, either via argv, env var, conf file or admin REST api request
  3. the designated node creates an initial range in an underreplicated state
  4. the metadata range contains a mapping from range to current assigned members
  5. as this node learns of others via the seeds, it assigns peers to the initial range
  6. if the metadata range (or any other range) loses quorum, a particular minority survivor can be manually chosen as a seed for fresh replication. the admin api can also trigger backup dumps for a range, and restoration of a range from a backup file.
  7. nodes each maintain their own monotonic counters, and publish a few basic stats about their ranges and utilization using a shared ORSWOT

shard splitting

Split algorithm:

  1. as operations happen in a range, we keep track of the max and min keys, and keep a running average for the position between max and min of inserts. We then choose a split point around there. If keys are always added to one end, the split should occur at the end.
  2. record split intent in watched meta range at the desired point
  3. record the split intent in the replicated log for the range
  4. all members of the replica set split their metadata when they see the split intent in their replicated log
  5. The half of the split point that contains less density is migrated by changing consensus participants one node at a time.
  6. once the two halves have a balanced placement, the split intent is removed

shard merging

Merge algorithm:

  1. merge intent written to metadata range
  2. the smaller half is to move to the larger's servers
  3. this direction is marked at the time of intent, to prevent flapping
  4. once the ranges are colocated, in the larger range's replicated log, write a merge intent, which causes it to accept keys in the new range
  5. write a merge intent into the less frequently accessed range's replicated log that causes it to redirect reads and writes to the larger range.
  6. update the metadata range to reflect the merge
  7. remove handlers and metadata for old range

cross-shard lock-free transactions

Relatively simple lock-free distributed transactions:

  1. read all involved data
  2. create txn object somewhere
  3. CAS all involved data to refer to the txn object and the conditionally mutated state
  4. CAS the txn object to successful
  5. (can crash here and the txn is still valid)
  6. CAS all affected data to replace the value, and remove the txn reference

readers at any point will CAS a txn object to aborted if they encounter an in-progress txn on something they are reading. if the txn object is successful, the reader needs to CAS the object's conditionally mutated state to be the present state, and nuke the txn reference, before continuing.

This can be relaxed to just intended writers, but then our isolation level goes from SSI to SI and we are vulnerable to write skew.

Invariants:

  1. must never see any intermediate states, a transaction must be entirely committed or entirely invisible.

Show HN: Moon – a tiny 6kb Javascript library inspired by Vue.js

$
0
0

Lightweight

Weighing in at 5kb minified + gzipped, Moon is one of the lightest libraries out there.

Fast

Moon uses a fast Virtual DOM, and can rerender the DOM efficiently, only updating nodes where changes were made.

Intuitive

Moon's simple API makes it easy to learn, you can get going in no time, just include the script!

Viewing all 25817 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>