CockroachDB was inspired by frustration with the available open source databases and cloud DBaaS offerings. It was never conceived of as anything but open source software.
In late 2014, with encouraging interest from the GitHub community and concomitant inquiries from some forward-looking venture capitalists, it was decision time: should we start a company to accelerate CockroachDB development? On the one hand, hiring a team of exceptional people would lead more quickly to a viable product. On the other hand, our goal would no longer be solely about building the next great open source database. It would necessarily expand to include concern for our employees and investors.
We were faced with the difficult question of how to build a business around open source software.
Building a Business Around Open Source Software
There has been constant evolution of open source software business models since RedHat blazed the first trail. Few have succeeded using RedHat’s original model centered on support and services. In fact, most investors consider that early open source business model a losing proposition. The two common OSS business model alternatives are:
- Open Core. This typically involves a capable core product which is free and open source, often licensed with APL, MIT, or GPL. That’s the core. Around the core, a commercial entity provides a constellation of proprietary software, adding to or extending its capabilities. These proprietary add-ons are sold as commercial software, often bundled with support and services.
- Cloud-hosted services using open source software. Often, this also involves proprietary software (e.g. multi-tenancy, billing, service dashboards), but the end product is sold as a service instead of software.
These two models are being successfully pursued by many companies. Cloudera, Elastic, and Confluent are three examples I like, all with different models, and at different stages of turning open source products into successful businesses.
Cautionary Tales
The landscape also contains cautionary examples. Some OSS companies set the bar too low for paid features, making the core OSS product feel “hobbled”. In 2017, any product whose core capabilities cannot scale without requiring a commercial license is probably setting the bar too low. There are also examples of companies which failed to provide enough proprietary value early on, while the open core was fast becoming a standard piece of infrastructure. Large corporations which saw value in the core product, and who in other circumstances would have been happy to pay for improvements had no choice but to build their own custom extensions.
There are some excellent open source software companies which have ceased to develop their products for lack of revenue. Some quite recently, including RethinkDB. Companies which previously were more liberal with the capabilities that by default went into the open core have decided to be more discerning in the interests of viability (see Paul Dix’s InfluxDB post).
It’s a delicate balancing act. Building paid “enterprise” features for open source software can feel dirty. Paid features diminish the open source appeal and can lead to substantial community angst. On the other hand, it’s disheartening to see mammoth cloud service providers repackaging OSS for substantial gain without finding ways to foster the open source ecosystem, or hundred-billion-dollar multinationals foregoing support licenses from struggling OSS companies. If you’re serious about building a company around open source software, you must walk a narrow path: introduce paid features too soon, and risk curtailing adoption. Introduce paid features too late, and risk encouraging economic free riders. Stray too far in either direction, and your efforts will ultimately continue only as unpaid open source contributions.
So, how will CockroachDB make money?
I believe ultimately we’ll embrace both the cloud-hosted model and the open core model. Demand for DBaaS is evolving quickly, and only the first chapter has been written (spoiler alert: AWS is winning). But for the immediate future, our product is better aligned with companies which intend to run the database themselves, either in a public or private cloud. In other words, we’re pursuing the open core model, though with some interesting Cockroach Labs peculiarities.
First, licensing. Many companies which have embraced the open core model implement their proprietary features as closed source extensions. Others ship two or more products, with enterprise versions containing closed source and distributed as compiled binaries. There are significant drawbacks to these models. They’re difficult to upgrade to, they often involve multiple development branches which are frustrating to manage, and they obviate benefits of open source where the new features are concerned: outside developers can’t debug or customize proprietary parts of the product.
The CockroachDB Community License (CCL)
We’re going to provide paid, enterprise features differently. Everything in our GitHub is currently licensed under the terms of the Apache License 2 (APL). Enterprise features we introduce will be contained in source files covered by a new license, called the CockroachDB Community License (CCL). The source code will still be available, but because it does not include the free redistribution right, it’s not open source by definition. Its intent is to ensure that commercial usage of enterprise features, beyond an evaluation period, is paid. These features will not be on by default, will be clearly marked in documentation, code, and help messages, and will be enabled only by operator or developer choice. The binary which we distribute will contain these features, but as a result, cannot be distributed under a FLOSS license. However, a “pure” FLOSS distribution will also be available, with enterprise features absent, for those that require it.
Because the source code is available for all features covered by the CCL license, we expect others to learn from what we’re building, and one day to build better products. We expect our customers to customize the software to accommodate their own ambitions.
How will we decide which features are covered by the CCL license?
This is a difficult question, and ultimately the crux of the balancing act. We have distilled the choice to a litmus test: features necessary for a startup to succeed will be APL, and part of the open core; a feature which is primarily useful only to an already successful company will be CCL, and part of the enterprise product. Which license is chosen for a new feature will be determined by our intuition and community feedback.
However, because such decisions are subjective, they will evolve over time. It is simple to move an enterprise feature from the CCL to APL, and we expect that to happen as a matter of course for any feature which turns out to be in high demand from startups.
What does a startup need from a database in order to succeed in 2017?
Every one of the features which motivated the design of CockroachDB:
- Cross-datacenter deployments and consistent replication to overcome failure disasters (e.g. downtime and lost or inconsistent data).
- Horizontal scalability and a cloud-native design to future-proof the data architecture.
- A SQL API with distributed ACID transactions and query execution for developer productivity.
While some of the above features are considered enterprise in other databases, we believe they comprise a generationally-appropriate foundation for building products and services and they will remain free and available under the APL. These are, after all, the features which define CockroachDB.
So what doesn’t a startup need to succeed, but an established company would consider an important requirement, or even a game-changing enabler of new use cases?
We have two such offerings planned for 2017.
The first is a fully-distributed, incremental capability for quickly and consistently backing up and restoring large databases using configurable storage sinks (e.g. S3 or GCS). The same functionality, but non-distributed, will be available for free to all users.
The second is geo-partitioning, a mechanism for row-level control of how and where data is replicated. Geo-partitioning allows a single, logical database to provide low-latency access for geographically disparate customers, as well as enabling compliance with data sovereignty requirements.
Building CockroachDB has been a two-plus year labor of love, and we’re now approaching our version 1.0 release. We recognize the challenge inherent in building a new database with these capabilities, and we’re trying to ensure we can continue developing CockroachDB, for as long as there’s a better product to release in the next version.