The History of Sears Predicts Nearly Everything Amazon Is Doing (2017)

October 15, 2018, 7:30 pm

≫ Next: Ex-Google engineer describing the company's role in China censorship

≪ Previous: Concepts to help developers master JavaScript

From the start, Sears’s genius was to market itself to consumers as an everything store, with an unrivaled range of products, often sold for minuscule profits. The company’s feel for consumer demand was so uncanny, and its operations so efficient, that it became, for many of its diehard customers, not just the best retail option, but the only one worth considering.

By building a large base of fiercely loyal consumers, Sears was able to buy more cheaply from manufacturers and wholesalers. It managed its deluge of orders with massive warehouses, like its central facility in Chicago, in which messages to various departments and assembly workers were sent through pneumatic tubes. In the decade between 1895 and 1905, Sears’s revenue grew by a factor of 50, from about $750,000 to about $38 million, according to Alfred D. Chandler Jr.’s 1977 book The Visible Hand: The Managerial Revolution in American Business. (By comparison, in the last decade, Amazon’s revenue has grown by a factor of 10.)

Then, after one of the most successful half-centuries in U.S. corporate history, Sears did something really crazy. It opened a store.

In the early 1920s, Sears found itself in an economy that was coming off a harsh post-World War recession, according to Daniel M. G. Graff and Peter Temin’s essay “Sears, Roebuck in the Twentieth Century.” The company was also dealing with a more lasting challenge: the rise of chain stores. To guide their corporate makeover, the company tapped a retired World War I general named Robert Wood, who turned to the U.S. Census and Statistical Abstract of the United States as a fount of marketing wisdom. In federally tabulated figures, he saw the country moving from farm to city, and then from city to suburb. His plan: Follow them with stores.

The first Sears stores opened in the company’s existing mail-order warehouses, for convenience’s sake. But soon they were popping up in new locations. Not satisfied with merely competing with urban department stores like Macy’s, Wood distinguished new Sears locations by plopping them into suburbs where land was cheap and parking space was plentiful.

Sears’s aesthetic was unadorned, specializing in “hard goods” like plumbing tools and car parts. Wood initially thought that young shoppers would prefer a cold, no-frills experience—he likened the first stores to “military commissaries.” This was a rare misstep; Sears ultimately redesigned their stores to appear more high-end.

The company’s brick-and-mortar transformation was astonishing. At the start of 1925, there were no Sears stores in the United States. By 1929, there were 300. While Montgomery Ward built 90 percent of its stores in rural areas or small cities, and Woolworth focused on rich urban areas, Sears bet on everything—rural and urban, rich and poor, farmers and manufacturers. Geographically, it disproportionately built where the Statistical Abstract showed growth: in southern, southwestern, and western cities.

↧

Ex-Google engineer describing the company's role in China censorship

October 16, 2018, 3:12 am

≫ Next: The new fast.ai research datasets collection, on AWS Open Data

≪ Previous: The History of Sears Predicts Nearly Everything Amazon Is Doing (2017)

1/ Google is working on a new search engine code-named "Dragonfly" that will aid China's effort to censor information from its citizenry.

As a former Google engineer I wanted to share some information on what it's like to be inside Google as these decisions are made

2/ I previously shared that in 2006 I was an engineer who worked on Google News and was asked to write code to censor new results in China.

I've found some emails from 2006 that shed more light on the censorship requirements of the Chinese state.

3/ The emails I'm presenting came from a mailing list at Google where employees discussed topics on politics and economics. I have formatted the emails for readability and redacted the names of my colleagues, with whom I was debating. The topic was the 2008 Olympics in China.

4/ The striking fact that I had forgotten about until I rediscovered these emails was that China required Google to censor information both broadly (entire news sections were to be censored) and extremely expeditiously (Google needed to comply with requests within 15 minutes)

5/ It is very likely that the same censorship requirements will apply to the Dragonfly project that Google is currently working on and perhaps the requirements have become even more stringent given Google's new willingness to comply with the Chinese state.

6/ The other thing I find disturbing, after all these years, is the willingness of my former colleagues to not only comply with the censorship but their enthusiasm in rationalizing it. It is not a coincidence that the rationale they give was the same one management had given them

7/ As Blaise Pascal trenchantly observed in Pensées, *power creates opinion*. This is just as true within corporations as it is for national politics. There are benefits to toeing the line, plus obvious disadvantages to dissenting (e.g., risk of being fired).

8/ My colleagues, although they may have been well intentioned, were just regurgitating the reasons Google's management had given for its first foray into China.

The real problem, then as now, is that management seems to have no moral compass.

9/ For many people there is little difference between what is legal and what is moral. This mindset is especially dangerous when it is held by people in power, such as Google's executives. The mindset is: if it's a legal requirement to censor, then we should do it.

10/ The desire to comply becomes even more urgent for executives of large profit-seeking corporations because it gives them access to massive and lucrative markets. Without a strong moral compass, the temptation is far too strong for most of them.

11/ As I previously tweeted, Sergey Brin is a notable exception to this temptation, and he is reported to be the reason that Google left China in 2010:

12/ Unfortunately, Brin's influence in Google's decision making seems to have waned and the new management seems not only willing to be complicit in censorship, but to lie about what it's doing:

13/ I encourage employees of Google who have been asked to work on censored products to stand up against these requests, as I did in 2006, and make it known that Google's willingness to censor is immoral.

You can follow Vijay Boyapati.

____
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Enjoy Threader? Tell @jack.

Download Threader on iOS.

↧

The new fast.ai research datasets collection, on AWS Open Data

October 16, 2018, 5:07 am

≫ Next: Gartner picks digital ethics and privacy as a strategic trend for 2019

≪ Previous: Ex-Google engineer describing the company's role in China censorship

Written: 16 Oct 2018 by Jeremy Howard (*fast.ai*) and Jed Sundwall (*Open Data Global Lead, AWS*)

In machine learning and deep learning we can’t do anything without data. So the people that create datasets for us to train our models are the (often under-appreciated) heroes. Some of the most useful and important datasets are those that become important “academic baselines”; that is, datasets that are widely studied by researchers and used to compare algorithmic changes. Some of these become household names (at least, among households that train models!), such as MNIST, CIFAR 10, and Imagenet.

We all owe a debt of gratitude to those kind folks who have made datasets available for the research community. So fast.ai and the AWS Public Dataset Program have teamed up to try to give back a little: we’ve made some of the most important of these datasets available in a single place, using standard formats, on reliable and fast infrastructure. For a full list and links see the fast.ai datasets page.

fast.ai uses these datasets in the Deep Learning for Coders courses, because they provide great examples of the kind of data that students are likely to encounter, and the academic literature has many examples of model results using these datasets which students can compare their work to. If you use any of these datasets in your research, please show your gratitude by citing the original paper (we’ve provided the appropriate citation link below for each), and if you use them as part of a commercial or educational project, consider adding a note of thanks and a link to the dataset.

Dataset example: the French/English parallel corpus

One of the lessons that gets the most “wow” feedback from fast.ai students is when we study neural machine translation. It seems like magic when we can teach a model to translate from French to English, even if we can’t speak both languages ourselves!

But it’s not magic; the key is the wonderful dataset that we leverage in this lesson: the French/English parallel text corpus prepared back in 2009 by Professor Chris Callison-Burch of the University of Pennsylvania. This dataset contains over 20 million sentence pairs in French and English. He built the dataset in a really clever way: by crawling millions of Canadian web pages (which are often multi-lingual) and then using a set of simple heuristics to transform French URLs onto English URLs. The dataset is particularly important for researchers since it is used in the most important annual competition for benchmarking machine translation models.

Here’s some examples of the sentence pairs that our translation models can learn from:

Often considered the oldest science, it was born of our amazement at the sky and our need to question Astronomy is the science of space beyond Earth’s atmosphere.	Souvent considérée comme la plus ancienne des sciences, elle découle de notre étonnement et de nos questionnements envers le ciel L’astronomie est la science qui étudie l’Univers au-delà de l’atmosphère terrestre.
The name is derived from the Greek root astron for star, and nomos for arrangement or law.	Son nom vient du grec astron, qui veut dire étoile et nomos, qui veut dire loi.
Astronomy is concerned with celestial objects and phenomena – like stars, planets, comets and galaxies – as well as the large-scale properties of the Universe, also known as “The Big Picture”.	Elle s’intéresse à des objets et des phénomènes tels que les étoiles, les planètes, les comètes, les galaxies et les propriétés de l’Univers à grande échelle.

So what’s Professor Callison-Burch doing now? When we reached out to him to check some details for his dataset, he told us he’s now preparing the University of Pennsylvania’s new AI course; and part of his preparation: watching the videos at course.fast.ai! It’s a small world indeed…

The dataset collection

The following categories are currently included in the collection:

The datasets are all stored in the same tgz format, and (where appropriate) the contents have been converted into standard formats, suitable for import into most machine learning and deep learning software. For examples of using the datasets to build practical deep learning models, keep an eye on the fast.ai blog where many tutorials will be posted soon.

↧

Gartner picks digital ethics and privacy as a strategic trend for 2019

October 16, 2018, 5:23 am

≫ Next: How Developers Stop Learning: Rise of the Expert Beginner

≪ Previous: The new fast.ai research datasets collection, on AWS Open Data

Analyst best known for crunching device marketshare data; charting technology hype cycles; and churning out predictive listicles of emergent capabilities at software’s cutting edge has now put businesses on watch that as well as dabbling in the usual crop of nascent technologies organizations need to be thinking about wider impacts next year — on both individuals and society.

Call it a sign of the times but digital ethics and privacy has been named as one of Gartner’s top ten strategic technology trends for 2019. That, my friends, is progress of a sort. Albeit, it also underlines how low certain tech industry practices have sunk that ethics and privacy is suddenly making a cutting-edge trend agenda, a couple of decades into the mainstream consumer Internet.

The analyst’s top picks do include plenty of techie stuff too, of course. Yes blockchain is in there. Alongside the usual string of caveats that the “technologies and concepts are immature, poorly understood and unproven in mission-critical, at-scale business operations”.

So too, on the software development side, is AI-driven development — with the analyst sneaking a look beyond the immediate future to an un-date-stamped new age of the ‘non-techie techie’ (aka the “citizen application developer”) it sees coming down the pipe, when everyone will be a pro app dev thanks to AI-driven tools automatically generating the necessary models. But that’s definitely not happening in 2019.

See also: Augmented analytics eventually (em)powering “citizen data science”.

On the hardware front, Gartner uses the umbrella moniker of autonomous things to bundle the likes of drones, autonomous vehicles and robots in one big mechanical huddle — spying a trend of embodied AIs that “automate functions previously performed by humans” and work in swarming concert. Again, though, don’t expect too much of these bots quite yet — collectively, or, well, individually either.

It’s also bundling AR, VR and MR (aka the mixed reality of eyewear like Magic Leap One or Microsoft’s Hololens) into immersive experiences— in which “the spaces that surround us define ‘the computer’ rather than the individual devices. In effect, the environment is the computer” — so you can see what it’s spying there.

On the hardcore cutting edge of tech there’s quantum computing to continue to tantalize with its fantastically potent future potential. This tech, Gartner suggests, could be used to “model molecular interactions at atomic levels to accelerate time to market for new cancer-treating drugs” — albeit, once again, there’s absolutely no timeline suggested. And QC remains firmly lodged in an “emerging state”.

One nearer-term tech trend is dubbed the empowered edge, with Gartner noting that rising numbers of connected devices are driving processing back towards the end-user — to reduce latency and traffic. Distributed servers working as part of the cloud services mix is the idea, supported, over the longer term, by maturing 5G networks. Albeit, again, 5G hasn’t been deployed at any scale yet. Though some rollouts are scheduled for 2019.

Connected devices also feature in Gartner’s picks of smart spaces (aka sensor-laden places like smart cities, the ‘smart home’ or digital workplaces — where “people, processes, services and things” come together to create “a more immersive, interactive and automated experience”); and so-called digital twins; which isn’t as immediately bodysnatcherish as it first sounds, though does refer to “digital representation of a real-world entity or system” driven by an estimated 20BN connected sensors/endpoints which it reckons will be in the wild by 2020.

But what really stands out in Gartner’s list of developing and/or barely emergent strategic tech trends is digital ethics and privacy — given the concept is not reliant on any particular technology underpinning it; yet is being (essentially) characterized as an emergent property of other already deployed (but unnamed) technologies. So is actually in play — in a way that others on the list aren’t yet (or aren’t at the same mass scale).

The analyst dubs digital ethics and privacy a “growing concern for individuals, organisations and governments”, writing: “People are increasingly concerned about how their personal information is being used by organisations in both the public and private sector, and the backlash will only increase for organisations that are not proactively addressing these concerns.”

Yes, people are increasingly concerned about privacy. Though ethics and privacy are hardly new concepts (or indeed new discussion topics). So the key point is really the strategic obfuscation of issues that people do in fact care an awful lot about, via the selective and non-transparent application of various behind-the-scenes technologies up to now — as engineers have gone about collecting and using people’s data without telling them how, why and what they’re actually doing with it.

Therefore, the key issue is about the abuse of trust that has been an inherent and seemingly foundational principle of the application of far too much cutting edge technology up to now. Especially, of course, in the adtech sphere.

And which, as Gartner now notes, is coming home to roost for the industry — via people’s “growing concern” about what’s being done to them via their data. (For “individuals, organisations and governments” you can really just substitute ‘society’ in general.)

Technology development done in a vacuum with little or no consideration for societal impacts is therefore itself the catalyst for the accelerated concern about digital ethics and privacy that Gartner is here identifying rising into strategic view.

It didn’t have to be that way though. Unlike ‘blockchain’ or ‘digital twins’, ethics and privacy are not at all new concepts. They’ve been discussion topics for philosophers and moralists for scores of generations and, literally, thousands of years. Which makes engineering without consideration of human and societal impacts a very spectacular and stupid failure indeed.

And now Gartner is having to lecture organizations on the importance of building trust. Which is kind of incredible to see, set alongside bleeding edge science like quantum computing. Yet here we seemingly are in kindergarten…

It writes: “Any discussion on privacy must be grounded in the broader topic of digital ethics and the trust of your customers, constituents and employees. While privacy and security are foundational components in building trust, trust is actually about more than just these components. Trust is the acceptance of the truth of a statement without evidence or investigation. Ultimately an organisation’s position on privacy must be driven by its broader position on ethics and trust. Shifting from privacy to ethics moves the conversation beyond ‘are we compliant’ toward ‘are we doing the right thing.”

The other unique thing about digital ethics and privacy is that it cuts right across all other technology areas in this trend list.

You can — and should — rightly ask what does blockchain mean for privacy? Or quantum computing for ethics? How could the empowered edge be used to enhance privacy? And how might smart spaces erode it? How can we ensure ethics get baked into AI-driven development from the get-go? How could augmented analytics help society as a whole — but which individuals might it harm? And so the questions go on.

Or at least they should go on. You should never stop asking questions where ethics and privacy are concerned. Not asking questions was the great strategic fuck-up condensed into ‘move fast and break things’ anti-humanitarian manifesto of yore. Y’know, the motto it had to ditch after it realized that breaking all the things didn’t scale.

Because apparently no one at the company had thought to ask how breaking everyone’s stuff would help it engender trust. And so claiming compliance without trust, as Facebook now finds itself trying to, really is the archetypal Sisyphean struggle.

↧

How Developers Stop Learning: Rise of the Expert Beginner

October 16, 2018, 5:29 am

≫ Next: How Exercise Might “Clean” the Alzheimer's Brain

≪ Previous: Gartner picks digital ethics and privacy as a strategic trend for 2019

Beyond the Dead Sea: When Good Software Groups Go Bad

I recently posted what turned out to be a pretty popular post called “How to Keep Your Best Programmers,” in which I described what most skilled programmers tend to want in a job and why they leave if they don’t get it. Today, I’d like to make a post that works toward a focus on the software group at an organization rather than on the individual journeys of developers as they move within or among organizations. This post became long enough as I was writing it that I felt I had to break it in at least two pieces. This is part one.

In the previous post I mentioned, I linked to Bruce Webster’s “Dead Sea Effect” post, which describes a trend whereby the most talented developers tend to be the most marketable and thus the ones most likely to leave for greener pastures when things go a little sour. On the other hand, the least talented developers are more likely to stay put since they’ll have a hard time convincing other companies to hire them. This serves as important perspective for understanding why it’s common to find people with titles like “super-duper-senior-principal-fellow-architect-awesome-dude,” who make a lot of money and perhaps even wield a lot of authority but aren’t very good at what they do. But that perspective still focuses on the individual. It explains the group only if one assumes that a bad group is the result of a number of these individuals happening to work in the same place (or possibly that conditions are so bad that they drive everyone except these people away).

I believe that there is a unique group dynamic that forms and causes the rot of software groups in a way that can’t be explained by bad external decisions causing the talented developers to evaporate. Make no mistake–I believe that Bruce’s Dead Sea Effect is both the catalyst for and the logical outcome of this dynamic, but I believe that some magic has to happen within the group to transmute external stupidities into internal and pervasive software group incompetence. In the next post in this series, I’m going to describe the mechanism by which some software groups trend toward dysfunction and professional toxicity. In this post, I’m going to set the stage by describing how individuals opt into permanent mediocrity and reap rewards for doing so.

Learning to Bowl

Before I get to any of that, I’d like to treat you to the history of my bowling game. Yes, I’m serious.

I am a fairly athletic person. Growing up, I was always picked at least in the top 1/3rd or so of any people, for any sport or game that was being played, no matter what it was. I was a jack of all trades and master of none. This inspired in me a sort of mildly inappropriate feeling of entitlement to skill without a lot of effort, and so it went when I became a bowler. Most people who bowl put a thumb and two fingers in the ball and carefully cultivate tossing the bowling ball in a pattern that causes the ball to start wide and hook into the middle. With no patience for learning that, I discovered I could do a pretty good job faking it by putting no fingers and thumbs in the ball and kind of twisting my elbow and chucking the ball down the lane. It wasn’t pretty, but it worked.

It actually worked pretty well the more I bowled, and, when I started to play in an after work league for fun, my average really started to shoot up. I wasn’t the best in the league by any stretch–there were several bowlers, including a former manager of mine, who averaged between 170 and 200, but I rocketed up past 130, 140, and all the way into the 160 range within a few months of playing in the league. Not too shabby.

But then a strange thing happened. I stopped improving. Right at about 160, I topped out. I asked my old manager what I could do to get back on track with improvement, and he said something very interesting to me. Paraphrased, he said something like this:

There’s nothing you can do to improve as long as you keep bowling like that. You’ve maxed out. If you want to get better, you’re going to have to learn to bowl properly. You need a different ball, a different style of throwing it, and you need to put your fingers in it like a big boy. And the worst part is that you’re going to get way worse before you get better, and it will be a good bit of time before you get back to and surpass your current average.

I resisted this for a while but got bored with my lack of improvement and stagnation (a personal trait of mine–I absolutely need to be working toward mastery or I go insane) and resigned myself to the harder course. I bought a bowling ball, had it custom drilled, and started bowling properly. Ironically, I left that job almost immediately after doing that and have bowled probably eight times in the years since, but c’est la vie, I suppose. When I do go, I never have to rent bowling shoes or sift through the alley balls for ones that fit my fingers.

Dreyfus, Rapid Returns and Arrested Development

In 1980, a couple of brothers with the last name Dreyfus proposed a model of skill acquisition that has gone on to have a fair bit of influence on discussions about learning, process, and practice. Later they would go on to publish a book based on this paper and, in that book, they would refine the model a bit to its current form, as shown on wikipedia. The model lists five phases of skill acquisition: Novice, Advanced Beginner, Competent, Proficient and Expert. There’s obviously a lot to it, since it takes an entire book to describe it, but the gist of it is that skill acquirers move from “dogmatic following of rules and lack of big picture” to “intuitive transcending of rules and complete understanding of big picture.”

All things being equal, one might assume that there is some sort of natural, linear advancement through these phases, like earning belts in karate or money in the corporate world. But in reality, it doesn’t shake out that way, due to both perception and attitude. At the moment one starts acquiring a skill, one is completely incompetent, which triggers an initial period of frustration and being stymied while waiting for someone, like an instructor, to spoon-feed process steps to the acquirer (or else, as Dreyfus and Dreyfus put it, they “like a baby, pick it up by imitation and floundering”). After a relatively short phase of being a complete initiate, however, one reaches a point where the skill acquisition becomes possible as a solo activity via practice, and the renewed and invigorated acquirer begins to improve quite rapidly as he or she picks “low hanging fruit.” Once all that fruit is picked, however, the unsustainably rapid pace of improvement levels off somewhat, and further proficiency becomes relatively difficult from there forward. I’ve created a graph depicting this (which actually took me an embarrassingly long time because I messed around with plotting a variant of the logistic 1/(1 + e^-x) function instead of drawing a line in Paint like a normal human being).

Rapid skill acquisition

This is actually the exact path that my bowling game followed in my path from bowling incompetence to some degree of bowling competence. I rapidly improved to the point of competence and then completely leveled off. In my case, improvement hit a local maximum and then stopped altogether, as I was too busy to continue on my path as-is or to follow through with my retooling. This is an example of what, for the purposes of this post, I will call “arrested development.” (I understand the overlap with a loaded psychology term, but forget that definition for our purposes here, please.) In the sense of skills acquisition, one generally realizes arrested development and remains at a static skill level due to one of two reasons: maxing out on aptitude or some kind willingness to cease meaningful improvement.

For the remainder of this post and this series, let’s discard the first possibility (since most professional programmers wouldn’t max out at or before bare minimum competence) and consider an interesting, specific instance of the second: voluntarily ceasing to improve because of a belief that expert status has been reached and thus further improvement is not possible.. This opting into indefinite mediocrity is the entry into an oblique phase in skills acquisition that I will call “Expert Beginner.”

The Expert Beginner

The Road to Expert... and Expert Beginner When you consider the Dreyfus model, you’ll notice that there is a trend over time from being heavily rules-oriented and having no understanding of the big picture to being extremely intuitive and fully grasping the big picture. The Advanced Beginner stage is the last one in which the skill acquirer has no understanding of the big picture. As such, it’s the last phase in which the acquirer might confuse himself with an Expert. A Competent has too much of a handle on the big picture to confuse himself with an Expert: he knows what he doesn’t know. This isn’t true during the Advanced Beginner phase, since Advanced Beginners are on the “unskilled” end of the Dunning Kruger Effect and tend to epitomize the notion that, “if I don’t understand it, it must be easy.”

As such, Advanced Beginners can break one of two ways: they can move to Competent and start to grasp the big picture and their place in it, or they can ‘graduate’ to Expert Beginner by assuming that they’ve graduated to Expert. This actually isn’t as immediately ridiculous as it sounds. Let’s go back to my erstwhile bowling career and consider what might have happened had I been the only or best bowler in the alley. I would have started out doing poorly and then quickly picked the low hanging fruit of skill acquisition to rapidly advance. Dunning-Kruger notwithstanding, I might have rationally concluded that I had a pretty good aptitude for bowling as my skill level grew quickly. And I might also have concluded somewhat rationally (if rather arrogantly) that me leveling off indicated that I had reached the pinnacle of bowling skill. After all, I don’t see anyone around me that’s better than me, and there must be some point of mastery, so I guess I’m there.

The real shame of this is that a couple of inferences that aren’t entirely irrational lead me to a false feeling of achievement and then spur me on to opt out of further improvement. I go from my optimistic self-assessment to a logical fallacy as my bowling career continues: “I know that I’m doing it right because, as an expert, I’m pretty much doing everything right by definition.” (For you logical fallacy buffs, this is circular reasoning/begging the question). Looking at the graphic above, you’ll notice that it depicts a state machine of the Dreyfus model as you would expect it. At each stage, one might either progress to the next one or remain in the current one (with the exception of Novice or Advanced Beginner who I feel can’t really remain at that phase without abandoning the activity). The difference is that I’ve added the Expert Beginner to the chart as well.

The Expert Beginner has nowhere to go because progression requires an understanding that he has a lot of work to do, and that is not a readily available conclusion. You’ll notice that the Expert Beginner is positioned slightly above Advanced Beginner but not on the level of Competence. This is because he is not competent enough to grasp the big picture and recognize the irony of his situation, but he is slightly more competent than the Advanced Beginner due mainly to, well, extensive practice being a Beginner. If you’ve ever heard the aphorism about “ten years of experience or the same year of experience ten times,” the Expert Beginner is the epitome of the latter. The Expert Beginner has perfected the craft of bowling a 160 out of 300 possible points by doing exactly the same thing week in and week out with no significant deviations from routine or desire to experiment. This is because he believes that 160 is the best possible score by virtue of the fact that he scored it.

Expert Beginners in Software

Software is, unsurprisingly, not like bowling. In bowling, feedback cycles are on the order of minutes, whereas in software, feedback cycles tend to be on the order of months, if not years. And what I’m talking about with software is not the feedback cycle of compile or run or unit tests, which is minutes or seconds, but rather the project. It’s during the full lifetime of a project that a developer gains experience writing code, source controlling it, modifying it, testing it, and living with previous design and architecture decisions during maintenance phases. With everything I’ve just described, a developer is lucky to have a first try of less than six months, which means that, after five years in the industry, maybe they have ten cracks at application development. (This is on average–some will be stuck on a single one this whole time while others will have dozens.)

What this means is that the rapid acquisition phase of a software developer–Advanced Beginnerism–will last for years rather than weeks. And during these years, the software developers are job-hopping and earning promotions, especially these days. As they breeze through rapid acquisition, so too do they breeze through titles like Software Engineer I and II and then maybe “Associate” and “Senior,” and perhaps eventually on up to “Lead” and “Architect” and “Principal.” So while in the throes of Dunning-Kruger and Advanced Beginnerism, they’re being given expert-sounding titles and told that they’re “rock stars” and “ninjas” and whatever by recruiters–especially in today’s economy. The only thing stopping them from taking the natural step into the Expert Beginner stage is a combination of peer review and interaction with the development community at large.

But what happens when the Advanced Beginner doesn’t care enough to interact with the broader community and for whatever reason doesn’t have much interaction with peers? The Daily WTF is filled with such examples. They fail even while convinced that the failure is everyone else’s fault, and the nature of the game is such that blaming others is easy and handy to relieve any cognitive dissonance. They come to the conclusion that they’ve quickly reached Expert status and there’s nowhere left to go. They’ve officially become Expert Beginners, and they’re ready to entrench themselves into some niche in an organization and collect a huge paycheck because no one around them, including them, realizes that they can do a lot better.

Until Next Time

And so we have chronicled the rise of the Expert Beginner: where they come from and why they stop progressing. In the next post in this series, I will explore the mechanics by which one or more Expert Beginners create a degenerative situation in which they actively cause festering and rot in the dynamics of groups that have talented members or could otherwise be healthy.

Next up: How Software Groups Rot: Legacy of the Expert Beginner

Want more content like this?

Sign up for my mailing list, and get about 10 posts' worth of content, excerpted from my latest book as a PDF. (Don't worry about signing up again -- the list is smart enough to keep each email only once.)

↧

How Exercise Might “Clean” the Alzheimer's Brain

October 16, 2018, 5:03 am

≫ Next: How Manhattan Became a Rich Ghost Town

≪ Previous: How Developers Stop Learning: Rise of the Expert Beginner

For the 50 million individuals worldwide ailing from Alzheimer’s disease, the announcements by pharmaceutical giants earlier this year that they will end research on therapeutics were devastating. The news is even more devastating considering projections that 100 million more people will be diagnosed with Alzheimer’s disease across the globe by 2050, all potentially without a medical means to better their quality of life.

As it happens, though, the pursuit of a therapeutic has been given a lifeline. New research shows that physical exercise can “clean up” the hostile environments in the brains of Alzheimer’s mice, allowing new nerve cells in the hippocampus, the brain structure involved in memory and learning, to enable cognitive improvements, such as learning and memory. These findings imply that pharmacological agents that enrich the hippocampal environment to boost cell growth and survival might be effective to recuperate brain health and function in human Alzheimer’s disease patients.

The brain of an individual with Alzheimer’s disease is a harsh place filled with buildups of harmful nerve cell junk—amyloid plaques and neurofibrillary tangles—and dramatic loss of nerve cells and connections that occur with severe cognitive decline, such as memory loss. Targeting and disrupting this harmful junk, specifically amyloid plaques, to restore brain function has been the basis of many failed clinical trials. This futility has led to a re-evaluation of the amyloid hypothesis—the central dogma for Alzheimer’s disease pathology based on the toxic accumulation of amyloid plaques.

At the same time, there have been traces of evidence for exercise playing a preventative role in Alzheimer’s disease, but exactly how this occurs and how to take advantage of it therapeutically has remained elusive. Exercise has been shown to create biochemical changes that fertilize the brain’s environment to mend nerve cell health. Additionally, exercise induces restorative changes relevant to Alzheimer’s disease pathology with improved nerve cell growth and connectivity in the hippocampus, a process called adult hippocampal neurogenesis. For these reasons, the authors Choi et al. explored whether exercise-induced effects and hippocampal nerve cell growth could be utilized for therapeutic purposes in Alzheimer’s disease to restore brain function.

The researchers found that exercised animals from a mouse model of Alzheimer’s had greatly enhanced memory compared to sedentary ones due to improved adult hippocampal neurogenesis and a rise in amounts of a specific molecule that promotes brain cell growth called BDNF. Importantly, they could recover brain function, specifically memory, in mice with Alzheimer’s disease but without exercise by increasing hippocampal cell growth and BDNF levels using a combination of genetic—injecting a virus—and pharmacological means. On the other hand, blocking hippocampal neurogenesis early in Alzheimer’s worsened nerve cell health later in stages, leading to degeneration of the hippocampus and, subsequently, memory function. This provides preclinical proof of concept that a combination of drugs that increase adult hippocampal neurogenesis and BDNF levels could be disease-modifying or prevent Alzheimer’s disease altogether.

With this work, things don’t look promising for the amyloid hypothesis—that Alzheimer’s disease is caused by the deposition of amyloid plaques. In this study, it was shown that eliminating amyloid plaques were not to necessary to ameliorate memory defects, which is consistent with evidence that plaques can also be found in the brains of healthy individuals. On the contrary, we may be looking at a new and improved fundamental theory for Alzheimer’s disease based on promoting a healthier brain environment and adult hippocampal neurogenesis.

However, this inspiring news should be taken with an important caution—mouse models of Alzheimer’s are notorious for failing to translate into humans such that treatments that have worked to remedy mice have failed for humans. Besides, even if these findings translate into humans, it may apply to a fraction of Alzheimer’s individuals with relevant genetic components to the mouse model utilized. Future studies will need to replicate these results in mouse models emulating the range of known Alzheimer’s disease genetic milieus and, more importantly, prove its medical relevance to human disease.

Before translating these findings into human patients, there remains significant research to establish that a medication or drug could mimic the effects of exercise—exercise mimetics—by “cleaning up” the brain with BDNF and stimulating neurogenesis to combat Alzheimer’s disease. Currently, the method for administering BDNF to animals in the lab—by direct injection into the brain—is not ideal for use in people, and a hippocampal neurogenesis stimulating compound remains elusive.

Future attempts to generate pharmacological means to imitate and heighten the benefits of exercise—exercise mimetics—to increase adult hippocampal neurogenesis in addition to BDNF may someday provide an effective means of improving cognition in people with Alzheimer’s disease. Moreover, increasing neurogenesis in the earliest stages of the disease may protect against neuronal cell death later in the disease, providing a potentially powerful disease-modifying treatment strategy.

↧

How Manhattan Became a Rich Ghost Town

October 16, 2018, 5:26 am

≫ Next: Twilio to Acquire Sendgrid

≪ Previous: How Exercise Might “Clean” the Alzheimer's Brain

New York’s empty storefronts are a dark omen for the future of cities.

These days, walking through parts of Manhattan feels like occupying two worlds at the same time. In a theoretical universe, you are standing in the nation’s capital of business, commerce, and culture. In the physical universe, the stores are closed, the lights are off, and the windows are plastered with for-lease signs. Long stretches of famous thoroughfares—like Bleecker Street in the West Village and Fifth Avenue in the East 40s—are filled with vacant storefronts. Their dark windows serve as daytime mirrors for rich pedestrians. It’s like the actualization of a Yogi Berra joke: Nobody shops there anymore—it’s too desirable.

A rich ghost town sounds like a capitalist paradox. So what the heck is going on? Behind the darkened windows, there’s a deeper story about money and land, with implications for the future of cities and the rest of the United States.

Let’s start with the data. Separate surveys by Douglas Elliman, a real-estate company, and Morgan Stanley determined that at least 20 percent of Manhattan’s street retail is vacant or about to become vacant. (The city government’s estimate is lower.) The number of retail workers in Manhattan has fallen for three straight years by more than 10,000. That sector has lost more jobs since 2014, during a period of strong and steady economic growth, than during the Great Recession.

There are at least three interlinked causes. First, the rent, as you may have heard, is too damn high. It’s no coincidence that retail vacancies are highest in some of the most expensive parts of the city, like the West Village and near Times Square. From 2010 to 2014, commercial rents in the most-trafficked Manhattan shopping corridors soared by 89 percent, according to CBRE Group, a large real-estate and investment firm. But retail sales rose by just 32 percent. In other words, commercial rents have ascended to an altitude where small businesses cannot breathe. Some of the city’s richest zip codes have become victims of their own affluence.

Second, the pain of soaring rents is exacerbated by the growth of online shopping. It’s typically simplistic to point at a problem in the U.S. and say, “Well, because Amazon.” But it is no coincidence that New York storefront vacancy is climbing just as warehousing vacancy in the U.S. has officially reached an all-century low: A lot of goods are moving from storefronts to warehouses, where they are placed in little brown boxes rather than big brown bags.

Walking around the Upper East Side, where I live, I find it striking how many of the establishments still standing among the many darkened windows are hair salons, nail salons, facial salons, eyebrow places, and restaurants. What’s the one thing they have in common? You won’t find their services on Amazon. The internet won’t cut my hair, and not even the most homesick midwesterner goes online to order a deep dish to be delivered from Chicago to New York. Online shopping has digitized a particular kind of business—mostly durable, nonperishable, and tradable goods—that one used to seek out in department stores or similar establishments. Their disappearance has opened up huge swaths of real estate.

One might expect that new companies would fill the vacuum, particularly given the evidence that e-commerce companies can boost online sales by opening physical locations. But that brings us to the third problem: Many landlords don’t want to offer short-term leases to pop-up stores if they think a richer, longer-term deal is forthcoming from a national brand with money to burn, like a bank branch or retail chain. The upshot is a stubborn market imbalance: The fastest-growing online retailers are looking to experiment with short-term leases, but the landlords are holding out for long-term tenants.

New York’s problems today are an omen for the future of cities. Most people don’t live downtown because they love drifting off to the endearing sounds of honking cars and hollering investment bankers. Rather, they want access to urban activity, diversity, and charm—the quirky bars, the curious antique shops, the family restaurant that’s been there for generations—and the best way to buy that access is to own a bedroom in the heart of the city.

What happens when cities become too expensive to afford any semblance of that boisterous diversity? The author E. B. White called New York an assembly of “tiny neighborhood units.” But the 2018 landlord waiting game is denuding New York of its particularity and turning the city into a high-density simulacrum of the American suburb. The West Village landlords hoping to lease their spaces to national chains are turning one of America’s most famous neighborhoods into a labyrinthine strip mall. Their strategy bodes the disappearance of those quirky restaurants, curious antique shops, and any coffee shops that aren’t publicly traded on the NYSE.

In Jane Jacobs’s famous vision of New York, the city ideally served as a playful laboratory, which nursed new firms and ideas and exported its blessedly strange culture to the world. Today’s New York is the opposite: a net importer of the un-weird, so desperate to bring in national chains to pay exorbitant leases that landlords are willing to sit on barren blocks.

Economics assures us that, in the long run, prices and strategies move toward an equilibrium; macroeconomics abhors a vacuum even more than physics (but apparently less than Fifth Avenue landlords). As vacancies pile up, one would think that desperate property owners would lower the rent to make room for a new generation of unique shops. In this vision, today’s vacancies are a necessary torment, the grassland fire whose ashes will nourish new native species and bring forth a better ecosystem. And, jeez, how many Wells Fargos and Duane Reades can one city block take?

But in the past five years, the problem of rising vacancies and monotony has actually gotten worse. It would be one thing if New York were simply trading eccentricity for accessibility—that is, knocking down fusty establishments to build new apartments with affordable housing. But the median home value in Manhattan is still over $1 million. For both middle-class families and emerging companies looking for a foothold in the city, it’s the same dispiriting picture: rising returns to incumbent businesses and legacy wealth, with fewer chances for the upstarts, the strivers, the rest.

“America has only three cities,” Tennessee Williams purportedly said. “New York, San Francisco, and New Orleans. Everywhere else is Cleveland.” That may have been true once. But New York’s evolution suggests that the future of cities is an experiment in mass commodification—the Clevelandification of urban America, where the city becomes the very uniform species that Williams abhorred. Paying seven figures to buy a place in Manhattan or San Francisco might have always been dubious. But what’s the point of paying New York prices to live in a neighborhood that’s just biding its time to become “everywhere else”?

This article originally appeared in The Atlantic.

About the Author

Derek Thompson

Derek Thompson is a staff writer at The Atlantic, where he writes about economics, labor markets, and the media. He is the author of Hit Makers.

↧

Twilio to Acquire Sendgrid

October 15, 2018, 1:54 pm

≫ Next: Docker raises $92M in funding

≪ Previous: How Manhattan Became a Rich Ghost Town

Accelerates Twilio’s Mission to Fuel the Future of Communications

Brings Together the Two Leading Communication Platforms for Developers

The Combination to Create One, Best-in-Class Cloud Communications Platform for Companies to Communicate with Customers Across Every Channel

Twilio & SendGrid Together Serve Millions of Developers, Have 100,000+ Customers, and Have a Greater than $700 Million Annualized Revenue Run Rate*

Twilio (NYSE:TWLO) and SendGrid today announced that they have entered into a definitive agreement for Twilio to acquire SendGrid in an all-stock transaction valued at approximately $2 billion. At the exchange ratio of 0.485 shares of Twilio Class A common stock per share of SendGrid common stock, this price equates to approximately $36.92 per share based on today’s closing prices. The transaction is expected to close in the first half of 2019.

Adding the leading email API platform to the leading cloud communications platform can drive tremendous value to the combined customer bases. The resulting company would offer developers a single, best-in-class platform to manage all of their important communication channels -- voice, messaging, video, and now email as well. Together, the companies currently drive more than half a trillion customer interactions annualized*, and growing rapidly.

“Increasingly, our customers are asking us to solve all of their strategic communications challenges - regardless of channel. Email is a vital communications channel for companies around the world, and so it was important to us to include this capability in our platform," said Jeff Lawson, Twilio's co-founder and chief executive officer. "The two companies share the same vision, the same model, and the same values. We believe this is a once-in-a-lifetime opportunity to bring together the two leading developer-focused communications platforms to create the unquestioned platform of choice for all companies looking to transform their customer engagement.”

“This is a tremendous day for all SendGrid customers, employees and shareholders,” said Sameer Dholakia, SendGrid’s chief executive officer. “Our two companies have always shared a common goal - to create powerful communications experiences for businesses by enabling developers to easily embed communications into the software they are building. Our mission is to help our customers deliver communications that drive engagement and growth, and this combination will allow us to accelerate that mission for our customers.”

Details Regarding the Proposed SendGrid Acquisition
The boards of directors of Twilio and SendGrid have each approved the transaction.

Under the terms of the transaction, Twilio Merger Subsidiary, Inc., a Delaware corporation and a wholly-owned subsidiary of Twilio, will be merged with and into SendGrid, with SendGrid surviving as a wholly-owned subsidiary of Twilio. At closing, each outstanding share of SendGrid common stock will be converted into the right to receive 0.485 shares of Twilio Class A common stock, which represents a per share price for SendGrid common stock of $36.92 based on the closing price of Twilio Class A common stock on October 15, 2018. The exchange ratio represents a 14% premium over the average exchange ratio for the ten calendar days ending, October 15, 2018.

The transaction is expected to close in the first half of 2019, subject to the satisfaction of customary closing conditions, including shareholder approvals by each of SendGrid’s and Twilio’s respective stockholders and the expiration of the applicable waiting period under the Hart-Scott-Rodino Antitrust Improvements Act. Certain stockholders of SendGrid owning approximately 6% of the outstanding SendGrid shares have entered into voting agreements and certain stockholders of Twilio who control approximately 33% of total Twilio voting power have entered into voting agreements, or proxies, pursuant to which they have agreed, among other things, and subject to the terms and conditions of the agreements, to vote in favor of the SendGrid acquisition and the issuance of Twilio shares in connection with the SendGrid acquisition, respectively.

Goldman Sachs & Co. LLC is serving as exclusive financial advisor to Twilio and Goodwin Procter LLP is acting as legal counsel to Twilio. Morgan Stanley & Co. LLC. is serving as exclusive financial advisor to SendGrid and Cooley LLP and Skadden, Arps, Slate, Meagher & Flom LLP are acting as legal counsel to SendGrid.

Q3 2018 Results and Guidance
Both companies will report their respective financial results for the three months ended September 30, 2018 on November 6, 2018. However, both Twilio and SendGrid are announcing that they have exceeded the guidance provided on Aug. 6th and July 31st, respectively, for their third fiscal quarters.

Guidance for the combined company will be provided after the proposed transaction has closed.

Conference Call Information
Twilio will host a conference call today, October 15, 2018, to discuss the SendGrid acquisition, at 2:30 p.m. Pacific Time, 5:30 p.m. Eastern Time. A live webcast of the conference call, as well as a replay of the call, will be available at https://investors.Twilio.com. The conference call can also be accessed by dialing (844) 453-4207, or +1 (647) 253-8638 (outside the U.S. and Canada). The conference ID is 6976357. Following the completion of the call through 11:59 p.m. Eastern Time on Oct. 22, 2018, a replay will be available by dialing (800) 585-8367 or +1 (416) 621-4642 (outside the U.S. and Canada) and entering passcode 6976357. Twilio has used, and intends to continue to use, its investor relations website as a means of disclosing material non-public information and for complying with its disclosure obligations under Regulation FD.

About SendGrid
SendGrid is a leading digital communications platform enabling businesses to engage with their customers via email reliably, effectively and at scale. A leader in email deliverability, SendGrid has processed over 45 billion emails each month for internet and mobile-based customers as well as more traditional enterprises.

Additional Information and Where To Find It
In connection with the proposed transaction between Twilio and SendGrid, Twilio will file a Registration Statement on Form S-4 and joint proxy statement/prospectus forming a part thereof. BEFORE MAKING ANY VOTING DECISION, TWILIO’S AND SENDGRID’S RESPECTIVE INVESTORS AND STOCKHOLDERS ARE URGED TO READ THE REGISTRATION STATEMENT AND JOINT PROXY STATEMENT/PROSPECTUS (INCLUDING ANY AMENDMENTS OR SUPPLEMENTS THERETO) REGARDING THE PROPOSED TRANSACTION WHEN THEY BECOME AVAILABLE BECAUSE THEY WILL CONTAIN IMPORTANT INFORMATION. Investors and security holders will be able to obtain free copies of the Registration Statement, the joint proxy statement/prospectus (when available) and other relevant documents filed or that will be filed by Twilio or SendGrid with the SEC through the website maintained by the SEC at http://www.sec.gov. They may also be obtained for free by contacting Twilio Investor Relations by email at ir@twilio.com or by phone at 415-801-3799 or by contacting SendGrid Investor Relations by email at ir@sendgrid.com or by phone at 720-588-4496, or on Twilio’s and SendGrid’s websites at www.investors.twilio.com and www.investors.sendgrid.com, respectively.

No Offer or Solicitation
This communication does not constitute an offer to sell or the solicitation of an offer to buy any securities nor a solicitation of any vote or approval with respect to the proposed transaction or otherwise. No offering of securities shall be made except by means of a prospectus meeting the requirements of Section 10 of the U.S. Securities Act of 1933, as amended, and otherwise in accordance with applicable law.

Participants in the Solicitation
Each of Twilio and SendGrid and their respective directors and executive officers may be deemed to be participants in the solicitation of proxies from their respective shareholders in connection with the proposed transaction. Information regarding the persons who may, under the rules of the SEC, be deemed participants in the solicitation of Twilio and SendGrid shareholders in connection with the proposed transaction and a description of their direct and indirect interests, by security holdings or otherwise will be set forth in the Registration Statement and joint proxy statement/prospectus when filed with the SEC. Information regarding Twilio’s executive officers and directors is included in Twilio’s Proxy Statement for its 2018 Annual Meeting of Stockholders, filed with the SEC on April 27, 2018 and information regarding SendGrid’s executive officers and directors is included in SendGrid’s Proxy Statement for its 2018 Annual Meeting of Stockholders, filed with the SEC on April 20, 2018.
Additional information regarding the interests of the participants in the solicitation of proxies in connection with the proposed transaction will be included in the joint proxy statement/prospectus and other relevant materials Twilio and SendGrid intend to file with the SEC.

Use of Forward-Looking Statements
This communication contains “forward-looking statements” within the meaning of federal securities laws. Forward-looking statements may contain words such as “believes”, “anticipates”, “estimates”, “expects”, “intends”, “aims”, “potential”, “will”, “would”, “could”, “considered”, “likely” and words and terms of similar substance used in connection with any discussion of future plans, actions or events identify forward-looking statements. All statements, other than historical facts, including statements regarding the expected timing of the closing of the proposed transaction and the expected benefits of the proposed transaction, are forward-looking statements. These statements are based on management’s current expectations, assumptions, estimates and beliefs. While Twilio believes these expectations, assumptions, estimates and beliefs are reasonable, such forward-looking statements are only predictions, and are subject to a number of risks and uncertainties that could cause actual results to differ materially from those described in the forward-looking statements.
The following factors, among others, could cause actual results to differ materially from those described in the forward-looking statements: (i) failure of Twilio or SendGrid to obtain stockholder approval as required for the proposed transaction; (ii) failure to obtain governmental and regulatory approvals required for the closing of the proposed transaction, or delays in governmental and regulatory approvals that may delay the transaction or result in the imposition of conditions that could reduce the anticipated benefits from the proposed transaction or cause the parties to abandon the proposed transaction; successful completion of the proposed transaction; (iii) failure to satisfy the conditions to the closing of the proposed transactions; (iv) unexpected costs, liabilities or delays in connection with or with respect to the proposed transaction; (v) the effect of the announcement of the proposed transaction on the ability of SendGrid or Twilio to retain and hire key personnel and maintain relationships with customers, suppliers and others with whom SendGrid or Twilio does business, or on SendGrid’s or Twilio’s operating results and business generally; (vi) the outcome of any legal proceeding related to the proposed transaction; (vii) the challenges and costs of integrating, restructuring and achieving anticipated synergies and benefits of the proposed transaction and the risk that the anticipated benefits of the proposed transaction may not be fully realized or take longer to realize than expected; (vii) competitive pressures in the markets in which Twilio and SendGrid operate; (viii) the occurrence of any event, change or other circumstances that could give rise to the termination of the merger agreement; and (ix) other risks to the consummation of the proposed transaction, including the risk that the proposed transaction will not be consummated within the expected time period or at all. Additional factors that may affect the future results of Twilio and SendGrid are set forth in their respective filings with the SEC, including each of Twilio’s and SendGrid’s most recently filed Annual Report on Form 10-K, subsequent Quarterly Reports on Form 10-Q, Current Reports on Form 8-K and other filings with the SEC, which are available on the SEC’s website at www.sec.gov. See in particular Part II, Item 1A of Twilio’s Quarterly Report on Form 10-Q for the quarter ended June 30, 2018 under the heading “Risk Factors” and Part II, Item 1A of SendGrid’s Quarterly Report on Form 10-Q for the quarter ended June 30, 2018 under the heading “Risk Factors.” The risks and uncertainties described above and in Twilio’s most recent Quarterly Report on Form 10-Q and SendGrid’s most recent Quarterly Report on Form 10-Q are not exclusive and further information concerning Twilio and SendGrid and their respective businesses, including factors that potentially could materially affect their respective businesses, financial condition or operating results, may emerge from time to time. Readers are urged to consider these factors carefully in evaluating these forward-looking statements, and not to place undue reliance on any forward-looking statements. Readers should also carefully review the risk factors described in other documents that Twilio and SendGrid file from time to time with the SEC. The forward-looking statements in these materials speak only as of the date of these materials. Except as required by law, Twilio and SendGrid assume no obligation to update or revise these forward-looking statements for any reason, even if new information becomes available in the future.

* Annualized data for the quarterly period ended June 30, 2018.

Source: Twilio Inc.

↧

Docker raises $92M in funding

October 16, 2018, 5:52 am

≫ Next: Alarming study shows massive insect loss

≪ Previous: Twilio to Acquire Sendgrid

Docker, the company that did more to create today’s modern containerized computing environment than any other independent company, has raised $92 million of a targeted $192 million funding round, according to a filing with the Securities and Exchange Commission.

The new funding is a signal that while may have lost its race with Google’s Kubernetes over whose toolkit would be the most widely adopted, the San Francisco-based company has become the champion for businesses that want to move to the modern hybrid application development and information technology operations model of programming.

To understand the importance of containers in modern programming it may help to explain what they are. Put simply, they’re virtual application environments that don’t require an operating system to work. In the past, this type of functionality would have been created using virtual machines, which included software and an operating system.

Containers, by contrast, are more efficient.

Because they only contain the application and the libraries, frameworks, etc. they depend on, you can put lots of them on a single host operating system. The only operating system on the server is that one host operating system and the containers talk directly to it. That keeps the containers small and the overhead extremely low.

Enterprises are quickly moving to containers as they are looking to improve how they develop and manage software — and do so faster. But they can’t do that alone and need partners like to help them make that transition.

What many people miss is that Docker is far more than the container orchestration layer — won that war — but a full toolchain for building and managing those containers.

With every open-source project, technology companies are quick to adopt (and adapt) the open-source project and be well-versed with how to use it. More mainstream big businesses that aren’t quite as tech-savvy will turn to a company like Docker to help them manage projects developed with the toolkits.

It’s the natural evolution of a technology startup that serves big business customers to become uninteresting while they become more profitable. Enterprises use them. They make money. The hype is gone. Because once a company sells to a big enterprise customer, they stick with that vendor forever.

When Docker’s founder and former chief executive, left the company earlier this year, he acknowledged as much:

… Docker has quietly transformed into an enterprise business with explosive revenue growth and a developer community in the millions, under the leadership of our CEO, the legendary Steve Singh. Our strategy is simple: every large enterprise in the world is preparing to migrate their applications and infrastructure to the cloud, en masse. They need a solution to do so reliably and securely, without expensive code or process changes, and without locking themselves to a single operating system or cloud. Today the only solution meeting these requirements is Docker Enterprise Edition. This puts Docker at the center of a massive growth opportunity. To take advantage of this opportunity, we need a CTO by Steve’s side with decades of experience shipping and supporting software for the largest corporations in the world. So I now have a new role: to help find that ideal CTO, provide the occasional bit of advice, and get out of the team’s way as they continue to build a juggernaut of a business. As a shareholder, I couldn’t be happier to accept this role.

With the money, it’s likely that Docker will ramp up its sales and marketing staff to start generating the kind of revenue numbers it needs to go out for a public offering in 2019. The company has built up a slate of independent directors (in another clear sign that it’s trying to open a window for its exit into the public markets).

Docker is already a “unicorn” worth well over $1 billion. The last time Docker reportedly raised capital was back in late 2017, when The Wall Street Journal uncovered a filing document from the Securities and Exchange Commission indicating that the company had raised $60 million of a targeted $75 million round. Investors at the time included AME Cloud Ventures, Benchmark, Coatue Management, Goldman Sachs and Greylock Partners. At the time, that investment valued the company at $1.3 billion.

We’ve reached out to the company for comment and will update this post when we hear back.

↧

Alarming study shows massive insect loss

October 16, 2018, 5:46 am

≫ Next: Ask HN: What's your advice for someone who's raising capital for the first time?

≪ Previous: Docker raises $92M in funding

Please enable cookies on your web browser in order to continue.

The new European data protection law requires us to inform you of the following before you use our website:

We use cookies and other technologies to customize your experience, perform analytics and deliver personalized advertising on our sites, apps and newsletters and across the Internet based on your interests. By clicking “I agree” below, you consent to the use by us and our third-party partners of cookies and data gathered from your use of our platforms. See our Privacy Policy and Third Party Partners to learn more about the use of data and your rights. You also agree to our Terms of Service.

↧

Ask HN: What's your advice for someone who's raising capital for the first time?

October 16, 2018, 9:46 pm

≫ Next: What's New in DevTools (Chrome 70)

≪ Previous: Alarming study shows massive insect loss

Ask yourself if you actually need the capital.

Between the ubiquity of pay-per-minute cloud computing for every imaginable service, tools to enable extremely productive remote employees, and thousands of software companies ready to handle the complexity of payments, email, ecommerce, hosting, etc, starting a tech company is easier and cheaper than ever.

↧

What's New in DevTools (Chrome 70)

October 16, 2018, 9:47 pm

≫ Next: A new book calls attention to Stanford Ovshinsky and his inventions

≪ Previous: Ask HN: What's your advice for someone who's raising capital for the first time?

Technical Writer, Chrome DevTools & Lighthouse

Welcome back! It's been about 12 weeks since our last update, which was for Chrome 68. We skipped Chrome 69 because we didn't have enough new features or UI changes to warrant a post.

New features and major changes coming to DevTools in Chrome 70 include:

Read on, or watch the video version of this doc:

Live Expressions in the Console

Pin a Live Expression to the top of your Console when you want to monitor its value in real-time.

Click Create Live Expression. The Live Expression UI opens.
Figure 1. The Live Expression UI
Type the expression that you want to monitor.
Figure 2. Typing Date.now() into the Live Expression UI
Click outside of the Live Expression UI to save your expression.
Figure 3. A saved Live Expression

Live Expression values update every 250 milliseconds.

Highlight DOM nodes during Eager Evaluation

Type an expression that evaluates to a DOM node in the Console and Eager Evaluation now highlights that node in the viewport.

Since the current expression evaluates to a node, that node is highlighted in the
viewport. — **Figure 4**. Since the current expression evaluates to a node, that node is highlighted in the viewport

Here are some expressions you may find useful:

document.activeElement for highlighting the node that currently has focus.
document.querySelector(s) for highlighting an arbitrary node, where s is a CSS selector. This is equivalent to hovering over a node in the DOM Tree.
$0 for highlighting whatever node is currently selected in the DOM Tree.
$0.parentElement to highlight the parent of the currently-selected node.

Performance panel optimizations

When profiling a large page, the Performance panel previously took tens of seconds to process and visualize the data. Clicking on a event to learn more about it in the Summary tab also sometimes took multiple seconds to load. Processing and visualizing is faster in Chrome 70.

Processing and loading Performance data. — **Figure 5**. Processing and loading Performance data

More reliable debugging

Chrome 70 fixes some bugs that were causing breakpoints to disappear or not get triggered.

It also fixes bugs related to sourcemaps. Some TypeScript users would instruct DevTools to blackbox a certain TypeScript file while stepping through code, and instead DevTools would blackbox the entire bundled JavaScript file. These fixes also address an issue that was causing the Sources panel to generally run slowly.

Enable network throttling from the Command Menu

You can now set network throttling to fast 3G or slow 3G from the Command Menu.

Network throttling commands in the Command Menu. — **Figure 6**. Network throttling commands in the Command Menu

Autocomplete Conditional Breakpoints

Use the Autocomplete UI to type out your Conditional Breakpoint expressions faster.

Break on AudioContext events

Use the Event Listener Breakpoints pane to pause on the first line of an AudioContext lifecycle event handler.

AudioContext is part of the Web Audio API, which you can use to process and synthesize audio.

AudioContext events in the Event Listener Breakpoints pane. — **Figure 8**. AudioContext events in the Event Listener Breakpoints pane

Debug Node.js apps with ndb

ndb is a new debugger for Node.js applications. On top of the usual debugging features that you get through DevTools, ndb also offers:

Detecting and attaching to child processes.
Placing breakpoints before modules are required.
Editing files within the DevTools UI.
Blackboxing all scripts outside of the current working directory by default.

Check out ndb's README to learn more.

Bonus tip: Measure real world user interactions with the User Timing API

Want to measure how long it takes real users to complete critical journeys on your pages? Consider instrumenting your code with the User Timing API.

For example, suppose you wanted to measure how long a user spends on your homepage before clicking your call-to-action (CTA) button. First, you would mark the beginning of the journey in an event handler associated to a page load event, such asDOMContentLoaded:

document.addEventListener('DOMContentLoaded', () => {
  window.performance.mark('start');
});

Then, you would mark the end of the journey and calculate its duration when the button is clicked:

document.querySelector('#CTA').addEventListener('click', () => {
  window.performance.mark('end');
  window.performance.measure('CTA', 'start', 'end');
});

You can also extract your measurements, making it easy to send them to your analytics service to collect anonymous, aggregated data:

const CTA = window.performance.getEntriesByName('CTA')[0].duration;

DevTools automatically marks up your User Timing measurements in the User Timing section of your Performance recordings.

This also comes in handy when debugging or optimizing code. For example, if you want to optimize a certain phase of your lifecycle, call window.performance.mark() at the beginning and end of your lifecycle function. React does this in development mode.

Feedback

Was this page helpful?

To discuss the new features and changes in this post, or anything else related to DevTools:

File bug reports at Chromium Bugs.
Discuss features and changes on the Mailing List. Please don't use the mailing list for support questions. Use Stack Overflow, instead.
Get help on how to use DevTools on Stack Overflow. Please don't file bugs on Stack Overflow. Use Chromium Bugs, instead.
Tweet us at @ChromeDevTools.
File bugs on this doc in the Web Fundamentals repository.

Consider Canary

If you're on Mac or Windows, consider using Chrome Canary as your default development browser. Canary gives you access to the latest DevTools features.

Previous release notes

See the devtools-whatsnew tag for links to all previous DevTools release notes.

↧

A new book calls attention to Stanford Ovshinsky and his inventions

October 16, 2018, 3:32 pm

≫ Next: An Unintuitive Take on Data Augmentation for Self-Driving Cars

≪ Previous: What's New in DevTools (Chrome 70)

It’s hard to look around in today’s technology-driven world and not see something that exists because of inventor Stanford R. Ovshinsky. When you turn your flat-screen TV on with the click of a remote, when a Prius silently drives past, when you see solar panels powering a home, when you save a photo on your smartphone, you have Ovshinsky, in part, to thank.

Ovshinsky is arguably one of the greatest thinkers and inventors you’ve never heard of. He’s been called his generation’s Thomas Edison and his brilliance compared to that of Albert Einstein. He was ahead of his time.

Born in 1922 in Akron, Ohio, to Jewish parents who emigrated from Eastern Europe, he only completed a high school education and initially pursued a career as a machinist in a factory that made molds for car tires. After leaving to start his own company, Electronic Conversion Devices in 1960 in Auburn Hills, Michigan, his inventions drew widespread attention in the scientific community. He befriended Nobel Prize winners, such as I.I. Rabi and Nevill Mott, and made connections in the business world. Before he passed away in 2012 at age 89, the tinkerer held more than 400 patents.

In a time when gigantic, boxy, cathode ray tube television sets sat in the corners of American living rooms, Ovshinsky envisioned a flattened TV set you could hang on the wall like a picture and went on to invent technology [U.S. Patent No. 3,271,591], in 1966, that turns thin panels of glass into semiconductors that spark pixels in our screens to this day. That same patented technology has since been applied to smartphone microchips that store our data and might usher in a new chapter of information storage. While coal mining reigned supreme, Ovshinsky was pondering ways to harness power from the sun and in 1979 began streamlining the mass production of inexpensive solar panels [U.S. Patent No. 4,519,339]. Then as gas-guzzling roadsters grew more and more popular, Ovshinsky told researchers in one of his labs that the makeshift battery they presented him in a beaker would one day power an energy-efficient car. That was in 1982 and his nickel-metal hydride battery [U.S. Patent No. 4,623,597] has powered electric and hybrid vehicles since the early 2000s.

He was a man who saw tomorrow.

That's precisely the title of a new biographical memoir about his life’s work, The Man Who Saw Tomorrow: The Life and Inventions of Stanford R. Ovshinsky, written by Lillian Hoddeson and Peter Garrett.

Smithsonian.com spoke with Hoddeson and Garrett about Ovshinsky’s life, work and worldview.

Lillian, you’ve written about dozens of scientists, physicists and inventors. You have a background in physics yourself. What made Stan an interesting subject for a biography?

Lillian Hoddeson: When my first book came out, I was teaching at the University of Illinois. The history department chairman, Peter Fritzsche, gave the book to his father who turned out to be, Hellmut Fritzsche, a physicist who worked with Stan for many years. Hellmut contacted me and suggested that my next book be a biography of Stan Ovshinsky. In the fall of 2005, I first visited Stan's company ECD, Energy Conversion Devices, in a suburb of Detroit. I was completely fired up by what I saw, especially the impressive variety of inventive work that was going on there. I was simply blown away by the completely hydrogen-powered Prius that Stan insisted I drive. It was completely silent, used green energy and water vapor was coming out of the exhaust.

I was also impressed by Stan personally and his wife Iris Ovshinsky, and I wanted to learn more about them, even though the work required learning about many topics that I had never worked on before or studied in any depth. I started doing a few interviews with him and about the third time I visited them, in the summer of 2006, I happened to be present at the sudden death of Iris while she was swimming. I kind of became an honorary member of the family, and I decided that I would write the biography.

Ovshinsky’s interests were so wide-ranging, and yet he was high-achieving in everything he did. In what ways did his interests inform one another?

LH: Stan transformed from his early career as a machinist and a toolmaker. The story goes: he got interested in machines and materials in part by being taken along with his father who was a scrap metal collector to many of the machine shops in Akron, a big industrial city. And he decides to become a machinist and a toolmaker when he graduates high school, but being the kind of thinker that he was, he was always motivated to improve the machines he was working with and eventually he realized he wanted to be an inventor.

His first significant invention was this huge heavy lathe—that he named after his father, the Benjamin Lathe. It was an automated lathe used to cut and carve while a machine moves a block of wood. He was interested especially in automation. Later, he studied cybernetics, which is an interdisciplinary way to learn through both communication and controls about animals and machines.

Peter Garrett: His different interests definitely informed each other. He would say later in life when he was trying to explain how he came up with new ideas that he was always thinking about four or five different problems at the same time. He had an incredible capacity for multi-tasking. They would feed on each other and eventually he would make a connection; he would see analogies or make connections that other people weren't seeing and come up with something new.

It seems like there’s hardly an area of our modern world that Stan didn’t touch. Can you talk about the scope of his inventions?

PG: It’s certainly easier for me to get a handle on the scope of his work by looking at the interconnectivity of the discoveries. The crucial discovery, which wasn't just an invention but a scientific breakthrough, was when he created what is now called the Ovshinsky effect, a technique that uses increasing voltage currents to turn noncrystalline [or amorphous and disordered] materials—like thin glassy films, for example—from insulators to conductors and back again at the flip of a switch. For instance, our flat panel TV displays depend on those amorphous semiconductors because unlike crystal transistors you can take this material and make it into very large sheets, so that your screen has a whole thin sheet of amorphous material covered with little transistors, each one of which are switching or interacting with crystals and turning pixels on and off.

Before this discovery it was believed that only crystalline materials, literal crystals, could do this. That’s what was used to create micro-electronic devices like the transistor. People in the field of solid state physics and in the business of making those devices all believed that you had to use crystals.

What Stan did had not been thought possible. It created quite a stir when he published his results both in The New York Times and Physical Review Letters in November 1968. There were a lot of people who were very upset about it, particularly because he was completely self-educated. He had no scientific training.

The nickel-metal hydride battery, which came when Stan had some of his scientists at ECD working on hydrogen storage, made possible the hydrogen car that Lillian was talking about before.

LH: There's this dramatic moment that we tell about in the book when these researchers first make this nickel-metal hydride battery in a beaker. They bring it to Stan and he calls a meeting of staff to demonstrate it. There's this little beaker battery—this crude experimental demonstration—and he says to the group, ‘Someday that will power an electric car.’

And they didn't believe him but eventually it did.

PG: The reason we call the book The Man Who Saw Tomorrow is because he did look very far ahead in imagining the possible developments and implications of his discoveries, and he made predictions that people thought were completely off the wall.

When he was interviewed for The New York Times story about the discovery of the Ovshinsky effect, the threshold switch, the reporter asked him, ‘What might this be good for?’ And one of the things he said was, ‘Well you could make a TV set that you could hang on your wall like a picture.’

That was in 1968 when people used cathode ray tube TVs, and people in electronics thought it was absolutely ridiculous.

You say that Ovshinsky's discovery of phase-change memory will have the most impact on the world. Can you explain what this technology is and how it stands to affect our future?

LH: Phase-change memory is an offshoot of Ovshinsky’s threshold switch in which either an electrical or laser pulse changes the amorphous chalcogenide material to crystalline; it remains in that state until a stronger pulse changes it back. This bistable feature—meaning its pulse remains stable in more than one state—allows the switch to store information and so act as a nonvolatile electronic or optical memory.

Compared to the currently predominant silicon flash memory, phase-change memory is roughly a hundred times faster, requires less power, and can be cycled many more times. As manufacturers work to increase the speed and storage capacity of flash memory chips by scaling them down, they will eventually reach a limit. Chalcogenide memory does not have that limitation and because of its lower power requirements actually works better as it scales down.

Because computer technology plays such a huge role in our lives, and because Ovshinsky’s phase-change memory will both allow today’s computers to work better and allow designers to create more advanced computer architectures in the future, it seems likely to have an increasing impact.

The late John Ross, who was a pioneering chemist at Stanford, once said, “Stan is a genius, but he’s not a scientist.” He almost had a sense of intuition in a way that he himself describes using words like “feeling” or knowing what these inanimate particles that make up our very being “want.” Can you try to expand on that quality?

PG: In terms of him saying, "I see the atoms and molecules and I know what they want to do," it's because he was brilliant. He read a lot across all sorts of areas, and he read very quickly and retained everything.

People would talk about watching him read and he would just be turning the pages like the way you or I would be skimming it. He would just recall everything out of it, and he could go back to the book years later and find the exact page that he wanted to cite. That kind of store of information was one thing.

Another was because he didn't have the formal training that a physicist would have, he didn't have the mathematical techniques for working out his ideas using calculations. He relied very much on visualizing and so that's where you get that “feeling from atoms.” He used his intelligence, but he turned it into visual images that gave him intuition about what you can accomplish with combining different elements.

LH: Another approach that Stan used, was that he used analogies between phenomena in different areas.

PG: He became interested in neurophysiology for a period, and he actually contributed to the field doing research for a while. But he thought of nerve cells as being like switches, and then he took it a step further. He actually constructed a switch that worked the way he understood nerve cells did and created an entirely new kind of switch. That was an important step toward discovering the Ovshinsky effect.

Although he was not a trained scientist, he hired a lot of very smart scientists to work with him on his research and also who helped to explain his work in ways that would have been hard for him to do.

Tell us a bit about the ways in which Stan—and his second wife, Iris’s—worldview came to impact his inventions?

PG: It's important to include the idea that a lot of his work was motivated by social and political ideals. All of these alternative energy devices—the batteries, solar panels or hydrogen powered car—those were all ways of pursuing a goal that he and Iris identified when they founded their company, which was to try to replace fossil fuels.

This was another way in which he saw tomorrow. He anticipated some of the problems that we are now experiencing, like global warming. Having that kind of idealistic social vision was just as important to him as making the inventions or pursuing his discoveries—and it was very important in how they ran the company.

They wanted to make ECD an embodiment of their social ideals, which meant very generous benefits and also supporting individual development—a lot of educational benefits for the employees, a lot of things that created a feeling of solidarity and commitment to the goals that Stan wanted to pursue.

He wasn’t just a brilliant genius, he was really trying to make life better for people. When he talked about his background as a socialist, it was in a different sense than what a lot of us have of socialism as being a philosophy that the government should provide services or that you should nationalize industries—that didn't interest him at all. He thought of socialism as a way of making life better for people, and he was really dedicated to that and he did succeed in that to some extent.

LH: He didn't really care about the money making except insofar that he needed the money to support the research that he wanted to do.

Let’s talk about some of the pushback he faced in his field. He was highly praised and recognized in his field, but also considered an outsider. How did those competing perceptions affect him?

LH: It hurt him a great deal personally. They didn't want to accept him. Some of them felt a bit envious that they didn't come up with the things that he did.

PG: He wasn’t intentionally an oppositional figure. He wanted to be accepted, he loved science. And many of the most gifted scientists appreciated him. Several Nobel Laureates, who would come to visit just because they wanted to talk to him, recognized what an original creative mind he was.

LH: People like I.I. Rabi.

PG: Rabi who won his Nobel Prize much earlier and was a senior statesman in the scientific establishment, really hit it off with Stan and more than once called him a genius.

But on the other hand, there were people who were suspicious of him, who thought he was a charlatan. They disliked the way that he publicized his work, which for a scientist at the time would have been considered very unprofessional, getting your discovery on the front page of The New York Times.

When he was working on his neurological research at Wayne University in Detroit, he said how wonderful it was that the other scientists working there accepted him and were interested in his research. He said, ‘I thought that's how science was,’ that if you made a contribution, people appreciated you and accepted you. The hostile reaction he got when he announced his discovery of the Ovshinsky switch certainly surprised and caused him dismay.

Lots of people call Stan “this generation’s Einstein or Edison.” What qualities might be used to describe the next Stan Ovshinsky?

PG: We have that tribute from Berkeley economist Harley Shaiken, who was a mentee of Stan’s, at the end of the book saying, “He was the last of his kind,” and in a way he was. He was a product of his early upbringing and that historical moment. The other thing about him that makes this question very hard to answer is that ordinarily there might be obvious qualities that educators should try to encourage, but what Stan shows is you can’t create someone like that. There will be brilliant, unique figures in the future, but their uniqueness makes them unpredictable.

That's why people who have been suggested as comparisons—Steve Jobs or Elon Musk— really aren’t good comparisons.

It would be interesting to me, in the future when some other completely unforeseen, brilliant, creative person comes along, whether other people will say he’s another Ovshinsky.

LH: Stan was kind of a transitional figure.

PG: His career covered the transition from the industrial age to the information age. So if you think about the question of someone like Stan coming up in the future, that might be a clue—someone who is not just working within the framework of our own time—however, we understand that—but is really seeing tomorrow and helping make this transition into a different era, which by definition is something we can't really imagine until it emerges.

↧

An Unintuitive Take on Data Augmentation for Self-Driving Cars

October 15, 2018, 5:22 am

≫ Next: MongoDB switches up its open source license

≪ Previous: A new book calls attention to Stanford Ovshinsky and his inventions

DeepScale is constantly looking for ways to boost the performance of our object detection models. In this post, I’ll discuss one project we launched towards that end and the unintuitive discovery we made along the way.

Augmentation experiments and unintuitive results

At the start of my internship with DeepScale, I was tasked with implementing a new data augmentor to improve our object detection efforts. One that stood out was a simple technique called cutout regularization. In short, cutout blacks out a randomly-located square in the input image.

Cutout applied to images from the CIFAR 10 dataset.

The original paper showed that cutout can significantly improve accuracy for vision applications. Because of this, I was surprised that when I applied it to our data, our detection mmAP decreased. I searched our data pipeline for the problem and found something even more surprising: all of the augmentors we were already using were hurting performance immensely.

At the beginning of this exploration, we were using flip, crop, and weight decay regularization — a standard scheme for object detection tasks. Through an ablation study, I found that each of these hurt detection performance on our internal dataset. Removing our default augmentors resulted in a 13% mmAP boost relative to the network’s initial performance.

Generally, we would expect adding weight decay, flip and crop to improve performance by a few points each, as shown in the dashed bars. In our case, however, these augmentors hurt mmAP by a relative 8.4%, 0.1% and 4.5%, respectively. Removing all augmentors lead to a total performance boost of 13%.

So why did these standard augmentors hurt our performance? To explain our unintuitive results, we had to revisit the idea of image augmentation from first principles.

Why we use data augmentation

(This section is an introduction to the intuition behind data augmentation. If you are already familiar with augmentation, feel free to skip to “Why self-driving car data is different.”)

Overfitting is a common problem for deep neural networks. Neural networks are extremely flexible; however, they are often overparameterized given the sizes of common datasets. This results in a model that learns the “noise” within the dataset instead of the “signal.” In other words, they can memorize unintended properties of the dataset instead of learning meaningful, general information about the world. As a result, overfit networks fail to yield useful results when given new, real-world data.

In order to address overfitting, we often “augment” our training data. Common methods for augmenting visual data include randomly flipping images horizontally (flip), shifting their hues (hue jitter) or cropping random sections (crop).

A picture of a giraffe (top left) shown with several common image augmentors: flip (top right), hue jitter (bottom left) and crop (bottom right). Despite these transformations, it is clear that each image is of a giraffe.

Augmentors like flip, hue jitter and crop help to combat overfitting because they improve a network’s ability to generalize. If you train a network to recognize giraffes facing right and on flipped images of giraffes facing left, the network will learn that a giraffe is a giraffe, regardless of orientation. This also forces the network to learn more meaningful and general information about what makes something a giraffe — for example, the presence of brown spotted fur.

Public datasets like the COCO object detection challenge show the need for generalization. Because these datasets contain images aggregated from many sources, taken from different cameras in various conditions, networks need to generalize over many factors to perform well. Some of the variables that nets need to contend with are: lighting, scale, camera intrinsics (such as focal length, principal point offset and axis skew), and camera extrinsics (such as position, angle and rotation). By using many data augmentors, we can train networks to generalize over all of these variables, much like we were able to generalize over giraffe orientation in the previous example.

These examples from the COCO dataset were taken with different cameras, from different angles, scales and poses. It is necessary to learn invariance to these properties to perform well on COCO object detection.

Why self-driving car data is different

Unlike data from COCO and other public datasets, the data collected by a self-driving car is incredibly consistent. Cars generally have consistent pose with respect to other vehicles and road objects. Additionally, all images come from the same cameras, mounted at the same positions and angles. That means that all data collected by the same system has consistent camera properties, like the extrinsics and intrinsics mentioned above. We can collect training data with the same sensor system as will be used in production, so a neural net in a self-driving car doesn’t have to worry about generalizing over these properties. Because of this, it can actually be beneficial to overfit to the specific camera properties of a system.

These examples from a single car in the Berkeley Deep Drive dataset were all taken from the same camera, at the same angle and pose. They also have consistent artifacts, such as the windshield reflection and the object in the bottom right of each frame.

Self-driving car data can be so consistent that standard data augmentors, such as flip and crop, hurt performance more than they help. The intuition is simple: flipping training images doesn’t make sense because the cameras will always be at the same angle, and the car will always be on the right side of the road (assuming US driving laws). The car will almost never be on the left side of the road, and the cameras will never flip angles, so training on flipped data forces the network to overgeneralize to situations it will never see. Similarly, cropping has the effect of shifting and scaling the original image. Since the car’s cameras will always be in the same location with the same field of view, this shifting and scaling forces overgeneralization. Overgeneralization hurts performance because the network wastes its predictive capacity learning about irrelevant scenarios.

A front-view of the sensor array on DeepScale’s data collection car. All sensors are permanently mounted, so all data will have consistent extrinsics — position, angle and rotation. Because we use the same sensors at test-time, all data also has consistent intrinsics — focal length, principal point offset and axis skew. By harnessing the properties of a specific car’s sensors, we can boost vision performance when deploying the same sensor system.

More Improvements

The realization that self-driving car data is uniquely consistent explained our surprising augmentation results. Next, I wanted to see if we could leverage this consistency to further boost performance. Before introducing any new augmentors, I inspected our dataset to see if we could make any improvements at the data level. Our training set originally included images from two wide-angle cameras and a camera with a zoom lens. The zoom lens produces a scaling and shifting effect similar to crop augmentation. At test time, we only use the wide-angle cameras, so training on the zoom images forces the network to overgeneralize. I found that removing the zoom images from our training set gave us another large boost in mmAP. This confirmed our hypothesis that consistency between the train and test sets is important for performance.

After removing the original image augmentors, I trained and tested on a new, more consistent dataset. This improved mmAP by an additional 10.5% relative to our original scheme.

Following this, I considered augmentors that could vary our training data without changing the camera properties. Cutout, the augmentor I implemented at the start of this project, seemed like a good option. Unlike flip and crop, cutout doesn’t change the input in a way that drastically impacts camera properties (ie by flipping, shifting or scaling). Instead, cutout simulates obstructions. Obstructions are common in real-world driving data, and invariance to obstructions can help a network detect partially-occluded objects.

Obstructions are common in real-world driving data. In this image, two pedestrians block our view of the police car, while large bags block our view of the pedestrians.

Hue jitter augmentation can also help generalization without affecting camera properties. Hue jitter simply shifts the hue of the input by a random amount. This helps the network generalize over colors (ie. a red car and a blue car should both be detected the same). As expected, cutout and hue jitter both improved performance on our new test set.

Adding cutout and hue jitter augmentation to the new dataset increased relative mmAP by 1% and 0.2%, respectively. This gives us a total 24.7% boost over our original data scheme (flip, crop and weight decay on the old dataset). Note that the y axis is scaled to better show the difference of small improvements.

Caveats

It’s worth noting that these augmentation tricks won’t work on datasets that include images from different camera types, at different angles and scales. To demonstrate this, I created a test set with varied camera properties by introducing random flips and crops to our original test set. As expected, our new, specialized augmentation scheme performs worse than our original, standard augmentors on the more general dataset.

When applied to consistent self-driving car data, our specialized augmentation scheme (cutout and hue jitter) provides an 11.7% boost in mmAP over the standard augmentation scheme (flip, crop and weight decay); however, when applied to more varied data, our specialized scheme results in a drop of 24.3% vs the standard scheme.

It’s always important to make sure that your test data covers the range of examples your model will see in the real world. Using specialized data augmentation makes this sanity-check even more essential. It’s easy to fool yourself into thinking that you’ve boosted your model’s performance, when you’ve really just overfit to a dataset that’s too easy (e.g. driving data with only clear, daytime images).

If your dataset really is robust and consistent, these tricks can be a powerful toolkit to improve performance. As shown, we were able to dramatically improve our object detection performance by enabling our network to learn the camera properties of our vehicle. This can be applied to any domain where training data is collected on the same sensor system as will be used in deployment.

Takeaways

Networks that perform well on satellite images (left) or cellular data (center) might require fundamentally different approaches than those built for common research datasets like ImageNet (right).

In hindsight, these augmentation changes might seem obvious. The reality is that we were blinded by conventional wisdom. Augmentors like flip and crop have been so broadly successful on research problems that we never thought to question their applicability to our specific problem. When we revisited the concept of augmentation from first principles, it became clear that we could do better. The field of machine learning has many similar “generic best practices,” such as how to set the learning rate, which optimizer to use, and how to initialize models. It’s important for ML practitioners to continually revisit our assumptions about how to train models, especially when building for specific applications. How does the vision problem change when working with satellite mapping data, or cellular imaging, as opposed to ImageNet? We believe that questions like these are underexplored in academia. By looking at them with fresh eyes, we have the potential to dramatically improve industrial applications of machine learning.

↧

MongoDB switches up its open source license

October 16, 2018, 7:05 am

≫ Next: Do We Worship Complexity?

≪ Previous: An Unintuitive Take on Data Augmentation for Self-Driving Cars

MongoDB is a bit miffed that some cloud providers — especially in Asia — are taking its open-source code and offering a hosted commercial version of its database to their users without playing by the open-source rules. To combat this, today announced it has issued a new software license, the Server Side Public License (SSPL), that will apply to all new releases of its MongoDB Community Server, as well as all patch fixes for prior versions.

Previously, MongoDB used the AGPLv3 license, but it has now submitted the SSPL for approval from the Open Source Initiative.

For virtually all regular users who are currently using the community server, nothing changes because the changes to the license don’t apply to them. Instead, this is about what MongoDB sees as the misuse of the AGPLv3 license. “MongoDB was previously licensed under the GNU AGPLv3, which meant companies who wanted to run MongoDB as a publicly available service had to open source their software or obtain a commercial license from MongoDB,” the company explains. “However, MongoDB’s popularity has led some organizations to test the boundaries of the GNU AGPLv3.”

So while the SSPL isn’t all that different from the GNU GPLv3, with all the usual freedoms to use, modify and redistribute the code (and virtually the same language), the SSPL explicitly states that anybody who wants to offer MongoDB as a service — or really any other software that uses this license — needs to either get a commercial license or open source the service to give back the community.

“The market is increasingly consuming software as a service, creating an incredible opportunity to foster a new wave of great open source server-side software. Unfortunately, once an open source project becomes interesting, it is too easy for cloud vendors who have not developed the software to capture all of the value but contribute nothing back to the community,” said Eliot Horowitz, the CTO and co-founder of MongoDB, in a statement. “We have greatly contributed to — and benefited from — open source and we are in a unique position to lead on an issue impacting many organizations. We hope this will help inspire more projects and protect open source innovation.”

I’m sure this move will ruffle some feathers. It’s hard to discuss open-source licenses without getting religious about what this movement is all about. And because MongoDB is the commercial entity behind the software and manages outside contributions to the code, it does have a stronger grip on the actual code than other projects that are managed by a large open-source foundation, for example. For some, that alone is anathema to everything they think open source should stand for. For others, it’s simply a pragmatic way to develop software. Either way, though, this will kick off a discussion about how companies like MongoDB manage their open-source projects and how much control they can exert over how their code is used. I, for one, can’t wait to read the discussions on Hacker News today.

↧

Do We Worship Complexity?

October 16, 2018, 9:42 am

≫ Next: GitHub launches Actions, its workflow automation tool

≪ Previous: MongoDB switches up its open source license

14. Oktober 2018 5 Minuten Lesedauer

Software development is not really about programming. Anybody can write a ten-line program. The true challenge are complex systems. If a system is so large that a single person alone cannot understand it and develop it further, concepts such as modularization are crucial. Modularization divides the system into small units, which a single person can handle. Then complexity becomes the main challenge. With this kind of complexity, single individuals can no longer implement projects, only teams. This leads to organizational challenges.

Conway’s Law

Conway’s Law is important in the context of organization and software development. It states that the architecture of a system represents the communication structures of the organization that implements the system. For each module in the software there is a organizational unit and for each communication relationship between organizational units there is a dependency between the modules in the software.

Conway’s paper of 1968, however, also describes something else: If an organization wants to develop a big system, a lot of people will need to work on the project. Since communication in a large team is difficult, it collapses at a certain team size. Since communication and architecture influence each other, poor communication leads to chaotic architecture and additional complexity.

But Conway goes further: Obviously, if at all possible, you should aim for an elegant solutions that a small team can implement. But a manager’s prestige depends on the size of the team and budget he or she is responsible for, says Conway. That’s why a manager will strive for as large a team and a large budget as possible.

That doesn’t seem to be a problem at first. If the project is done with too large a team, then some people will just be sitting around and do nothing. That costs money, but doesn’t jeopardize the project or the architecture. But Conway says that Parkinson’s Law will strike. Parkinson’s law explains why some administrations hire more employees, but still don’t get more work done. The law states that a task uses the time available to all employees completely. Even if the task can be processed easily and quickly, more and more people participate until everyone in the organization is busy. So in a software project, all team members will work on the project, regardless of whether this is necessary or not. Accordingly, the organization will grow, communication will collapse, and the architecture will become chaotic.

Conway’s insight, which is now 50 years old, is particularly interesting because it can explain why a large important project might have a bad architecture and is difficult to develop further even if it is a very important project.

The managers, that this paper describe, worship complexity without realizing it. They want as large a team as possible and thereby they make a problem complex because a large organization can cause the architecture to collapse.

What About Software Architects?

Not only managers but also software architects sometimes unconsciously worship complexity. This happens, for example, when we use patterns such as event sourcing, architectures with many layers or microservices without considering sufficiently the benefits in the specific context versus the complexity.

If you want to use the latest and shiniest technology, that can also lead to excessive complexity. After all, we all are looking for technical challenges and want to implement interesting projects. Modern approaches and in particular complex systems are well suited for this.

We also sometimes solve technical problems that do not exist. This might result in very generic or scalable solutions that are not required by the actual requirements and therefore generate too much complexity.

Complexity as an Excuse

A particularly blatant case of complexity worshiping is the statement “This doesn’t work for us. Our challenges are much greater than those of Amazon or Google.” I’ve heard that from employees of different companies. Such statements are surprising: companies like Amazon or Google have extremely complex IT systems. Their economic success depends directly on these IT systems. Not least because of these IT systems, they are among the most valuable companies in the world.

At first glance, the statements can be interpreted as defensive: Amazon and Google have a modern organization and a cloud infrastructure, but in the company’s much more complex environment, it’s obviously impossible to establish similar things. But perhaps this statement reflects pride. After all, you’re dealing with almost unprecedented challenges. Either way, the complexity of course has disadvantages, but also the advantage that you don’t have to consider certain approaches such as cloud, continuous delivery, or microservices, because they are impossible to implement anyway.

It is therefore questionable whether we really always avoid complexity. To concentrate only on techniques to make designs as simple and elegant as possible, is of no use if we unconsciously worship complexity. Therefore, it is important to realize these unconscious mechanisms. Of course, there are still many complex problems that are actually difficult to solve.

Many thanks to my colleagues Jens Bendisposto, Jochen Christ, Lutz Hühnken, Michael Vitz and Benjamin Wolf for their comments on an earlier version of the article.

tl;dr

Software development is all about handling complexity. It would be best to avoid complexity right away. But unfortunately, there are times when complexity is worshipped - consciously or unconsciously - leading to unnecessarily complex systems.

↧

GitHub launches Actions, its workflow automation tool

October 16, 2018, 10:08 am

≫ Next: First analysis of how Uber and Lyft have affected roadway congestion in SF

≪ Previous: Do We Worship Complexity?

For the longest time, GitHub was all about storing source code and sharing it either with the rest of the world or your colleagues. Today, the company, which is in the process of being acquired by Microsoft, is taking a step in a different but related direction by launching GitHub Actions. Actions allow developers to not just host code on the platform but also run it. We’re not talking about a new cloud to rival AWS here, but instead about something more akin to a very flexible IFTTT for developers who want to automate their development workflows, whether that is sending notifications or building a full continuous integration and delivery pipeline.

This is a big deal for . Indeed, Sam Lambert, GitHub’s head of platform, described it to me as “the biggest shift we’ve had in the history of GitHub.” He likened it to shortcuts in iOS — just more flexible. “Imagine an infinitely more flexible version of shortcut, hosted on GitHub and designed to allow anyone to create an action inside a container to augment and connect their workflow.”

GitHub users can use Actions to build their continuous delivery pipelines, and the company expects that many will do so. And that’s pretty much the first thing most people will think about when they hear about this new project. GitHub’s own description of Actions in today’s announcement makes definitely fits that bill, too. “Easily build, package, release, update, and deploy your project in any language—on GitHub or any external system—without having to run code yourself,” the company writes. But it’s about more than that.

“I see CI/CD as one narrow use case of actions. It’s so, so much more,” Lambert stressed. “And I think it’s going to revolutionize DevOps because people are now going to build best in breed deployment workflows for specific applications and frameworks, and those become the de facto standard shared on GitHub. […] It’s going to do everything we did for open source again for the DevOps space and for all those different parts of that workflow ecosystem.”

That means you can use it to send a text message through Twilio every time someone uses the ‘urgent issue’ tag in your repository, for example. Or you can write a one-line command that searches your repository with a basic grep command. Or really run any other code you want to because all you have to do to turn any code in your repository into an Action is to write a Docker file for it so that GitHub can run it. “As long as there is a Docker file, we can build it, run in and connect it to your workflow,” Lambert explained. If you don’t want to write a Docker file, though, there’s also a visual editor you can use to build your workflow.

As Corey Wilkerson, GitHub’s head of product engineering also noted, many of these Actions already exist in repositories on GitHub today. And there are now over 96 million of those on GitHub, so that makes for a lot of potential actions that will be available from the start.

With Actions, which is now in limited public beta, developers can set up the workflow to build, package, release, update and deploy their code without having to run the code themselves.

Now developers could host those Actions themselves — they are just Docker containers, after all — but GitHub will also host and run the code for them. And that includes developers on the free open source plan.

Over time — and Lambert seemed to be in favor of this — GitHub could also allow developers to sell their workflows and Actions through the GitHub marketplace. For now, that’s not an option, but it it’s definitely that’s something the company has been thinking about. Lambert also noted that this could be a way for open source developers who don’t want to build an enterprise version of their tools (and the sales force that goes with that) to monetize their efforts.

While GitHub will make its own actions available to developers, this is an open platform and others in the GitHub community can contribute their own actions, too.

GitHub will slowly open Actions to developers, starting with daily batches for the time being. You can sign up for access here.

In addition to Actions, GitHub also announced a number of other new features on its platform. As the company stressed during today’s event, it’s mission is to make the life of developers easier — and most of the new features may be small but do indeed make it easier for developers to do their jobs.

So what else is new? GitHub Connect, which connects the silo of GitHub Enterprise with the open source repositories on its public site, is now generally available, for example. GitHub Connect enables new features like unified search, that can search through both the open source code on the site and internal code, as well as a new Unified Business Identity feature that brings together the multiple GitHub Business accounts that many businesses now manage (thanks, shadow IT) under a single umbrella to improve billing, licensing and permissions.

The company also today launched three new courses in its Learning Lab that make it easier for developers to get started with the service, as well as a business version of Learning Lab for larger organizations.

What’s maybe even more interesting for developers whose companies use GitHub Enterprise, though, is that the company will now allow admins to enable a new feature that will display those developers’ work as part of their public profile. Given that GitHub is now the de facto resume for many developers, that’s a big deal. Much of their work, after all, isn’t in open source or in building side projects, but in the day-to-day work at their companies.

The other new features the company announced today are pretty much all about security. The new GitHub Security Advisory API, for example, makes it easier for developers to find threads in their code through automatic vulnerability scans, while the new security vulnerability alerts for Java and .NET projects now extend GitHub’s existing alerts to these two languages. If your developers are prone to putting their security tokens into public code, then you can now rest easier since GitHub will now also start scanning all public repositories for known token formats. If it finds one, it’ll alert you and you can set off to create a new one.

↧

First analysis of how Uber and Lyft have affected roadway congestion in SF

October 16, 2018, 3:56 am

≫ Next: Why is FFTW written in OCaml and what makes it so fast?

≪ Previous: GitHub launches Actions, its workflow automation tool

Rider enters a TNC vehicle

Overview And Key Findings

"TNCs and Congestion" report provides the first comprehensive analysis of how Transportation Network Companies Uber and Lyft collectively have affected roadway congestion in San Francisco.

Key findings in the report:

The report found that Transportation Network Companies accounted for approximately 50 percent of the rise in congestion in San Francisco between 2010 and 2016, as indicated by three congestion measures: vehicle hours of delay, vehicle miles travelled, and average speeds.

Employment and population growth were primarily responsible for the remainder of the worsening congestion.

Major findings of the TNCs & Congestion report show that collectively the ride-hail services accounted for:

51 percent of the increase in daily vehicle hours of delay between 2010 and 2016;
47 percent of the increase in vehicle miles travelled during that same time period; and
55 percent of the average speed decline on roadways during that same time period.
On an absolute basis, TNCs comprise an estimated 25 percent of total vehicle congestion (as measured by vehicle hours of delay) citywide and 36 percent of delay in the downtown core.

Consistent with prior findings from the Transportation Authority’s 2017 TNCs Today report, TNCs also caused the greatest increases in congestion in the densest parts of the city - up to 73 percent in the downtown financial district - and along many of the city’s busiest corridors. TNCs had little impact on congestion in the western and southern San Francisco neighborhoods.

The report also found that changes to street configuration (such as when a traffic lane is converted to a bus-only lane), contributed less than 5 percent to congestion.

Resources

Download a copy of "TNCs and Congestion" report.

Download a copy of the press release.

Dynamic Map

Explore a dynamic map of TNCs and Congestion.

Data Files

Download a copy of the data file used to prepare the report:

Data set 2010

Data set 2016

Connect With Us

If you have questions about "TNCs Today," or are interested in a research collaboration, please contact Joe Castiglione, Deputy Director for Technology, Data and Analysis via email or Drew Cooper, Planner, via email.

↧

Why is FFTW written in OCaml and what makes it so fast?

October 16, 2018, 1:44 pm

≫ Next: Lessons learned from creating a real-time collaborative rich-text editor

≪ Previous: First analysis of how Uber and Lyft have affected roadway congestion in SF

FFTW was written in OCaml because it is a metaprogram (in this case a program that generates a program) and OCaml is a MetaLanguage (a family of programming languages that were specifically designed for metaprogramming) thus permitting a relatively simple implementation in OCaml. When FFTW was written, OCaml was the most pragmatic FPL for this. Nowadays there are more options such as Haskell, Scala and F#. FFTW was originally written in Scheme but the authors switched to OCaml because its built-in pattern-matching makes symbolic simplification so much easier.

FFTW is fast because it combines an efficient algorithm (mixed radix with Rader's convolution IIRC) with very high-level optimisations (algebraic simplifications written in OCaml) with low-level efficiency (an aggressive optimising C compiler) as well as run-time profiling to choose the fastest codelets (which is largely affected by the target architecture).

Note that some people (such as Frank) get confused because FFTW is commonly distributed in the form of precompiled C code. That is not the source code. Most of the C code is generated by OCaml source code (see this statement by one of the authors of FFTW Google Groups). One of the authors of FFTW, Steven G Johnson, once patiently explained this on the caml-list to someone who continued to argue with him (d'oh!).

↧

Lessons learned from creating a real-time collaborative rich-text editor

October 15, 2018, 7:38 am

≫ Next: New half-light half-matter particles may hold the key to a computing revolution

≪ Previous: Why is FFTW written in OCaml and what makes it so fast?

How we approached collaborative editing

Real-time collaboration is a feature we wanted to introduce since the inception of CKEditor 5. The research that we made back in 2012 and some failed attempts that we observed all around showed us that full support for collaborative editing for rich-text data cannot be added on top of existing projects. A proper architecture has to be designed and implemented from scratch, with real-time collaboration treated as a first-class citizen in the entire project.

As simple as it sounds, for us it meant leaving behind years of WYSIWYG HTML editor experience and a rock-solid code base of CKEditor 4 that we were proud of and that our customers appreciated. Leaving a code base estimated at 50+ man-years and r.e.s.t.a.r.t.i.n.g.

We were quite seriously scared of repeating the infamous history of Netscape which was a well-known example of failing to successfully release a newer version of a popular software after deciding to rewrite it from scratch. Fortunately it did not happen in our case.

It took us nearly 4 years, but we succeeded. CKEditor 5 Framework was built with real-time collaboration in mind, from its very foundations. The integrity of the platform was validated by CKEditor 5 Collaborative Editing— a set of features that enables users to create and edit content together in a real-time collaborative environment.

This article describes how we approached the problem and what challenges we had to overcome in order to provide real-time collaborative editing capable of handling rich text. Check it out if you are interested in:

Learning what problems you may face when implementing real-time collaborative editing.
Building a rich-text editor with support for real-time collaboration.
How we approached collaborative editing in CKEditor 5.

Real real-time collaboration

Since collaborative editing is a highly desired feature (see [1], [2], [3]), many projects boast to support it. However, very few solutions are able to provide the top quality and completeness. Additionally, the terms “collaborative editing” and “collaboration” are quite broad and can be understood in a variety of ways which leads to even more confusion among potential users.

This article describes how real-time collaborative editing was implemented in CKEditor 5. The terms “collaboration” and “real-time collaborative editing” are used interchangeably throughout the document to refer to “real real-time collaboration” as implemented by CKEditor 5.

Alternative solutions

Since the beginning our goal was to provide a solution that would bring no compromises when it comes to collaborative editing. There are many shortcuts one may try to use to enable collaboration in an application which has not been designed for it, but in the end they all result in a poor user experience:

Full or partial content locking. Only one user can edit the document or a given part of the document (a block element: paragraph, table, list item, etc.) at the same time.
Collaboration features enabled in “read-only” mode. Users are able to make comments on text but only if the editor is in “read-only” mode.
Manual conflict resolution. Edits in the same place would have to be resolved manually by one of the users.
Only basic features enabled in collaborative editing. You can bold the text or create a heading, but forget about support for tables or nested lists.
Lack of intention preservation. After conflicts are resolved, the user ends up with a different content than what they intended to create (in other words: poor conflict resolution).

We wanted to avoid all these pitfalls. It required creating a truly real-time collaborative editing solution that enables all users to simultaneously create and edit content without any limitations or features stripping. We always had one idea: the editor should look, feel and behave the same, no matter if collaborative editing is on or off.

It’s all about conflicts

During collaborative editing users are constantly modifying their local editor content and synchronizing the changes between themselves. When two or more users edit the same part of the content, conflicts may, and will, appear. Conflict resolution is what makes or breaks the collaborative editing experience.

For example, when two users remove a part of the same paragraph, their editors’ states need to be synchronized. However, this is problematic: when User A receives information from User B, this information is based on User B’s content — which is different than what User A is currently working on.

This is one of the simplest scenarios but even that, without proper mechanisms in place, would lead to lack of eventual consistency — a fundamental requirement of any collaborative editing solution. Some editors introduce full or partial content locking to prevent this from happening, but this was not the kind of limitation that we would accept.

Side note: One may think that in real-life use conflicts will not happen frequently and, perhaps, you do not need a sophisticated solution to them. Could we not simply reject changes if we discover a conflict? It turns out that in reality conflicts are quite frequent and rejecting one user’s changes when they happen leads to an awful user experience.

Our take on Operational Transformation

There are several approaches to implementing conflict resolution in real-time collaborative editing. Two main candidates are Operational Transformation (OT) and Conflict-Free Replicated Data Type (CRDT). We chose OT and perhaps one day we will write down our thoughts on the ongoing OT vs. CRDT battle.

Long story short, CKEditor 5 uses OT to make sure it is able to resolve conflicts. OT is based on a set of operations (objects describing changes) and algorithms that transform these operations accordingly, so that all users end up with the same editor content regardless of the order in which these operations were received. As a concept it is well-described in IT literature ([1], [2]) and it is proven by existing implementations (although none that could serve as a stable and powerful enough base for our needs).

Therefore, in 2015 we started working on our take on OT implementation. We quickly realized that basic Operational Transformation (as usually described and implemented) is not enough to provide top quality user experience for rich-text editing. OT in its basic form defines three operations: insert, delete, and set attribute. These operations are meant to be executed on a linear data model. They are responsible for inserting text characters, removing text characters and changing their attributes (for example to set bold). However, a powerful rich-text editor requires more than that.

Support for complex data structures

The linear data model is a simple data model that is sufficient to represent plain text. On the contrary, HTML is a tree-based language, where an element can contain multiple other elements. An HTML document is represented in the browser as the Document Object Model (or DOM), which is tree-structured. It is possible to represent simple, flat structured data in a linear model, but this model fails short when it comes to complex data structures, like tables, captioned images or lists containing block elements. Elements simply cannot contain other elements. For example, a block quote cannot contain a list item or a heading.

Hence, we needed to make a step further and provide Operational Transformation algorithms that work for a tree data structure. Back in 2015, there was literally one paper about OT for trees ([1]) that we could find and no evidence of anyone working on OT for trees. We based on that research, but the reality turned out to be even more challenging than we could have expected. The first implementation took us over one year, with several significant reworks over the next two years. The result is, however, outstanding. We not only managed to build the engine for real-time collaboration, but also implemented a complete end-user solution which verifies what would be a theoretical work otherwise.

The diagram below shows how a simple structured content can be represented in a linear data model:

The diagram below shows how a more complex piece of rich text can be represented in a tree-structured data model:

Advanced conflict resolution

Switching to the tree data model was not enough to implement a bulletproof real-time collaboration. We quickly realized that the basic set of operations (insert, delete, set attribute) is insufficient to handle real-life scenarios in a graceful way. While, perhaps, these three operations provide enough semantics to implement conflict resolution in a linear data model, they did not satisfy the semantics of rich-text editing.

Below are some examples of situations where users simultaneously perform an action on the same part of content:

(1) User A changes the list item type (from bulleted to numbered) while User B presses Enter to split that list item:

(2) User A and User B press Enter in the same paragraph:

(3) User A wraps a paragraph into a block quote while User B presses Enter:

(4) User A adds a link to a sentence, while User B writes inside that sentence:

(5) User A adds a link to some text, while User B removes a part of that text and then undoes the removing:

To properly handle these and many other situations we needed to heavily enhance our Operational Transformation algorithms. The most important enhancement that we made was adding a set of new operations to the basic three (insert, remove, set attribute). The goal was to better express the semantics of any user changes. That, in turn, allowed us to implement better conflict resolution algorithms. To the basic three operations we added:

The rename operation, to handle element’s renaming (used, for example, to change a paragraph into a heading or a list item).
The split, merge, wrap, unwrap operations to better describe the user intention.
The insert text operation, to differentiate between inserting text content and elements.
Unrelated to conflict solving, we have also introduced the marker operation.

Why do we need these new operations? Rename, split, merge, wrap and unwrap“actions” can be executed by a combination of insert, move and remove operations. For example, splitting a paragraph can be represented as a pair of “insert a new paragraph” + “move a part of the old paragraph to the new paragraph”. However, the split operation is semantic-focused — it conveys the user’s intention. It means more than insert + move which just happen to be executed one after another.

Thanks to the new operations, we can write more contextual transformation algorithms. This way we can resolve more complex use cases like scenarios (1-4) described above.

Side note: We believe that the set of necessary operations is strongly connected to the semantics of the tree data that you are representing. A rich-text editor has a different nature than a genealogical tree and hence requires a different set of operations.

Further extensions

Adding the new operations still did not solve all the problems. We needed to extend our Operational Transformation implementation even further to handle the scenarios that we discovered over the years. Here are the most significant additions that we made:

The graveyard root– A special data tree root where removed nodes are moved, which enables better conflict resolution in scenarios when User A changes a part of data which is at the same time removed by User B (scenario (5) and similar).
Generalizing operations to work on ranges instead of singular nodes for better processing and memory efficiency.
Operation breaking – Sometimes, when being transformed, an operation needs to be broken into two operations, for example when a part of the content was removed (scenario (5)).
Selective undo mechanisms – Undo feature needs to be aware of collaborative editing, so, for example, a user is able to undo only their own changes.

If you read up to this point, congratulations! 😃 In fact, we could write much more about every single thing mentioned in this article, but that would make it painfully long. If you are interested in a detailed overview of anything specific mentioned here, let us know in comments and we may create a separate article about it.

Real-time collaborative editing in CKEditor 5

So far, we talked about implementing real-time collaborative editing in general. Those low-level topics were platform-agnostic, but there is also the second part of this big puzzle — the end-user features and the platform’s architecture that allows to implement these features.

Dedicated collaboration features

Apart from enabling the users to share and edit the same document simultaneously (you can test it live on https://ckeditor.com/collaborative-editing/), we implemented some dedicated collaboration features that make the users’ real-time collaborative editing experience as engaging as one would expect from a complete solution:

Comments feature– Adding comments in real time, as other users edit, to any selected part of content (commenting in “read-only mode” is supported, too).
Users’ selection feature – Visual highlights at exact places where other users are editing to further emphasize the collaboration aspect and help users navigate inside the edited document.
Presence list feature– Showing photos or avatars of users who are currently editing the document.

Support for rich-text editing features

Our editing framework is built in a way to support all rich-text editor features in the collaboration mode. From simple ones like text styling, through image drag and drop and captioning, to complex ones like undo and redo, nested lists or tables.

Since mechanisms used in real-time collaborative editing lay at the very foundation of CKEditor 5 Framework, any new feature added to the rich-text editor will also be available in collaboration mode.

Support for third-party plugins

The editor is usually just a component of a bigger platform or application, so we needed to design its architecture in a way to make it flexible and easily extendable. Your custom features need to be as supported in a collaborative environment as the core ones. If you need to develop your own piece of editor functionality, there is a high chance that you will not need to write even a single line of code to enable it for collaboration.

Developing features for collaborative editing with CKEditor 5 Framework is easy thanks to the following advantages:

1. Data abstraction (model-view-controller architecture).

The editor content (the data) is abstracted from the view and from the DOM (the browser’s content representation). This brings an important benefit: abstract data is much easier to operate on. A content element (for example, an image widget) can be represented as one element in the data model, instead of a few (as it is in the DOM or HTML). Thanks to that, the feature code can become much simpler.

2. Single entry point for changes.

Every change performed on the editor data, internally, always results in creating one or multiple operations. Operations are atomic data objects describing the change. These are then used to synchronize data between collaborating clients. As a result, one might say that it is impossible to write a CKEditor 5 feature without it being supported in real-time collaborative editing.

3. Simple API built on a powerful foundation.

All the mechanisms responsible for themagic are hidden from the developer. Instead, we provide an API resembling what you are already used to. Changing the data tree is easy thanks to intuitive methods that perform actions which are then translated into operations behind the scenes.

4. Data conversion decoupled from data synchronization.

After the editor data model is changed, the changes are converted to the editor view (a custom, DOM-like data structure) and then rendered to the real DOM. The important thing is that only the editor data is synchronised — the conversion is done on every client independently. This means that even a complicated feature, if represented by an easy abstraction, is still easily supported in the collaborative environment.

5. Markers.

Markers are ranges (“selections”) on content that are trackable and automatically kept in sync while the data tree is being changed — also during collaboration. Thanks to them creating features like user selection or a comment to the text is a breeze.

6. Post-fixers.

Post-fixers are callbacks which are called after the editor data changes. They are not exclusive to collaboration but can be used to fix the editor model if your feature is complicated.

Real-time collaboration backend

Real-time collaboration requires a server (backend) to propagate changes between connected clients. Such server also offers additional benefits:

Your changes will not be lost if you accidentally close the document. A temporary backup in the cloud will always be available.
Your changes will be propagated to other connected users even if you temporarily lose your internet connection.

We have implemented the backend as a SaaS solution ready for zero-effort instant integration with your application. However, if for various reasons you cannot use a cloud solution, an on-premise version of the collaboration server is also available.

We spent significant time and effort on designing and implementing a highly optimized client-server communication protocol for real-time collaboration. We plan to talk more about some optimizations we worked on recently in another article (to be published soon).

What’s next

Apart from constantly adjusting and optimizing the real-time collaboration algorithms, we plan to introduce more features that will bring the ultimate collaborative editing experience to CKEditor 5 Ecosystem. We have already started prototyping and preparing the architecture them:

Suggestion mode (aka track changes) – Add your changes as suggestions to be reviewed later.
Mentions feature – Configurable autocompleting helper, providing a way to quickly insert and link names or phrases.
Versioning and diffing – Save versions of your document and compare them.

Summary

We started building our next generation rich-text editor with the assumption that real-time collaborative editing must be the core feature that lies at its very foundation — and this meant a rewrite from scratch. After a lengthy research and development phase we created an Operational Transformation implementation, extended to support tree-based data structures (rich-text content) for advanced conflict resolution. The successful implementation of the CKEditor 5 Framework collaboration-ready architecture was validated by working solutions from the CKEditor Ecosystem: CKEditor 5, CKEditor 5 Collaboration Features and Letters.

Behind the scenes the implementation of it all took a lot of our effort (that, frankly speaking, exceeded our initial estimations by the factor of 2… 😃). Here are some numbers about the project to give you more perspective:

The number of tickets closed: 5700
The number of tests: 12500
Code coverage: 100%
Development team: 25+
Estimated number of man-days: 42 man-years (until September 2018), including time spent on writing tools to support the project like mgit and Umberto (the documentation generator used to build the project documentation)

We hope you enjoyed reading the article. If you would like to read more about anything specific related to real-time collaboration or CKEditor 5, let us know in comments.

If you would like to play with the final result of our work, check https://ckeditor.com/collaborative-editing/.

↧