Quantcast
Channel: Hacker News
Viewing all 25817 articles
Browse latest View live

A Comparison of Automatic Speech Recognition (ASR) Systems

$
0
0

Back in March 2016 I wrote Semi-automated podcast transcription about my interest in finding ways to make archives of podcast content more accessible. Please read that post for details of my motivations and goals.

Some 11 months later, in February 2017, I wrote Comparing Transcriptions describing how I was exploring measuring transcription accuracy. That turned out to be more tricky, and interesting, than I’d expected. Please read that post for details of the methods I’m using and what the WER (word error rate) score means.

Here, after another over-long gap, I’m returning to post the current results, and start thinking about next steps. One cause of the delay has been that whenever I returned to the topic there had been significant changes in at least one of the results, most recently when Google announced their enhanced models. In the end the delay turned out to be helpful.

The table below shows the results of my tests on many automated speech recognition services, ordered by WER score (lower is better). I’ll note a major caveat up front: I only used a single audio file for these tests. An almost two hour interview in English between two North American males with no strong accents and good audio quality. I can’t be sure how the results would differ for female voices, more accented voices, lower audio quality etc. I plan to retest the top tier services with at least one other file in due course.

You can’t beat a human, at least not yet. All the human services scored between 4 and 6. I described them in my previous post, so I won’t dwell on them here.

ServiceWERPunctuation
( . / , / ? / names )
TimingOther
Features
Approx Cost
(not bulk)
Human (3PlayMedia)4.51261/1470/76/1064$3/min
Human (Voicebase)4.61090/1626/57/1056$1.5/min
Human (Scribie)5.1923/1450/49/1153$0.75/min
Human (Volunteer)5.3840/1748/60/1208Goodwill
Google Text-to-Speech (video model, not enhanced)10.7792/421/29/1238WordsC, A, V$0.048/min
Otter AI11.50786/1166/35/1030PgfsE, SFree up to 600 mins/month
Spex11.81813/369/30/1263LinesE$0.35/min
Go-Transcribe12.1979/0/0/922PgfsE$0.22/min
SimonSays12.2941/0/0/893LineE, S$0.17/min
Trint12.3968/0/0/894LinesE$0.33/min
Speechmatics12.3955/0/0/929WordsS, C$0.08/min
Sonix12.3943/0/0/900LinesD, S, E$0.083/min+$15/mon
Temi12.5915/1329/51/862PgfsS, E$0.10/min
TranscribeMe12.91203/0/63/836Lines$0.25/min
Scribie ASR12.9970/1307/48/973NoneECurrently free
YouTube Captions15.00/0/0/1075LinesSCurrently free
Voicebase16.6116/0/0/1119LinesE, V$0.02/min
AWS Transcribe22.2772/0/85/67WordsS, C, A, V$0.02/min
Vocapia VoxSigma23.6771/599/0/931WordsS, C$0.02/min approx
IBM Watson25.211/0/0/896WordsC, A, V$0.02/min
Dragon +vocabulary25.39/7/0/967NoneFree + €300 for app
Deepgram27.9715/1262/52/443PgfsS, E$0.0183
SpokenData36.51457/0/0/680WordsS, E$0.12/min
  • WER: Word error rate (lower is better).
  • Punctuation: Number of sentences / commas / question marks / capital letters not at the start of a sentence (a rough proxy for proper nouns).
  • Timing: Approximate highest precision timing: Words typically means a data format like JSON or XML with timing information for each word, Lines typically means a subtitle format like SRT, Pgfs (paragraphs) means some lower precision.
  • Other Features: E=online editor, S=speaker identification (diarisation), A=suggested alternatives, C=confidence score, V=custom vocabulary (not used in these tests).
  • Approx Cost: base cost, before any bulk discount, in USD.

Note the clustering of WER scores. After the human services scoring from 4–6, the top-tier ASR services all score 10–16, with most around 12. The scores in the next tier are roughly double: 22–28. Seems likely that the top-tier systems are using more modern technology.

For my goals I prioritise these features:

  • Accuracy is a priority, naturally, so most systems in the top-tier would do.
  • A custom vocabulary would further improve accuracy.
  • Cost. Clearly $0.02/min is much more attractive than $0.33/min when there are hundreds of hours of archives to transcribe. (I’m ignoring bulk discounts for now.)
  • Word level timing enables accurate linking to audio segments and helps enable comparison/merging of transcripts from multiple sources (such as taking punctuation from one transcript and applying it to another).
  • Good punctuation reduces the manual review effort required to polish the automated transcript into something pleasantly readable. Recognition of questions would also help with topic segmentation.
  • Speaker identification would also help identify questions and enable multiple ‘timelines’ to help resolve transcripts where there’s cross-talk.

Before Google released their updated Speech-to-Text service in April there wasn’t a clear winner for me. Now there is. Their new video premium model is significantly better than anything else I’ve tested.

I also tested their enhanced models a few weeks after I initially posted this. It didn’t help for my test file. I also tried setting interactionType and industryNaicsCodeOfAudio in the recognition metadata of the video model but that made the WER slightly worse. Perhaps they will improve over time.

Punctuation is clearly subjective but both Temi and Scribie get much closer than Google to the number of question marks and commas used by the human transcribers. Google did very well on capital letters though (a rough proxy for proper nouns).

I think we’ll see a growing ecosystem of tools and services using Google Speech-to-Text service as a backend. The Descript app is an interesting example.

Differential Analysis

While working on Comparing Transcriptions I’d realized that comparing transcripts from multiple services is a good way to find errors because they tend to make different mistakes.

So for this post I also compared most of the top-tier services against one another, i.e. using the transcript from one as the ‘ground truth’ for scoring others. A higher WER score in this test is good. It means the services are making different mistakes and those differences would highlight errors.

Google, Otter AI, Temi, Voicebase, Scribie, and TranscribeMe all scored a high WER, over 10, against all the others. Go-Transcribe vs Speechmatics had a WER of 6.1. SimonSays had a WER of 5.2 against Sonix, Trint, and Speechmatics. Trint, Sonix, and Speechmatics have very little difference between the transcripts, a WER of just 1.4. That suggests those three services are using very similar models and training data.

What Next?

My primary goal is to get the transcripts available and searchable, so the next phase would be developing a simple process to transcribe each podcast and convert the result into web pages. That much seems straightforward using the Google Text-to-Speech API. Then there’s working with the podcast host to integrate with their website, style, menus etc.

After that the steps are a more fuzzy. I’ll be crossing the river by feeling the stones…

The automated transcripts will naturally have errors that people notice (and more that they won’t). To improve the quality it’s important to make it very easy for them to contribute corrections. Being able to listen to the corresponding section of audio would be a great help. All that will require a web-based user interface backed by a service and a suitable data model.

The suggested corrections will need reviewing and merging. That will require its own low-friction workflow. I have a vague notion of using GitHub for this.

Generating transcripts from at least one other service would provide a way to highlight possible errors, in both words and punctuation. Those highlights would be useful for readers and also encourage the contribution of corrections. Otter API, Speechmatics and Voicebase are attractive low-cost options for these extra transcriptions, as are any contributed by volunteers. This kind of multi-transcription functionality has significant implications for the data model.

I’d like to directly support translations of the transcriptions. The original transcription is a moving target as corrections are submitted over time, so the translations would need to track corrections applied to the original transcription since the translation was created. Translators are also very likely to notice errors in the original, especially if they’re working from the audio.

Before getting into any design or development work, beyond the basic transcriptions, I’d want to do another round of due-dilligence research, looking for what services and open source projects might be useful components or form good foundations. Amara springs to mind. If you know of any existing projects or services that may be relevant please add a comment or let me know in some other way.

I’m not sure when, or even if, I’ll have any further updates on this hobby project. If you’re interested in helping out feel free to email me.

I hope you’ve found my rambling explorations interesting.

Updates:

  • 25th May 2018: Updated SimonSays.ai with much improved score
  • 10th June 2018: Updated notes about Google enhanced model (not helping WER score).
  • 8th September 2018: Added Otter AI, prompted by a note in a blog post by Descript comparing ASR systems.
  • 10th September 2018: Emphasised that I only used a single audio file for these tests. Noted that Otter.ai is free up to 600 mins/month.
  • 14th September 2018: Added Spex.

Portrait of Terence Tao (2006) [video]

$
0
0

Offentliggjort den 2. aug. 2018

The Heidelberg Laureate Forum Foundation presents the HLF Portraits: Terence Tao; Fields Medal, 2006

Recipients of the ACM A.M. Turing Award, the Abel Prize and Fields Medal in discussion with Marc Pachter, Director Emeritus National Portrait Gallery, Smithsonian Institute, about their lives, their research, their careers and the circumstances that led to the awards. Video interviews produced for the Heidelberg Laureate Forum Foundation by the Berlin photographer Peter Badge.

Background:

The Heidelberg Laureate Forum Foundation (HLFF) annually organizes the Heidelberg Laureate Forum (HLF), which is a networking event for mathematicians and computer scientists from all over the world. The HLFF was established and is funded by the German foundation the Klaus Tschira Stiftung (KTS), which promotes natural sciences, mathematics and computer science. The HLF is strongly supported by the award-granting institutions, the Association for Computing Machinery (ACM: ACM A.M. Turing Award, ACM Prize in Computing), the International Mathematical Union (IMU: Fields Medal, Nevanlinna Prize), and the Norwegian Academy of Science and Letters (DNVA: Abel Prize). The Scientific Partners of the HLFF are the Heidelberg Institute for Theoretical Studies (HITS) and Heidelberg University.

More information to the Heidelberg Laureate Forum:

Website: http://www.heidelberg-laureate-forum....
Facebook: https://www.facebook.com/HeidelbergLa...
Twitter: https://twitter.com/hlforum
Flickr: https://www.flickr.com/hlforum
More videos from the HLF: https://www.youtube.com/user/Laureate...
Blog: https://scilogs.spektrum.de/hlf/

Crypto Market Has Bottomed (According to Novogratz)

$
0
0

Billionaire Michael Novogratz believes the bottom is in for the cryptocurrency market this year — but is his call premature?

Novogratz is a former hedge fund manager for investment firm Fortress Investment Group and was a partner at financial giant Goldman Sachs. He now spends his time making waves in the cryptocurrency waters as the CEO of Galaxy Investment Partners, a cryptocurrency investment firm.

“I think we put in a low yesterday,” Novogratz tweeted on September 13 — while also noting that the Bloomberg Galaxy Crypto Index “retouched the highs of late last year and the point of acceleration that led to the massive rally/bubble.” The index in question measures the performance of the largest digital currencies traded in dollars, as noted by CNBC.

Jumping the Gun?

Novogratz has been notably bullish on cryptocurrencies throughout 2018 — which is to be expected, given that he is the CEO of a cryptocurrency investment firm. However, his call that the bottom is in could very well be premature.

Bitcoin is in the throes of a bear market — a fact which cannot be denied. Every rally this year has put in a lower high, while trading volume continues to decrease. Trading volume also goes out of the cryptocurrency market as soon as it goes in, as evidenced by the market leader’s latest cascading selloff.

Bitcoin

Furthermore, Bitcoin is in very real danger of collapsing through its key support level of $6,000.

BTC

Patience is a Virtue

Nevertheless, Novogratz believes cryptocurrencies will bounce back. To be fair, they probably will — at least, Bitcoin will. “Markets like to retrace to the breakout,” he noted in his tweet. “We retraced the whole of the bubble.”

When the seemingly inevitable bounce back occurs, it will most likely come via the market leader. Bitcoin has seen its dominance increase significantly in 2018, and it’s within the realm of possibility we could see the first and foremost cryptocurrency once again claim 70 or 80 percent of the total market.

Do you think the bottom is in for Bitcoin and/or the cryptocurrency market? Let us know your thoughts in the comments below! 


Images courtesy of Shutterstock, Twitter/@novogratz, blockchain.info, TradingView.

Mylk Guys (YC S18) Is Hiring Our 2nd Full Stack Engineer

$
0
0
The current food system isn’t working - not for our health and not for our planet. Vegan food is one way to improve the food system. We are a grocery startup making it easier for folks to eat more plant-based food.

Along with solving the hard logistics and operations challenges of building a grocery store, we are also innovating on the customer experience (e.g. creating a community of food lovers, chat with a human for recommendations etc.), using data to build our catalog and rethinking a grocery store from first principles.

Join us as a full stack engineer if one of the 2 strongly resonate with you - - You love crafting beautiful front end experiences and love talking to customers. - You love operational challenges, and process improvement to save $$s.

In either case, you are obsessed with data. You have experience designing and building large and complex (yet maintainable) systems, and you should be able to do so in about one-third the time that most competent people think possible. Expect talented, motivated, intense and interesting co-worker.

We are extremely transparent, communicate well, own our pieces and move fast. We are a team of x-Instacart, x-Amazon, looking to build a strong culture, a serious business and work with the best. We are in SF, and are backed by top investors including YC & Khosla Ventures.

Our Stack:

Backend: Ruby / Rails API / AWS / Algolia Frontend: Javascript / React / Redux / Saga

We promise to thoroughly review every application, so we appreciate you putting in the effort. Alternatively, if you are more of a let's talk about it person - just shoot us a quick note and we'll pick up the phone and call you !

If not you, then who? jobs@mylkguys.com

Facebook Bowler: Safe code refactoring for modern Python

$
0
0

Bowler

Safe code refactoring for modern Python projects.

build statusversionlicensecode style

Overview

Bowler is a refactoring tool for manipulating Python at the syntax tree level. It enables safe, large scale code modifications while guaranteeing that the resulting code compiles and runs. It provides both a simple command line interface and a fluent API in Python for generating complex code modifications in code.

Bowler uses a "fluent" Query API to build refactoring scripts through a series of selectors, filters, and modifiers. Many simple modifications are already possible using the existing API, but you can also provide custom selectors, filters, and modifiers as needed to build more complex or custom refactorings. See theQuery Reference for more details.

Using the query API to rename a single function, and generate an interactive diff from the results, would look something like this:

query = (
    Query(<paths to modify>)
    .select_function("old_name")
    .rename("new_name")
    .diff(interactive=True)
)

For more details or documentation, check out https://pybowler.io

Installing Bowler

Bowler supports modifications to code from any version of Python 2 or 3, but it requires Python 3.6 or higher to run. Bowler can be easily installed using most common Python packaging tools. We recommend installing the latest stable release fromPyPI with pip:

You can also install a development version from source by checking out the Git repo:

git clone https://github.com/facebookincubator/bowlercd bowler
python setup.py install

License

Bowler is MIT licensed, as found in the LICENSE file.

Burning Man's Mathematical Underbelly

$
0
0

Does your hometown have any mathematical tourist attractions such as statues, plaques, graves, the cafe where the famous conjecture was made, the desk where the famous initials are scratched, birthplaces, houses, or memorials? Have you encountered a mathematical sight on your travels? If so, we invite you to submit an essay to this column. Be sure to include a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks.

A math degree can take you to a lot of places, both physically and figuratively, and if you play your cards right, you too can argue counterfactual definiteness with a shaman. First in 2008, and several times since, a fellow math PhD and I traveled to the Burning Man art festival to sit in the desert and talk with the locals about whatever they happened to be curious about.

Burning Man began in 1986, when a group of people (who would argue endlessly over any finite list of their names) decided to assemble annually on a San Francisco beach and burn a wooden human effigy. In 1990, increasing membership and a lack of fire permits forced Burning Man to combine with Zone #4, a ‘‘Dadaist temporary autonomous zone’’ piloted by the Cacophony Society, in the Black Rock Desert, 110 miles outside of Reno, Nevada.

As of 2017, the festival has expanded to a modest 70,000 people. For the week it exists, Black Rock City (the name for the physical infrastructure of the festival) is the sixth largest city in Nevada. At first blush, it may seem a little audacious to call a festival a ‘‘city,’’ but by most definitions of the word, Black Rock City qualifies, supplying sanitation, roads, lighting (by the Lamplighter’s Guild), police, emergency services (including fire, of course), and even a Department of Mutant Vehicles.

The Black Rock Desert is ideal for fire-based art. As an alkali flat, there is nothing to burn and nothing to break. The ground is a flat expanse of white powder, reverently monikered the ‘‘Playa.’’ It is a rare moment of relief that you’re not aware of the dust on everything and everyone. The Playa is simultaneously a blank canvas for sculpture and a stunning panorama, and being completely fireproof gives artists a little more leeway than they might have in the Guggenheim. The same blank slate applies to Black Rock City as a whole. Not constrained by physical impediments, like being in a convention center or civilization in general, Black Rock City is free to follow mathematical ideals of organization.

Most cities are roughly arranged on grids, but obstructed by rivers, topography, or politics, they rarely achieve graph-paper perfection. The Playa, on the other hand, has no obstructions of any kind. Starting with an empty desert, the festival is built in about a month, and most returning visitors would be surprised to learn that it’s never in the same place twice. Such are the advantages of Black Rock Desert.

Instead of using stodgy Cartesian coordinates, Black Rock City is organized along polar coordinates, making its coordinate system unique among massive fire-themed desert art festivals. At the mathematical origin is ‘‘the Man,’’ the titular carry-through from Burning Man’s historical origins. The angular coordinate is described using time on a clock face, with the city stretching from 2:00 to 10:00, and radially from R = 0:5 miles to R = 1 mile. The innermost ring of the city is ‘‘Esplanade,’’ with each street as you move radially outward given a name beginning with a sequential letter of the alphabet. For example, in 2017 the names were Esplanade, Awe, Breath, Ceremony, Dance, Eulogy, Fire, Genuflect, Hallowed, Inspirit, Juju, Kundalini, and Lustrate. Radial and angular locations in the city are specified using a letter and a time. For example, you might describe your camp as being at ‘‘D and 7:30’’. The large art installations are harder to pinpoint, since they’re scattered throughout the center of the ring and out into the ‘‘Deep Playa,’’ beyond the 10:00–2:00 gap where there are no street signs. ‘‘Center Camp,’’ where all of the official operations, bureaucracy, and public services are located, is a secondary set of ringed roads located centrally at 6:00.

The sudden addition of a new city to the Nevada landscape doesn’t go unnoticed. Although there’s always some grumbling whenever 70,000 people suddenly show up anywhere, capitalism has a way of bringing Burners and the citizens of Reno together. The demographic inside local Wal-Marts shifts precipitously in the days leading up to Burning Man as bikes, water, and food are stripped from the shelves. On the way out, many of the tens of thousands of visitors want a dust-free meal, and all of them want a shower (although some amount of playa dust inevitably makes it onto outbound airplanes).

Burning Man has a ‘‘gift economy,’’ so once you’ve left Reno and entered Burning Man proper, money ceases to have worth. In a barter economy you exchange goods for other goods. In a gift economy you give without the expectation of any return. It ‘‘works’’ because everyone else (or at least a larger fraction than you might expect) is doing the same. The gift economy even covers public transportation; if you see a mutant vehicle with enough space for a person, then you can jump on. I wouldn’t trust it to handle the housing market, but it did supply pancakes three days in a row from three different locations.

Against this backdrop, my good friend and office-mate, Spencer, and I forded out and set up an ‘‘Ask a Mathematician / Ask a Physicist’’ booth to discuss and hopefully answer whatever questions might come our way. We wanted to keep up our half of the gift economy contract, but giving things away is expensive and it’s difficult to find a productive application for Banach spaces and character classes in the middle of nowhere. Our one big skill is talking about math and science, so that’s exactly what we did.

The very first two questions, ‘‘How do I find the love of my life?’’ and ‘‘What is spacetime made of?’’ set the bar for variety and difficulty. With a big enough shoehorn, you can turn anything into math, so we reduced the first question to finding the best of N options, where you only meet the ‘‘options’’ sequentially and can only say yes or no to each. In other words, by ignoring all the nuance of romance we had reduced finding the love of one’s life to the ‘‘fussy suitor problem’’ (more commonly known as the ‘‘secretary problem’’). The optimal solution is to pick a value N, date N / e people (e ≈ 2:718, so this is a little more than a third of the total potential love interests), and then marry the first person you like better than anyone in that first group. But after a few more people joined the discussion, the answer evolved until we eventually settled on this: interact with lots of people, be patient and kind, and what’s wrong with flowers? It may not have a solid mathematical backing, but I bet it works better.

‘‘What is spacetime made of?’’ was actually a lot easier to talk about, but we were surprised by how fast it became philosophical and mathematical. Time and space come up a lot, and we’ve found that the simplest, most solid, and universally disappointing definitions for them are that time is what clocks measure and space is what rulers measure. On that first day, those definitions went over well, but then the conversation swerved into the relationship between the two. Then into how they’re affected by mass and energy. Then into what that says about space and time. There are hand-wavy ways to talk about these things, but no one was satisfied until we had gotten neck deep in spacetime intervals (the spacetime notion of distance, s2 = x2 + y2 + z2 - (ct) 2) and parallel transports (a method for describing the curvature of space), and surrounded the booth with equations and diagrams in the dust. The next year we replaced the dust with a whiteboard, and the year after that (after we learned what playa dust does to whiteboards) with a blackboard.

We weren’t always so lucky. One year, before we’d even finished setting up the booth, a cadre of MIT undergraduate physicists stumped us with ‘‘Soap bubbles physically solve minimal surface problems. Are there any other physical phenomena that quickly solve NP-type problems?’’

We had initially assumed that public enthusiasm for physics and especially math would be lukewarm. We expected to get a handful of straightforward, easy questions about fractions and outer space, then break for lunch. Instead, we were a lightning rod for fascinating, wide- ranging, precise inquiries. Rather than talking to rare individuals, we found ourselves holding court with a dozen people at a time, fielding multiple new questions even as conversations about previous questions ruminated around the group and resurfaced. After the first year we invested in a bigger tent.

For every ‘‘How fast is Earth moving?’’ there was a profound debate into the nature of understanding itself followed by a delving inquiry into quantum theory. I found that detail especially surprising. My own research is in quantum information theory, which I was certain would not come up in a tiny tent in the middle of the desert. But it turns out that the big questions that drove me into my field, about entanglement, Schrödinger’s cat, and the endless train of other bizarre things quantum theory implies about our world, are nothing special. A shocking number of self-identified non-scientists have the same questions, and if you advertise yourself as ‘‘a physicist,’’ you’ll hear them. Although I’ll admit to a certain amount of if-you-have-a- hammer-every-question-is-a-nail-ness, it’s surprising how frequently the double-slit experiment (a beautifully weird keystone experiment in quantum physics) meanders into the conversation.

The simplest and often most profound questions tended to come from professional scientists. A researcher from the satellite division at Boeing sparked a heated debate about the relationship between prior probabilities and human bias when he asked, ‘‘Why do weird things happen so much?’’ One innocent question, ‘‘Is there such a thing as a one-half-dimensional space?’’ split the assembled guests into tribes on what ‘‘space’’ meant in this context. On one side were people who thought that ‘‘fractals with dimension one-half’’ was a reasonable way to talk about such a space, and for them, the answer was yes. Another faction believed that ‘‘space’’ clearly means ‘‘a space you could imagine walking around in,’’ and for them the answer was no. More remarkable than the politeness of the debate was how well informed it was and how quickly new arrivals came up to speed.

In retrospect, we shouldn’t have been surprised. While all of the art on the Playa is eye-catching, a lot of it is extremely technical as well: lasers that continuously outline your shadow, sensors that sync the Man’s heartbeat with your own, huge kinetic sculptures, interactive digital art (which is either highly random or unobviously complex), and on and on. A moment’s thought would have revealed that there must be an army of scientists and engineers behind the scenes. Fortunately for us, through our booth we were rapidly introduced to the healthy math, science, and maker communities of Black Rock City. There’s even a ‘‘Math Camp,’’ built in and around a second-order Sierpinski gasket the size of a truck and located (we were absolutely thrilled to learn) at E and 3:14. There they have guest lecturers, an ‘‘open problems’’ board, and tequila shots.

Once you’re aware of the nerdy undercurrent, it’s hard to miss. While walking around in the ‘‘Deep Playa’’ (a long walk from Black Rock City proper) and looking at art, Spencer was struggling to explain the idea behind wiki software to me. The one other person present, a man wearing sunglasses, feathers, and body paint to the strict exclusion of all else, volunteered ‘‘You know, my engi- neering firm uses wiki software all the time!’’ He really cleared up a lot of confusion, but in any other context that would have been an unusual conversation.

The people attending Burning Man come from all over the world, with wildly divergent personal histories. So as often as not, the festival itself is the only common experience two strangers may have. Fortunately, that common experience carries a lot with it. For example, when a question about complex numbers comes up (it always happens at least once), you can get everybody onto the same page by talking about the complex plane in terms of the layout of Black Rock City: the Man is at zero, Center Camp (Esplanade and 6:00) is at i, the main entrance is around - 2i, the 3:00 and 9:00 plazas are at ± 1, and the Temple (a nondenominational, pan-belief, transient, and intentionally flammable structure) is at + i.

And when all else fails, there are very few people at Burning Man who aren’t at least a little enthusiastic about open fires. A detailed conversation about momentum and gyroscopic forces is normally difficult to follow, but if there happens to be a helpful fire dancer nearby, suddenly the theoretical becomes empirical and arguably very difficult to ignore. A fire dancer’s weapons of choice are typically poi: flaming weights at the ends of a pair of chains. Changing a poi’s plane of rotation, without that plane intersecting yourself, requires some subtle angular momentum exchange, and fire dancers, while usually not familiar with the exact vocabulary, are always more than happy to demonstrate. Who can’t learn about torque if the option is getting burned (a little)?

Most of the people who show up at the booth don’t explicitly have questions to ask. At least not at first. For most people the barrier to even beginning a conversation about science and math is way too high. Of course, everyone has things they wonder about, but as any teacher can tell you, most of us are also worried about sounding stupid. It takes listening for a little while to realize that science is something that can be understood, scientists aren’t unusually smart, and there’s no need to impress them. It warms your heart to listen to what other people wonder about and to hear how they think, and nothing makes a science conversation more inclusive than a sci- entist who’s deeply bothered by exactly the same things that bother you.

We were so impressed by the enthusiasm behind the questions, both from the asking itself and the inevitable ‘‘Oh yeah, I’ve always wondered that too!’’ that afflicts practically everyone in earshot, that we set up the booth back in New York a few times as well as online in the form of http://askamathematician.com. We’re still surprised. Ten years on and the questions keep coming.

Reprinted with permission from the Mathematical Intelligencer

In the Balance

$
0
0

When you first take the Artifact, you will see a vision of ALPHANION, Demon-Sultan of the Domain of Order, who appears as a grid of spheres connected by luminous lines. Alphanion will urge you to use the Artifact to enforce cosmic order, law at its most fundamental. He will show you visions of all the most brutal and sadistic crimes of history, of all the wars caused by nations that could not live together in harmony, and he will tell you they are all preventable. He will show you dreams of perfectly clean cities with wide open streets, where everyone earns exactly the optimal amount of money and public transportation is accurate to the second. He will tell you it is all attainable.

But if you hesitate even an instant to take Alphanion’s offer, you will see a vision of CTHGHFZXAY, Demon-Shah of the Domain of Chaos, who appears as a shifting multicolored cloud. Cthghfzxay will urge you to use the Artifact to promote cosmic chaos, the ultimate principle of freedom. She will condemn the works of Order as a lie, a dystopia bought at the cost of true human liberty. She will show you visions of primaeval forests, where no two flowers are alike, where each glade holds a new mystery, where people run wild in search of new adventure. She will tell you it can all be yours.

As you weigh these two offers, you will see a vision of ZAMABAMAZ, Demon-Pharaoh of the Domain of Balance, who appears as a man and woman conjoined. They will tell you that neither Order nor Chaos is at the root of human flourishing, but an ability to strike the right balance between the two. That a virtuous life is one spent in moderation between total wild liberty and a stifling concept of rote rule-following. That Alphanion and Cthfhfzxay are the two poles of the universe, and that righteousness exists in the space created by their interaction. They will ask you to devote the Artifact and its power to the Domain of Balance, so all people can better manage the interaction of Order and Chaos in their own lives.

This will seem reasonable to you, but then there will appear a vision of IYYYYYYYYYYYYYYYYYYYYYYY, Demon-Raja of the Domain of Excess, who appears as a blinding violet light. It will tell you that both Order and Chaos present coherent visions of the world, but that for the love of God, choose one or the other instead of being a wishy-washy milquetoast who refuses to commit to anything. It will tell you that blinding white and pitch black are both purer and more compelling than endless pointless grey. It will ask you to give the Artifact to somebody – anybody – other than Zamabamaz.

Just as you think you have figured all this out, there will appear a vision of MLOXO7W, Demon-Kaiser of the Domain of Meta-Balance, who appears as a face twisted into a Moebius strip. It will tell you that sometimes it is right to seek balance, and other times right to seek excess, and that a life well-lived consists of excess when excess is needed, and balance when balance is needed. It will remind you that sometimes you are a sprinter and other times a tightrope walker in the Olympiad of life, and that to commit to either eternal carefulness or eternal zealousness is to needlessly impoverish yourself. It will ask you to devote the Artifact and its power to balancing balance and imbalance, balancedly.

You will not be the least bit surprised when there will appear a vision of K!!!111ELEVEN, Demon-Shogun of the Domain of Meta-Excess, who appears as a Toricelli trumpet with eyes and a mouth. She says that seriously, pick a side, all this complicated garbage about the balance between balance and excess is just another layer of intellectualization to defend against having any real values, a trick to make you feel smart and superior for believing in nothing, not even Balance. She will ask you to choose something now, lest you be caught in an endless regress of further options.

As soon as you acknowledge that this makes sense, there will appear a vision of ILO, Demon-Chancellor of the Domain of Excessive Meta-Balance, who appears as a deep hole in space whose end you cannot see. They will point out that yes, there is potentially an infinite regress of further levels. But to act to avoid those levels is essentially to unthinkingly side with the principle of Excess over Balance. After all, if you had originally started by siding with Chaos or Order rather than waiting to hear of the existence of Balance, you would have been unknowingly favoring Excess over Balance. And if you had decided to choose either Excess or Balance, you would have been favoring the principle of Meta-Excess over Meta-Balance before even knowing they existed. So choosing at any level of the hierarchy is essentially equivalent to choosing Excess at all higher levels of the hierarchy. When viewed this way, the hierarchy collapses to chaos, order, first-level-balance, second-level-balance, third-level-balance, and so on. They offer a new, better vision: Infinite Balance, a theoretical top of the hierarchy in which you choose to balance all previous levels.

But as you start to consider this, there will appear a vision of PAHANUP, Demon-Taoiseach of the Domain of Balanced Meta-Balance, who appears as a hole in space exactly three inches deep. Ze will tell you that going to infinite lengths to ensure perfect balance at an infinite number of levels actually seems a bit excessive in ways. To choose either Chaos or Order outright would be insufficiently careful, but to give yourself an intractable problem with an endless number of meta-levels would be excessively careful. Ze will suggest seeking balance in the number of levels you seek balance in.

This will seem plausible to you right up until the sudden appearance of a fiery vision of IFNI, Demon-Secretary-General of the Domain of Chaotic Meta-Excess, who appears as static. She will point out that there is now another infinite regress, more difficult than the last – to wit, how long you should spend calculating the number of levels on which to seek balance. She will state her case thus: suppose you want to calculate the correct amount of balance in the universe. Let us call this Calculation A. You need to calculate how long to spend on this calculation before giving up and satisficing; let us call this Calculation B. But you need to calculate how long to spend on Calculation B before giving up and satisficing; let us call this Calculation C. Clearly you will never be able to complete any of the calculations. Therefore in order to avoid spending your entire life in an infinite regress of calculation, you should flip a coin right now and use it to decide either Chaos or Order, no takebacks.

But as you reach for the coin, you will see a vision of GOSAGUL, Demon-Admiral of the Domain of Ordered Meta-Balanced Excess, who appears as a cube with constantly flashing black and white faces. He will lecture you on how it seems pretty strange that, when faced with the most important decision in the history of the universe, you decide to flip a coin. Surely, even if Ifni’s argument is correct, you can do better than that! For example, you can just go a specific finite number of levels, such as three, then seek balance at that many levels, then stop. This will be strictly better than Ifni’s plan of choosing completely randomly.

But this sage advice is interrupted by MEGAHAHA, Demon-Pope of the Domain of Excessively Ordered Meta-Balance, who appear a as pattern of black and white that cycles between a line, square, cube, and hypercube. It will point out that if you’re in the business of accepting arguments along the lines that “it seems pretty strange that when faced with the most important decision in the history of the universe you…”, then it seems pretty strange that when faced with the most important decision in the history of the universe, you agree to a kind of random number of levels chosen by a demon you have no reason to trust. By what logic do you reject making the decision itself randomly, but accept making the decision about how many levels to make the decision on randomly? Any amount of Balance in Meta-Balancing Excess is just arbitrary capriciousness; you either need to act fully randomly, or embrace the entire difficulty of the problem.

At this point, you will remember that the Artifact is cursed and demons are evil. With a final effort of will, you will shout the words “I choose Balance! Just normal Balance! First-level Balance! That’s it!” and throw the Artifact to the ground, where it will shatter into a thousand pieces and the voices of the demonic hierarchy will suddenly all go silent.

And for a thousand years to come, heroes will grumble “Why, exactly, are we seeking balance in the universe? Isn’t that kind of dumb? Don’t we want more good stuff, and less bad stuff? Doesn’t really seem that balance is really what we’re after, exactly.”

And you will tell them the story of how once you found the Artifact that gave you mastery of the universe, and you refused to take more than about three minutes figuring out what to use it for, because that would have been annoying.

The Omnigenic Model as Metaphor for Life

$
0
0

The collective intellect is change-blind. Knowledge gained seems so natural that we forget what it was like not to have it. Piaget says children gain long-term memory at age 4 and don’t learn abstract thought until ten; do you remember what it was like not to have abstract thought? We underestimate our intellectual progress because every every sliver of knowledge acquired gets backpropagated unboundedly into the past.

For decades, people talked about “the gene for height”, “the gene for intelligence”, etc. Was the gene for intelligence on chromosome 6? Was it on the X chromosome? What happens if your baby doesn’t have the gene for intelligence? Can they still succeed?

Meanwhile, the responsible experts were saying traits might be determined by a two-digit number of genes. Human Genome Project leader Francis Collins estimated that there were “about twelve genes” for diabetes, and “all of them will be discovered in the next two years”. Quanta Magazine reminds us of a 1999 study which claimed that “perhaps more than fifteen genes” might contribute to autism. By the early 2000s, the American Psychological Association was a little more cautious, was saying intelligence might be linked to “dozens – if not hundreds” of genes.

The most recent estimate for how many genes are involved in complex traits like height or intelligence is approximately “all of them” – by the latest count, about twenty thousand. From this side of the veil, it all seems so obvious. It’s hard to remember back a mere twenty or thirty years ago, when people earnestly awaited “the gene for depression”. It’s hard to remember the studies powered to find genes that increased height by an inch or two. It’s hard to remember all the crappy p-hacked results that okay, we found the gene for extraversion, here it is! It’s hard to remember all the editorials in The Guardian about how since nobody had found the gene for IQ yet, genes don’t matter, science is fake, and Galileo was a witch.

And even remembering those times, they seem incomprehensible. Like, really? Only a few visionaries considered the hypothesis that the most complex and subtle of human traits might depend on more than one protein? Only the boldest revolutionaries dared to ask whether maybe cystic fibrosis was not the best model for the entirety of human experience?

This side of the veil, instead of looking for the “gene for intelligence”, we try to find “polygenic scores”. Given a person’s entire genome, what function best predicts their intelligence? The most recent such effort uses over a thousand genes and is able to predict 10% of variability in educational attainment. This isn’t much, but it’s a heck of a lot better than anyone was able to do under the old “dozen genes” model, and it’s getting better every year in the way healthy paradigms are supposed to.

Genetics is interesting as an example of a science that overcame a diseased paradigm. For years, basically all candidate gene studies were fake. “How come we can’t find genes for anything?” was never as popular as “where’s my flying car?” as a symbol of how science never advances in the way we optimistically feel like it should. But it could have been.

And now it works. What lessons can we draw from this, for domains that still seem disappointing and intractable?

Turn-of-the-millennium behavioral genetics was intractable because it was more polycausal than anyone expected. Everything interesting was an excruciating interaction of a thousand different things. You had to know all those things to predict anything at all, so nobody predicted anything and all apparent predictions were fake.

Modern genetics is healthy and functional because it turns out that although genetics isn’t easy, it is simple. Yes, there are three billion base pairs in the human genome. But each of those base pairs is a nice, clean, discrete unit with one of four values. In a way, saying “everything has three billion possible causes” is a mercy; it’s placing an upper bound on how terrible genetics can be. The “secret” of genetics was that there was no “secret”. You just had to drop the optimistic assumption that there was any shortcut other than measuring all three billion different things, and get busy doing the measuring. The field was maximally perverse, but with enough advances in sequencing and computing, even the maximum possible level of perversity turned out to be within the limits of modern computing.

(this is an oversimplification: if it were really maximally perverse, chaos theory would be involved somehow. Maybe a better claim is that it hits the maximum perversity bound in one specific dimension)

One possible lesson here is that the sciences where progress is hard are the ones that have what seem like an unfair number of tiny interacting causes that determine everything. We should go from trying to discover “the” cause, to trying to find which factors we need to create the best polycausal model. And we should go from seeking a flash of genius that helps sweep away the complexity, to figuring out how to manage complexity that cannot be swept away.

Late-90s/early-00s psychiatry was a lot like late-90s/early-00s genetics. The public was talking about “the cause” of depression: serotonin. And the responsible experts were saying oh no, depression might be caused by as many as several different things.

Now the biopsychosocial model has caught on and everyone agrees that depression is complicated. I don’t know if we’re still at the “dozens of things” stage or the “hundreds of things stage”, but I don’t think anyone seriously thinks it’s fewer than a dozen. The structure of depression seems different from the structure of genetic traits in that one cause can still have a large effect; multiple sclerosis might explain less than 1% of the variance in depressedness, but there will be a small sample of depressives whose condition is almost entirely because of multiple sclerosis. But overall, I think the analogy to genetics is a good one.

If this is true, what can psychiatry (and maybe other low-rate-of-progress sciences) learn from genetics?

One possible lesson is: there are more causes than you think. Stop looking for “a cause” or “the ten causes” and start figuring out ways to deal with very numerous causes.

There are a bunch of studies that are basically like this one linking depression to zinc deficiency. They are good as far as they go, but it’s hard to really know what to do with them. It’s like finding one gene for intelligence. Okay, that explains 0.1% of the variability, now what?

We might imagine trying to combine all these findings into a polycausal score. Take millions of people, measure a hundred different variables – everything from their blood zinc levels, to the serotonin metabolites in their spinal fluid, to whether their mother loved them as a child – then do statistics on them and see how much of the variance in depression we can predict based on the inputs. “Do statistics on them” is a heck of a black box; genes are kind of pristine and causally unidirectional, but all of these psychological factors probably influence each other in a hundred different ways. In practice I think this would end up as a horribly expensive boondoggle that didn’t work at all. But in theory I think this is what a principled attempt to understand depression would look like.

(“understand depression” might be the wrong term here; it conflates being able to predict a construct with knowing what real-world phenomenon the construct refers to. We are much better at finding genes for intelligence than at understanding exactly what intelligence is, and whether it’s just a convenient statistical construct or a specific brain parameter. By analogy, we can imagine a Martian anthropologist who correctly groups “having a big house”, “driving a sports car”, and “wearing designer clothes” into a construct called “wealth”, and is able to accurately predict wealth from a model including variables like occupation, ethnicity, and educational attainment – but who doesn’t understand that wealth = having lots of money. I think it’s still unclear to what degree intelligence and depression have a simple real-world wealth-equals-lots-of-money style correspondence – though see here and here.)

A more useful lesson might be skepticism about personalized medicine. Personalized medicine – the idea that I can read your genome and your blood test results and whatever and tell you what antidepressant (or supplement, or form of therapy) is right for you has been a big idea over the past decade. And so far it’s mostly failed. A massively polycausal model would explain why. The average personalized medicine company gives you recommendations based on at most a few things – zinc levels, gut flora balance, etc. If there are dozens or hundreds of things, then you need the full massively polycausal model – which as mentioned before is computationally intractable at least without a lot more work.

(you can still have some personalized medicine. We don’t have to know the causes of depression to treat it. You might be depressed because your grandfather died, but Prozac can still make you feel better. So it’s possible that there’s a simple personalized monocausal way to check who eg responds better to Prozac vs. Lexapro, though the latest evidence isn’t really bullish about this. But this seems different from a true personalized medicine where we determine the root cause of your depression and fix it in a principled way.)

Even if we can’t get much out of this, I think it can be helpful just to ask which factors and sciences are oligocausal vs. massively polycausal. For example, what percent of variability in firm success are economists able to determine? Does most of the variability come from a few big things, like talented CEOs? Or does most of it come from a million tiny unmeasurable causes, like “how often does Lisa in Marketing get her reports in on time”?

Maybe this is really stupid – I’m neither a geneticist or a statistician – but I imagine an alien society where science is centered around polycausal scores. Instead of publishing a paper claiming that lead causes crime, they publish a paper giving the latest polycausal score for predicting crime, and demonstrating that they can make it much more accurate by including lead as a variable. I don’t think you can do this in real life – you would need bigger Big Data than anybody wants to deal with. But like falsifiability and compressability, I think it’s a useful thought experiment to keep in mind when imagining what science should be like.


Interactive Git Cheatsheet with Visualisation

$
0
0
NDP Software :: Git Cheatsheet
stash
workspace
index
local repository
upstream repository
status
Displays paths that have differences between the index file and the current HEAD commit, paths that have differences between the workspace and the index file, and paths in the workspace that are not tracked by git.
diff
Displays the differences not added to the index.
diff commit or branch
View the changes you have in your workspace relative to the named <em>commit</em>. You can use HEAD to compare it with the latest commit, or a branch name to compare with the tip of a different branch
add file... or dir...
Adds the current content of new or modified files to the index, thus staging that content for inclusion in the next commit. Use <code>add --interactive</code> to add the modified contents in the workspace interactively to the index.
add -u
Adds the current content of modified (NOT NEW) files to the index. This is similar to what 'git commit -a' does in preparation for making a commit.
rm file(s)...
Remove a file from the workspace and the index.
mv file(s)...
Move file in the workspace and the index.
commit -a -m 'msg'
Commit all files changed since your last commit, except untracked files (ie. all files that are already listed in the index). Remove files in the index that have been removed from the workspace.
checkout files(s)... or dir
Updates the file or directory in the workspace. Does NOT switch branches.
reset HEAD file(s)...
Remove the specified files from the next commit. Resets the index but not the working tree (i.e., the changed files are preserved but not marked for commit) and reports what has not been updated.
reset --soft HEAD^
Undo the last commit, leaving changes in the index.
reset --hard
Matches the workspace and index to the local tree. WARNING: Any changes to tracked files in the working tree since commit are lost. Use this if merging has resulted in conflicts and you'd like to start over. Pass ORIG_HEAD to undo the most recent successful merge and any changes after.
checkout branch
Switches branches by updating the index and workspace to reflect the specified branch, <em>branch</em>, and updating HEAD to be <em>branch</em>.
checkout -b name of new branch
Create a branch and switch to it
merge commit or branch
Merge changes from <em>branch name</em> into current branch.<br>Use <code>&#8209;&#8209;no-commit</code> to leave changes uncommitted.
rebase upstream
Reverts all commits since the current branch diverged from <em>upstream</em>, and then re-applies them one-by-one on top of changes from the HEAD of <em>upstream</em>.
cherry-pick commit
Integrate changes in the given commit into the current branch.
revert commit
Reverse commit specified by <em>commit</em> and commit the result. This requires your working tree to be clean (no modifications from the HEAD commit).
diff --cached commit
View the changes you staged vs the latest commit. Can pass a <em>commit</em> to see changes relative to it.
commit -m 'msg'
Stores the current contents of the index in a new commit along with a log message from the user describing the changes.
commit --amend
Modify the last commit with the current index changes.
log
Show recent commits, most recent on top. Options:<br><code>&#8209;&#8209;decorate</code> with branch and tag names on appropriate commits<br><code>&#8209;&#8209;stat</code> with stats (files changed, insertions, and deletions) <br><code>&#8209;&#8209;author=<em>author</em></code> only by a certain author<br><code>&#8209;&#8209;after="MMM DD YYYY"</code> ex. ("Jun 20 2008") only commits after a certain date<br><code>&#8209;&#8209;before="MMM DD YYYY"</code> only commits that occur before a certain date <br><code>&#8209;&#8209;merge</code> only the commits involved in the current merge conflicts
diff commitcommit
View the changes between two arbitrary commits
branch
List all existing branches. Option -r causes the remote-tracking branches to be listed, and option -a shows both.
branch -d branch
Delete an specified branch. Use -D to force.
branch --track newremote/branch
Create a new local branch that tracks a remote branch.
clone repo
Download the repository specified by <em>repo</em> and checkout HEAD of the master branch.
pull remoterefspec
Incorporates changes from a remote repository into the current branch. In its default mode, <code>git pull</code> is shorthand for <code>git fetch</code> followed by <code>git merge FETCH_HEAD</code>.
reset --hard remote/branch
Reset local repo and working tree to match a remote branch. Use <code>reset &#8209;&#8209;hard origin/master</code> to throw away all commits to the local master branch. Use this to start over on a failed merge.
fetch remoterefspec
Download objects and refs from another repository.
push
update the server with your commits across all branches that are *COMMON* between your local copy and the server.Local branches that were never pushed to the server in the first place are not shared
push remotebranch
Push new (or existing) branch to remote repository
push remotebranch:branch
Push new branch to remote repository with a different name
branch -r
List remote branches
push remote :branch
Remove a remote branch. Literally &quot;push nothing to this branch&quot;
clean
Cleans the working tree by recursively removing files that are not under version control, starting from the current directory.
stash save msg
Save your local modifications to a new stash, and run git reset &#8209;&#8209;hard to revert them. The<em>msg</em> part is optional and gives the description along with the stashed state. For quickly making a snapshot, you can omit both "save" and <em>msg</em>.
stash apply stash
Move changes from the specified stash into the workspace. The latest stash is the default.
stash pop
Applies the changes from the last (or specified) stash and then removes the given stash.
stash list
List the stashes that you currently have.
stash show stash
Show the changes recorded in the stash as a diff between the stashed state and its original parent. When no <em>stash</em> is given, shows the latest one.
stash drop stash
Remove a single stashed state from the stash list. When no <em>stash</em> is given, it removes the latest one.
stash clear
Remove all the stashed states. Note that those states will then be subject to pruning, and may be impossible to recover.
stash branch branchnamestash
Creates and checks out a new branch named <em>branchname</em> starting from the commit at which the<em>stash</em> was originally created, applies the changes recorded in <em>stash</em> to the new working tree and index. <br>If that succeeds, and <em>stash</em> is a reference of the form stash@{<em>revision</em>}, it then drops the <em>stash</em>. When no <em>stash</em> is given, applies the latest one. <br>This is useful if the branch on which you ran git stash save has changed enough that git stash apply fails due to conflicts. Since the stash is applied on top of the commit that was HEAD at the time git stash was run, it restores the originally stashed state with no conflicts.

Mary Meeker, the legendary internet analyst, is leaving Kleiner Perkins

$
0
0

Mary Meeker of Kleiner Perkins Caufield & Byers, one of the premier Silicon Valley investors at one of its premier venture capital firms, is leaving her position in an abrupt, high-profile splitting of the firm she helped lead.

Meeker is leading an exodus of late-stage investors from Kleiner Perkins in its most dramatic shake-up since legendary investor John Doerr stepped back from his role more than two years ago. Meeker’s exit — she, along with three of her partners, will form a new firm — will undoubtedly deal a hard blow to Kleiner Perkins, given her high profile in the business community and her stature as by far the most senior woman in venture capital.

Meeker, “the first Wall Street analyst to become a household name” during the dot-com boom, has been most famous in this era for her agenda-setting, unusually thorough “Internet Trends” slide decks, delivered — in memorable, rapid-fire fashion— in recent years at our annual Code Conference. This year’s deck included 294 slides and has been viewed more than a million times; last year’s, with 355 slides, is well past three million views.

The departures of Meeker and her colleagues — Mood Rowghani, Noah Knauf and Juliet de Baubigny — are rooted in different visions for the types of deals they would like to do. But like so many disputes in recent years at Kleiner Perkins, there was also persistent friction between the two sides in ways that had little to do with the firm’s core business — over mundane things such as whether to host a holiday party in San Francisco or closer to Sand Hill Road in Silicon Valley — according to people familiar with the situation.

Kleiner Perkins is one of tech’s oldest firms and its early investments in Amazon and Google have made it part of Silicon Valley lore. That legacy has helped sustain the firm even as it self-admittedly missed out on a wave of generation-defining startups. Meeker’s new team will lose the connection to that brand, effectively placing a bet on the singular brand of Meeker.

“The way the teams operate are just very different,” Meeker said in an interview with Recode.

The principals are naturally downplaying how important a split this is, but this is a massive moment for the firm.

“I don’t think it’s a huge deal,” Ted Schlein, who succeeded Doerr as the de facto head of the firm, said in an interview. The late-stage investing group, which started in 2011, “continued to diverge away from what the core part of Kleiner Perkins has done for 46 years, and will continue to do for another 46 years.”

People familiar with the situation described a distant relationship between the early-stage group, which backs U.S. startups that sometimes don’t have a finished product, and the later-stage group, which can compete with sovereign wealth funds to back companies worth billions of dollars all across the world. The two teams spent fairly little time with one another in recent years, sharing office space, some limited partners, and — critically — a storied brand. But deals? Not many.

Lots of top investing firms have two teams that operate semi-independently, and those firms don’t feel the need to go separate ways. But at Kleiner Perkins, there were cultural clashes between the two squads. The firm has long been home to personal squabbles between its big-personality investors — in-fighting that spilled into very public view during the gender discrimination lawsuit brought by Ellen Pao— and it has been unable to put those to rest even as it elevated a new, likable leader in Mamoon Hamid, who joined the firm last year.

Hamid is a bit of a purist for early-stage investing, though Schlein said the decision was not solely his.

What the decision was, according to the people familiar with the situation, was fairly abrupt. While the two sides have been growing apart for years, they had a “moment of clarity” in the last week or so, as one person said, and the two teams quickly moved to go public with their plans for a split before it leaked. The firm’s investors — its limited partners, in industry parlance — were only informed early this morning.

The breaking point came as the team had to decide whether to raise another later-stage fund under the Kleiner banner or under a new one.

The new, still-to-be-named firm led by Meeker will have immediate cachet given Meeker’s connections on Wall Street, where she served as a top research analyst at Morgan Stanley, and in Silicon Valley. Unlike most big-name tech investors, Meeker has name recognition that goes well beyond the sector because of her annual trends report.

“There is only one Mary Meeker,” Schlein said at the time of her hire.

Schlein now says he is “not naive” about what it will be like to lose Meeker. Kleiner Perkins will have to move on without its most famous name. And after the exit of another investor, Beth Seidenberg, in an unrelated departure, it will have no female general partners. Kleiner Perkins plans to take some economic stake in the new firm.

“People have gotten emotional, not in an angry way,” said one of the people, “but in a poignant way.”

Cttrie – Compile-time trie-based string matching for C++

$
0
0

Tobias Hoffmann

C++ User Treffen Aachen, 2018-09-13


C/C++: switch for non-integers - Stack Overflow

switch statement - cppreference.com

fastmatch.h

switch.hpp

Can we do better?

We want to:

  • compare only the remainder
  • get rid of the sorting requirement
  • keep "O(log n)" lookup complexity
  • still have clean code

Trie


  1. Raben
  2. Rabe
  3. Rasten
  4. Rasen

Trie


  1. Raben
  2. Rabe
  3. Rasten
  4. Rasen
smilingthax/cttrie

cttrie usage example i


#include "cttrie.h"
...
  const char *str = ...; // oder std::string, ...

  TRIE(str) printf("E\n");
  CASE("Raben") printf("0\n");
  CASE("Rabe") printf("1\n");
  CASE("Rasten") printf("2\n");
  CASE("Rasen") printf("3\n");
  ENDTRIE;
  

cttrie usage example ii


  printf("%d\n",
         TRIE(str) return -1;
         CASE("abc") return 0;
         CASE("bcd") return 1;
         ENDTRIE);

Agenda

  • Lifting the Hood
  • C++ template techniques, index sequences
  • Trie as C++ types
  • Trie lookup
  • String literals and TMP
  • Building the trie
  • Additional features
  • Two applications
  • Extensions to cttrie, other approaches

Lifting the Hood i


#define TRIE(str)  CtTrie::doTrie((str), [&]{

#define CASE(str)  }, CSTR(str), [&]{

#define ENDTRIE    })

template <typename ArgE, typename... Args>
constexpr auto doTrie(stringview str,
                      ArgE&& argE, Args&&... args)
  -> decltype(argE())
{ ... }

// CSTR("abc")  ->  string_t<.../>
ctrie.h

Lifting the Hood ii

struct stringview {
  template <unsigned int N>
  constexpr stringview(const char (&ar)[N]) // implicit
    // strips trailing \0
    : begin(ar), size((ar[N-1]==0) ? N-1 : N) {}

  template <typename String,
            typename Sfinae=decltype(
              std::declval<String>().c_str(),
              std::declval<String>().size())>
  constexpr stringview(String&& str)
    : begin(str.c_str()), size(str.size()) {}

  stringview(const char *begin)
    : begin(begin), size(std::strlen(begin)) {}

  constexpr stringview(const char *begin, unsigned int size)
    : begin(begin), size(size) {}

  constexpr bool empty() const {
    return (size==0);
  }

  constexpr char operator*() const {
    // assert(!empty());  // or: throw ?
    return *begin;
  }

  constexpr stringview substr(unsigned int start) const {
    return { begin+start,
             (start<size) ? size-start : 0 };
  }

  constexpr stringview substr(unsigned int start,
                              unsigned int len) const {
    return { begin+start,
             (start<size) ?
               (len<size-start) ? len : size-start
             : 0 };
  }

private:
  const char *begin;
  unsigned int size;
};
  
stringview.h

C++ template techniques

// provides  pack_tools::get_index<I>(Ts&&... ts)
// (≙ std::get<I>(std::make_tuple(ts...)) )

namespace pack_tools {
namespace detail {
  template <unsigned int> struct int_c {};

  template <unsigned int I>
  constexpr void *get_index_impl(int_c<I>) // invalid index
  {
    return {};
  }

  template <typename T0, typename... Ts>
  constexpr T0&& get_index_impl(int_c<0>,
                                T0&& t0, Ts&&... ts)
  {
    return (T0&&)t0;
  }

  template <unsigned int I, typename T0, typename... Ts>
  constexpr auto get_index_impl(int_c<I>,
                                T0&& t0, Ts&&... ts)
    -> decltype(get_index_impl(int_c<I-1>(), (Ts&&)ts...))
  {
    return get_index_impl(int_c<I-1>(), (Ts&&)ts...);
  }
} // namespace detail

template <unsigned int I, typename... Ts>
constexpr auto get_index(Ts&&... ts)
  -> decltype(detail::get_index_impl(detail::int_c<I>(),
                                     (Ts&&)ts...))
{
  static_assert((I<sizeof...(Ts)), "Invalid Index");
  return detail::get_index_impl(detail::int_c<I>(),
                                (Ts&&)ts...);
}

} // namespace pack_tools
  
get_index.h

Index sequences i

// using seq3_t = std::make_index_sequence<3>; // not c++11
using seq3_t = decltype(detail::make_index_sequence<3>());

// => seq3_t = detail::index_sequence<0, 1, 2, 3>;

template <unsigned int... Is>
void foo(detail::index_sequence<Is...>) { ... }

foo(detail::make_index_sequence());

// c++14: index_sequence = integer_sequence<size_t, Is...>;

Index sequences ii

struct nil {};

template <bool B>
using Sfinae = typename std::enable_if<B>::type;

template <unsigned int... Is>
struct index_sequence {};

template <unsigned int N, unsigned int... Is,
          typename =Sfinae<N==0>>
constexpr index_sequence<Is...> make_index_sequence(...)
{ return {}; }

template <unsigned int N, unsigned int... Is,
          typename =Sfinae<(N>0)>>
constexpr auto make_index_sequence(...)
  // argument forces ADL
  -> decltype(make_index_sequence<N-1, N-1, Is...>(nil()))
{ return {}; }

Index sequences iii

namespace detail {
  template <unsigned int... Is,
            typename ArgE, typename... Args>
  constexpr auto doTrie(index_sequence<Is...>,stringview str,ArgE&& argE, Args&&... args)
    -> decltype(argE())
  {
    return checkTrie(
      makeTrie<0>(
        nil(),
        pack_tools::get_index<(2*Is)>((Args&&)args...)...),str, (ArgE&&)argE,
      pack_tools::get_index<(2*Is+1)>((Args&&)args...)...);
  }
} // namespace detail

template <typename ArgE, typename... Args>
constexpr auto doTrie(stringview str,ArgE&& argE, Args&&... args)
  -> decltype(argE())
{
  return detail::doTrie(
    detail::make_index_sequence<sizeof...(args)/2>(),str, (ArgE&&)argE, (Args&&)args...);
}

Trie as C++ types

namespace CtTrie {
using pack_tools::detail::int_c;

template <int Char, typename Next>
struct Transition {};

// multiple inheritance used for cttrie_sw256 ...
template <typename... Transitions>
struct TrieNode : Transitions... {};

// ...

Trie lookup i

check(node, str):
  if (str.empty):
    if (node.Transition[0].Char==-1):
      return node.Transition[0].Next // i.e. index
    return error

  switch (str[0]):
    case node.Transition[0].Char:
      return check(node.Transition[0].Next, str[1:])
    case node.Transition[1].Char:
      return check(node.Transition[1].Next, str[1:])
    ...
  return error // (default)
(pseudocode)

Trie lookup ii

// possible via Transition<-1, int_c<...>>
template <typename FnE, typename... Fns>
constexpr auto checkTrie(TrieNode<> trie, stringview str,FnE&& fne, Fns&&... fns)
  -> decltype(fne())
{
  return fne();
}

template <int Char, typename Next,typename FnE, typename... Fns,
          typename =Sfinae<(Char>=0)>>
constexpr auto checkTrie(
  TrieNode<Transition<Char,Next>> trie,
  stringview str, FnE&& fne, Fns&&... fns)
  -> decltype(fne())
{
  return (!str.empty() && (*str==Char))
    ? checkTrie(Next(), str.substr(1),(FnE&&)fne, (Fns&&)fns...)
    : fne();
}

template <typename... Transitions,typename FnE, typename... Fns>
constexpr auto checkTrie(
  TrieNode<Transitions...> trie,
  stringview str, FnE&& fne, Fns&&... fns)
  -> decltype(fne())
{
  return (!str.empty())
    ? Switch(*str, str.substr(1),
             trie, (FnE&&)fne, (Fns&&)fns...)
    : fne();
}

template <unsigned int Index, typename... Transitions,typename FnE, typename... Fns>
constexpr auto checkTrie(
  TrieNode<Transition<-1,int_c<Index>>, Transitions...>,
  stringview str, FnE&& fne, Fns&&... fns)
  -> decltype(fne())
{
  return (str.empty())
    ? pack_tools::get_index<Index>((Fns&&)fns...)()
    : checkTrie(TrieNode<Transitions...>(), str,(FnE&&)fne, (Fns&&)fns...);
}

Trie lookup: Switch i

template <...>
auto Switch(unsigned char ch, stringview str,
            TrieNode<Transitions...>, FnE&&, Fns&&...)
  -> decltype(fne())
{
  switch (ch) {
    {
    case (Transitions::Char):
      return checkTrie(Transitions::Next(), str,
                       (FnE&&)fne, (Fns&&)fns...);
    }...
  }
  return fne();
}

Trie lookup: Switch ii

template <int Char0, typename Next0,
          int Char1, typename Next1,
          typename FnE,typename... Fns>
auto Switch(unsigned char ch, stringview str,
            TrieNode<Transition<Char0,Next0>,
                     Transition<Char1,Next1>>,
            FnE&& fne, Fns&&... fns)
  -> decltype(fne())
{
  switch (ch) {
  case Char0: return checkTrie(Next0(), str, (FnE&&)fne, (Fns&&)fns...);
  case Char1: return checkTrie(Next1(), str, (FnE&&)fne, (Fns&&)fns...);
  }
  return fne();
}

Trie lookup: Switch iii

// TNext obtained by partial specialization!
next_or_nil<I>(node) =
   has_base(node, Transition<I, TNext>) ? TNext : nil

type table[256] = { next_or_nil<Is>(node)... };
// actually: type_array<A00,A01,...> parameter

switch (str[0]):
  case 0: static_if (is_nil(table[0])): return error;
    return check(table[0], str[1:])
  case 1: static_if (is_nil(table[1])): return error;
    return check(table[1], str[1:])
  ...
  case 255:
    return check(table[255], str[1:])
  

String literals and TMP

Problem: foo as template parameter?!

Idea: "abc"[1] == 'b' is possible

template <unsigned char... Chars>
struct string_t {
  static constexpr unsigned int size() {
    return sizeof...(Chars);
  }
  static const char *data() {
    static constexpr const char data[]={Chars...};
    return data;
  }
};

namespace detail {
template <typename Str, unsigned int N, unsigned char... Chars>
struct make_string_t
  : make_string_t<Str, N-1, Str().chars[N-1], Chars...> {};

template <typename Str, unsigned char... Chars>
struct make_string_t<Str, 0, Chars...> {
   typedef string_t<Chars...> type;
 };
} // namespace detail

#define CSTR(str) []{ \
    struct Str { const char *chars = str; }; \
    return ::detail::make_string_t<Str,sizeof(str)>::type(); \
  }()
cstr.h

Building the trie i

makeTrie(String0, String1, ..., StringN):
  for each I=0...N:
    trie = trieAdd<I, StringI>(trie)
template <unsigned int I>
constexpr TrieNode<> makeTrie(nil) // nil forces adl
{ return {}; }

template <unsigned int I,typename String0, typename... Strings>
constexpr auto makeTrie(nil, String0, Strings...)
  -> decltype(
    trieAdd<I, String0>(makeTrie<I+1>(nil(), Strings()...)
    ))
{ return {}; }

Building the trie ii

trieAdd<Index, String>(TrieNode<Transitions...>):
  insertSorted<Index>(String, TrieNode< | Transitions...>)
insertSorted:
  • Either there is no transition yet for the next char:
    Insert new Transition into TrieNode at appropriate position.
  • Or, when there is one:
    Take transition, repeat.
  • Start of iteration is (TrieNode(), Transitions...).

Building the trie iii

trieAdd<Index, String>(TrieNode<Transitions...>):
  insertSorted<Index>(String, TrieNode< | Transitions...>)
template <unsigned int Index, typename String,
             typename... Transitions>
constexpr auto trieAdd(TrieNode<Transitions...>)
  -> decltype(
    insertSorted<Index>(
      nil(), String(), // nil forces adl
      TrieNode<>(), Transitions()...))
{ return {}; }

Building the trie iv: Chains

transitionAdd<Index>(string_t<...>) →
  (string_t<Ch0, Chars...>)
    = Transition<Ch0,
                 transitionAdd<Index>(string_t<Chars...>)>

  (string_t<>)
    = Transition<-1, int_c<Index>>

  (string_t<'\0'>)  // alternative ...
    = Transition<-1, int_c<Index>>

Building the trie v: Chains

template <unsigned int Index>
constexpr Transition<-1, int_c<Index>>transitionAddclass="c1">(nil, string_t<0>)  //  or: string_t<>
{ return {}; }

template <unsigned int Index,unsigned char Ch0, unsigned char... Chars>
constexpr Transition<Ch0, TrieNode<decltype(transitionAdd<Index>(nil(), string_t<Chars...>())
)>>transitionAdd(nil, string_t<Ch0, Chars...>)
{ return {}; }

Building the trie vi

insertSorted<Index>(
  string_t<Ch0, Chars...> s,
  TrieNode<Prefix... | Transition<Ch,Next>, Transitions...>
):

  if (Ch>Ch0):
    TrieNode<Prefix..., transitionAdd<Index>(s),
             Transition<Ch,Next>, Transitions...>

  else if (Ch==Ch0):
    TrieNode<Prefix...,
      Transition<Ch,
        trieAdd<Index, string_t<Chars...>>(Next())>,
      Transitions...>

  else // (Ch<Ch0)
    insertSorted<Index>(s,
      TrieNode<Prefix...,
               Transition<Ch, Next> | Transition...>)

Building the trie vii

template <unsigned int Index,
          unsigned char... Chars,
          typename... Prefix, typename... Transitions,
          typename =Sfinae<(sizeof...(Chars)==0 ||sizeof...(Transitions)==0)>>
constexpr auto insertSorted(nil,
  string_t<Chars...> s,
  TrieNode<Prefix...>, Transitions...)
  -> TrieNode<Prefix...,
    decltype(transitionAdd<Index>(nil(), s)),
    Transitions...>
{ return {}; }

template <unsigned int Index,
          unsigned char Ch0, unsigned char... Chars,
          typename... Prefix,
          int Ch, typename Next,
          typename... Transitions,
          typename =Sfinae<(Ch>Ch0)>>
constexpr auto insertSorted(nil,
  string_t<Ch0, Chars...> s,
  TrieNode<Prefix...>,
  Transition<Ch,Next>,
  Transitions...)
  -> TrieNode<Prefix...,
    decltype(transitionAdd<Index>(nil(), s)),
    Transition<Ch,Next>,
    Transitions...>
{ return {}; }

template <unsigned int Index,
          unsigned char Ch0, unsigned char... Chars,
          typename... Prefix,
          int Ch, typename Next,
          typename... Transitions,
          typename =Sfinae<(Ch==Ch0)>>
constexpr auto insertSorted(nil,
  string_t<Ch0, Chars...> s,
  TrieNode<Prefix...>,
  Transition<Ch, Next>,
  Transitions...)
  -> TrieNode<
    Prefix...,
    Transition<Ch,
      decltype(trieAdd<Index, string_t<Chars...>>(Next()))>,
    Transitions...>
{ return {}; }

template <unsigned int Index,
          unsigned char Ch0, unsigned char... Chars,
          typename... Prefix,
          int Ch, typename Next,
          typename... Transitions,
          typename =Sfinae<(Ch<Ch0)>>
constexpr auto insertSorted(nil,
  string_t<Ch0, Chars...> s,
  TrieNode<Prefix...>,
  Transition<Ch, Next>,
  Transitions...)
  -> decltype(insertSorted<Index>(nil(), s,
    TrieNode<Prefix..., Transition<Ch, Next>>(),
    Transitions()...))
{ return {}; }

Additional features

template <typename TrieNode, typename FnE, typename... Fns>
constexpr auto checkTrie(TrieNode trie, stringview str,
                         FnE&& fne,Fns&&... fns)
  -> decltype(fne())
{
  return detail::checkTrie(trie, str,
                           (FnE&&)fne, (Fns&&)fns...);
}

// Strings must be string_t
template <typename... Strings>
constexpr auto CtTrie::makeTrie(Strings... strs)
  -> decltype(detail::makeTrie<0>(detail::nil(), strs...))
{ return {}; }

// ---

auto trie=CtTrie::makeTrie(
  CSTR("Rosten"),
  CSTR("Raben"));

// CtTrie::checkTrie(trie, "ab", [&]{...}, [&]{...}, ...);

#include "cttrie-print.h"
CtTrie::printTrie(trie); // or: decltype(trie)() ...
  

Application: XML

  for (node=node->children; node; node=node->next) {
    if (node->type != XML_ELEMENT_NODE) {
      continue;
    }
    TRIE((const char *)node->name)
      fprintf(stderr, "Warning: unknown ltconfig/text element: %s\n", (const char *)node->name);

    CASE("in")
      ensure_onlyattr(node, "!rel at");
      unique_xmlFree rel(xmlGetProp(node, (const xmlChar *)"rel"));
      txt.in.rel_loop =
        TRIE((const char *)rel)
          throw UsrError("Unknown text/in/@rel value: %s\n", (const char *)rel);
          return bool(); // needed for return type deduction
        CASE("in") return false;
        CASE("loop") return true;
        ENDTRIE;
      txt.in.at = get_attr_int(node, "at", 0);
      parse_fade_only(node, txt.in.fade_duration);

    CASE("out")
      ensure_onlyattr(node, "at");
      txt.out.at = get_attr_int(node, "at", 0);
      parse_fade_only(node, txt.out.fade_duration);
    ENDTRIE;
  }
  

Extensions to cttrie

  • Partial/substring matching
  • Case insensitive
  • Suffix-at-once

Other approaches

Forecasting at Uber: An Introduction

$
0
0

This article is the first in a series dedicated to explaining how Uber leverages forecasting to build better products and services. In recent years, machine learning, deep learning, and probabilistic programming have shown great promise in generating accurate forecasts. In addition to standard statistical algorithms, Uber builds forecasting solutions using these three techniques. Below, we discuss the critical components of forecasting we use, popular methodologies, backtesting, and prediction intervals.

Forecasting is ubiquitous. In addition to strategic forecasts, such as those predicting revenue, production, and spending, organizations across industries need accurate short-term, tactical forecasts, such as the amount of goods to be ordered and number of employees needed, to keep pace with their growth. Not surprisingly, Uber leverages forecasting for several use cases, including:  

  • Marketplace forecasting: A critical element of our platform, marketplace forecasting enables us to predict user supply and demand in a spatio-temporal fine granular fashion to direct driver-partners to high demand areas before they arise, thereby increasing their trip count and earnings. Spatio-temporal forecasts are still an open research area.
Figure 1. Marketplace forecasting in California’s Bay Area allows us to direct drivers to high-demand areas.
  • Hardware capacity planning: Hardware under-provisioning may lead to outages that can erode user trust, but over-provisioning can be very costly. Forecasting can help find the sweet spot: not too many and not too few.
  • Marketing: It is critical to understand the marginal effectiveness of different media channels while controlling for trends, seasonality, and other dynamics (e.g., competition or pricing). We leverage advanced forecasting methodologies to help us build more robust estimates and to enable us to make data-driven marketing decisions at scale.

What makes forecasting (at Uber) challenging?

The Uber platform operates in the real, physical world, with its many actors of diverse behavior and interests, physical constraints, and unpredictability. Physical constraints, like geographic distance and road throughput move forecasting from the temporal to spatio-temporal domains.

Although a relatively young company (eight years and counting), Uber’s hypergrowth has made it particularly critical that our forecasting models keep pace with the speed and scale of our operations.  

Figure 2, below, offers an example of Uber trips data in a city over 14 months. You can notice a lot of variability, but also a positive trend and weekly seasonality (e.g., December often has more peak dates because of the sheer number of major holidays scattered throughout the month).

Figure 2. Leveraging the daily sum of Uber trips in a city, we can better predict the amount and frequency of future trips.

If we zoom in (Figure 3, below) and switch to hourly data for the month of July 2017, you will notice both daily and  weekly (7*24) seasonality. You may notice that weekends tend to be more busy.

Figure 3. The hourly sum of Uber trips in a given month (in July 2017) help us model user patterns.

Forecasting methodologies need to be able to model such complex patterns.

Prominent forecasting approaches

Apart from qualitative methods, quantitative forecasting approaches can be grouped as follows: model-based or causal classical, statistical methods, and machine learning approaches.

Model-based forecasting is the strongest choice when the underlying mechanism, or physics, of the problem is known, and as such it is the right choice in many scientific and engineering situations at Uber. It is also the usual approach in econometrics, with a broad range of models following different theories.

When the underlying mechanisms are not known or are too complicated, e.g., the stock market, or not fully known, e.g., retail sales, it is usually better to apply a simple statistical model. Popular classical methods that belong to this category include ARIMA (autoregressive integrated moving average), exponential smoothing methods, such as Holt-Winters, and the Theta method, which is less widely used, but performs very well. In fact, the Theta method won the M3 Forecasting Competition, and we also have found it to work well on Uber’s time series (moreover, it is computationally cheap).

In recent years, machine learning approaches, including quantile regression forests (QRF), the cousins of the well-known random forest, have become part of the forecaster’s toolkit. Recurrent neural networks (RNNs) have also been shown to be very useful if sufficient data, especially exogenous regressors, are available. Typically, these machine learning models are of a black-box type and are used when interpretability is not a requirement. Below, we offer a high level overview of popular classical and machine learning forecasting methods:

Classical & StatisticalMachine Learning
  • Autoregressive integrated moving average (ARIMA)
  • Exponential smoothing methods (e.g. Holt-Winters)
  • Theta
  • Recurrent neural networks (RNN)
  • Quantile regression forest (QRF)
  • Gradient boosting trees (GBM)
  • Support vector regression (SVR)
  • Gaussian Process regression (GP)

Interestingly, one winning entry to the M4 Forecasting Competition was a hybrid model that included both hand-coded smoothing formulas inspired by a well known the Holt-Winters method and a stack of dilated long short-term memory units (LSTMs).

Actually, classical and ML methods are not that different from each other, but distinguished by whether the models are more simple and interpretable or more complex and flexible. In practice. classical statistical algorithms tend to be much quicker and easier-to-use.

At Uber, choosing the right forecasting method for a given use case is a function of many factors, including how much historical data is available, if exogenous variables (e.g., weather, concerts, etc.) play a big role, and the business needs (for example, does the model need to be interpretable?). The bottom line, however, is that we cannot know for sure which approach will result in the best performance and so it becomes necessary to compare model performance across multiple approaches.

Comparing forecasting methods

It is important to carry out chronological testing since time series ordering matters. Experimenters cannot cut out a piece in the middle, and train on data before and after this portion. Instead, they need to train on a set of data that is older than the test data.

Figure 4: Two major approaches to test forecasting models are the sliding window approach (left) and the expanding window approach (right).

With this in mind, there are two major approaches, outlined in Figure 4, above: the sliding window approach and the expanding window approach. In the sliding window approach, one uses a fixed size window, shown here in black, for training. Subsequently, the method is tested against the data shown in orange.

On the other hand, the expanding window approach uses more and more training data, while keeping the testing window size fixed. The latter approach is particularly useful if there is a limited amount of data to work with.

It is also possible, and often best, to marry the two methods: start with the expanding window method and, when the window grows sufficiently large, switch to the sliding window method.

Many evaluation metrics have been proposed in this space, including absolute errors and percentage errors, which have a few drawbacks. One particularly useful approach is to compare model performance against the naive forecast. In the case of a non-seasonal series, a naive forecast is when the last value is assumed to be equal to the next value. For a periodic time series, the forecast estimate is equal to the previous seasonal value (e.g., for an hourly time series with weekly periodicity the naive forecast assumes the next value is at the current hour one week ago).

To make choosing the right forecasting method easier for our teams, the Forecasting Platform team at Uber built a parallel, language-extensible backtesting framework called Omphalos to provide rapid iterations and comparisons of forecasting methodologies.

The importance of uncertainty estimation

Determining the best forecasting method for a given use case is only one half of the equation. We also need to estimate prediction intervals. The prediction intervals are upper and lower forecast values that the actual value is expected to fall between with some (usually high) probability, e.g. 0.9. We highlight how prediction intervals work in Figure 5, below:

Figure 5: Prediction intervals are critical to informed decision making. Although point forecasts may be the same, their prediction intervals may be significantly different.

In Figure 5, the point forecasts shown in purple are exactly the same. However, the prediction intervals in the the left chart are considerably narrower than in the right chart. The difference in prediction intervals results in two very different forecasts, especially in the context of capacity planning: the second forecast calls for much higher capacity reserves to allow for the possibility of a large increase in demand.

Prediction intervals are just as important as the point forecast itself and should always be included in your forecasts. Prediction intervals are typically a function of how much data we have, how much variation is in this data, how far out we are forecasting, and which forecasting approach is used.

Moving forward

Forecasting is critical for building better products, improving user experiences, and ensuring the future success of our global business. It goes without saying that there are endless forecasting challenges to tackle on our Data Science teams. In future articles, we will delve into the technical details of these challenges and the solutions we’ve built to solve them. The next article in this series will be devoted to preprocessing, often under-appreciated and underserved, but a crucially important task.   

If you’re interested building forecasting systems with impact at scale, apply for a role on our team.

Subscribe to our newsletter to keep up with the latest innovations from Uber Engineering.

Fran Bell is a Data Science Director at Uber, leading platform data science teams including Applied Machine Learning, Forecasting, and Natural Language Understanding.

Slawek Smyl is a forecasting expert working at Uber. Slawek has ranked highly in international forecasting competitions. For example, he won the M4 Forecasting competition (2018) and the Computational Intelligence in Forecasting International Time Series Competition 2016 using recurrent neural networks. Slawek also built a number of statistical time series algorithms that surpass all published results on M3 time series competition data set using Markov Chain Monte Carlo (R, Stan).

Writing a JIT Compiler in C#

$
0
0

Ludovic Henry, Miguel de Icaza, Aleksey Kliger, Bernhard Urban and Ming Zhou runtime

During the 2018 Microsoft Hack Week, members of the Mono team explored the idea of replacing the Mono’s code generation engine written in C with a code generation engine written in C#.

In this blog post we describe our motivation, the interface between the native Mono runtime and the managed compiler and how we implemented the new managed compiler in C#.

Motivation

Mono’s runtime and JIT compiler are entirely written in C, a highly portable language that has served the project well. Yet, we feel jealous of our own users that get to write code in a high-level language and enjoy the safety, the luxury and reap the benefits of writing code in a high-level language, while the Mono runtime continues to be written in C.

We decided to explore whether we could make Mono’s compilation engine pluggable and then plug a code generator written entirely in C#. If this were to work, we could more easily prototype, write new optimizations and make it simpler for developers to safely try changes in the JIT.

This idea has been explored by research projects like theJikesRVM,Maxime andGraal for Java. In the .NET world, the Unity team wrote an IL compiler to C++ compiler calledil2cpp. They also experimented with a managed JIT recently.

In this blog post, we discuss the prototype that we built. The code mentioned in this blog post can be found here:https://github.com/lambdageek/mono/tree/mjit/mcs/class/Mono.Compiler

Interfacing with the Mono Runtime

The Mono runtime provides various services, just-in-time compilation, assembly loading, an IO interface, thread management and debugging capabilities. The code generation engine in Mono is called mini and is used both for static compilation and just-in-time compilation.

Mono’s code generation has a number of dimensions:

  • Code can be either interpreted, or compiled to native code
  • When compiling to native code, this can be done just-in-time, or it can be batch compiled, also known as ahead-of-time compilation.
  • Mono today has two code generators, the light and fast mini JIT engine, and the heavy duty engine based on the LLVM optimizing compiler. These two are not really completely unaware of the other, Mono’s LLVM support reuses many parts of the mini engine.

This project started with a desire to make this division even more clear, and to swap up the native code generation engine in ‘mini’ with one that could be completely implemented in a .NET language. In our prototype we used C#, but other languages like F# or IronPython could be used as well.

To move the JIT to the managed world, we introduced the ICompiler interface which must be implemented by your compilation engine, and it is invoked on demand when a specific method needs to be compiled.

This is the interface that you must implement:

The CompileMethod () receives a IRuntimeInformation reference, which provides services for the compiler as well as a MethodInfo that represents the method to be compiled and it is expected to set the nativeCode parameter to the generated code information.

The NativeCodeHandle merely represents the generated code address and its length.

This is the IRuntimeInformation definition, which shows the methods available to the CompileMethod to perform its work:

We currently have one implementation of ICompiler, we call it the the “BigStep” compiler. When wired up, this is what the process looks like when we compile a method with it:

Managed JIT overview

The mini runtime can call into managed code via CompileMethod upon a compilation request.
For the code generator to do its work, it needs to obtain some information about the current environment. This information is surfaced by the IRuntimeInformation interface. Once the compilation is done, it will return a blob of native instructions to the runtime. The returned code is then “installed” in your application.

Now there is a trick question: Who is going to compile the compiler?

The compiler written in C# is initially executed with one of the built-in engines (either theinterpreter, or the JIT engine).

The BigStep Compiler

Our first ICompiler implementation is called theBigStep compiler.

This compiler was designed and implemented by a developer (Ming Zhou) not affiliated with Mono Runtime Team. It is a perfect showcase of how the work we presented through this project can quickly enable a third-party to build their own compiler without much hassle interacting with the runtime internals.

The BigStep compiler implements an IL to LLVM compiler. This was convenient to build the proof of concept and ensure that the design was sound, while delegating all the hard compilation work to the LLVM compiler engine.

A lot can be said when it comes to the design and architecture of a compiler, but our main point here is to emphasize how easy it can be, with what we have just introduced to Mono runtime, to bridge IL code with a customized backend.

The IL code is streamed into to the compiler interface through an iterator, with information such as op-code, index and parameters immediately available to the user. See below for more details about the prototype.

Hosted Compiler

Another beauty of moving parts of the runtime to the managed side is that we can test the JIT compiler without recompiling the native runtime, so essentially developing a normal C# application.

The InstallCompilationResult () can be used to register compiled method with the runtime and the ExecuteInstalledMethod () are can be used to invoke a method with the provided arguments.

Here is an example how this is used code:

We can ask the host VM for the actual result, assuming it’s our gold standard:

This eases development of a compiler tremendously.

We don’t need to eat our own dog food during debugging, but when we feel ready we can flip a switch and use the compiler as our system compiler. This is actually what happens if you run make -C mcs/class/Mono.Compiler run-test in the mjit branch: We use this API to test the managed compiler while running on the regular Mini JIT.

Native to Managed to Native: Wrapping Mini JIT into ICompiler

As part of this effort, we also wrapped Mono’s JIT in the ICompiler interface.

MiniCompiler

MiniCompiler calls back into native code and invokes the regular Mini JIT. It works surprisingly well, however there is a caveat: Once back in the native world, the Mini JIT doesn’t need to go through IRuntimeInformation and just uses its old ways to retrieve runtime details. Though, we can turn this into an incremental process now: We can identify those parts, add them to IRuntimeInformation and change Mini JIT so that it uses the new API.

Conclusion

We strongly believe in a long-term value of this project. A code base in managed code is more approachable for developers and thus easier to extend and maintain. Even if we never see this work upstream, it helped us to better understand the boundary between runtime and JIT compiler, and who knows, it might will help us to integrate RyuJIT into Mono one day 😉

We should also note that IRuntimeInformation can be implemented by any other .NET VM: Hello CoreCLR folks 👋

If you are curious about this project, ping us on our Gitter channel.


Appendix: Converting Stack-Based OpCodes into Register Operations

Since the target language was LLVM IR, we had to build a translator that converted the stack-based operations from IL into the register-based operations of LLVM.

Since many potential target are register based, we decided to design a framework to make it reusable of the part where we interpret the IL logic. To this goal, we implemented an engine to turn the stack-based operations into the register operations.

Consider the ADD operation in IL. This operation pops two operands from the stack, performing addition and pushing back the result to the stack. This is documented in ECMA 335 as follows:

  Stack Transition: 
      ..., value1, value2 -> ..., result 

The actual kind of addition that is performed depends on the types of the values in the stack. If the values are integers, the addition is an integer addition. If the values are floating point values, then the operation is a floating point addition.

To re-interpret this in a register-based semantics, we treat each pushed frame in the stack as a different temporary value. This means if a frame is popped out and a new one comes in, although it has the same stack depth as the previous one, it’s a new temporary value.

Each temporary value is assigned a unique name. Then an IL instruction can be unambiguously presented in a form using temporary names instead of stack changes. For example, the ADD operation becomes

Temp3 := ADD Temp1 Temp2

Other than coming from the stack, there are other sources of data during evaluation: local variables, arguments, constants and instruction offsets (used for branching). These sources are typed differently from the stack temporaries, so that the downstream processor (to talk in a few) can properly map them into their context.

A third problem that might be common among those target languages is the jumping target for branching operations. IL’s branching operation assumes an implicit target should the result be taken: The next instruction. But branching operations in LLVM IR must explicitly declare the targets for both taken and not-taken paths. To make this possible, the engine performs a pre-pass before the actual execution, during which it gathers all the explicit and implicit targets. In the actual execution, it will emit branching instructions with both targets.

As we mentioned earlier, the execution engine is a common layer that merely translates the instruction to a more generic form. It then sends out each instruction to IOperationProcessor, an interface that performs actual translation. Comparing to the instruction received from ICompiler, the presentation here,OperationInfo, is much more consumable: In addition to the op codes, it has an array of the input operands, and a result operand:

There are several types of the operands: ArgumentOperand, LocalOperand,ConstOperand, TempOperand, BranchTargetOperand, etc. Note that the result, if it exists, is always a TempOperand. The most important property on IOperand is its Name, which unambiguously defines the source of data in the IL runtime. If an operand with the same name comes in another operation, it unquestionably tells us the very same data address is targeted again. It’s paramount to the processor to accurately map each name to its own storage.

The processor handles each operand according to its type. For example, if it’s an argument operand, we might consider retrieving the value from the corresponding argument. An x86 processor may map this to a register. In the case of LLVM, we simply go to fetch it from a named value that is pre-allocated at the beginning of method construction. The resolution strategy is similar for other operands:

  • LocalOperand: fetch the value from pre-allocated address
  • ConstOperand: use the const value carried by the operand
  • BranchTargetOperand: use the index carried by the operand

Since the temp value uniquely represents an expression stack frame from CLR runtime, it will be mapped to a register. Luckily for us, LLVM allows infinite number of registers, so we simply name a new one for each different temp operand. If a temp operand is reused, however, the very same register must as well.

We use LLVMSharp binding to communicate with LLVM.

How to write X in both Python 3 and JavaScript

$
0
0
How to write X in both Python 3 and JavaScript (ES2015)

I'm Saya, a jewelry designer and a junior JavaScript/Python developer living in Tokyo. Find me on Twitter: @sayajewels

Math: Absolute Value

Python 3 / abs

# 100
print(abs(-100))
# 50
print(abs(50))

JavaScript (ES2015) / abs

// 100
console.log(Math.abs(-100))
// 50
console.log(Math.abs(50))

Math: Round numbers

Python 3 / ceil, floor, round

import math

# 2
print(math.ceil(1.5))
# 1
print(math.floor(1.5))
# 2
print(round(1.5))

JavaScript (ES2015) / ceil, floor, round

// 2
console.log(Math.ceil(1.5))
// 1
console.log(Math.floor(1.5))
// 2
console.log(Math.round(1.5))

Math: Max and Min

Python 3 / max, min

# 100
print(max(100, 50))
# 40
print(min(80, 40))

JavaScript (ES2015) / max, min

// 100
console.log(Math.max(100, 50))

// 40
console.log(Math.min(80, 40))

Control Flow: If else

Python 3 / if

some_number = 3

# Number is 3
if some_number == 1:
    print("Number is 1")
elif some_number == 2:
    print("Number is 2")
elif some_number == 3:
    print("Number is 3")
else:
    print("?")

JavaScript (ES2015) / if

const someNumber = 3

// Number is 3
if (someNumber === 1) {
  console.log('Number is 1')
} else if (someNumber === 2) {
  console.log('Number is 2')
} else if (someNumber === 3) {
  console.log('Number is 3')
} else {
  console.log('?')
}

Control Flow: Ternary

Python 3

x = 2
y = 3

# even
print("even" if x % 2 == 0 else "odd")
# odd
print("even" if y % 2 == 0 else "odd")

JavaScript (ES2015) / Conditional Operator

const x = 2
const y = 3

// even
console.log(x % 2 === 0 ? 'even' : 'odd')
// odd
console.log(y % 2 === 0 ? 'even' : 'odd')

Control Flow: Boolean

Python 3

# yes
if "abc" == "abc":
    print("yes")
else:
    print("no")

# yes
if "abc" != "def":
    print("yes")
else:
    print("no")

# no
if True and False:
    print("yes")
else:
    print("no")

# yes
if True or False:
    print("yes")
else:
    print("no")

# yes
if not False:
    print("yes")
else:
    print("no")

# no
if 0:
    print("yes")
else:
    print("no")

# no
if "":
    print("yes")
else:
    print("no")

# no
if []:
    print("yes")
else:
    print("no")

# no
if None:
    print("yes")
else:
    print("no")

# yes
if not not not None:
    print("yes")
else:
    print("no")

JavaScript (ES2015) / Falsy Values

// yes
if ('abc' === 'abc') {
  console.log('yes')
} else {
  console.log('no')
}

// yes
if ('abc' !== 'def') {
  console.log('yes')
} else {
  console.log('no')
}

// no
if (true && false) {
  console.log('yes')
} else {
  console.log('no')
}

// yes
if (true || false) {
  console.log('yes')
} else {
  console.log('no')
}

// yes
if (!false) {
  console.log('yes')
} else {
  console.log('no')
}

// no
if (0) {
  console.log('yes')
} else {
  console.log('no')
}

// no
if ('') {
  console.log('yes')
} else {
  console.log('no')
}

// no
if (undefined) {
  console.log('yes')
} else {
  console.log('no')
}

// no
if (null) {
  console.log('yes')
} else {
  console.log('no')
}

// yes
if (!!!null) {
  console.log('yes')
} else {
  console.log('no')
}

Control Flow: Exceptions

Python 3 / Errors

# foo is not defined
try:
    foo()
except NameError:
    print("foo is not defined")

JavaScript (ES2015) / try...catch

// foo is not defined
try {
  foo()
} catch (error) {
  console.log('foo is not defined')
}

Control Flow: Continue / Break

Python 3 / break/continue

# 1
# 2
# Fizz
# 4
# Buzz
for number in range(1, 101):
    if number == 3:
        print("Fizz")
        continue
    if number == 5:
        print("Buzz")
        break
    print(number)

JavaScript (ES2015) / continue, break

// 1
// 2
// Fizz
// 4
// Buzz
for (let i = 1; i 

String: Length

Python 3 / len

some_string = "abcd"

# 4
print(len(some_string))

JavaScript (ES2015) / length

const someString = 'abcd'

// 4
console.log(someString.length)

String: Interpolation

Python 3

x = "Hello"

# Hello World
print(f"{x} World")

JavaScript (ES2015)

const x = 'Hello'

// Hello World
console.log(`${x} World`)

String: Multiline

Python 3

x = """------
Line 1
Line 2
Line 3
------"""

# ------
# Line 1
# Line 2
# Line 3
# ------
print(x)

JavaScript (ES2015) / Multiline Strings

const x = `------
Line 1
Line 2
Line 3
------`

// ------
// Line 1
// Line 2
// Line 3
// ------
console.log(x)

String: String to Integer

Python 3 / int

string_1 = "1"
number_1 = int(string_1)

# 3
print(number_1 + 2)

JavaScript (ES2015) / parseInt

const string1 = '1'
const number1 = parseInt(string1)

// 3
console.log(number1 + 2)

String: Contains

Python 3 / in

# 2 is in the string
if "2" in "123":
    print("2 is in the string")

# 2 is not in the string
if "2" not in "456":
    print("2 is not in the string")

JavaScript (ES2015) / includes

// 2 is in the string
if ('123'.includes('2')) {
  console.log('2 is in the string')
}

// 2 is not in the string
if (!'456'.includes('2')) {
  console.log('2 is not in the string')
}

String: Substring

Python 3 / [i:j]

some_string = "0123456"

# 234
print(some_string[2:5])

JavaScript (ES2015) / substring

const someString = '0123456'

// 234
console.log(someString.substring(2, 5))

String: Join

Python 3 / join

some_list = ["a", "b", "c"]

# a,b,c
print(",".join(some_list))

JavaScript (ES2015) / join

const someList = ['a', 'b', 'c']

// a,b,c
console.log(someList.join(','))

String: Strip

Python 3 / strip

some_string = "   abc   "

# abc
print(some_string.strip())

JavaScript (ES2015) / trim

const someString = '   abc   '

// abc
console.log(someString.trim())

String: Split

Python 3 / split

some_string = "a,b,c"
some_string_split = some_string.split(",")

# a
print(some_string_split[0])
# b
print(some_string_split[1])
# c
print(some_string_split[2])

JavaScript (ES2015) / split

const someString = 'a,b,c'
const someStringSplit = someString.split(',')

// a
console.log(someStringSplit[0])
// b
console.log(someStringSplit[1])
// c
console.log(someStringSplit[2])

String: Replace

Python 3 / replace

some_string = "a b c d e"

# a,b,c,d,e
print(some_string.replace(" ", ","))

JavaScript (ES2015) / Regular Expressions, replace

const someString = 'a b c d e'

// Only changes the first space
// a,b c d e
// console.log(someString.replace(' ', ','))

// Use / /g instead of ' ' to change every space
console.log(someString.replace(/ /g, ','))

String: Search

Python 3 / search

import re

# Has a number
if re.search(r"\d", "iphone 8"):
    print("Has a number")

# Doesn't have a number
if not re.search(r"\d", "iphone x"):
    print("Doesn't have a number")

JavaScript (ES2015) / Regular Expressions, match

// Has a number
if ('iphone 8'.match(/\d/g)) {
  console.log('Has a number')
}

// Doesn't have a number
if (!'iphone x'.match(/\d/g)) {
  console.log("Doesn't have a number")
}

List/Array: Iteration

Python 3

some_list = [6, 3, 5]

# 6
# 3
# 5
for item in some_list:
    print(item)

JavaScript (ES2015) / forEach

const someList = [6, 3, 5]

// 6
// 3
// 5
someList.forEach(element => {
  console.log(element)
})

List/Array: Length

Python 3 / len

some_list = [1, 4, 9]

# 3
print(len(some_list))

JavaScript (ES2015) / length

const someList = [1, 4, 9]

// 3
console.log(someList.length)

List/Array: Enumerate

Python 3 / enumerate, list

some_list = [6, 3, 5]

# 0 6
# 1 3
# 2 5
for i, item in enumerate(some_list):
    print(f"{i} {item}")

# If you're not using this in a for loop, use list()
# list(enumerate(some_list)) # [(0, 6), (1, 3), (2, 5)]

JavaScript (ES2015) / forEach, map

const someList = [6, 3, 5]

// 0 6
// 1 3
// 2 5

someList.forEach((element, index) => {
  console.log(`${index} ${element}`)
})

List/Array: Contains

Python 3 / in

# 2 is in the list
if 2 in [1, 2, 3]:
    print("2 is in the list")

# 2 is not in the list
if 2 not in [4, 5, 6]:
    print("2 is not in the list")

JavaScript (ES2015) / includes

// 2 is in the list
if ([1, 2, 3].includes(2)) {
  console.log('2 is in the list')
}

// 2 is not in the list
if (![4, 5, 6].includes(2)) {
  console.log('2 is not in the list')
}

List/Array: Reverse

Python 3 / reversed, list

some_list = [1, 2, 3, 4]

# reversed(some_list) is just an iterable.
# To convert an iterable into a list, use list()
reversed_list = list(reversed(some_list))

# 4
# 3
# 2
# 1
for item in reversed_list:
    print(item)

# You can use an iterable instead of a list in a for loop
# for item in reversed(some_list):

JavaScript (ES2015) / reverse

const someList = [1, 2, 3, 4]

someList.reverse()

// 4
// 3
// 2
// 1
someList.forEach(element => {
  console.log(element)
})

List/Array: Range Iteration

Python 3 / range

# 0
# 1
# 2
# 3
for i in range(4):
    print(i)

# 4
# 5
# 6
# 7
for i in range(4, 8):
    print(i)

# 6
# 5
# 4
for i in range(6, 3, -1):
    print(i)

JavaScript (ES2015)

// 0
// 1
// 2
// 3
for (let i = 0; i  3; i--) {
  console.log(i)
}

List/Array: Append with Modification

Python 3 / append

some_list = [1, 2]

some_list.append(3)

# 1
# 2
# 3
for x in some_list:
    print(x)

JavaScript (ES2015) / push

const someList = [1, 2]

someList.push(3)

// 1
// 2
// 3
someList.forEach(element => {
  console.log(element)
})

List/Array: Append without Modification

Python 3 / Unpacking

original_list = [1, 2]

# [original_list, 3] -> [[1, 2], 3]
# [*original_list, 3] -> [1, 2, 3]
new_list = [*original_list, 3]
original_list[0] = 5

# 1
# 2
# 3
for x in new_list:
    print(x)

JavaScript (ES2015) / Spread

const originalList = [1, 2]

// [originalList, 3] -> [[1, 2], 3]
// [...originalList, 3] -> [1, 2, 3]
const newList = [...originalList, 3]
originalList[0] = 5

// 1
// 2
// 3
newList.forEach(element => {
  console.log(element)
})

List/Array: Extend with Modification

Python 3 / extend

some_list = [1]

some_list.extend([2, 3])

# 1
# 2
# 3
for x in some_list:
    print(x)

JavaScript (ES2015) / Spread

const someList = [1]

someList.push(...[2, 3])

// 1
// 2
// 3
someList.forEach(element => {
  console.log(element)
})

List/Array: Extend without Modification

Python 3 / List addition

original_list = [1]
new_list = original_list + [2, 3]
original_list[0] = 5

# 1
# 2
# 3
for x in new_list:
    print(x)

JavaScript (ES2015) / concat

const originalList = [1]
const newList = originalList.concat([2, 3])
originalList[0] = 5

// 1
// 2
// 3
newList.forEach(element => {
  console.log(element)
})

List/Array: Prepend

Python 3 / insert

some_list = [4, 5]

some_list.insert(0, 3)

# 3
# 4
# 5
for x in some_list:
    print(x)

JavaScript (ES2015) / unshift

const someList = [4, 5]
someList.unshift(3)

// 3
// 4
// 5
someList.forEach(element => {
  console.log(element)
})

List/Array: Remove

Python 3 / del

some_list = ["a", "b", "c"]
del some_list[1]

# a
# c
for x in some_list:
    print(x)

JavaScript (ES2015) / splice

const someList = ['a', 'b', 'c']
someList.splice(1, 1)

// a
// c
someList.forEach(element => {
  console.log(element)
})

List/Array: Pop

Python 3 / pop

some_list = [1, 2, 3, 4]

# 4
print(some_list.pop())
# 1
print(some_list.pop(0))

# 2
# 3
for x in some_list:
    print(x)

JavaScript (ES2015) / pop, shift

const someList = [1, 2, 3, 4]

// 4
console.log(someList.pop())

// 1
console.log(someList.shift())

// 2
// 3
someList.forEach(element => {
  console.log(element)
})

List/Array: Index

Python 3 / index

some_list = ["a", "b", "c", "d", "e"]

# 2
print(some_list.index("c"))

JavaScript (ES2015) / indexOf

const someList = ['a', 'b', 'c', 'd', 'e']

// 2
console.log(someList.indexOf('c'))

List/Array: Copy

Python 3 / [i:j]

original_list = [1, 2, 3]
new_list = original_list[:]  # or original_list.copy()

original_list[2] = 4

# 1
# 2
# 3
for x in new_list:
    print(x)

JavaScript (ES2015) / Spread

const originalList = [1, 2, 3]
const newList = [...originalList]

originalList[2] = 4

// 1
// 2
// 3
newList.forEach(element => {
  console.log(element)
})

List/Array: Map

Python 3 / List Comprehensions

original_list = [1, 2, 3]
new_list = [x * 2 for x in original_list]

# 2
# 4
# 6
for x in new_list:
    print(x)

JavaScript (ES2015) / map

const originalList = [1, 2, 3]

// You can also do this:
// const newList = originalList.map(x => { return x * 2 })
const newList = originalList.map(x => x * 2)

// 2
// 4
// 6
newList.forEach(element => {
  console.log(element)
})

List/Array: Map (Nested)

Python 3 / List Comprehensions

first_list = [1, 3]
second_list = [3, 4]
combined_list = [[x + y for y in second_list] for x in first_list]

# 4
print(combined_list[0][0])
# 5
print(combined_list[0][1])
# 6
print(combined_list[1][0])
# 7
print(combined_list[1][1])

JavaScript (ES2015) / map

const firstList = [1, 3]
const secondList = [3, 4]

const conbinedList = firstList.map(x => {
  return secondList.map(y => {
    return x + y
  })
})

// 4
console.log(conbinedList[0][0])
// 5
console.log(conbinedList[0][1])
// 6
console.log(conbinedList[1][0])
// 7
console.log(conbinedList[1][1])

List/Array: Filter

Python 3 / List Comprehensions

original_list = [1, 2, 3, 4, 5, 6]
new_list = [x for x in original_list if x % 2 == 0]

# 2
# 4
# 6
for x in new_list:
    print(x)

JavaScript (ES2015) / filter

const originalList = [1, 2, 3, 4, 5, 6]
const newList = originalList.filter(x => x % 2 == 0)

// 2
// 4
// 6
newList.forEach(element => {
  console.log(element)
})

List/Array: Sum

Python 3 / sum

some_list = [1, 2, 3]

# 6
print(sum(some_list))

JavaScript (ES2015) / reduce

const someList = [1, 2, 3]
const reducer = (accumulator, currentValue) => accumulator + currentValue

// 6
console.log(someList.reduce(reducer))

List/Array: Zip

Python 3 / zip

list_1 = [1, 3, 5]
list_2 = [2, 4, 6]

# 1 2
# 3 4
# 5 6
for x, y in zip(list_1, list_2):
    print(f"{x} {y}")

JavaScript (ES2015) / map

const list1 = [1, 3, 5]
const list2 = [2, 4, 6]

// [[1, 2], [3, 4], [5, 6]]
const zippedList = list1.map((x, y) => {
  return [x, list2[y]]
})

zippedList.forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

List/Array: Slice

Python 3 / [i:j]

original_list = ["a", "b", "c", "d", "e"]
new_list = original_list[1:3]
original_list[1] = "x"

# b
# c
for x in new_list:
    print(x)

JavaScript (ES2015) / slice

const originalList = ['a', 'b', 'c', 'd', 'e']
const newList = originalList.slice(1, 3)
originalList[1] = 'x'

// b
// c
newList.forEach(element => {
  console.log(element)
})

List/Array: Sort

Python 3 / sorted

some_list = [4, 2, 1, 3]

# 1
# 2
# 3
# 4
for item in sorted(some_list):
    print(item)

JavaScript (ES2015) / sort

const someList = [4, 2, 1, 3]

// 1
// 2
// 3
// 4
someList.sort().forEach(element => {
  console.log(element)
})

List/Array: Sort Custom

Python 3 / sorted

some_list = [["c", 2], ["b", 3], ["a", 1]]

# a 1
# c 2
# b 3
for item in sorted(some_list, key=lambda x: x[1]):
    print(f"{item[0]} {item[1]}")

JavaScript (ES2015) / sort

const someList = [['c', 2], ['b', 3], ['a', 1]]

// a 1
// c 2
// b 3
someList
  .sort((a, b) => {
    return a[1] - b[1]
  })
  .forEach(element => {
    console.log(`${element[0]} ${element[1]}`)
  })

Dictionary/Object: Iteration

Python 3 / items

some_dict = {"one": 1, "two": 2, "three": 3}

# one 1
# two 2
# three 3
# NOTE: If you're not using this in a for loop,
# convert it into a list: list(some_dict.items())
for key, value in some_dict.items():
    print(f"{key} {value}")

JavaScript (ES2015) / entries

const someDict = { one: 1, two: 2, three: 3 }

// one 1
// two 2
// three 3

Object.entries(someDict).forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

Dictionary/Object: Contains

Python 3 / in

some_dict = {"one": 1, "two": 2, "three": 3}

# one is in the dict
if "one" in some_dict:
    print("one is in the dict")

# four is not in the dict
if "four" not in some_dict:
    print("four is not in the dict")

JavaScript (ES2015) / hasOwnProperty

const someDict = { one: 1, two: 2, three: 3 }

// one is in the dict
if (someDict.hasOwnProperty('one')) {
  console.log('one is in the dict')
}

// four is not in the dict
if (!someDict.hasOwnProperty('four')) {
  console.log('four is not in the dict')
}

Dictionary/Object: Add with Modification

Python 3

original_dict = {"one": 1, "two": 2}
original_dict["three"] = 3

# one 1
# two 2
# three 3
for key, value in original_dict.items():
    print(f"{key} {value}")

JavaScript (ES2015)

const originalDict = { one: 1, two: 2 }
originalDict.three = 3

// one 1
// two 2
// three 3
Object.entries(originalDict).forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

Dictionary/Object: Add without Modification

Python 3 / Unpacking

original_dict = {"one": 1, "two": 2}
new_dict = {**original_dict, "three": 3}
original_dict["one"] = 100

# one 1
# two 2
# three 3
for key, value in new_dict.items():
    print(f"{key} {value}")

JavaScript (ES2015) / Spread

const originalDict = { one: 1, two: 2 }
const newDict = { ...originalDict, three: 3 }
originalDict.one = 100

// one 1
// two 2
// three 3
Object.entries(newDict).forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

Dictionary/Object: Keys

Python 3 / keys

some_dict = {"one": 1, "two": 2, "three": 3}

# one
# two
# three
# NOTE: If you're not using this in a for loop,
# convert it into a list: list(some_dict.keys())
for x in some_dict.keys():
    print(x)

JavaScript (ES2015) / keys

const someDict = { one: 1, two: 2, three: 3 }

// one
// two
// three
Object.keys(someDict).forEach(element => {
  console.log(element)
})

Dictionary/Object: Values

Python 3 / values

some_dict = {"one": 1, "two": 2, "three": 3}


# 1
# 2
# 3
# NOTE: If you're not using this in a for loop,
# convert it into a list: list(some_dict.values())
for x in some_dict.values():
    print(x)

JavaScript (ES2015) / values

const someDict = { one: 1, two: 2, three: 3 }

// 1
// 2
// 3
Object.values(someDict).forEach(element => {
  console.log(element)
})

Dictionary/Object: Remove

Python 3 / del

some_dict = {"one": 1, "two": 2, "three": 3}

del some_dict["two"]

# one 1
# three 3
for key, value in some_dict.items():
    print(f"{key} {value}")

JavaScript (ES2015) / delete

const someDict = { one: 1, two: 2, three: 3 }

delete someDict.two

Object.entries(someDict).forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

Dictionary/Object: Get

Python 3 / get

some_dict = {"one": 1, "two": 2, "three": 3}

# some_dict["four"] will be KeyError

# Doesn't exist
print(some_dict.get("five", "Doesn't exist"))

JavaScript (ES2015) / ||

const someDict = { one: 1, two: 2, three: 3 }

// Doesn't exist
console.log(someDict.five || "Doesn't exist")

Dictionary/Object: Map

Python 3 / Dictionary Comprehensions

original_list = {"one": 1, "two": 2}

# {"one": 1, "two": 4}
new_list = {key: value * value for key, value in original_list.items()}

# one 1
# two 4
for key, value in new_list.items():
    print(f"{key} {value}")

JavaScript (ES2015) / entries

const originalDict = { one: 1, two: 2 }
const newDict = {}

Object.entries(originalDict).forEach(element => {
  // newDict.element[0] doesn't work - for variable key use []
  newDict[element[0]] = element[1] * element[1]
})

// one 1
// two 4
Object.entries(newDict).forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

Dictionary/Object: Copy

Python 3 / copy

original_dict = {"one": 1, "two": 2, "three": 3}
new_dict = original_dict.copy()
original_dict["one"] = 100

# one 1
# two 2
# three 3
for key, value in new_dict.items():
    print(f"{key} {value}")

JavaScript (ES2015) / Spread

const originalDict = { one: 1, two: 2, three: 3 }
const newDict = { ...originalDict }
originalDict.one = 100

// one 1
// two 2
// three 3
Object.entries(newDict).forEach(element => {
  console.log(`${element[0]} ${element[1]}`)
})

Dictionary/Object: to JSON

Python 3 / json.dumps

import json

some_dict = {"one": 1, "two": 2, "three": 3}

# {
#   "one": 1,
#   "two": 2,
#   "three": 3
# }
print(json.dumps(some_dict, indent=2))

JavaScript (ES2015) / JSON.stringify

const someDict = { one: 1, two: 2, three: 3 }

// {
//   "one": 1,
//   "two": 2,
//   "three": 3
// }
console.log(JSON.stringify(someDict, null, 2))

Dictionary/Object: from JSON

Python 3 / json.loads

import json

some_json = """{
  "one": 1,
  "two": 2,
  "three": 3
}"""

dict_from_json = json.loads(some_json)

# 2
print(dict_from_json["two"])

JavaScript (ES2015) / JSON.parse

const someJson = `{
  "one": 1,
  "two": 2,
  "three": 3
}`

const dictFromJson = JSON.parse(someJson)

// 2
console.log(dictFromJson.two)

Dictionary/Object: Variable Key

Python 3

some_variable = 2
some_dict = {(some_variable + 1): "three"}

# three
print(some_dict[3])

JavaScript (ES2015) / Computed Property Names

const someVariable = 2
const someDict = { [someVariable + 1]: 'three' }

// three
console.log(someDict[3])

Function: Definitions

Python 3

def add(x, y):
    print(f"Adding {x} and {y}")
    return x + y


# Adding 1 and 2
# 3
print(add(1, 2))

JavaScript (ES2015)

const add = (x, y) => {
  console.log(`Adding ${x} and ${y}`)
  return x + y
}
// Adding 1 and 2
// 3
console.log(add(1, 2))

Function: Keyword Arguments

Python 3

def add(a, b, c):
    return a + b + c


# 6
print(add(b=1, a=2, c=3))

JavaScript (ES2015)

const add = ({ a, b, c }) => {
  return a + b + c
}

// 6
console.log(
  add({
    b: 1,
    a: 2,
    c: 3
  })
)

Function: Default Arguments

Python 3

def greet(name, word="Hello"):
    print(f"{word} {name}")


# Hello World
greet("World")

# Goodbye World
greet("World", "Goodbye")

JavaScript (ES2015) / Default Parameters

const greet = (name, word = 'Hello') => {
  console.log(`${word} ${name}`)
}

// Hello World
greet('World')

// Goodbye World
greet('World', 'Goodbye')

Function: Default Keyword Arguments

Python 3

def greet(name="World", word="Hello"):
    print(f"{word} {name}")


# Goodbye World
greet(word="Goodbye")

# Hello Programming
greet(name="Programming")

JavaScript (ES2015) / Default Value

const greet = ({ name = 'World', word = 'Hello' }) => {
  console.log(`${word} ${name}`)
}

// Goodbye World
greet({ word: 'Goodbye' })

// Hello Programming
greet({ name: 'Programming' })

Function: Positional Arguments

Python 3 / Arbitrary Argument Lists

def positional_args(a, b, *args):
    print(a)
    print(b)
    for x in args:
        print(x)


# 1
# 2
# 3
# 4
positional_args(1, 2, 3, 4)

JavaScript (ES2015) / Rest Parameters

const positionalArgs = (a, b, ...args) => {
  console.log(a)
  console.log(b)
  args.forEach(element => {
    console.log(element)
  })
}

// 1
// 2
// 3
// 4
positionalArgs(1, 2, 3, 4)

Function: Variable Keyword Arguments

Python 3 / Keyword Arguments

def func_1(**kwargs):
    for key, value in kwargs.items():
        print(f"{key} {value}")


def func_2(x, *args, **kwargs):
    print(x)
    for arg in args:
        print(arg)
    for key, value in kwargs.items():
        print(f"{key} {value}")


# one 1
# two 2
func_1(one=1, two=2)

# 1
# 2
# 3
# a 4
# b 5
func_2(1, 2, 3, a=4, b=5)

JavaScript (ES2015) / Rest Parameters

const func1 = kwargs => {
  Object.entries(kwargs).forEach(element => {
    console.log(`${element[0]} ${element[1]}`)
  })
}

//  ...args must be the last argument
const func2 = (x, kwargs, ...args) => {
  console.log(x)
  args.forEach(element => {
    console.log(element)
  })
  Object.entries(kwargs).forEach(element => {
    console.log(`${element[0]} ${element[1]}`)
  })
}

// one 1
// two 2
func1({ one: 1, two: 2 })

// 1
// 2
// 3
// a 4
// b 5
func2(1, { a: 4, b: 5 }, 2, 3)

Class: Basics

Python 3 / Classes

class Duck:
    def __init__(self, name):
        self.name = name

    def fly(self):
        print(f"{self.name} can fly")

    # not @classmethod: call a method on an instance
    # duck = Duck(...)
    # duck.create(...)
    #
    # @classmethod: call a method on a class
    # Duck.create(...)
    @classmethod
    def create(cls, name, kind):
        if kind == "mallard":
            return MallardDuck(name)
        elif kind == "rubber":
            return RubberDuck(name)
        else:
            # cls = Duck
            return cls(name)


class MallardDuck(Duck):
    # @property:
    # use duck.color instead of duck.color()
    @property
    def color(self):
        return "green"


class RubberDuck(Duck):
    def __init__(self, name, eye_color="black"):
        super().__init__(name)
        self.eye_color = eye_color

    def fly(self):
        super().fly()
        print(f"Just kidding, {self.name} cannot fly")

    @property
    def color(self):
        return "yellow"


regularDuck = Duck("reggie")
# reggie can fly
regularDuck.fly()

mallardDuck = Duck.create("mal", "mallard")
# mal
print(mallardDuck.name)
# green
print(mallardDuck.color)

rubberDuck = RubberDuck("vic", "blue")
# vic can fly
# Just kidding, vic cannot fly
rubberDuck.fly()
# yellow
print(rubberDuck.color)
# blue
print(rubberDuck.eye_color)

JavaScript (ES2015) / Classes, Method Definitions, static, get, super

class Duck {
  constructor(name) {
    this.name = name
  }

  fly() {
    console.log(`${this.name} can fly`)
  }

  // not static: call a method on an instance
  // const duck = new Duck(...)
  // duck.create(...)
  //
  // static: call a method on a class
  // Duck.create(...)
  static create(name, kind) {
    if (kind === 'mallard') {
      return new MallardDuck(name)
    } else if (kind === 'rubber') {
      return new RubberDuck(name)
    } else {
      // this = Duck
      return new this(name)
    }
  }
}

class MallardDuck extends Duck {
  // get:
  // use duck.color instead of duck.color()
  get color() {
    return 'green'
  }
}

class RubberDuck extends Duck {
  constructor(name, eyeColor = 'black') {
    super(name)
    this.eyeColor = eyeColor
  }

  fly() {
    super.fly()
    console.log(`Just kidding, ${this.name} cannot fly`)
  }

  get color() {
    return 'yellow'
  }
}

const regularDuck = new Duck('reggie')
// reggie can fly
regularDuck.fly()

const mallardDuck = Duck.create('mal', 'mallard')
// mal
console.log(mallardDuck.name)
// green
console.log(mallardDuck.color)

rubberDuck = new RubberDuck('vic', 'blue')
// vic can fly
// Just kidding, vic cannot fly
rubberDuck.fly()

// yellow
console.log(rubberDuck.color)

// blue
console.log(rubberDuck.eyeColor)

Distractions Cause Bad Code

$
0
0

We are barraged by constant distractions, and they are degrading the quality of our work. Our digital society now is set up to allow us to focus for mere minutes at a time, since we are in an attention economy and the sole objective of companies is to capture more of our time. Facebook, Google, and Snapchat are all incentivized to get us to look at our phones many times a day.

Distractions permeate everything, even at work. GitHub has notifications for so many things that if you have work and personal projects on the same account, you will get unrelated notifications all the time. Our employers set us up with ping-pong tables, open offices, and Slack, the open-office of chat tools.

With all these distractions surrounding us and with all these notifications, we are expected to get deep work done. Personally, I cannot. And you cannot, either.

When I’m highly distracted, I’m prevented from entering flow. To my core, I’m a maker. I get such a thrill from making things that are usable and useful, and these distractions cut through that in a way that makes it impossible to have a productive, fulfilling day.

Flow is important if you want to get anything meaningful done. Context switching takes a lot of effort and time and you can only do it so many times in a day. If you are constantly distracted, you will never enter flow and you will never have great, innovative ideas.

If you never concentrate and go deep, you will produce bad work: you will produce bugs, and fail to debug them; you will create security issues; you will cause performance problems; and you will architect things poorly.

You cannot live in a vacuum: you have to talk to users and stakeholders and your teammates and your manager. But that should be done on your schedule, not on theirs (most of the time) so that when you are done talking to them and you have a good idea of what to build, you can go crank out a high quality first version. This version will be on the right track, technically: good architecture, usable performance, well-tested, with minimal bugs. This is a first iteration you can go take to users to get concrete feedback and keep iterating.

Our attention is being squandered and we have an opportunity now to reclaim it. Fight back. Get rid of the ping-pong table; delete Facebook and Snapchat; disable push notifications for emails; build some walls to establish real offices. Setup processes on your teams to give people large chunks of time where they can go deep, for days at a time. Put walls or even hundreds of miles between your employees. Embrace flow, and get some work done. You’ll feel better, I promise, and what you produce will be better as well.


Kotlin Multiplatform IRL: running the same code on iOS and Android

$
0
0

Motivation

At Makery we are huge fans of Kotlin, and we always get excited when anything new comes up about our favorite language. You can imagine that we were pretty hyped after last year's KotlinConf when JetBrains introduced Kotlin Native for multiplatform projects. Writing Kotlin code for iOS was a dream for us and the reality is even better: we can create a shared layer with the business logic and use it on both major mobile platforms. Maybe this is the right cross-platform solution we were waiting for so long? After this point, it was just a matter of time when to start experimenting with MP projects for Android and iOS.

In this article, I'm going to guide you through the labyrinth of Kotlin Multiplatform (Kotlin MP or MP from now on) and show you how we created a simple example app to test out the capabilities of the technology.
Also, I highly recommend to check out the project's source code on GitHub because not all the details are covered in this article, just the most interesting ones.

GitHub example

First, I needed a feature set realistic enough, so we can say: "If it's working with this example it will probably work in production too."

Usually, a baseline for any application is to be able to communicate with the outside world. In a developer's head, this means the capability to fire network requests and process the received data. So with these guidelines in our mind, that's what we came up with:

  1. Authenticate with Basic HTTP Authentication to GitHub
  2. Fetch the list of the user's repositories
  3. Deserialize the result
  4. Display it in a list view

The aim is to implement the first three points from the list as a shared library. Anything outside of that (the UI layer basically) will be written with the platform native SDKs (in Swift for iOS and Kotlin for Android).

Project structure

We started our research by looking for documentation about how we should structure our project. We found a pretty good description here with a simple example app.

This tutorial by JetBrais suggests the following directory structure for an MP project:

project/
├── androidApp/
├── iosApp/
└── shared/
    ├── common/
    ├── android/
    └── ios/

The androidApp and the iosApp folders are for the native applications receiving the compiled artifacts from the shared module.

The common folder in the shared directory should have most of the implementation of the desired business logic and the android and ios directories are there for the platform specific code parts.

Let's look at all these in detail!

Shared module

The shared module is literally a Gradle module with every component we need for our business logic.

The main idea is to look for a library which can help us with HTTP requests and another one to deserialize the received data. Also when we talk about I/O, threading is always something to consider with care.

Fortunately, we can find all these components ready to work with Kotlin MP.

We will use ktor as our HTTP client and kotlinx.serialization for processing/mapping request and response entities. And finally, Kotlin coroutines is JetBrains solution to handle asynchronous computation.

These libraries already support Kotlin Native (which is necessary to work on iOS) and they also work well with Android.

The next step is to wire these parts together and implement our business logic.

Common module

The common module is the heart of our MP project. It's the place where all the magic happens. After setting up everything correctly in our Gradle build files (they are written using Gradle Kotlin DSL) we can start using our libraries to implement our GitHub client.

An important thing to mention here is that you only have access to the Kotlin Stdlib and the libraries you included (and they should support MP projects of course).

class GitHubApiClient(..) {
..
fun repos(successCallback: (List) -> Unit,
          errorCallback: (Exception) -> Unit) {
    launch(ApplicationDispatcher) {
        try {
            val result: String = httpClient.get {
                url {
                    protocol = URLProtocol.HTTPS
                    port = 443
                    host = "api.github.com"
                    encodedPath = "user/repos"
                    header("Authorization", "Basic " + 
                    "$githubUserName:$githubPassword".encodeBase64())
                    }
                }

           val repos = JsonTreeParser(result).read().jsonArray
                    .map { it.jsonObject }
                    .map { GitHubRepo(it["name"].content,
                                      it["html_url"].content)}
                  successCallback(repos)
            } catch (ex: Exception) {
                errorCallback(ex)
            }
        }
    }
}

That's it. Wasn't that hard, right? Let's take a look at the interesting parts:

  • line 3-4 The repos() function needs two callback function for handling the success and the error case.
  • line 5 We start a coroutine with launch{} to push work on to a background thread. But where does the ApplicationDispatcher come from? We defined it in the common module: internal expect val ApplicationDispatcher: CoroutineDispatcher. The expected keyword means that concrete implementation should come from the platform-specific modules which we will check out later.
  • line 7 Our httpClient instance is created by using Ktor.
  • line 8-16 We define the target url and encode the user credentials in the Authorization header according to the HTTP Basic Authentication protocol.
  • line 18-21 Parsing the result looks more manual than what we are used to but currently this is the only solution working with Kotlin Native.

After we finished with the business logic, we need to write the concrete implementation for our expected ApplicationDispatcher variable mentioned above. We have to do this in the platform specific modules (android, ios).

Android module

The android MP module is basically an Android library project. Kotlin's main target was to work well with Java so in the case of Android, there are not a lot of tricks. Our only job here is to provide an actual implementation of the ApplicationDispatcher.

It looks like this:

internal actual val ApplicationDispatcher: CoroutineDispatcher = DefaultDispatcher

Because in this module we have access to the Android SDK and the coroutines API for Android (unlike in the common) we can use the DefaultDispatcher.

iOS module

In case of iOS we have to work a little bit more on our actual implementation:

internal actual val ApplicationDispatcher: CoroutineDispatcher = NsQueueDispatcher(dispatch_get_main_queue())

internal class NsQueueDispatcher(private val dispatchQueue: dispatch_queue_t) : CoroutineDispatcher() {
    override fun dispatch(context: CoroutineContext,
                          block: Runnable) {
        dispatch_async(dispatchQueue) {
            block.run()
        }
    }
}

We create a new class called NsQueueDispatcher which is basically there for to provide a dispatching method for coroutines on Kotlin Native (which is currently behind of the coroutines implemementation on other platforms, using a single-threaded (JS-style) event loop solution).

The next step is to check out how can we integrate the product artifacts of Kotlin MP into our native mobile applications.

Android app

For Android, as it was in previous cases, the work with Koltin MP is hassle-free. You just need to create a regular Android application in Android Studio.

After the initial project setup we will use a Gradle trick called composite build. This feature allows us to access other builds and in this case, build artifacts.

To enable composite build I inserted this line into androidApp/settings.gradle:

includeBuild("../")

In the root project level build script we have already defined the group and the version property:

allprojects {
    group = "com.adrianbukros.github.example"
    version = "0.1.0"
  }

Now we have access to the .aar in the android app module, generated by Kotlin MP.

To include it in our application, simply just add as a dependency:

dependencies {
    ...
    implementation("com.adrianbukros.github.example:android:0.1.0")
}

Now we can easily instantiate and use our GitHubApiClient class:

GitHubApiClient(username, password).repos(
                successCallback = {
                // handle the result
                }, errorCallback = {
                // handle the error
        })

iOS app

Let's add our Kotlin MP library into a native iOS application! In the case of iOS, the story is more complicated than it was with Android. What the Kotlin Native compiler does is basically grab our Kotlin code from the common and ios modules and compile it into an Objective-C framework.

What we have to do is to include this framework in our Xcode project. Here are the steps:

  1. Add a new framework to your project by clicking File/New/Target/Cocoa Touch Framework and call it Shared.
  2. On the project settings page, select the Shared library below targets and go to the Build phases tab.
  3. Delete all the default build phases except Target Dependencies.
  4. Hit the + at the top left of this view and add a New Run script phase.
  5. Open the new phase and insert the following lines:
"$SRCROOT/../gradlew" -p "$SRCROOT/../shared/ios" "$KOTLIN_NATIVE_TASK" copyFramework \
-PconfigurationBuildDir="$CONFIGURATION_BUILD_DIR"

What this script will do for you is call a piece of Gradle code, which will copy the Obj-C lib from the Kotlin MP build folder and insert it into the Xcode build folder for e.g.: /Users/{user}/Library/Developer/Xcode/DerivedData/github-multiplatform-example-bqzqpuwrnlaafjgkqfwustgrqnwa/Build/Products/Debug-iphonesimulator (the $CONFIGURATION_BUILD_DIR marks this).

This way Xcode will have access to our freshly build Kotlin Native library at compile time.

Wait a minute! Do we have the copyFramework task available in Gradle by default? Of course not. Let's add these lines to shared/ios/build.gradle.kts:

tasks {
    "copyFramework" {
        doLast {
            val buildDir = tasks.getByName("compileDebugFrameworkIos_x64KotlinNative").outputs.files.first().parent
            val configurationBuildDir: String by project
            copy {
                from(buildDir)
                into(configurationBuildDir)
            }
        }
    }
}

One thing is still missing. We have to define a $KOTLIN_NATIVE_TASK environment variable's value. Also on the Shared framework target, select Build Settings and hit the + on the middle of the screen, then select Add User-Defined Setting. If you want the build to work with every target, you have to add the following values:

Debug
Any iOS Simulator SDKassembleDebugFrameworkIos_x64
Any iOS SDKassembleDebugFrameworkIos_arm64
Release
Any iOS Simulator SDKassembleReleaseFrameworkIos_x64
Any iOS SDKassembleReleaseFrameworkIos_arm64

Now after building your project, you can import the Shared module into your Swift files and use it like this:

import Shared

class LoginViewController: UIViewController {
    @objc private func loginButtonDidTap() {
           GitHubApiClient(githubUserName: username, 
                           githubPassword: password).repos(
            successCallback:{ [weak self] repos in
                // handle the result
                return StdlibUnit()
            }, errorCallback: { [weak self] error in
                // handle the error
                return StdlibUnit()
        })
    }
}
  • line 1 You have to import the Shared framework before you start using it in your class.
  • line 5 Initialize the GitHubApiClient and call the repos() function on it with a success and an error closure!
  • line 9 and 12 Currently there is limitation in Kotlin Native that if a function doesn't have a return value, you still have to return with StdlibUnit(). I guess it's just one thing that can be omitted in the future versions of Kotlin Native.

Conclusion

Koltin Multiplatform for mobile applications is no question an experimental technology. I think there is still a long way until we can say that it's production ready, but my opinion is that it is a great thing.
There is plenty of room to improve the framework and the tooling, but it's amazing that even at this stage, we were able to develop this small example app. I'm really positive about the project's future and I'm sure that I will give it another try a few months later (maybe after KotlinConf '18?).

Show HN: rwtxt – a space for reading and writing text

$
0
0

rwtxt
Build StatusVersion

A cms for absolute minimalists. Try it at rwtxt.com.

rwtxt is an open-source website where you can store any text online for easy sharing and quick recall. In more specific terms, it is a light-weight and fast content management system (CMS) where you write in Markdown with emphasis on reading.

rwtxt builds off cowyo, a similar app I made previously. In improving with rwtxt I aimed to avoid second-system syndrome: I got rid of features I never used in cowyo (self-destruction, encryption, locking), while integrating a useful new feature not available previously: you can create domains. A domain is basically a personalized namespace where you can write private/public posts that are searchable. I personally use rwtxt to collect and jot notes for work, personal, coding - each which has its own searchable and indexed domain.

rwtxt is backed by a single sqlite3 database, so its portable and very easy to backup. Its written in Go and all the assets are bundled so you can just download a single binary and start writing. You can also try the online version: rwtxt.com.

Usage

Reading. You can share rwtxt links to read them on another computer. rwtxt is organized in domains - places where you can privately or publicly store your text. If the domain is private, you must be signed in to read, even you have the permalink.

You can easily create your own domain in 10 seconds. When you make a new domain it will be private by default, so only you can view/search/edit your own text.

Once you make a domain you will se an option to make your domain public so that anyone can view/search it. However, only people with the domain password can edit in your domain - making rwtxt useful as a password-protected wiki. (The one exception is the /public domain, which anyone can edit/view - making rwtxt useful as a pastebin).

Writing. To write in rwtxt, just create a new page and click "Edit", or goto a URL for the thing you want to write about - like rwtxt.com/something-i-want-to-write. When you write in rwtxt you can format your text in Markdown.

In addition, writing triple backtick code blocks:

```javascript
console.log("hello, world");
```

produces code highlighted with prism.js:

console.log("hello, world");

Deleting. You can easily delete your page. Just erase all the content from it and it will disappear forever within 10 seconds.

Install

You can easily install and run rwtxt on your own computer.

You can download a binary for the latest release.

Or you can install from source. First make sure to install Go. Then clone the repo:

$ git clone https://github.com/schollz/rwtxt.git

Then you can make it with:

And then run it!

Notice

By using rwtxt.com you agree to the terms of service.

License

MIT

Cycling Is Key to Safer, Healthier, More Vital Cities

$
0
0
A mother and two children riding bikes in the Netherlands
Bicycle helmets are uncommon in the Netherlands, even for children.Modacity

In their new book Building the Cycling City: The Dutch Blueprint for Urban Vitality, Melissa and Chris Bruntlett use the example of the Netherlands to show how a cycling culture promotes community building and health.

Frustrated by the obstacles to urban cycling in North America, Melissa and Chris Bruntlett traveled with their two kids from Vancouver to the Netherlands in 2016 to take a deep five-week dive into places that do cycling better. Traversing cities in the Netherlands by bike, they found that cycling is not just a better way to get around; when done right, it leads to healthier, safer, more vibrant, more family-friendly communities. They wrote it all up in their new book, Building the Cycling City: The Dutch Blueprint for Urban Vitality, which provides a guide for cities and communities that want to do cycling right, and for urban cyclists and families who want to learn the keys to cycling as a way of life.

I spoke to the Bruntletts by phone earlier this month about what they’ve learned and about what cities and people in the United States and Canada can learn from the cycling lifestyle in the Netherlands. Our conversation has been lightly edited for space and flow.

Why did you decide to go to the Netherlands and start cycling like the Dutch?

Melissa: We lived so long experiencing cycling in Vancouver and telling a lot of great stories about what building cities for cycling can do. We felt that in order to really tell that story, we needed to go to the place where that is what people enjoy throughout the country and learn what has made them so successful.

Sometimes critics of cycling say it's about “yuppies,” “hipsters,” and “the creative class,” and a force for “gentrification.” But your book talks more about the role of cycling for families and in building stronger communities.

Chris:  Cycling plays a tremendous role in how we now look at cities for families. If it's not safe enough for our 8-year-old son, then it's just simply not good enough. I think for far too long in North America, we've made cycling acceptable for the “fit and the brave” that are willing to suit up and get on their bikes, but there are entire segments of the population that are completely ignored.

M: What people overlook in those conversations are the people that can't drive. For anyone that is not of driving age, cycling is an independent means of transportation, so they don't need to rely on an adult or a bus. When we get older, there is a certain point when we may not be legally allowed to drive anymore. A lot of the conversation in terms of the elderly population is around aging in place. But it also includes the ability to still feel connected to their community, being able to go outside and travel comfortably even with limited mobility. Bicycles play a key part in that. It's less stress on the joints. It also affords elderly people a way move around the places where they have always lived and where they want to continue living. By saying that the infrastructure and the investment in cycling is only for the “fit and the brave” is to completely ignore entire swaths of our population and not afford them the same rights that we afford able-bodied people in their 20s and 30s.

I remember when I was a boy growing up in New Jersey, my brother and I rode our 10 speeds everywhere. LeBron James recently said the thing that most affected his youth growing up in Akron, Ohio, was the ability to ride a bike everywhere. How can cycling help kids get a sense of the city or even a sense of freedom?

C: The Netherlands ranks as having the happiest children in the world. That's not by accident. That's because they give them safe places to cycle and they trust kids to get from place to place without adult supervision. They don't quite have the stranger danger that we have. It's also because their streets are traffic-calmed, there's fewer cars, they're going slower. Kids are given a free reign to get around the city, whether it's by foot, bicycle, or bus.

M: A lot of kids are getting less and less physical activity. And that simple bike ride to school is one of the easiest ways to build in 15-30 minutes of physical activity in a day and help them be a little bit healthier. The Dutch are one of the only advanced countries to reduce their obesity rate. It's not because they have the healthiest diets. It’s because they have built exercise into their day-to-day activity.

I had a colleague from Sweden who visited Toronto and she said she wouldn't ride in Toronto or let her kids ride there, not just because of cars and inadequate bike lanes, but because the cyclists ride too fast—like they're in the Tour de France is how she put it. But as you point out in your book, cyclists in the Netherlands ride at a slower pace. Why is that important?

C: I think that's an indication of how you build your streets. If you build hostile streets, people are going to want to keep up with car traffic and armor themselves up with protective equipment. There’s a differentiation in the Dutch language between a sports cyclist and a utilitarian cyclist; the two phrases loosely translate to “walking with wheels” versus “running with wheels.” The “wheeled walkers” make up the vast majority of people that bike in the Netherlands because they've created these conditions that aren't as hostile, so anybody feels like they can do it.

A bakfiets is often used to transport largo cargo. (Modacity)

Another point you make in the book that's so important is about the different kind of bikes Dutch cyclists ride.

M: They're upright, they’re a little bit slower but they're meant for utility. They're meant to get them comfortably, without any complication, from point A to B, hauling some goods along the way or hauling children. Those utility bikes mean a lot in terms of simplifying the trip. They don't overcomplicate it. The bikes already come with all the gear, you don’t have to worry about buying lights or a bell separately. They're meant for day-to-day transportation.

Why is the bike shop such an important part of the cycling environment?

C: In Vancouver, the cycling shops were still very sport-focused. The staff weren't trained to sell bikes, they usually only have one or two collecting dust in the corner. Because the vast majority of people riding bikes in North America are doing it for sport and recreation, the retail industry is still playing catch up. It's almost become this chicken and egg scenario where they don't see a large market for transportation bikes so they're not putting many resources into developing that market. Bike sharing has kind of changed this a little bit, because people are riding these more upright utilitarian bikes. But if they ultimately want to invest in one, they have a real job on their hands trying to find one.

In the book, you point out that a very small fraction of Dutch riders wear helmets, but the rate of injury and death from cycling is much lower.

M: It's not even a part of the conversation in the Netherlands because they've engineered their streets to take out a lot of the possible stresses and risk of collisions that would inherently make people feel like they need the extra bit of safety. Less than 1 percent of the population in the Netherlands actually wears helmets, because they've got the investment in the safe infrastructure and safety in numbers.

How much do the protected lanes matter, or are there other elements of the infrastructure that are of equal or more importance?

M: In North America, a lot of the times we talk more about protected and fully separated infrastructure. But in the Netherlands, the conversation is actually much more about traffic calming. A lot of their streets don't have speeds over 30 kilometers [approximately 18 miles] an hour. On neighborhood streets, they build in surface treatments that inherently force you to slow down, like laying cobblestone or narrowing the street. Also, because more people bike there, drivers have a more empathetic approach toward cyclists.

A separated cycling track. (Modacity)

Why do you think we have this mentality that the car is more important than a cyclist or even a person?

C: It's been a product of 60 years of post-war planning and propaganda from the automobile industries that streets are for one purpose: moving cars from A to B. Before the Second World War, streets in a lot of Dutch cities were places for connection and community and commerce. Then the post-war planners came along pushed all those public functions into parks and private spaces. The Dutch resisted that urge to modernize their city around the car, so they kind of have this 40 to 50-year head start on us.

You see cycling as a way of connecting diverse groups of people not just hard infrastructure, but an element of what Eric Klinenberg calls the social infrastructure which binds people and communities together.

M: On a bike, you inherently have to make a physical connection with people. In a car, you're separated by glass and steel. But when you're out on a bike, you can actually see everybody, you can say hello to the people that you meet along the way. The side of the road becomes a place to reconnect or have a quick wave in the morning to brighten your day. That's how we need to see our streets—as places for connection as opposed to just a place to pass through.

There's a great chapter in your book titled "Not Sport. Transport." You make the point that riding a bike can connect more people to public transit.

C: The prevailing understanding is that the bicycle doesn't replace the car. Neither does the public transit system. But by combining bikes with trains and buses, and trams, suddenly you've got this game-changing seamless transportation network that can get you from door-to-door often quicker than a car, with a little bit of exercise and social connection. In the Netherlands, that means providing bike parking and infrastructure that leads to the transit stop, and then providing a last-mile solution like bike sharing or rental on the other side of the trip. That ultimately can be used as a strategy to reduce congestion in our cities.

Bike parking at a public transit stop. (Modacity)

What about electric bikes now that they are growing in use?

M: Electric bikes are quite prevalent throughout the Netherlands.

C: One in three new bike purchases is electric-assist.

M: For e-bikes that can go over 40 kilometers [approximately 25 miles] an hour, those see a lot more restrictions than just the regular e-bike.

C: They can't use the bike infrastructure, they have to wear a helmet, and they have to have some kind of insurance and registration.

M: E-bikes give people an option to travel longer distances without worrying about the sweat or having to change clothes or even just the extra exertion and time it takes.

A really nice bike isn't cheap. A Dutch bike can cost more than $1000; a cargo bike several thousand; and an electric bike can be even more. Does this reflect and reinforce our growing urban economic divide?

C: Countries such as Belgium, France, and Germany started incentivizing electric and cargo bikes by providing tax rebates or cash discounts for residents in the knowledge that they will ultimately reduce the amount of capital they have to spend on car infrastructure. Keep in mind that any transportation system that requires somebody to own and maintain a $20-30,000 motor vehicle is the ultimate inequitable solution. Supplementing bike share is another way: In Vancouver, we now have a community pass where you can get an annual membership for $20 if you qualify as a low-income resident. There are solutions out there, they just involve subsidy and incentivizing these purchases.

What are some of the Netherlands’ key steps that cities might be able to emulate to make themselves safer for pedestrians and cyclists?

C: They've developed the Sustainable Safety principles which categorizes and codifies all of these safety ideas into a manual that their street designers and engineers would always have to follow. In North America, we’ve maybe only now reached a point where people are rightfully shocked by the carnage that takes place every day. One of the biggest challenges we have right now in city building is reducing the amount of death and injury on our streets.

You make this great point in the book that there's no “one size fits all.” Every city is different and you can't just copy and paste.

M: Too often cities are like, “Let's just put in a bunch of separated cycle tracks and everyone will be safe.” It's more about looking at how those spaces are used and what you want that space to be used for.  That’s why “8-80” is such a great term. If it's not great for 8 year olds and it's not great for 80 year olds to move around a city on however they want to move, whether that's foot, bike, car, or public transportation, then something needs to change.

R: Your book focuses on the Netherlands and Europe, but do any North American cities get it close to right?

M: Oh, for sure! We've got it pretty good in Vancouver. We talk about New York; it's still slow progress and there's always a battle against parking and less space to work with. We'd like to also point out Calgary, Alberta. In a Canadian context, it’s a pretty unlikely city to be adopting cycling and they did it in a very affordable and quite effective way. So we're all doing it in our own way that makes sense for our city. Also, the Dutch have been doing this for 50 or 60 years, but they made a lot of mistakes along the way. We can now look at that and say that idea didn't work, let's not do that, let's try this better one.

What are the constraints that hold cities back from doing this? What are the things that tend to get in the way of implementing a more vigorous agenda for cycling and safety?

C: Perhaps the most dangerous idea is that cities are done or they've reached the peak. Once the place is declared America's or Canada's “bicycle capital,” politicians feel like their work is done. The job of building the cycling city is never done.

About the Author

Show HN: Guitar Dashboard – Open source music theory explorer for guitarists

Forget the new iPhones, Apple's best product is now privacy

$
0
0

When my friends come to me asking which smartphone or laptop they should buy, I almost always recommend an Apple product–the latest iPhone or MacBook. I recommend these products not just because they are Apple’s best, but because as someone who covers technology for a living, I believe that for most people, Apple offers better products and solutions than its competitors.

Yes, Apple’s products are more expensive than many, “but you get what you pay for,” I frequently explain. In the case of iPhones, they generally have the fastest smartphone processors on the market, sport arguably the best industrial design, and have the most refined and stable operating system. I attribute similar qualities to Apple’s MacBooks, although my recommendation for those also include the line, “you’ll pay a little more up front, but they’ll last you twice as long as a PC laptop.”

Of course, this week Apple introduced its newest iPhones, the iPhone XS, XS Max, and XR. Once again, journalists, analysts, and armchair Apple pundits have taken to social media to state that the new iPhones are Apple’s best products ever.

Yet I no longer think this is a true statement. I now believe the best product Apple offers is intangible, yet far more valuable than a flagship smartphone. The best product Apple has–and the single biggest reason that consumers should choose an Apple device over competing devices–is privacy.

In 2018, no issue is more important than user privacy–or the lack of it. We’re tracked by private industry on an unprecedented scale, with major corporations having so much data about us–much of it gleaned without our knowledge–that they can tell when a teenager is pregnant (and inform the teen’s father) or even predict your future actions based on decisions you haven’t made yet. If you want to be part of this world, designed by advertisers and tech giants, you must relinquish your right to privacy. In other words, we live in a commercial surveillance state.

Well, unless you use Apple’s products.

Apple’s devices and software–and the company’s ethos–are now steeped in user privacy protections that other tech companies would never dream of embracing. And this isn’t a stance Apple has only recently adopted. It is something that has been building for years at the company, starting under Steve Jobs’ leadership and rapidly accelerating under Tim Cook’s reign.

It has only been in the last few years that the perils of online privacy have made their way to the forefront of national conversation, thanks to the Cambridge Analytica scandal and a seemingly unending string of data breaches and hacks. Such events have left consumers rightly worried just how the data tech companies are collecting about them are being used and abused. Yet Apple seems to be the only major tech company that had the foresight–and the will–to begin tackling these issues before they reached a crisis point.

Apple protects your privacy other tech companies won’t

With each recent iteration of iOS and MacOS, Apple has steadily made it harder for third-parties to siphon our data from us. For example, Apple’s Safari browser was the first browser to block third-party cookies by default. In iOS 11 and MacOS High Sierra, Apple went a step further and implemented Intelligent Tracking Prevention, which reduces the ability of advertisers to track your movements around the web.

iOS 12, which ships on the new iPhones announced this week–and will be available for all iPhones and iPads going back to the iPhone 5s and original iPad Air–will allow users to shield themselves even more from the likes of Google and Facebook, whose prying digital eyes tracking us around the web via the embedded Like and Share buttons on web pages. Yep, Facebook and Google can track your movements even if you don’t interact with these buttons–well, until Apple shut that down.

In iOS 12 Apple is also introducing anti-fingerprinting technology in Safari. Fingerprinting is a tracking technology advertisers and data firms use to identify your movements online. They do this by recording characteristics about the device you are using–such as hard drive size, screen resolution, fonts, installed, and more–and then recording a log of that device’s movements. Though fingerprinting doesn’t give the firms access to your name, they know what the owner of a specific device does online and can build a profile around those actions. Well, again, until Apple shut that down with iOS 12 by stripping the unique characteristics of your device away from advertisers’ tracking software. These same benefits are also found in Apple’s latest MacOS Mojave, by the way.

iPhone XSs and iPhone XS Max [Photo: courtesy of Apple]

And Apple’s privacy protections extend to the hardware itself. In iOS 11, Apple introduced tech that physically disables data transfers from a device’s Lightning port to thwart bad actors from using cracking tools to access your data on your device. The company was also the first to introduce full disk encryption on its laptops and desktops with its FileVault technology. With FileVault 2, it turned this encryption on by default on every Mac–making it infinitely harder for someone with access to your Mac to access the data on it without the password. And with its latest MacBook Pros Apple even introduced a hardware backstop where the microphone in the laptop is automatically disabled when the lid is closed–ensuring no one can listen in on you. Further, in MacOS Mojave, apps will now need your explicit permission to access your Mac’s camera and microphone, so malware can’t hijack your camera to creep in on you and advertisers can’t use ultrasonic ad tracking to hear what you are watching on television.

Apple is now also enforcing its strict privacy protection policies on third-party developers. The company recently forced Facebook to remove its Onavo Protect VPN app from the App Store because Facebook was using it to create a log of every website an Onavo user went to (the app is still available for Android devices). Apple is also requiring all of its app developers to have a publicly posted privacy policy on how they use user data if they want their apps to continue to be available in the App Store after their next app update. While simply forcing app developers to publish their privacy policies can’t stop bad developers from abusing user data, it will create a written record that will make it easier to call developers out on privacy violations.

Why Apple can do what Facebook and Google can’t

Once I’ve explained all of these points, some of my friends ask me why can Apple do this. Or to put it another way, why don’t other tech giants like Facebook and Google? A small part of the reason is ideological. So far, Tim Cook hasn’t said or done anything that makes me think his claim that privacy is a fundamental human right isn’t sincere.

But let’s be honest–Apple is a corporation, and a corporation’s goal is to make as much money as possible. In this age of tech giants, user data may be the new black gold, but Apple’s business model doesn’t rely on monetizing such information. Apple makes its hundreds of billions every year by selling physical products that have a high markup. Facebook and Google, on the other hand, have a business model built around advertisers who want as much data about users as possible so they can better target them. This is why, for example, Google would never build the types of anti-tracking and privacy protections into the Android OS that Apple has done with MacOS and iOS. Google–and Facebook–aren’t going to cut off their access to all that black gold.

That’s not to say Apple doesn’t collect user data, It does; it just keeps it to a minimum. Matter of fact, iOS devices send ten times less data to Apple than Android devices send to Google, according to independent researchers. And most of the information an iOS device does send back to Apple is obfuscated with a technique called Differential Privacy, which adds random information to a user’s data before it reaches Apple so the company has no way of knowing that it came from your device.

Apple can still improve its privacy protections

Mind you, all this isn’t to say Apple still can’t improve user privacy. My biggest gripe here is that while Apple uses end-to-end encryption for user passwords and messages, among other data–preventing hackers, authorities, and even Apple itself from accessing such information–it hasn’t expanded that end-to-end encryption to other places that need it. For instance, as many of us move to storing all of our files online, it’s disappointing that Apple doesn’t offer end-to-end encryption for files stored in iCloud Drive and the Notes app.

However, I should add a caveat. I understand why Apple (and Dropbox and other cloud storage providers) doesn’t provide end-to-end encryption for documents stored in the cloud: it’s a trade-off between user experience and security. If Apple were to enable end-to-end encryption on iCloud Drive and a user, such as my 71-year-old mother, did forget a password, Apple would be physically unable to recover their photos, financial documents, and any other data they have stored in iCloud. Still, having an option to enable end-to-end encryption on iCloud Drive would be nice for those of us willing to take the risk.

Another area where Apple could take the lead in improved privacy protections is by restricting which data fields are shared when a user decides to grant an app access to their contacts. Right now, all contacts data–from names to emails, phone numbers, birthdays, and home addresses–are upload to an app developer’s servers, and who knows what those developers do with that information at that point? Additionally, the notes section of a contact card is also uploaded–presenting a big security risk as many people use the notes section on a contact card to write personal information (such as a child’s social security number). Ideally, Apple will restrict contact data uploads to just the names and email addresses in the future–and this is something I expect we’ll see the company do sooner rather than later.

Yet despite these limited gripes, my recommendation stands. When you pay that extra money for an Apple product, you’re not just buying better industrial design or more advanced underlying tech–you’re buying the right to keep more information about yourself to yourself. In an age when data breaches are the norm, data manipulation is a business model, and corporate surveillance of your life is at an all-time high–what better product is there than privacy?

Viewing all 25817 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>