Quantcast
Channel: Hacker News
Viewing all 25817 articles
Browse latest View live

New half-light half-matter particles may hold the key to a computing revolution

$
0
0

Scientists have discovered new particles that could lie at the heart of a future technological revolution based on photonic circuitry, leading to superfast, light-based computing.

Current computing technology is based on electronics, where electrons are used to encode and transport information.

Due to some fundamental limitations, such as energy-loss through resistive heating, it is expected that electrons will eventually need to be replaced by photons, leading to futuristic light-based computers that are much faster and more efficient than current electronic ones.

Physicists at the University of Exeter have taken an important step towards this goal, as they have discovered new half-light half-matter particles that inherit some of the remarkable features of graphene, the so-called “wonder material”.

This discovery opens the door for the development of photonic circuitry using these alternative particles, known as ‘massless Dirac polaritons’, to transport information rather than electrons.

Dirac polaritons emerge in honeycomb metasurfaces, which are ultra-thin materials that are engineered to have structure on the nanoscale, much smaller than the wavelength of light.

A unique feature of Dirac particles is that they mimic relativistic particles with no mass, allowing them to travel very efficiently.  This fact makes graphene one of the most conductive materials known to man.

However, despite their extraordinary properties, it is very difficult to control them. For example, in graphene it is impossible to switch on/off electrical currents using simple electrical potential, thus hindering the potential implementation of graphene in electronic devices.

This fundamental drawback - the lack of tunability - has been successfully overcome in a unique way by the physicists at the University of Exeter. 

Charlie-Ray Mann, the lead author of the paper published in Nature Communications, explains: "For graphene, one usually has to modify the honeycomb lattice to change its properties, for example by straining the honeycomb lattice which is extremely challenging to do controllably.”

“The key difference here is that the Dirac polaritons are hybrid particles, a mixture of light and matter components. It is this hybrid nature that presents us with a unique way to tune their fundamental properties, by manipulating only their light-component, something that is impossible to do in graphene".

The researchers show that by embedding the honeycomb metasurface between two reflecting mirrors and changing the distance between them, one can tune the fundamental properties of the Dirac polaritons in a simple, controllable and reversible way.

"Our work has crucial implications for the research fields of photonics and of Dirac particles", adds Dr Eros Mariani, principal investigator on the study.

"We have shown the ability to slow down or even stop the Dirac particles, and modify their internal structure, their “chirality” in technical terms, which is impossible to do in graphene itself”

"The achievements of our work will constitute a key step along the photonic circuitry revolution".

The study "Manipulating type-I and type-II Dirac polaritons in cavity-embedded honeycomb metasurfaces" (DOI:10.1038/s41467-018-03982-7) was published in Nature Communications. The authors are Charlie-Ray Mann, Bill Barnes and Eros Mariani (University of Exeter), Thomas Sturges (formerly at University of Exeter, now at Warsaw University) and Guillaume Weick (University of Strasbourg).

The project was funded by the Engineering and Physical Sciences Research Council (EPSRC) of the United Kingdom through the EPSRC Centre for Doctoral Training in Metamaterials (Grant No. EP/L015331/1) and through Grant No. EP/K041150/1, by the Leverhulme Trust (Research Project Grant RPG-2015-101), by the Royal Society (International Exchange Grant No. IE140367, Newton Mobility Grants NI160073, and Theo Murphy Award TM160190), by the EU ERC project Photmat (ERC-2016-ADG-742222), by the Agence Nationale de la Recherche (Project ANR-14-CE26-0005 Q-Meta- Mat) and the Centre National de la Recherche Scientifique (Contract No. 6384 APAG).


Piano Genie: An Intelligent Musical Interface

$
0
0

We introduce Piano Genie, an intelligent controller that maps 8-button input to a full 88-key piano in real time:

Piano Genie is in some ways reminiscent of video games such asRock Band andGuitar Hero that are accessible to novice musicians, with the crucial difference that users can freely improvise on Piano Genie rather than re-enacting songs from a fixed repertoire. You cantry it out yourself via our interactive web demo!

There are many ways one could map 8-button controller sequences to full piano performances. We restrict ourselves to 1-to-1 mappings between button presses and notes, giving the user precise control over timing and degree of polyphony but not which notes are played. Even given that restriction, there are many possible mappings; for example, the 8 buttons could map to a fixed scale over a single octave. Instead of using such a fixed mapping, we learn a time-varying mapping using a discrete autoencoder architecture trained on a set of existing piano performances:

A bidirectional LSTM encoder maps a sequence of piano notes to a sequence of controller buttons (shown as 4 in the above figure, 8 in the actual system). A unidirectional LSTM decoder then decodes these controller sequences back into piano performances. After training, the encoder is discarded and controller sequences are provided by user input.

You may have noticed in the above demo video that pitch contours in the piano performance closely mimic the contours of the button sequence. Such behavior is encouraged by a training loss term that penalizes the encoder for violating relative pitch ordering, e.g. if an ascending piano interval maps to a descending button interval. The below image shows (top) a pianoroll for a real piano performance, and (bottom) the 8-button sequence output by the contour-regularized encoder; you can see that the contours of the two sequences match closely:


We trained the model on a set of ~1400 virtuosic performances from theInternational Piano-e-Competition, the same dataset used in Performance RNN. This choice of dataset naturally affects the style of the generated performances; as we have open-sourced the model (see below), you are welcome to train Piano Genie on your own set of MIDI files.

You can try Piano Genie yourself with our web demo. For more details, check out our arXiv paper. The web demo code and model training code are available on GitHub. Additional improvisation videos are available here.

What I loved about Paul Allen

$
0
0

Gone too soon

| October 16, 2018

Paul Allen, one of my oldest friends and the first business partner I ever had, died yesterday. I want to extend my condolences to his sister, Jody, his extended family, and his many friends and colleagues around the world.

I met Paul when I was in 7th grade, and it changed my life.

I looked up to him right away. He was two years ahead of me in school, really tall, and proved to be a genius with computers. (Later, he also had a very cool beard, which I could never pull off.) We started hanging out together, especially once the first computer arrived at our school. We spent just about all our free time messing around with any computer we could get our hands on.

Here we are in school. That’s Paul on the left, our friend Ric Weiland, and me on the right.

Remembering Paul Allen

Paul foresaw that computers would change the world. Even in high school, before any of us knew what a personal computer was, he was predicting that computer chips would get super-powerful and would eventually give rise to a whole new industry. That insight of his was the cornerstone of everything we did together.

In fact, Microsoft would never have happened without Paul. In December 1974, he and I were both living in the Boston area—he was working, and I was going to college. One day he came and got me, insisting that I rush over to a nearby newsstand with him. When we arrived, he showed me the cover of the January issue of Popular Electronics. It featured a new computer called the Altair 8800, which ran on a powerful new chip. Paul looked at me and said: “This is happening without us!” That moment marked the end of my college career and the beginning of our new company, Microsoft. It happened because of Paul.

As the first person I ever partnered with, Paul set a standard that few other people could meet. He had a wide-ranging mind and a special talent for explaining complicated subjects in a simple way. Since I was lucky enough to know him from such a young age, I saw that before the rest of the world did. As a teenager, I was curious about (of all things) gasoline. What did “refining” even mean? I turned to the most knowledgeable person I knew. Paul explained it in a super-clear and interesting way. It was just one of many enlightening conversations we would have over the coming decades.

Paul was cooler than I was. He was really into Jimi Hendrix as a teenager, and I remember him playing Are You Experienced? for me. I wasn’t experienced at much of anything back then, and Paul wanted to share this amazing music with me. That’s the kind of person he was. He loved life and the people around him, and it showed.

Sports was another passion that Paul loved to share with his friends. In later years he would take me to see his beloved Portland Trail Blazers and patiently helped me understand everything that was happening on the court.

Remembering Paul Allen

When I think about Paul, I remember a passionate man who held his family and friends dear. I also remember a brilliant technologist and philanthropist who wanted to accomplish great things, and did.

Paul deserved more time in life. He would have made the most of it. I will miss him tremendously.

"+(mainIIG+1)+"  of  "+listOfObjects.length+"

"+capt01+" "+capt02+""); }else{ $('.InlineImageGalleryText').html("

"+(mainIIG+1)+"  of  "+listOfObjects.length+"

"+capt01+" "+capt02+""); } $('.iigNextArrow').click(function() { iignext(); }) $('.iigPrevArrow').click(function() { iigprev(); }) adjustIIG(); } function iigprev(){ var ww = $(window).width(); mainIIG-=1; if (mainIIGlistOfObjects.length-1){nIIG=0;} $('.InlineImageGalleryNextImage').html(listOfObjects[nIIG].outerHTML); pIIG = mainIIG-1; if (pIIG

"+(mainIIG+1)+"  of  "+listOfObjects.length+"

"+capt01+" "+capt02+""); }else{ $('.InlineImageGalleryText').html("

"+(mainIIG+1)+"  of  "+listOfObjects.length+"

"+capt01+" "+capt02+""); } $('.iigNextArrow').click(function() { iignext(); }) $('.iigPrevArrow').click(function() { iigprev(); }) adjustIIG(); } function adjustPQ(){ var pqW = $('.suspendedR').width(); var pqH = $('.suspendedR').height(); $('.pqrq').css("margin-left",pqW-60); $('.pqrq').css("margin-top",pqH-110); } function adjustIIG(){ var ww = $(window).width(); var iigW = $('.InlineImageGalleryBase').width(); //console.log(ww); //if (iigWlistOfObjects.length-1){nIIG=0;} $('.InlineImageGalleryNextImage').html(listOfObjects[nIIG].outerHTML); pIIG = mainIIG-1; if (pIIG

"+(mainIIG+1)+"  of  "+listOfObjects.length+"

"+capt01+" "+capt02+""); $('.IIGheadline').css("width",iigMW-200); var headH = $('.IIGheadline').height(); if (headH==15){ $('.IIGheadline').css("top",17); }else if(headH==30){ $('.IIGheadline').css("top",9); }else if(headH==45){ $('.IIGheadline').css("top",2); }else{ $('.IIGheadline').css("top",17); } $('.InlineImageGalleryText').css("height", 50); //console.log("h: "+$('.IIGheadline').height()+" w: "+$('.IIGheadline').width()); } } // ADD SUPPORT FOR INLINE INTERACTIVE FILES function formatInlineElems(){ // related content //small-12 medium-12 large-12 columns var listOfRelated = []; $('.promounit').each(function (i, obj) { listOfRelated.push(obj); }); //console.log(listOfRelated); // CTA $('.bgcta').each(function (i, obj) { //console.log($(this).children().eq(1).prop('src')); $(this).after( "" ); var imgOn = $(this).children().eq(1).prop('src'); var imgOff = $(this).children().eq(0).prop('src'); $("#bgctab"+i).mouseenter(function() { $(this).find('img').attr("src", ""+imgOn); }).mouseleave(function() { $(this).find('img').attr("src", ""+imgOff); }); $(this).remove(); $("#bgctab"+i).children(0).attr("width", "100%" ); }); // marginalia right $('.bgsbnoter').each(function (i, obj) { var ww = $(window).width(); if (ww > 980) { $(this).after( "" ); $(this).remove(); $("#bgsbnoter"+i).children(0).attr("width", "100%" ); $("#bgsbnoter"+i).css("position","relative"); $("#bgsbnoter"+i).css("top",$(this).attr("offsetY")); $("#bgsbnoter"+i).css("left",$(this).attr("offsetX")); } else { $(this).remove(); } }); // marginalia left $('.bgsbnotel').each(function (i, obj) { var ww = $(window).width(); if (ww > 980) { $(this).after( "" ); $(this).remove(); $("#bgsbnotel"+i).children(0).attr("width", "100%" ); $("#bgsbnotel"+i).css("position","relative"); $("#bgsbnotel"+i).css("top",$(this).attr("offsetY")); $("#bgsbnotel"+i).css("left",$(this).attr("offsetX")); } else { $(this).remove(); } }); // left image $('.bgli').each(function (i, obj) { //$(this).after( "

"+listOfRelated[0].outerHTML+"

" ); // related content $(this).after( "" ); $(this).remove(); $("#gblib"+i).children(0).attr("width", "100%" ); }); // block quote $('.bgbq').each(function (i, obj) { var PQtext = $(this).html(); var s = PQtext.replace(/\u201C/g, " "); var s2 = s.replace(/\u201D/g, ""); $(this).after( "" ); $(this).remove(); $(".articleContent p").css("z-index", "1"); $(".articleContent p").css("position", "relative"); }); // underline $('.TGNil_underline').each(function (i, obj) { $(this).css("border-bottom","0px solid "+"#f5d840");//+$(this).attr("mycolor")); $(this).css("box-shadow","inset 0 -5px 0 "+"#f5d840");//+$(this).attr("mycolor")); $(this).css("color","inherit"); $(this).css("-webkit-transition","background .15s cubic-bezier(.33,.66,.66,1)"); $(this).css("transition","background .15s cubic-bezier(.33,.66,.66,1)"); }); // highlight $('.TGNil_highlight').each(function (i, obj) { $(this).css("border-bottom","2px solid "+$(this).attr("mycolor")); $(this).css("box-shadow","inset 0 -23px 0 "+$(this).attr("mycolor")); $(this).css("color","inherit"); $(this).css("-webkit-transition","background .15s cubic-bezier(.33,.66,.66,1)"); $(this).css("transition","background .15s cubic-bezier(.33,.66,.66,1)"); }); // inline image gallery $('.bgiig').css("display", "block");//remove after testing $('.bgiig').each(function (i, obj) { $(this).children().each(function () { $(this).attr('style', 'height:100%!important;width:auto!important;'); listOfObjects.push(this); }); var capt01 = listOfObjects[0].getAttribute('data-caption1');//listOfObjects[0].dataset.caption1 var capt02 = listOfObjects[0].getAttribute('data-caption2'); if (capt01==null){capt01="";} if (capt02==null){capt02="";} //thumb.getAttribute('data-bigwidth') $(this).after( "

"+listOfObjects[0].outerHTML+"

1  of  "+listOfObjects.length+"

"+capt01+" "+capt02+"
"); //$(this).after( "

"+listOfObjects[0].outerHTML+"

1  of  "+listOfObjects.length+"

"+listOfObjects[0].dataset.caption1+" "+listOfObjects[mainIIG].dataset.caption2+"
" ); $(this).remove(); //console.log(listOfObjects); //$('.articleContent').css("color","#F00"); }); var IIGlength = listOfObjects.length; //console.log(IIGlength); $('.InlineImageGalleryNext').click(function() { iignext(); }) $('.InlineImageGalleryPrev').click(function() { iigprev(); }) $('.iigNextArrow').click(function() { iignext(); }) $('.iigPrevArrow').click(function() { iigprev(); }) // if( $('.InlineImageGalleryBase').length ) { adjustIIG(); } // if( $('.suspendedR').length ) { adjustPQ(); } //$('.suspended').each(function (i, obj) { // var thisW = $(this).width(); // var thisH = $(this).height(); // if(thisW>thisH){ // $(this).css("width",450); // } //}); } function getCommentCount() { var see = $(".disqusLink").text(); if (see == "") { setTimeout(getCommentCount, 500); } else { var i = parseInt(see); $(".commentCount").each(function (ii, v) { $(v).text(i.toString()); }); } } var scrollwidthDU = function () { var inner = document.createElement('p'); inner.style.width = "100%"; inner.style.height = "200px"; var outer = document.createElement('div'); outer.style.position = "absolute"; outer.style.top = "0px"; outer.style.left = "0px"; outer.style.visibility = "hidden"; outer.style.width = "200px"; outer.style.height = "150px"; outer.style.overflow = "hidden"; outer.appendChild(inner); document.body.appendChild(outer); var w1 = inner.offsetWidth; outer.style.overflow = 'scroll'; var w2 = inner.offsetWidth; if (w1 == w2) w2 = outer.clientWidth; document.body.removeChild(outer); return (w1 - w2); }(); function adjustPrevNext() { var hn = $(".nextText").height(); var hp = $(".prevText").height(); var h = Math.max(hn+24, hp+24); //console.log("prev " + hp + " next " + hn); $("#content_0_nextLink").css("height", h + "px"); $("#content_0_prevLink").css("height", h + "px"); //$(".nextText").css("padding-top", (((h-hn) / 2) + 8) + "px"); //$(".prevText").css("padding-top", (((h-hp) / 2) + 8) + "px"); $(".nextArrow").css("padding-top", (((h-33) / 2)) + "px"); $(".prevArrow").css("padding-top", (((h-33) / 2)) + "px"); } function doOmnitureEvent(event) { try { s.linkTrackVars = "events,eVar9"; s.linkTrackEvents = "event11,event12,event13,event15"; s.events = event; s.tl(true, 'o', 'article'); } catch (e) { //console.log(e.message); //console.log(e.name); } } $(document).ready(function () { // Omniture Stuff var tagsLocation = $(".bottom").offset().top; // was tagsLocation var scrolledToTags = false; var scrolledToComments = false; var tgnbody = $("#tgnbody"); $(".fbShareThis").click(function () { s.eVar9 = "facebook share"; doOmnitureEvent("event11"); }); $(".twitterShareThis").click(function () { s.eVar9 = "twitter share"; doOmnitureEvent("event12"); }); $("#tgnbody").scroll(function () { if (scrolledToTags === false) { if (tgnbody.height() + tgnbody.scrollTop() > tagsLocation) { scrolledToTags = true; doOmnitureEvent("event13"); } } if (scrolledToComments === false) { if (tgnbody.scrollTop() > tagsLocation) { scrolledToComments = true; doOmnitureEvent("event15"); } } }); }); $(document).ready(function () { var clicked = false; var title = document.location.pathname; if ($(".videoEmbed iframe").length > 0) { $(".videoEmbed iframe").iframeTracker({ blurCallback: function () { if (clicked === false) { try { s.events = "event8"; s.eVar6 = title; s.prop9 = title; s.linkTrackVars = 'events,eVar6,prop9'; s.linkTrackEvents = "event8"; s.tl(true, 'o', title); } catch (e) { //console.log(title); //console.log(event); //console.log(e.message); //console.log(e.name); } clicked = true; } return true; } }); } });

ChuChu TV is responsible for widely-viewed toddler content on YouTube

$
0
0

ChuChu TV, thecompany responsible for some of the most widely viewed toddler content on YouTube, has a suitably cute origin story. Vinoth Chandar, the CEO, had always played around on YouTube, making Hindu devotionals and little videos of his father, a well-known Indian music producer. But after he and his wife had a baby daughter, whom they nicknamed “Chu Chu,” he realized he had a new audience—of one. He drew a Chu Chu–like character in Flash, the animation program, and then created a short video of the girl dancing to the popular and decidedly unwoke Indian nursery rhyme “Chubby Cheeks.” (“Curly hair, very fair / Eyes are blue, lovely too / Teacher’s pet, is that you?”)

To hear more feature stories, see our full list or get the Audm iPhone app.

Chu Chu loved it. “She wanted me to repeat it again and again,” Chandar recalls. Which gave him an idea: “If she is going to like it, the kids around the world should like it.” He created a YouTube channel and uploaded the video. In a few weeks, it had 300,000 views. He made and uploaded another video, based on “Twinkle, Twinkle, Little Star,” and it took off. After posting just two videos, he had 5,000 subscribers to his channel. Someone from YouTube reached out and, as Chandar remembers it, said, “You guys are doing some magic with your content.” So Chandar and several of his friends formed a company in Chennai, in the South Indian state of Tamil Nadu, from the bones of an IT business they’d run. They hired a few animators and started putting out a video a month.

Five years on, ChuChu TV is a fast-growing threat to traditional competitors, from Sesame Street to Disney to Nickelodeon. With all its decades of episodes, well-known characters, and worldwide brand recognition, Sesame Street has more than 5 billion views on YouTube. That’s impressive, but ChuChu has more than 19 billion. Sesame Street’s main feed has 4 million subscribers; the original ChuChu TV channel has 19 million—placing it among the top 25 most watched YouTube channels in the world, according to the social-media-tracking site Social Blade—and its subsidiary channels (primarily ChuChu TV Surprise Eggs Toys and ChuChu TV Español) have another 10 million.

According to ChuChu, its two largest markets are the United States and India, which together generate about one-third of its views. But each month, tens of millions of views also pour in from the U.K., Canada, Mexico, Australia, and all over Asia and Africa. Roughly 20 million times a day, a caretaker somewhere on Earth fires up YouTube and plays a ChuChu video. What began as a lark has grown into something very, very big, inflating the company’s ambitions. “We want to be the next Disney,” Chandar told me.

But whereas Disney has long mined cultures around the world for legends and myths—dropping them into consumerist, family-friendly American formats—ChuChu’s videos are a different kind of hybrid: The company ingests Anglo-American nursery rhymes and holidays, and produces new versions with subcontinental flair. The characters’ most prominent animal friend is a unicorn-elephant. Nursery rhymes become music videos, complete with Indian dances and iconography. Kids of all skin tones and hair types speak with an Indian accent.

Many observers respond to ChuChu’s unexpected success by implying that the company has somehow gamed the system. “Whenever we go to the U.S.,” Chandar told me, “people say, ‘You guys cracked the algorithm.’ But we didn’t do anything. The algorithm thing is a complete myth.”

ChuChu does not employ the weird keyword-stuffed titles used by lower-rent YouTube channels. The company’s titles are simple, sunny, consistent. Its theory of media is that good stuff wins, which is why its videos have won. “We know what our subscribers want, and we give it to them,” Chandar says. ChuChu says it adds roughly 40,000 subscribers a day.

That kind of growth suggests that something unpredictable and wild is happening: America’s grip on children’s entertainment is coming to an end. ChuChu is but the largest of a new constellation of children’s-media brands on YouTube that is spread out across the world: Little Baby Bum in London, Animaccord Studios in Moscow, Videogyan in Bangalore, Billion Surprise Toys in Dubai, TuTiTu TV in Tel Aviv, and LooLoo Kids in Iași, a Romanian town near the country’s border with Moldova. The new children’s media look nothing like what we adults would have expected. They are exuberant, cheap, weird, and multicultural. YouTube’s content for young kids—what I think of as Toddler YouTube—is a mishmash, a bricolage, a trash fire, an explosion of creativity. It’s a largely unregulated, data-driven grab for toddlers’ attention, and, as we’ve seen with the rest of social media, its ramifications may be deeper and wider than you’d initially think.

With two small kids in my own house, I haven’t been navigating this new world as a theoretical challenge. My youngest, who is 2, can rarely sustain her attention to watch the Netflix shows we put on for my 5-year-old son. But when I showed her a ChuChu video, just to see how she’d react, I practically had to wrestle my phone away from her. What was this stuff? Why did it have the effect it did?

To find out, I had to go to Chennai.

Uber in Chennai is essentially the same as Uber in Oakland, California, where I live. In the airport I hit a button on my phone, and soon a white sedan pulled up outside. My driver was a student who had come to Chennai to break into Tollywood. Yes, Tollywood: T for Telugu, the language spoken by 75 million people, mostly in South India.

The driver dropped me off just south of the center of the city, in an area of new high-rises that overlook Srinivasapuram, a fishing village on the Bay of Bengal. The village hangs on to the edge of the city, which has been modernizing fast; the government has been trying to relocate the village for years. From my hotel, I watched tiny figures wander over to the Adyar River estuary and squat, staring up at the opulence of the new Chennai.

ChuChu’s headquarters take up the entire first floor of a blue-glass building with bright-yellow stripes. Rows of animators flank a center aisle that houses big, colorful flourishes—weird chairs, structural columns with graffiti on them—signifying “fun tech office!” The work floor is ringed by maybe 10 offices that house the higher-ups. ChuChu says it employs about 200 people.

Chandar met me and led me into a massive conference room. In addition to being the CEO, he composes music for ChuChu. He’s the public face of the company and, at 39, a few years younger than the other four founders, who each hold an equal stake. He sent a young man to get me a coffee, and then we sat down together with his friend B. M. Krishnan, a former accountant and a ChuChu co-founder who is now the company’s chief creative officer.

It was after Krishnan joined the creative team, Chandar told me, that ChuChu really began to achieve global popularity. What made the difference, in part, was that Krishnan decided to rewrite nursery rhymes that he felt didn’t end well or teach good morals. What if Jack and Jill, after falling down while fetching the pail of water, get back up, learn from the resilience of birds and ants, actually get the damn pail of water, and give it to their mom? “It was ‘Jack and Jill 2.0,’ ” Chandar said. “I thought, This is how a nursery rhyme should be.”

After Krishnan rewrote a nursery rhyme, Chandar would then take the lyrics and compose music around them. The songs are simple, but if you hear them once, you will hear them for the rest of your life. Krishnan would storyboard the videos, imagining the sequence of shots, as befitting his youthful dream of becoming a movie director. ChuChu productions are essentially music videos for kids, sometimes featuring Tollywood dance moves that Chandar and Krishnan demonstrate for the animators.

ChuChu TV employs about 200 people; its headquarters, in Chennai, India, is full of big, colorful flourishes that signify “fun tech office!” (Asmita Parelkar)

The ChuChu guys didn’t set out to make educational programming. They were just making videos for fun. How were they to know they’d become a global force in children’s entertainment? As time went on and the staff expanded, the company created a teaching series, called Learning English Is Fun, and worked with a preschool company to develop an app, ChuChu School, that has an explicitly didactic purpose. But generally speaking, Chandar and Krishnan just wanted their videos to be wholesome—to deliver entertainment that perhaps provided kids with a dose of moral instruction.

Krishnan had no experience other than his own parenting. But if whatever he did as a parent worked for his kids, he felt, why wouldn’t it work for everyone? For example, when he taught his kids left from right, he liked to do it in the car, when they were in the back seat. That way, if he pointed left, it was left for them, too. So when ChuChu made a video teaching the left-right concept, it made sure to always show the characters from behind, not mirrored, so that when a character pointed left, the kids watching would understand.

As it became clear that ChuChu videos were being watched by millions of people on six continents, Krishnan and Chandar started branching out into original songs and nursery rhymes, which Krishnan has been writing for the past couple of years. Their content runs the gamut, from an adaptation of “Here We Go Round the Mulberry Bush,” dedicated to tree planting as a way to fight global warming, to “Banana Song” (“Na na na banana / long and curved banana”).

But their most popular video, by far, is a compilation that opens with “Johny Johny Yes Papa,” a take on a nursery rhyme popular in India. With 1.5 billion views, it’s one of the most watched videos of any kind, ever.

In it, a small boy wakes up in the middle of the night and sneaks to the kitchen. He grabs a jar of sugar; just as he’s spooning some into his mouth, the light switches on, and his father walks in.

“Johny Johny?” his father says.

“Yes, Papa?”

“Eating sugar?”

“No, Papa.”

“Telling lies?”

“No, Papa.”

“Open your mouth.”

“Ha ha ha!”

As the son laughs, the song kicks up, and all the kids in the family play and dance together.

When Krishnan watches “Johny Johny,” he sees a universal father-child interaction. The kid tries to get one over on the dad, and when the dad catches him, the parent isn’t actually annoyed. Instead, he’s almost delighted by the sly willfulness. “Inside, the father will be a little happy,” Krishnan said. “This child is having some brains.”

To an adult, the appeal of ChuChu videos is not totally obvious. On the one hand, the songs are catchy, the colors are bright, and the characters are cute. On the other, the animation is two-dimensional and kind of choppy, a throwback to the era before Pixar. And there is a lot of movement; sometimes every pixel of the screen seems to be in motion. Krishnan and Chandar believe that any given shot needs to include many different things a child could notice: A bird flying in the background. Something wiggling. These things hold kids’ attention.

The men know this with quantitative precision. YouTube analytics show exactly when a video’s audience falls off. ChuChu and other companies like it—whatever their larger philosophy—can see exactly what holds a toddler’s attention, moment by moment, and what causes it to drift. If a video achieves a 60 percent average completion rate, ChuChu knows it has a hit. Using these data doesn’t let it “crack the algorithm”; everyone has access to a version of these numbers. Instead, Chandar uses the analytics to tune his and other creators’ intuition about what works.

But what people want changes. As YouTube became the world’s babysitter—an electronic pacifier during trips, or when adults are having dinner—parents began to seek out videos that soaked up more time. So nowadays what’s most popular on Toddler YouTube are not three-minute songs, but compilations that last 30 to 45 minutes, or even longer.

Vinoth Chandar (right) started creating animated videos for his daughter, then decided to share them with the world. He went on to found ChuChu TV with his friend B. M. Krishnan (left) and three others. (Asmita Parelkar)

ChuChu learns many lessons from parents, who provide the company with constant feedback. It heard from parents who questioned the diversity of its characters, who were all light-skinned; it now has two light-skinned and two dark-skinned main characters. It heard from parents who wondered about the toy guns in one video; it removed them. It heard from parents about an earlier version of the “Johny Johny” video, in which the little boy sleeps in a communal bed with his family, as is common in India; in a new version, he has his own room.

ChuChu is largely making things up as it goes, responding—as any young company would—to what its consumers want. Despite the company’s earnest desire to educate the kids who watch its videos, it has not tried to use the lessons generated by previous generations of educational-TV makers. Its executives and developers don’t regularly work with academics who could help them shape their content to promote healthy development of young brains. So what effects are ChuChu’s shows having on kids? How does what it’s producing compare with whatever kids were watching before?

Part of the absurdity of the internet is that these questions get asked only after something metastasizes and spreads across the world. But children’s content reflects its time, and this is how we live.

Fifty years ago, the most influential children’s-television studio of the 20th century, Children’s Television Workshop, came into being, thanks to funding from the Ford Foundation, the Carnegie Corporation of New York, and the United States government. It created an unprecedented thing—Sesame Street—with help from a bevy of education experts and Jim Henson, the creator of the Muppets. The cast was integrated. The setting was urban. The show was ultimately broadcast on public television across America, defining a multicultural ideal at a time of racial strife. It was the preschool-media embodiment of the War on Poverty, a national government solution to the problems of America’s cities.

The 1990s and 2000s saw the growth of cable TV channels targeted at children. With the rise of ubiquitous merchandising deals and niche content, powerful American media companies such as Disney, Turner, and Viacom figured out how to make money off young kids. They created, respectively, the Disney Channel, the Cartoon Network, and, of course, Nickelodeon, which was the most watched cable channel during traditional television’s peak year, 2009–10 (Nielsen’s measurement period starts and ends in September). Since then, however, little kids have watched less and less television; as of last spring, ratings in 2018 were down a full 20 percent from just last year. As analysts like to put it, the industry is in free fall. The cause is obvious: More and more kids are watching videos online.

This might not exactly seem like a tragedy. After all, Americans watch a lot of TV. By the time Nielsen began recording how much time Americans spent in front of TV screens in 1949–50, each household was already averaging four hours and 35 minutes a day. That number kept going up, passing six hours in 1970–71, seven hours in 1983–84, all the way up to eight hours in 2003–04. Viewing finally peaked at eight hours and 55 minutes in 2009–10. Since then, the numbers have been gliding downward, with the most recent data showing Americans’ viewing habits edging under eight hours a day for the first time since George W. Bush’s presidency.

Given this baseline, perhaps it’s fine that phones—and YouTube specifically—are spooning some number of hours from TV. Considered purely as a medium, television seems to have little to recommend it over YouTube. But that would ignore the history of children’s television, which is one of those 20th-century triumphs that people take for granted.

The institutions of the 20th century shaped television into a tool for learning. Researchers, regulators, and creators poured tremendous resources into producing a version of children’s TV that, at the very least, is not harmful to kids and that has even been shown to be good for them under the right conditions.

At first, pretty much everybody agrees, television for kids was bad—dumb cartoons, cowboy shows, locally produced slop. There also wasn’t much of it, so kids often watched whatever adult programming was on TV. In the early 1950s, one teacher enumerated the changes she’d seen in her pupils since they had “got television”: “They have no sense of values, no feeling of wonder, no sustained interest. Their shallowness of thought and feeling is markedly apparent, and they display a lack of cooperation and inability to finish a task.” There were calls for action.

Congress held hearings on television’s possible deleterious effects on children (and adults) in 1952, 1954, and 1955. But not much happened, and the government and TV networks generally settled into a cycle that has been described by the media scholar Keisha Hoerrner. “First,” she has written, “the government castigated the industry for its deplorable programming, then the industry took its verbal punishment and promised to do better, followed by the government staying out of the industry’s business.”

Absent substantive oversight by regulators, in the late 1960s the calls for change entered a new, more creative phase. A group calling itself Action for Children’s Television began advocating for specific changes to programming for young kids. The Corporation for Public Broadcasting was formed in 1968 with government dollars. At the same time, Children’s Television Workshop began producing Sesame Street, and the forerunner to PBS, National Educational Television, began distributing Mister Rogers’ Neighborhood. These shows were tremendously successful in creating genuinely educational television. By the time children’s programming got swept up into the growing cable industry, the big channels had learned a lot from the public model, which they incorporated into shows such as Dora the Explorer and Blue’s Clues.

Add all these factors up, and a surprising thing is revealed: Through the sustained efforts of children’s-TV reformers, something good happened. “Basic scientific research on how children attend to and comprehend television has evolved into sophisticated studies of how children can learn from electronic media,” a literature review by the Kaiser Family Foundation concluded. “This, in turn, has led to the design and production of a number of effective educational television programs, starting with Sesame Street, which many experts regard as one of the most important educational innovations of recent decades.”

Asmita Parelkar

Among the specific findings, researchers demonstrated that Sesame Street improved children’s vocabulary, regardless of their parents’ education or attitudes. Another study found that regular adult TV stunted vocabulary development, while high-quality educational programs accelerated language acquisition. The most fascinating study began in the 1980s, when a University of Massachusetts at Amherst team installed video cameras in more than 100 homes, and had those families and hundreds of others keep a written log of their media diet. Following up more than a decade later, researchers found that “viewing educational programs as preschoolers was associated with higher grades, reading more books, placing more value on achievement, greater creativity, and less aggression.” On the flip side, violent programming led to lower grades among girls, in particular. The team was unequivocal about the meaning of these results: What kids watched was much more important than how much of it they watched. Or, as the researchers’ refutation of Marshall McLuhan’s famous aphorism went, “The medium is not the message: The message is.”

So what message are very young kids receiving from the most popular YouTube videos today? And how are those children being shaped by the videos?

To explore this question, I sought out Colleen Russo Johnson, a co-director of UCLA’s Center for Scholars & Storytellers. Johnson did her doctoral work on kids’ media and serves as a consultant to studios that produce children’s programming. I asked her to watch “Johny Johny Yes Papa” and a few other ChuChu videos and tell me what she saw.

Her answer was simple: “Bright lights, extraneous elements, and faster pacing.” In one of the videos I had her watch, a little boy dances flanked by two cows on a stage. A crowd waves its hands in the foreground. Lights flash and stars spin in the background. The boy and the cows perform “Head, Shoulders, Knees, and Toes,” and as they do, the dance floor lights up, à la Saturday Night Fever. Johnson told me all that movement risks distracting kids from any educational work the videos might do.

For kids to have the best chance of learning from a video, Johnson told me, it must unfold slowly, the way a book does when it is read to a child. “Calmer, slower-paced videos with less distracting features are more effective for younger children,” she said. “This also allows the video to focus attention on the relevant visuals for the song, thus aiding in comprehension.”

To be clear, it’s hard to make videos that very young children can learn from. (Johnson’s doctoral adviser, Georgene Troseth, was part of the team that demonstrated this.) Children under 2 struggle to translate the world of the screen to the one they see around them, with all its complexity and three-dimensionality. That’s why things like Baby Einstein have been debunked as educational tools. Most important for kids under 2 is rich interaction with humans and their actual environments. Older toddlers are the ones who can get something truly educational from videos, as opposed to just entertainment and the killing of time.

But even in relatively limited doses, these videos can affect young toddlers’ development. If kids watch a lot of fast-paced videos, they come to expect that that is how videos should work, which could make other educational videos less compelling and effective. “If kids get used to all the crazy, distracting, superfluous visual movement, then they may start requiring that to hold their attention,” Johnson says.

ChuChu has changed over time—it has slowed the pacing of its videos, focused on the key elements of scenes, and made more explicitly educational videos. But in the wilds of YouTube, the videos with the most views, not the most educational value, are the ones that rise to the top. ChuChu’s newer videos, which have more of the features Johnson looks for, have not had the time to hoover up as much attention, so the old ones keep appearing in YouTube searches and suggestions.

Not to put too fine a point on it, but this is almost precisely the problem that the rest of the media world finds itself in. Because quality is hard to measure, the numbers that exist are the ones that describe attention, not effect: views, watch time, completion rate, subscribers. YouTube uses those metrics, ostensibly objectively, when it recommends videos. But as Theodore Porter, the great historian of science and technology, put it in his book Trust in Numbers, “Quantification is a way of making decisions without seeming to decide.”

In a widely circulated essay last year, the artist James Bridle highlighted the many violent, odd, and nearly robotic children’s videos sitting in the vaults of YouTube. They didn’t seem made by human hands, he wrote, or at least not completely. Some were sadistic or sick. (After Bridle’s essay was published, YouTube undertook an effort to purge the site of “content that attempts to pass as family-friendly, but clearly is not,” and ultimately removed some of the disturbing videos the essay cited.) Others seemed like grab bags of keywords that had been successful for more professional operations: nursery rhymes, surprise eggs, finger family, learning colors. These were videos reverse engineered from whatever someone might enter into the YouTube search box. And though none of these videos has achieved the scale of ChuChu’s work, they definitely get seen, and are occasionally recommended to a child who has been happily watching something more virtuous.

The world of YouTube is vastly different from the world of broadcast television. While broadcasters in the United States and abroad are bound by rules, and the threat of punishment for breaking those rules, far fewer such regulations apply to the creators of YouTube content, or to YouTube itself. YouTube’s default position is that no one under 13 is watching videos on its site—because that’s the minimum age allowed under its terms of service. In addition to its main site, however, the company has developed an app called YouTube Kids. Like normal YouTube, it plays videos, but the design and content are specifically made for parents and children. It’s very good. It draws on the expertise of well-established children’s-media companies. Parents can restrict their children’s viewing in a multitude of ways, such as allowing access only to content handpicked by PBS Kids. But here’s the problem: Just a small fraction of YouTube’s 1.9 billion monthly viewers use it. (YouTube Kids is not available in as many countries as normal YouTube is.)

Little kids are responsible for billions of views on YouTube—pretending otherwise is irresponsible. In a small study, a team of pediatricians at Einstein Medical Center, in Philadelphia, found that YouTube was popular among device-using children under the age of 2. Oh, and 97 percent of the kids in the study had used a mobile device. By age 4, 75 percent of the children in the study had their own tablet, smartphone, or iPod. And that was in 2015. The sea change in children’s content that ChuChu and other new video makers have effected is, above all, profitable.

​Fast-paced, bright videos easily grab young children’s attention—but those qualities may negate any educational benefits the videos could have. (Asmita Parelkar)

To date, YouTube has hidden behind a terms-of-service defense that its own data must tell it is toothless. There don’t seem to be any imminent regulatory solutions to this; by and large, YouTube regulates itself. The company can declare its efforts for children sufficient at any point.

But there is something the company could do immediately to improve the situation. YouTube knows that I—and tens of millions of other people—have watched lots of videos made for toddlers, but it has never once recommended that I switch to YouTube Kids. Think of how hard Facebook works to push users from Instagram onto Facebook and vice versa. Why not try to get more families onto the YouTube Kids app? (Malik Ducard, YouTube’s global head of family and learning, said in a statement that YouTube has “worked hard to raise awareness of the YouTube Kids app through heavy promotion. These promos have helped drive our growth. Today, YouTube Kids has over 14 million weekly viewers and over 70 billion views.”)

If streaming video followed the broadcast model, YouTube—in partnership with governments around the world—could also subsidize research into creating educational content specifically for YouTube, and into how best to deliver it to children. The company could invest in research to develop the best quantitative signals for educational programming, so it could recommend that programming to viewers its algorithm believes to be children. It could fund new educational programming, just as broadcasters have been required to do for decades. (“We are always looking for ways to build the educational content offering in the app in a way that’s really fun and engaging for kids,” Ducard said.)

Other, more intense measures could help, too. For example, how about restricting toddler videos to the YouTube Kids app? Toddler content could, in effect, be forbidden on the main platform. If video makers wanted their work on the YouTube Kids app, they’d have to agree to have it only on the Kids app. This might hurt their view counts initially, but it would keep kids in a safer environment, and in the long term would protect the brand from the inevitable kid-related scandals. The issue of inappropriate videos popping up in YouTube Kids has received a good deal of national press—but society can live with a tiny sliver of bad things slipping through the company’s filters. It’s a small issue compared with kids watching billions of videos on regular YouTube. Why worry about the ways a kid could hurt himself in a padded room, when huge numbers of kids are tromping around the virtual city’s empty lots? (Ducard said that YouTube knows families watch videos together: “That’s why this content is available on our main YouTube site and also on our YouTube Kids app.”)

Maybe better or more refined solutions exist, but if the history of children’s television teaches us anything, it’s that the market alone will not generate the best outcomes for kids. Nor is the United States government likely to demand change, at least not without prompting. Heroes will have to emerge to push for change in the new YouTube’d world, just as they did in the early days of broadcast children’s TV. And not all of those heroes will come from the Western world. They’ll come from all over the globe, maybe even Chennai.

For any well-meaning kids’ producer, one model to look to for inspiration is Fred Rogers—PBS’s Mister Rogers. Rogers didn’t have any deep academic background in children’s development, but early on, he grasped the educational possibilities of the new medium, and in the 15 years between the first children’s show he produced and the national premiere of Mister Rogers’ Neighborhood, he worked constantly to make it better for kids. ChuChu could well be going through a similar stage now. Founded just five years ago, it’s encountering a different, and tougher, media landscape than Rogers did—but his path is still worth following.

​The subsidiary channel ChuChu TV Brazil now has more than 600,000 subscribers. Together, ChuChu’s channels have some 29 million. (Asmita Parelkar)

Watching my daughter play with my phone is a horrifying experience, precisely because her mimicry of adult behaviors is already so accurate. Her tiny fingers poke at buttons, pinch to zoom, endlessly scroll. It’s as though she’s grown a new brain from her fingertips. Most parents feel some version of this horror. Watching them poke and pinch at our devices, we realize that these rectangles of light and compulsion are not going away, and we are all dosing ourselves with their pleasures and conveniences without knowing the consequences.

It took energy and institutional imagination to fix TV for kids. Where will that come from today? Who will pay for the research and, later, the production? How would or could YouTube implement any kind of blanket recommendation?

I worry about these questions a lot, and I wonder if our 21st-century American institutions are up to the challenges they’ve created with their market successes and ethical abdications. Even so, when I visited Chennai, I felt okay about the media future we’re heading into. The toddler videos that ChuChu is posting on YouTube are cultural hybrids, exuberant and cosmopolitan, and in a philosophical sense they presuppose a world in which all children are part of one vast community, drawing on the world’s collective heritage of storytelling. That’s a rich narrative rootstock, with lots of lessons to teach—and right now who’s better poised to make the most of it than ChuChu and companies like it, especially if they can learn from the legacy of American educational TV?

ChuChu’s founders aren’t blind to the power of new-media platforms, or the undertow of crappy YouTube producers, or the addictive power of devices, but the magnitude and improbability of their success more than balance the scales. They don’t quite seem to know why (or how, exactly) they’ve been given this opportunity to speak to millions from an office in South India, but they’re not going to throw away the chance. After all, there are so many stories to tell.

On my last day in the ChuChu offices, Krishnan related a parable to me from the Mahābhārata, a Sanskrit epic. A prince wants to be known as generous, so the god Krishna decides to put him to the test: He creates two mountains of gold and tells the prince to give it all away in 24 hours. The prince begins to do so, parceling it out to people he thinks need it. But as the day ends he’s hardly made a dent in the mountains. So Krishna calls another prince and tells him he has just five minutes to give away the gold. This prince sees two people walking along, goes right over to them, and gives each a mountain. Just like that, the job is done. The moral is unsettling, but simple: Don’t impose limits on your generosity.

Krishnan loves this parable. “This is a story which I can do for ChuChu,” he told me. “But with pizza.”


This article appears in the November 2018 print edition with the headline “Raised by YouTube.”

Life Got You Down? Time to Read The Master and Margarita

$
0
0

‘“And what is your particular field of work?” asked Berlioz. “I specialize in black magic.”’

If many Russian classics are dark and deep and full of the horrors of the blackness of the human soul (or, indeed, are about the Gulag), then this is the one book to buck the trend. Of all the Russian classics, The Master and Margarita is undoubtedly the most cheering. It’s funny, it’s profound and it has to be read to be believed. In some ways, the book has an odd reputation. It is widely acknowledged as one of the greatest novels of the 20th century and as a masterpiece of magical realism, but it’s very common even for people who are very well read not to have heard of it, although among Russians you have only to mention a cat the size of a pig and apricot juice that makes you hiccup and everyone will know what you are talking about. Most of all, it is the book that saved me when I felt like I had wasted my life. It’s a novel that encourages you not to take yourself too seriously, no matter how bad things have got. The Master and Margarita is a reminder that, ultimately, everything is better if you can inject a note of silliness and of the absurd. Not only is this a possibility at any time; occasionally, it’s an absolute necessity: “You’ve got to laugh. Otherwise you’d cry.”

For those who already know and love The Master and Margarita, there is something of a cult-like “circle of trust” thing going on. I’ve formed friendships with people purely on the strength of the knowledge that they have read and enjoyed this novel. I have a friend who married her husband almost exclusively because he told her he had read it. I would normally say that it’s not a great idea to found a lifelong relationship on the basis of liking one particular book. But, in this case, it’s a very special book. So, if you are unmarried, and you love it and you meet someone else who loves it, you should definitely marry them. It’s the most entertaining and comforting novel. When I was feeling low about not being able to pretend to be Russian any more, I would read bits of it to cheer myself up and remind myself that, whatever the truth about where I come from, I had succeeded in understanding some important things about another culture. It is a book that takes your breath away and makes you laugh out loud, sometimes at its cleverness, sometimes because it’s just so funny and ridiculous. I might have kidded myself that you need to be a bit Russian to understand Tolstoy. But with Bulgakov, all you need to understand him is a sense of humor. His comedy is universal.

Article continues after advertisement

Written in the 1930s but not published until the 1960s, The Master and Margarita is the most breathtakingly original piece of work. Few books can match it for weirdness. The devil, Woland, comes to Moscow with a retinue of terrifying henchmen, including, of course, the giant talking cat (literally “the size of a pig”), a witch and a wall-eyed assassin with one yellow fang. They appear to be targeting Moscow’s literary elite. Woland meets Berlioz, influential magazine editor and chairman of the biggest Soviet writers’ club. (Berlioz has been drinking the hiccup-inducing apricot juice.) Berlioz believes Woland to be some kind of German professor. Woland predicts Berlioz’s death, which almost instantly comes to pass when the editor is decapitated in a freak accident involving a tram and a spillage of sunflower oil. All this happens within the first few pages.

A young poet, Ivan Bezdomny (his surname means “Homeless”), has witnessed this incident and heard Woland telling a bizarre story about Pontius Pilate. (This “Procurator of Judaea” narrative is interspersed between the “Moscow” chapters.) Bezdomny attempts to chase Woland and his gang but ends up in a lunatic asylum, ranting about an evil professor who is obsessed with Pontius Pilate. In the asylum, he meets the Master, a writer who has been locked away for writing a novel about Jesus Christ and, yes, Pontius Pilate. The story of the relationship between Christ and Pilate, witnessed by Woland and recounted by the Master, returns at intervals throughout the novel and, eventually, both stories tie in together. (Stick with me here. Honestly, it’s big fun.)

Meanwhile, outside the asylum, Woland has taken over Berlioz’s flat and is hosting magic shows for Moscow’s elite. He summons the Master’s mistress, Margarita, who has remained loyal to the writer and his work. At a midnight ball hosted by Satan, Woland offers Margarita the chance to become a witch with magical powers. This happens on Good Friday, the day Christ is crucified. (Seriously, all this makes perfect sense when you are reading the book. And it is not remotely confusing. I promise.) At the ball, there is a lot of naked dancing and cavorting (oh, suddenly you’re interested and want to read this book?) and then Margarita starts flying around naked, first across Moscow and then the USSR. Again, I repeat: this all makes sense within the context of the book.

“Literature can be a catalyst for change. But it can also be a safety valve for a release of tension and one that results in paralysis.”

Woland grants Margarita one wish. She chooses the most altruistic thing possible, liberating a woman she meets at the ball from eternal suffering. The devil decides not to count this wish and gives her another one. This time, Margarita chooses to free the Master. Woland is not happy about this and gets her and the Master to drink poisoned wine. They come together again in the afterlife, granted “peace” but not “light,” a limbo situation that has caused academics to wrap themselves up in knots for years. Why doesn’t Bulgakov absolve them? Why do both Jesus and the Devil seem to agree on their punishment? Bulgakov seems to suggest that you should always choose freedom—but expect it to come at a price.

One of the great strengths of The Master and Margarita is its lightness of tone. It’s full of cheap (but good) jokes at the expense of the literati, who get their comeuppance for rejecting the Master’s work. (This is a parallel of Bulgakov’s experience; he was held at arm’s length by the Soviet literary establishment and “allowed” to work only in the theatre, and even then with some difficulty). In dealing so frivolously and surreally with the nightmare society in which Woland wreaks havoc, Bulgakov’s satire becomes vicious without even needing to draw blood. His characters are in a sort of living hell, but they never quite lose sight of the fact that entertaining and amusing things are happening around them. However darkly comedic these things might sometimes be.

Article continues after advertisement

While The Master and Margarita is a hugely complex novel, with its quasi-religious themes and its biting critique of the Soviet system, above all it’s a big fat lesson in optimism through laughs. If you can’t see the funny side of your predicament, then what is the point of anything? Bulgakov loves to make fun of everyone and everything. “There’s only one way a man can walk round Moscow in his underwear—when he’s being escorted by the police on the way to a police station!” (This is when Ivan Bezdomny appears, half naked, at the writers’ restaurant to tell them a strange character has come to Moscow and murdered their colleague.) “I’d rather be a tram conductor and there’s no job worse than that.” (The giant cat talking rubbish at Satan’s ball.) “The only thing that can save a mortally wounded cat is a drink of paraffin.” (More cat gibberish.)

The final joke of the book is that maybe Satan is not the bad guy after all. While I was trying to recover my sense of humor about being Polish and Jewish instead of being Russian, this was all a great comfort. Life is, in Bulgakov’s eyes, a great cosmic joke. Of course, there’s a political message here, too. But Bulgakov delivers it with such gusto and playfulness that you never feel preached at. You have got to be a seriously good satirist in order to write a novel where the Devil is supposed to represent Stalin and/or Soviet power without making the reader feel you are bludgeoning them over the head with the idea. Bulgakov’s novel is tragic and poignant in many ways, but this feeling sneaks up on you only afterwards. Most of all, Bulgakov is about conjuring up a feeling of fun. Perhaps because of this he’s the cleverest and most subversive of all the writers who were working at this time. It’s almost impossible to believe that he and Pasternak were contemporaries, so different are their novels in style and tone. (Pasternak was born in 1890, Bulgakov in 1891.) The Master and Margarita and Doctor Zhivago feel as if they were written in two different centuries.

Unlike Pasternak, though, Bulgakov never experienced any reaction to his novel during his lifetime, as it wasn’t published until after he had died. One of the things that makes The Master and Margarita so compelling is the circumstances in which it was written. Bulgakov wrote it perhaps not only “for the drawer” (i.e. not to be published within his lifetime) but never to be read by anyone at all. He was writing it at a time of Black Marias (the KGB’s fleet of cars), knocks on the door and disappearances in the middle of the night. Ordinary life had been turned on its head for most Muscovites, and yet they had to find a way to keep on living and pretending that things were normal. Bulgakov draws on this and creates a twilight world where nothing is as it seems and the fantastical, paranormal and downright evil are treated as everyday occurrences.

It’s hard to imagine how Bulgakov would have survived if the novel had been released. Bulgakov must have known this when he was writing it. And he also must have known that it could never be published—which means that he did not hold back and wrote exactly what he wanted, without fear of retribution. (Although there was always the fear that the novel would be discovered. Just to write it would have been a crime, let alone to attempt to have it published.) This doesn’t mean that he in any way lived a carefree life. He worried about being attacked by the authorities. He worried about being prevented from doing any work that would earn him money. He worried about being unable to finish this novel. And he worried incessantly—and justifiably—about his health.

During his lifetime Bulgakov was known for his dystopian stories “The Fatal Eggs” (1924) and “The Heart of a Dog” (1925) and his play The Days of the Turbins (1926), about the civil war. Despite his early success, from his late twenties onwards, Bulgakov seemed to live with an awareness that he was probably going to be cut down in mid-life. He wrote a note to himself on the manuscript of The Master and Margarita: “Finish it before you die.” J.A.E. Curtis’s compelling biography Manuscripts Don’t Burn: Mikhail Bulgakov, A Life in Letters and Diaries, gives a near-cinematic insight into the traumatic double life Bulgakov was leading as he wrote the novel in secrecy. I love this book with the same intensity that I love The Master and Margarita. Curtis’s quotes from the letters and the diaries bring Bulgakov to life and are packed full of black comedy and everyday detail, from Bulgakov begging his brother not to send coffee and socks from Paris because “the duty has gone up considerably” to his wife’s diary entry from New Year’s Day 1937 which tells of Bulgakov’s joy at smashing cups with 1936 written on them.

As well as being terrified that he would never finish The Master and Margarita, Bulgakov was becoming increasingly ill. In 1934, he wrote to a friend that he had been suffering from insomnia, weakness and “finally, which was the filthiest thing I have ever experienced in my life, a fear of solitude, or to be more precise, a fear of being left on my own. It’s so repellent that I would prefer to have a leg cut off.” He was often in physical pain with a kidney disease but was just as tortured psychologically. There was the continual business of seeming to be offered the chance to travel abroad, only for it to be withdrawn. Of course, the authorities had no interest in letting him go, in case he never came back. (Because it would make them look bad if talented writers didn’t want to live in the USSR. And because it was much more fun to keep them in their own country, attempt to get them to write things praising Soviet power and torture them, in most cases literally.)

It is extraordinary that Bulgakov managed to write a novel that is so full of humor and wit and lightness of tone when he was living through this period. He grew accustomed to being in a world where sometimes the phone would ring, he would pick it up and on the other end of the line an anonymous official would say something like: “Go to the Foreign Section of the Executive Committee and fill in a form for yourself and your wife.” He would do this and grow cautiously hopeful. And then, instead of an international passport, he would receive a slip of paper that read: “M.A. Bulgakov is refused permission.” In all the years that Bulgakov continued, secretly, to write The Master and Margarita—as well as making a living (of sorts) as a playwright—what is ultimately surprising is that he did not go completely insane from all the cat-and-mouse games that Stalin and his acolytes played with him. Stalin took a personal interest in him, in the same way he did with Akhmatova. There’s some suggestion that his relationship with Stalin prevented Bulgakov’s arrest and execution. But it also prevented him from being able to work on anything publicly he wanted to work on.

How galling, too, to have no recognition in your own lifetime for your greatest work. When the book did come out in 1966-7, its significance was immense, perhaps greater than any other book published in the 20th century. As the novelist Viktor Pelevin once said, it’s almost impossible to explain to anyone who has not lived through Soviet life exactly what this novel meant to people. “The Master and Margarita didn’t even bother to be anti-Soviet, yet reading this book would make you free instantly. It didn’t liberate you from some particular old ideas, but rather from the hypnotism of the entire order of things.”

The Master and Margarita symbolizes dissidence; it’s a wry acknowledgement that bad things happened that can never, ever be forgiven. But it is also representative of an interesting kind of passivity or non-aggression. It is not a novel that encourages revolution. It is a novel that throws its hands up in horror but does not necessarily know what to do next. Literature can be a catalyst for change. But it can also be a safety valve for a release of tension and one that results in paralysis. I sometimes wonder if The Master and Margarita—the novel I have heard Russians speak the most passionately about—explains many Russians’ indifference to politics and current affairs. They are deeply cynical, for reasons explored fully in this novel. Bulgakov describes a society where nothing is as it seems. People lie routinely. People who do not deserve them receive rewards. You can be declared insane simply for wanting to write fiction. The Master and Margarita is, ultimately, a huge study in cognitive dissonance. It’s about a state of mind where nothing adds up and yet you must act as if it does. Often, the only way to survive in that state is to tune out. And, ideally, make a lot of jokes about how terrible everything is.

Overtly, Bulgakov also wants us to think about good and evil, light and darkness. So as not to be preachy about things, he does this by mixing in absurd humor. Do you choose to be the sort of person who joins Woland’s retinue of weirdos? (Wall-eyed goons, step forward!) Or do you choose to be the sort of person who is prepared to go to an insane asylum for writing poetry? (I didn’t say these were straightforward choices.) On a deeper level, he is asking whether we are okay with standing up for what we believe in, even if the consequences are terrifying. And he is challenging us to live a life where we can look ourselves in the eye and be happy with who we are. There is always a light in the dark. But first, you have to be the right kind of person to be able to see it.

From The Anna Karenina Fix, by Viv Grokop, courtesy Abrams. Copyright 2018, Viv Groskop.

A new python HTTP service framework from the maker of requests and pipenv

$
0
0

Build Statusimage1image2image3image4image5

The Python world certainly doesn’t need more web frameworks. But, it does need more creativity, so I thought I’d spread some Hacktoberfest spirit around, bring some of my ideas to the table, and see what I could come up with.

importresponderapi=responder.API()@api.route("/{greeting}")asyncdefgreet_world(req,resp,*,greeting):resp.text=f"{greeting}, world!"if__name__=='__main__':api.run()

That async declaration is optional.

This gets you a ASGI app, with a production static files server pre-installed, jinja2 templating (without additional imports), and a production webserver based on uvloop, serving up requests with gzip compression automatically.

Features

  • A pleasant API, with a single import statement.
  • Class-based views without inheritence.
  • ASGI framework, the future of Python web services.
  • The ability to mount any ASGI / WSGI app at a subroute.
  • f-string syntax route declration.
  • Mutable response object, passed into each view. No need to return anything.
  • Background tasks, spawned off in a ThreadPoolExecutor.
  • GraphQL support!

Testimonials

“Pleasantly very taken with python-responder. @kennethreitz at his absolute best.”

—Rudraksh M.K.

“ASGI is going to enable all sorts of new high-performance web services. It’s awesome to see Responder starting to take advantage of that.”

“I love that you are exploring new patterns. Go go go!”

“The most ambitious crossover event in history.”

Installing Responder

$ pipenv install responder
✨🍰✨

Only Python 3.6+ is supported.

The Basic Idea

The primary concept here is to bring the niceties that are brought forth from both Flask and Falcon and unify them into a single framework, along with some new ideas I have. I also wanted to take some of the API primitives that are instilled in the Requests library and put them into a web framework. So, you’ll find a lot of parallels here with Requests.

  • Setting resp.text sends back unicode, while setting resp.content sends back bytes.
  • Setting resp.media sends back JSON/YAML (.text/.content override this).
  • Case-insensitive req.headers dict (from Requests directly).
  • resp.status_code, req.method, req.url, and other familiar friends.

Ideas

  • Flask-style route expression, with new capabilities – all while using Python 3.6+’s new f-string syntax.
  • I love Falcon’s “every request and response is passed into to each view and mutated” methodology, especially response.media, and have used it here. In addition to supporting JSON, I have decided to support YAML as well, as Kubernetes is slowly taking over the world, and it uses YAML for all the things. Content-negotiation and all that.
  • A built in testing client that uses the actual Requests you know and love.
  • The ability to mount other WSGI apps easily.
  • Automatic gzipped-responses.
  • In addition to Falcon’s on_get, on_post, etc methods, Responder features an on_request method, which gets called on every type of request, much like Requests.
  • A production static files server is built-in.
  • Uvicorn built-in as a production web server. I would have chosen Gunicorn, but it doesn’t run on Windows. Plus, Uvicorn serves well to protect against slowloris attacks, making nginx unneccessary in production.
  • GraphQL support, via Graphene. The goal here is to have any GraphQL query exposable at any route, magically.

Future Ideas

  • Cookie-based sessions are currently an afterthought, as this is an API framework, but websites are APIs too.
  • If frontend websites are supported, provide an official way to run webpack.

Facebook lured advertisers by inflating video ad-watch times: lawsuit

$
0
0

Not only did Facebook inflate ad-watching metrics by up to 900 percent, it knew for more than a year that its average-viewership estimates were wrong and kept quiet about it, a new legal filing claims.

A group of small advertisers suing the Menlo Park social media titan alleged in the filing that Facebook “induced” advertisers to buy video ads on its platform because advertisers believed Facebook users were watching video ads for longer than they actually were.

That “unethical, unscrupulous” behavior by Facebook constituted fraud because it was “likely to deceive” advertisers, the filing alleged.

The latest allegations arose out of a lawsuit that the advertisers filed against Mark Zuckerberg-led Facebook in federal court in 2016 over alleged inflation of ad-watching metrics.

Facebook knew by January 2015 that its video-ad metrics had problems, and understood the nature of the issue within a few months, but sat on that information for more than a year, the plaintiffs claimed in an amended complaint Tuesday in U.S. District Court in Oakland.

Facebook disputed that allegation. “Suggestions that we in any way tried to hide this issue from our partners are false,” the company told the Wall Street Journal, which first reported the latest allegations. “We told our customers about the error when we discovered it — and updated our help center to explain the issue.”

Facebook in 2016 revealed the metrics problem, saying it had “recently discovered” it. The firm told some advertisers that it had probably overestimated the average time spent watching video ads by 60 percent to 80 percent. Tuesday’s filing alleged that Facebook had instead inflated average ad-watching time by 150 percent to 900 percent.

Facebook did not immediately respond to a request for comment on the alleged inflation of ad-watching metrics by up to 900 percent.

The plaintiffs are seeking class-action status to bring other advertisers into the legal action, plus unspecified damages. They also want the court to order a third-party audit of Facebook’s video-ad metrics.

Pijul: A Rust based distributed version control system

$
0
0

Saturday, April 21, 2018

I’m pleased to announce the new release of Pijul, version 0.10. This release has been a long time coming, but brings in an important number of new features and stability enhancements.

Stabilisation of the new patch format

Pijul 0.9 introduced a breaking change in the patch format. Even though the representation of data on disk did not change much, the representation of certain kinds of conflict resolutions changed, in a way that wasn’t easy to convert automatically (we have a converter though, which will be released soon).

We put a big effort in stabilising this format for Pijul 0.10, by extending our test suite with the help of kcov (we’re over 90% of test coverage at the time of writing this post).

Implementation of the entire theory: welcome rollback

By simplifying the algorithm and the representation of patches, that change in patch format allowed us to go ahead and get a clearer understanding of what it would take to finish implementing the full set of operations we wanted. The missing bit was rollback, which allows one to create a patch that “does” the opposite of another patch, as far as the working copy is concerned.

The convoluted formulation of the previous sentence is due to the fact that the main datastructure in Pijul is append-only (the implementation is append-only too, even though it mutates stuff for performance reasons), so nothing is really reversible.

However, in all cases, a repository containing a patch p and another patch q created with pijul rollback p, will have the exact same working copy (including conflicts), as a repository containing neither p nor q. (This is modulo possible remaining bugs).

Better SSH support

Pijul 0.10 improves the support of different SSH key formats, and support for agents. I’d like to thank all the Pijul enthusiasts who have generated and sent SSH keys using many different kinds of obsolete crypto algorithms and encodings.

One remaining challenge is SSH agents on Windows. Thrussh-keys implements an agent (client and server), but it currently only works on Unix, due to the use of Unix sockets.

VT100 line editing

The previous versions of Pijul used different line editing crates, which were more or less abandonned by their owners in the last six months. These crates caused the input line to blink in the terminal anyway, for some reason.

Just before releasing Pijul 0.10 this week, I read some VT100 documentation, and decided to write my own library to solve this. It is certainly not the most complete things, but it at least compiles and works at least on Windows, Linux and FreeBSD, handles arrow keys, deletes and backspaces without blinking.

Image of a [VT100 Terminal](https://en.wikipedia.org/wiki/VT100) by [Jason Scott](https://www.flickr.com/people/54568729@N00)

A VT100 Terminal

Image of a [VT100 Terminal](https://en.wikipedia.org/wiki/VT100) by [Jason Scott](https://www.flickr.com/people/54568729@N00)

Also, I’ve started to know and love the VT100, a very cool machine released in 1978, on which most terminal emulators we have today are based.

The road ahead

Aside from more testing, there are still a few things we’d like to have before announcing our first 1.0 version.

  • More robust command-line IO, not all things work correctly with Unix pipes (for instance). My line crate is probably a bit young, and would need more polishing and more features. Contributions welcome!

  • Partial repository checkouts. This is one the coolest features of Pijul, which will hopefully allow it to scale to much bigger repositories than others.

  • Faster and more general diff. The diff currently implemented is a naive one, in O(n^2), and this is slow. We’d love to implement Myers Diff. Since the things added to a Pijul patch for each line addition or deletion are not obvious, we’d like a more generic interface to this, so that diff algorithms could be written in a more standard way.

Get involved!

There are many good first tasks to get acquainted with the Pijul project. Some, mentionned in this post, are almost independent from the main project:

  • Implement SSH agents on Windows.
  • Improve line, maybe using the new cool VT100 feature of Windows 10.
  • Review Thrussh, and help us get it to the highest standards of security, speed and ease of use. This is currently blocked on us converting the repository on the Nest to the last patch format, which will be done very soon.
  • Add Serde support to Sanakirja.

Others are related to the command-line, and don’t touch the theory very much:

  • Help us implement a more consistent CLI, see the ongoing discussion about this.
  • Once we have a trait for diffs, implement different diff algorithms. They’re among the most fun things to write.

Evergreen: a React UI Framework built by Segment

$
0
0

Works out of the box

Evergreen contains a set of polished React components that work out of the box.

Flexible & composable

Evergreen components are build on top of a React UI Primitive for endless composability.

Enterprise-grade

Evergreen features a UI design language for enterprise-grade web applications.

WEF: The Global Competitiveness Report 2018

$
0
0

In the midst of rapid technological change, political polarization and a fragile economic recovery, it is critical that we define, assess and implement new pathways to growth and prosperity.

The 2018 edition of the Global Competitiveness Report represents a milestone in the four-decade history of the series, with the introduction of the new Global Competitiveness Index 4.0. The new index sheds light on an emerging set of drivers of productivity and long-term growth in the era of the Fourth Industrial Revolution. It provides a much-needed compass for policy-makers and other stakeholders to help shape economic strategies and monitor progress.

Show HN: All Clear Weather – creating new live weather datasets with ML

$
0
0

A new weather app that emphasizes clear communication of weather data,

up-to-date current conditions, and experiments with phone sensor data

A modern weather app

Simple text and useful animations

Current conditions and today's forecast are shown when you open the app. Hourly and daily forecasts are one tap away. Widgets and notifications are available too!

We animate the weather conditions to help you know right away what to expect and to more accurately display "percent chance" cases.

Up-to-date current conditions

Sometimes the current conditions shown in weather apps are old or wrong. All Clear lets you send in your weather, so that you can fix the display to be correct for yourself - and for others nearby!

This data could also help in future weather forecast models.

Sensor data for weather experiments

All Clear can use the environmental sensors in phones, such as barometers and thermometers, to collect live, useful atmosphere data. We are conducting research and plan to eventually do data assimilation into weather models, hopefully demonstrating an ability to increase forecast accuracy.

Sky pictures for machine learning

This is a new experiment! You can send in photos of the sky with labels, and we'll use the data to train a machine learning classifier.

The goal is to automatically label weather features in outdoor photos: this could be useful for analyzing both historical and current weather.

We aim to make all outdoor photos into usable weather data. Download All Clear and check out this experiment! We'll keep you up-to-date with our progress in the Pictures section of the app.

Ask HN: What's your favorite elegant/beautiful algorithm?

$
0
0
Myers Diff. It hits the trifecta:

1. An irrefutable improvement over the state of the art.

2. A short paper which can be understood after only a dozen readings or so. I mean really understood, with visualizations and everything.

3. A practical algorithm which can be implemented by nearly anyone (even me).

How to Host or Attend a Tiny Conference

$
0
0

Every year since 2012, I co-hosted an annual “tiny” conference that we call “Big Snow Tiny Conf”.  11 other business owners and myself travel to a beautiful ski/snowboarding resort and stay together in a house.  In between ski/snowboard runs, we’re talking business strategy, giving and getting advice.

It’s an opportunity for you and your business to go on vacation, get relaxed, get inspired, and get energized, all in one week!

I love the “Tiny Conference” so much, that I’m attending 4 of them this year, ranging from beach resorts, to island getaways, to snowboarding retreats. I find them so incredibly valuable, both professionally and personally.

Hence, why I created this page!  I want more people to host or attend more “Tiny Confs”.  If you’ve been looking to attend one, you’ll find a list of “Tiny Confs” that I know of below.  Better yet, if you want to create one, you’ll find my guide to doing so!

Table of Contents

Why you (yes you!) should start a Tiny Conf

Most of us work on the web.  Many of us have friends and colleagues from all over the world (remote FTW!).

But there’s a certain magic that happens when you meet up in real life, in a real place, and enjoy real experiences together (experiences that happen outside of Slack chats and Zoom calls). More on what this magic is all about in a second.

First, I want to sell you on the idea of you starting your own Tiny Conf. Yes, you! Here’s what’s in it for you:

  • You can invite whoever you want.  Pick your favorite people or invite those you want to get to know better.
  • You can host it wherever you want.  Pick your favorite destination or do it in your home town.
  • You can structure it however you want.  I’ll give you my experiences and preferences below, but it’s your conf! Your rules!
  • By their nature, Tiny Confs are, well, tiny, which means spots are limited.  If you can’t attend an existing one, that’s more of a reason to start your own!
  • Starting and running a (not-tiny) “Conference” is big, scary, and requires a ton of time, money, and logistics.  Running a Tiny Conf is none of those things.  It’s easy!

Seriously.  It’s not difficult to do.  It’s super valuable, both for the host and the attendees.

Bottom line:  I want there to be more Tiny Confs, in more places, for more people to connect.  So please, start one.

And once you do, put up a landing page for it, then send me the link and I’ll include it in the list of Tiny Confs at the bottom of this page 🙂

Tiny Conferences vs. Big Conferences

Having attended over 10 Tiny Confs as of this writing, I have decided to “double down” on them as my preferred conference type to invest my time, money, and travel points on (points could mean credit card points or good will points from your spouse 😉 )

I find that I get so much more value and enjoyment from conferences with less than 30 people than I do from most of the 200+ attendee conferences I’ve been to.  Don’t get me wrong, there are some excellent, well-run, “real” business conferences with plenty value.

But if I compare and evaluate them based on this criteria: “Did I get what I wanted out of this trip?” … “Will my business benefit because I went?” … “Did I have fun and enjoy my time there?” … “Would I go again?”, then I choose Tiny Confs every time.

Here’s why:

  • More fun, relaxation, and enjoyment
    Tiny Confs are not all fun and games, all the time.  More on this in a moment.  But, fun/activity time, down-time, and casual “hang out” time is a key ingredient of Tiny Confs.  It’s a chance to get away from the day-to-day hustle, and yes, also a break from the routines of home life.  There’s real value in that, especially for entrepreneurs who have a tendency to overwork ourselves.
  • Deep connections with awesome people
    I often come away from large conferences feeling like I didn’t get a chance to talk to all of the people I wanted to talk to, or for long enough.  At Tiny Confs, we have plenty of time for deep dive conversations, follow-up conversations, groups and 1-on-1s—all with an awesome group of fellow founders I respect.  These talks and connections simply don’t happen at larger conferences, and they don’t happen in Slack or on Zoom for that matter.
  • Space to re-think everything
    Personally, I use these trips as a chance to step back and re-assess everything I’m working on in my business.  Even if my plans were set, it’s a chance to ask “what if I did this differently?” and game those out with help from trusted advisors who’ve gone through or are going through similar paths.  The time away, mixed with fun activities to fill the day, really gives me the space for my mind to absorb a new way of looking at my business.  I always come home with a significant change of plans and clarity for my business (for the better).
  • A year-to-year mastermind group
    One of my favorite developments has been that my Tiny Conf trips have a bunch of returning attendees. They’ve become sort of an annual 3-day mastermind session, where we can get an update on how things have progressed since our last trip, and plans for the upcoming year.  We stay in touch in Slack for mid-year updates and questions too.  Tiny Confs have basically become my primary mastermind groups these days.
  • It’s a great deal!
    This is no joke.  When you’re in a group of 8+ people, there’s a ton of value to be had.  You can book the nicest AirBnB in the area and the split cost is very maneageable.  Ski resorts and beach resorts tend to offer bulk pricing discounts and early-bird booking discounts too.  I end up spending a lot less on Tiny Confs than I do on traveling to larger conferences or family vacations (don’t worry honey, we’re not ditching the family vacations!).

8 Keys to Running a Great Tiny Conf

Here are some tips based on my experiences attending and hosting Tiny Confs through the years.  Keep in mind, this is what worked for us.  If you’re running your own, make it how you want!

1. Keep it “Tiny”

My favorite trips have had around 10-12 attendees. I found this to be small enough to spend plenty of time to deep-dive with everyone on the trip.  This is large enough to enjoy the full-group sessions, but small enough to break into small groups of several 2-4 person conversations at times.

Also, if you plan to do attendee talks, it’s difficult to fit everyone in when there’s more than 12 talks to get through.

If you plan on housing everyone in a single AirBnB, ~12 will probably be your limit on number of beds (more on sleeping arrangements below).

I have attended ~30-person conferences, which I’d still consider to be in the “Tiny” category, and these can work well too.  But you’ll need to make different sleeping arrangements and probably spend more time planning and coordinating this large of a group.

2. Make it application & invite-only

For the first few years of running our Tiny Conf, we made registrations open to anyone.  For the most part, this worked out well, especially in the early days when we simply didn’t know enough people to fill all the spots.

But as the years went on and our networks grew, we found that it adds more value for everyone when we hand-curate the attendee list.

I suggest you create an application form with a few questions about their business type, size, website, years in business, and anything else you think is relevant.  Then take some time to check each person out before inviting them.

While it’s great to have returning attendees, it’s also good to reserve some spots for new faces as well.  That means you’ll probably need to not-invite-back some people, which can be awkward, but I think necessary.  My hope is that this page can serve as a go-to referral place to send people if they can’t attend your conference.

3. Pick a fun location with an activity

As the organizer, you should choose a destination that you know and love, along with an activity you enjoy.

I’ve been co-organizing Snowboarding trips to some of my favorite Vermont destinations.  A friend started a Tiny Conf on Martha’s Vinyard since he’s been going there all his life and knows all the cool spots.  Another friend runs one at his favorite all-inclusive resort in Cabo San Lucas, Mexico.  Pick your favorite stomping grounds and invite your favorite people!

Having a main activity as the “focal point” of the Tiny Conf helps.  We spend two half-days skiing and snowboarding.  In Mexico, it was swimming, eating and drinking.  At Matha’s Vinyard it was biking and exploring the island.  I tend to prefer some physical activity to balance out the sitting around chatting business all day.  But the chats don’t need to stop…  Some of my best mastermind sessions happened on the chair-lifts on the mountain!

4.  “Sessions”, not “Talks”

Traditional conferences have speaks who give talks from up on stage.  At Tiny Confs, I think of them more like “Sessions”, with one person leading an open discussion for everyone to participate in.

With a group of 12 or less, all attendees should be encouraged to lead sessions.  We tell people they can use their ~30-minute session to talk about anything they want.  Some people share an interesting tactic they’ve had success with.  Some ask for feedback on their new product idea.  Some ask for advice on a key strategic decision they’re wrestling with.  Some people prepare slides, some just have notes, some have nothing and just talk.  All of these can work!

Note:  Your session will definitely not be the only time you get to talk about “your stuff”.  You’ll quickly find that when you get a group of entrepreneurs in a house together, we dive right in with the shop talk and it doesn’t stop until we leave.

One of the coolest things is that there’s plenty of time for follow-up conversation after the main session.  For example, someone might talk about something on night 1, then dig in more on that while on the chairlift or in the pool the next day.  There’s time for both large-group discussion and small-group or 1-on-1 feedback.  You won’t get these opportunities at larger conferences or workshops.

5. Plan everything in advance

Since a tiny conf can feel pretty small and laid back (and it is), it can be tempting to keep things loose and play things by ear.

Don’t do that.

Better to have a ready-made plan in place before the trip starts then to have to figure things out on the fly. As the organizer, this is your responsibility.

The reason this is important is that it can be super distracting to have to pause high-quality conversations and distract everyone with “where are we going to eat tonight?”, etc.

Pre-plan how you’ll do each meal:

  • Cooking in?  Which groceries will you need to buy?
  • Going out?  Which restaurant?  Are they open?  Should you book a reservation?
  • Ordering in?  From where?  Do they deliver to where you’re staying?
  • Hire a chef?  How much would that cost?

Plan the schedule.  Here’s my suggested (rough) itinerary

  • Arrival day:  Easy dinner.  Take-in pizza or similar.  Round of introductions.  Casual chats and hangouts.
  • Full conference days:  Sessions in the morning or evening, leave a big space for open hangouts or doing the activity (snowboading, biking, beach, etc.)
  • Final evening:  Group dinner out at a nice restaurant
  • Fill in the gaps as needed…

Plan the list of sessions and when each person will go.  This gives folks a heads up on when they should be ready and when they can expect to have their session.  Also, as the organizer, try and moderate the time.  It’s very easy to drift into lengthy group discussions, which eat into others’ session time.  Remember, there will be plenty of time to follow-up later, during casual hangout time.

6.  Manage the money

This probably sounds obvious, but you need to have a plan to manage the cashflow for a Tiny Conf.

During the planning stage, try and put together a ballpark budget for things like the AirBnB cost, groceries, activity costs (ski lift tickets, beach passes, etc.), and anything else that will be “included” in the price of each person’s admission.

Don’t feel like you have to take a big risk and potentially lose money as the organizer.  I’ve seen two strategies work equally well:

Strategy 1:  Set price, budget accordingly

Based on your budgeting, come up with a flat fee to charge people up-front.  Then it’s up to you as the organizer(s) to pay for all the things.  Ideally, you’ll pad the pricing a bit so that you’ll have extra in case someone cancels and needs a refund, or other unexpected things come up.  After several years of running ours, we’ve had surplus cash, which we’ve been able to use to keep prices down, book nicer houses, and book in advance.

Strategy 2:  Split the cost as you go

If this is your first time organizing a trip, it can be difficult to predict your costs.  Do what my friend Ben did at the Tiny Conf he organized at Martha’s Vinyard:  Start by guauging interest with a group of friends.  Then when you have enough “Yes, I’ll go”, book the AirBnB.  Then ask everyone to pay their share of that cost.  Then during the trip, the organizer paid for all meals on his card.  Then after the trip, he added up those food costs and asked everyone to reimburse for the total.

Either way, there shouldn’t be a scenario where the organizer is losing their shirt on the trip.  Nor should any attendee’s costs skyrocket unexpectedly.

7.  Keep in touch before, during and after the trip

I highly recommend starting a Slack group for your Tiny Conf.

Before the trip, it’s always nice to chat and get to people a bit before you meet in person.  Did I mention entrepreneurs are a chatty bunch?

During the trip, the Slack has proven essential for us.  We’ve used it to communicate travel arrangements, flight delays, ride shares.  We’ve used it to communicate while we’re spread throughout a ski resort,  And we’ve used it to share pics with everyone.

After the trip, we keep the Slack open and active throughout the year.  This is where the ongoing mastermind aspect comes into play.  We post updates, questions, and continue the conversations all year long until we meet again at another Tiny Conf.

8.  You do you!

I can’t stress this enough.  I’ve been to several Tiny Confs organized by several different people and all had their own ways of doing things.  And all were great!

Take it upon yourself to design your Tiny Conf just like you’d design a product to scratch your own itch.  Make it great for you and great for your type of people.


Frequently Asked Questions about Tiny Confs

In case you’re wondering…

When is the best time to schedule a Tiny Conf?

This really depends on the destination.  The main rules of thumb would be:  Try and avoid holidays (most people won’t be able to make it).  And be aware of smaller holidays, which tend to have higher rates at resorts and hotels.  Avoid those too.

It helps to go on weekdays and not weekends.  First, rates tend to be less expensive.  But also, you’ll attract higher quality business owners who have more flexible schedules than those who have full-time jobs.

How do I attract people to my Tiny Conf?

If you don’t have an audience or network, I’d suggest starting in two places:

Attend bigger conferences where your type of people attend.  Meet folks there and float the idea of your Tiny Conf by them, then follow up with personal invites after you met.

Be active in online communities, forums, Slacks, etc. with likeminded business owners.  There are plenty of those to get involved in.  Float the idea around those places to build an interest list.

Lastly, if you have a landing page, Tweet it to me and I’ll post it at the bottom of this page!

What are the sleeping arrangements?

First, you’ll need to confirm that the house rental or hotel that you’re booking has enough separate beds for the number of attendees you have.  Many AirBnB’s advertise themselves as “sleeps 12” when in fact that means 6 beds that can each sleep a couple.  You’ll probably need to double-confirm with the AirBnB host about number of separate beds.

In most cases, some or all people will share a room with someone else.  This has worked fine in my experience.  Just make sure that you’re up-front about this so that people don’t have the wrong expectation.

I think it’s a good idea to pre-assign beds to all attendees.  That way you’re not dealing with a “race to the best bed!” situation.  That’s just awkward and unfair to people who have late flights in.

Should business partners attend together?

I’ve been to some where business partners attend together, and this can work fine, if there is a large enough group.  However, if the total group is very small, then one “company” can dominate the conversation and you won’t get a variety of perspectives in the mix.

Also, if people are opening up and asking for advice, oftentimes those questions involve partnership issues which would be avoided if all partners are in the room.  Just something to consider.

What should be my first step to starting a Tiny Conf?

I’d start by talking to a few people and share your idea.  Then if you think it has legs, then buy a domain and put up a single-page site for it!  Slap an email form on there to collect addresses of people who are interested in hearing more about it.  Then email those people and start throwing out dates and ideas.

I won’t get into how to build a website, how to take payments, how to set up a form, etc.  You all probably know how to do all that, or can find resources elsewhere.

Where can I find Tiny Confs to attend?

I’m keeping a list of some that I know of below!

Other questions?

Tweet at me and I’ll update this FAQ as needed.


Tiny Conferences List!

All of those listed here meet the following criteria:  50 attendees or less (I tend to prefer those <20), business focused, has a website.

Note:  I’m not aware of the status or all of the details about these.  You’ll need to click through and contact the organizers with any questions.

Know of a Tiny Conf that should be listed here?  Tweet me the link and I’ll take a look.

This list should be longer!  Start up a Tiny Conf, then send me the link!

Not in on my newsletter yet? Join us

Show HN: Mole – an open source tool to easily create ssh tunnels

$
0
0

Mole is a cli application to create ssh tunnels, forwarding a local port to a remote address through a ssh server.

Highlighted Features

  • Auto local address selection: find a port available and start linstening to it, so the -local flag doesn’t need to be given every time you run the app.
  • Aliases: save your tunnel settings under an alias, so it can be reused later.
  • Leverage the SSH Config File: use some options (e.g. user name, identity key and port), specified in $HOME/.ssh/config whenever possible, so there is no need to have the same SSH server configuration in multiple places.

…or why on Earth would I need something like this?

Access a computer or service behind a firewall

Mole can help you to access computers and services outside the perimeter network that are blocked by a firewall, as long as the user has ssh access to a computer with access to the target computer or service.

+----------+          +----------+          +----------+
|          |          |          |          |          |
|          |          | Firewall |          |          |
|          |          |          |          |          |
|  Local   |  tunnel  +----------+  tunnel  |          |
| Computer |--------------------------------|  Server  |
|          |          +----------+          |          |
|          |          |          |          |          |
|          |          | Firewall |          |          |
|          |          |          |          |          |
+----------+          +----------+          +----------+
                                                 |
                                                 |
                                                 | tunnel
                                                 |
                                                 |
                                            +----------+
                                            |          |
                                            |          |
                                            |          |
                                            |          |
                                            |  Remote  |
                                            | Computer |
                                            |          |
                                            |          |
                                            |          |
                                            +----------+

NOTE: Server and Remote Computer could potentially be the same machine.

Access a service that is listening only on a local address

$ mole \-local 127.0.0.1:3306 \-remote 127.0.0.1:3306 \-server example@172.12.0.100
+-------------------+             +--------------------+
| Local Computer    |             | Remote / Server    |
|                   |             |                    |
|                   |             |                    |
| (172.17.0.10:     |    tunnel   |                    |
|        50001)     |-------------| (172.17.0.100:22)  |
|  tunnel client    |             |  tunnel server     |
|       |           |             |         |          |
|       | port      |             |         | port     |
|       | forward   |             |         | forward  |
|       |           |             |         |          |
| (127.0.0.1:3306)  |             | (127.0.0.1:50000)  |
|  local address    |             |         |          |
|                   |             |         | local    |
|                   |             |         | conn.    |
|                   |             |         |          |
|                   |             | (127.0.0.1:3306)   |
|                   |             |  remote address    |
|                   |             |      +----+        |
|                   |             |      | DB |        |
|                   |             |      +----+        |
+-------------------+             +--------------------+

NOTE: Server and Remote Computer could potentially be the same machine.

macOS

brew tap davrodpin/homebrew-mole && brew install mole

Linux

curl -L https://github.com/davrodpin/mole/releases/download/v0.2.0/mole0.2.0.linux-amd64.tar.gz | tar xz -C /usr/local/bin
$ mole -help
usage:
  mole [-v][-local[<host>]:<port>] -remote[<host>]:<port> -server[<user>@]<host>[:<port>] [-key<key_path>]
  mole -alias<alias_name> [-v][-local[<host>]:<port>] -remote[<host>]:<port> -server[<user>@]<host>[:<port>] [-key<key_path>]
  mole -alias<alias_name> -delete
  mole -start<alias_name>
  mole -help
  mole -version-alias string
        Create a tunnel alias-delete
        delete a tunnel alias(must be used with -alias)-help
        list all options available-key string(optional) Set server authentication key file path-local value(optional) Set local endpoint address: [<host>]:<port>-remote valueset remote endpoing address: [<host>]:<port>-server valueset server address: [<user>@]<host>[:<port>]-start string
        Start a tunnel using a given alias-v(optional) Increase log verbosity-version
        display the mole version

Examples

Provide all supported options

$ mole -v-local 127.0.0.1:8080 -remote 172.17.0.100:80 -server user@example.com:22 -key ~/.ssh/id_rsa
DEBU[0000] cli options                                   key=/home/mole/.ssh/id_rsa local="127.0.0.1:8080"remote="172.17.0.100:80"server="user@example.com:22"v=trueDEBU[0000] using ssh config file from: /home/mole/.ssh/config
DEBU[0000] server: [name=example.com, address=example.com:22, user=user, key=/home/mole/.ssh/id_rsa]
DEBU[0000] tunnel: [local:127.0.0.1:8080, server:example.com:22, remote:172.17.0.100:80]
INFO[0000] listening on local address                    local_address="127.0.0.1:8080"

Use the ssh config file to lookup a given server host

$ cat$HOME/.ssh/config
Host example1
  Hostname 10.0.0.12
  Port 2222
  User user
  IdentityFile ~/.ssh/id_rsa$ mole -v-local 127.0.0.1:8080 -remote 172.17.0.100:80 -server example1
DEBU[0000] cli options                                   key=local="127.0.0.1:8080"remote="172.17.0.100:80"server=example1 v=trueDEBU[0000] using ssh config file from: /home/mole/.ssh/config
DEBU[0000] server: [name=example1, address=10.0.0.12:2222, user=user, key=/home/mole/.ssh/id_rsa]
DEBU[0000] tunnel: [local:127.0.0.1:8080, server:10.0.0.12:2222, remote:172.17.0.100:80]
INFO[0000] listening on local address                    local_address="127.0.0.1:8080"

Let mole to randomly select the local endpoint

$ mole -remote 172.17.0.100:80 -server example1
INFO[0000] listening on local address                    local_address="127.0.0.1:61305"

Bind the local address to 127.0.0.1 by specifying only the local port

$ mole -v-local :8080 -remote 172.17.0.100:80 -server example1
DEBU[0000] cli options                                   key=local="127.0.0.1:8080"remote="172.17.0.100:80"server=example1 v=trueDEBU[0000] using ssh config file from: /home/mole/.ssh/config
DEBU[0000] server: [name=example1, address=10.0.0.12:2222, user=user, key=/home/mole/.ssh/id_rsa]
DEBU[0000] tunnel: [local:127.0.0.1:8080, server:10.0.0.12:2222, remote:172.17.0.100:80]
INFO[0000] listening on local address                    local_address="127.0.0.1:8080"

Connect to a remote service that is running on 127.0.0.1 by specifying only the remote port

$ mole -v-local 127.0.0.1:8080 -remote :80 -server example1
DEBU[0000] cli options                                   key=local="127.0.0.1:8080"remote="127.0.0.1:80"server=example1 v=trueDEBU[0000] using ssh config file from: /home/mole/.ssh/config
DEBU[0000] server: [name=example1, address=10.0.0.12:2222, user=user, key=/home/mole/.ssh/id_rsa]
DEBU[0000] tunnel: [local:127.0.0.1:8080, server:10.0.0.12:2222, remote:127.0.0.1:80]
INFO[0000] listening on local address                    local_address="127.0.0.1:8080"

Create an alias, so there is no need to remember the tunnel settings afterwards

$ mole -alias example1 -v-local :8443 -remote :443 -server user@example.com$ mole -start example1
DEBU[0000] cli options                                   options="[local=:8443, remote=:443, server=user@example.com, key=, verbose=true, help=false, version=false]"
DEBU[0000] using ssh config file from: /home/mole/.ssh/config
DEBU[0000] server: [name=example.com, address=example.com:22, user=user, key=/home/mole/.ssh/id_rsa]
DEBU[0000] tunnel: [local:127.0.0.1:8443, server:example.com:22, remote:127.0.0.1:443]
INFO[0000] listening on local address                    local_address="127.0.0.1:8443"

Wefunder is hiring a founder-in-residence

$
0
0

We've invested in hundreds of incredibly diverse startups: flying cars, robots, artificial pancreases, breweries, movie studios, and more. From solo founders toiling away in the coffee shop to unicorns now worth over $1 billion. In 34 of 50 states. Next up, 50!

Our goal:  to add so much value, that a founder would fail an intelligence test if they turned down our investment. It should be a no-brainer. Would you turn down Ron Conway? Exactly.

We also aim to sprinkle some of that Silicon Valley fairy dust to the rest of the country. We inspire more potential founders to get started, encourage them to think bigger, hone their pitch, help them focus, and give them the funding they need to prove to the "normals" in the world that they are not crazy (or, at least, crazy in a lucrative way). We believe founders who "have been there and done that" make the best early-stage investors.  That's why we are hiring a Founder-in-Residence.

Dream Candidate

Our dream candidate is a restless founder between startups, willing to commit a year. You likely finished a grueling grind and need to recover before jumping back into the next adventure.  You know - like us - that the job of an investor is so much easier than a founder. But you also know  - eventually - the siren call of another startup will beckon. 

The success or failure of your prior startup doesn't matter. What does matter is empathy for founders, the quality of advice you can offer, and the encouragement that you can give them.  The perfect candidate commands respect from founders: you've been in the trenches... and your advice proves to be true, more often then not.

In addition to empathy, we're also looking for intellectual curiosity and humbleness. One of the most fun parts of this job is to help founders in very different industries - not just tech. Are you the kind of person that would be excited to learn about the economics of ostrich farms?  We're your people.

Things You Could Do

We will design the role for your unique superpowers and interests.  Read our Charter, and tell us how you'd like to help.  However, here are some options:

  • Help a founder with the elevator pitch.  Most founders can't pitch their company. They often use jargon and fancy words. You understand that a 10 year old has to grok what they are doing in 5 seconds, and you know how to intrigue investors in as few words as possible.  Channel pg.
  • Help focus the founder on the most important thing. In the moment, everything seems like a hair on fire / cluster-fuck dumpster fire emergency.  But you have the experience and perspective to guide founders on what is truly important.
  • Spot formidable early founders.  You don't just follow the herd and like the same shiny things everyone else does. You can spot formidable founders - with no external stamps of credibility - before they have succeeded. You can sense founders who "have what it takes" after a half hour... not after a16z backs them.
  • Fix America. As a company, we've taken multiple train trips across the country, stopping in cities like Fargo and Boise. You would be offended if you knew how the local "angels" operated in these cities. We aim to give more power and leverage to the smart founders in these cities, so they can fund the next generation, and outcompete the old money which leads founders astray with horrible advice.
  • Product / Design Feedback.  We're big believers of the YC ethos: build quickly, talk to users, repeat as fast as possible.  We favor product thinkers and makers who can guide founders in this direction.
  • Be a firm cheerleader that practices tough love. One of the most valuable things an investor can do is believe in you, and prove it with their funding. It's a little like being a cheerleader, but with one difference:  when the founder is screwing up, you tell it to them straight, because you respect them. Since we're not a VC, we can be on the side of the founder... but that includes telling it like it is.
  • Be a connector.  Part of the thesis of Wefunder is that we have 200,000 investors who all want to help out and pay it forward in small ways... if only they knew how. You can match up founders with an expert who can help them with their particular challenge.
  • Run a mini-batch.  We run a tiny little incubator just for fun, and line up weekly fireside chats with awesome guest speakers. We imagine it like 2005-era-YC, with a dozen or so founders (but minus the 7%). You can help define and run these for different themes, in different cities.

Skills

This is short. You are a founder. You get shit done. You know how to create something from nothing through sheer force of will, despite the naysayers, obstacles be damned.  And you can inspire others to do that too.

Benefits

  • Market-rate salaries and options
  • Unlimited vacation days (mandatory 3 weeks off)
  • Medical, dental, & vision insurance
  • All-expense paid Wefunder vacation. We’ve taken two train trips across the country
  • Apple equipment. Whatever you want.
  • Lots of free food and drinks.
  • Reimbursement for classes and conferences.

To Apply

Please contact nick@wefunder.com with the following info:

  • What was your startup? What happened?
  • What was the most important thing you learned?
  • What is your biggest superpower?
  • What excites you about our Charter?
  • If the world collapsed, what is your post-apocalyptic survival skill?
  • Tell us how you are an interesting and cool human.

Understanding How Apache Pulsar Works

$
0
0

I will be writing a series of blog posts about Apache Pulsar, including some Kafka vs Pulsar posts. First up though I will be running some chaos tests on a Pulsar cluster like I have done with RabbitMQ and Kafka to see what failure modes it has and its message loss scenarios.

I will try to do this by either exploiting design defects, implementation bugs or poor configuration on the part of the admin or developer.

In this post we’ll go through the Apache Pulsar design so that we can better design the failure scenarios. This post is not for people who want to understand how to use Apache Pulsar but who want to understand how it works. I have struggled to write a clear overview of its architecture in a way that is simple and easy to understand. I appreciate any feedback on this write-up.

Claims

The main claims that I am interested in are:

  • guarantees of no message loss (if recommended configuration applied and your whole data center doesn't burn to the ground)

  • strong ordering guarantees

  • predictable read and write latency

Apache Pulsar chooses consistency over availability as does its sister projects BookKeeper and ZooKeeper. Every effort is made to give strong consistency.

We'll be taking a look at Pulsar's design to see if those claims are valid. In the next post we'll put the implementation of that design to the test. I won’t cover geo-replication in this post, we’ll look at that another day, we’ll just focus on a single cluster.

Multiple layers of abstraction

Apache Pulsar has the high level concept of topics and subscriptions and at its lowest level data is stored in binary files which interleave data from multiple topics distributed across multiple servers. In between are a myriad of details and moving parts. I personally find it easier to understand the Pulsar architecture if I separate it out into different layers of abstraction, so that’s what I’ll do in this post.

Let's take a journey down the layers.

Fig 1. Layers of abstraction

Layer 1 - Topics, Subscriptions and Cursors

This is not a post about messaging architectures that you can build with Apache Pulsar. We’ll just cover the basics of what topics, subscriptions and cursors are but not any depth about the wider messaging patterns that Pulsar enables.

Fig 2. Topics and Subscriptions

Messages are stored in topics. A topic, logically, is a log structure with each message being at an offset. Apache Pulsar uses the term Cursor to describe the tracking of offsets. Producers send their messages to a given topic and Pulsar guarantees that once the message has been acknowledged it won’t be lost (bar some super bad catastrophe or poor configuration).

A consumer consumes messages from a topic via a subscription. A subscription is a logical entity that keeps track of the cursor (the current consumer offset) and also provides some extra guarantees depending on the subscription type:

  • Exclusive Subscription - Only one consumer can read the topic via the subscription at a time

  • Shared Subscription - Competing consumers can read the topic via the same subscription at the same time.

  • Fail-Over Subscription - Active/Backup pattern for consumers. If the active consumer dies, then the back up takes over. But there are never two active consumers at the same time.

One topic can have multiple attached subscriptions. The subscriptions do not contain the data, only meta-data and a cursor.

Pulsar provides both queueing and log semantics by allowing consumers to treat a Pulsar topic like a queue that deletes messages after being acknowledged by a consumer, or like a log where consumers can rewind their cursor if they want to. Underneath the storage model is the same - a log.

If no data retention policy is set on a topic (via its namespace) then messages are deleted once all cursors of attached subscriptions have passed its offset. That is, the message has been acknowledged on all subscriptions attached to that topic.

However, if a data retention policy exists that covers the topic, then messages are removed once they pass the policy boundary (size of topic, time in topic).

Messages can also be sent with an expiration. These messages are deleted if they exceed the TTL while still unacknowledged. This means that they can be deleted before any consumer gets the chance to read them. Expiration only applies to unacknowledged messages and therefore fits more into the queuing semantics side of things.

TTLs apply to each subscription separately, meaning that “deletion” is a logical deletion. The actual deletion will occur later according to what happens in other subscriptions and any data retention policy.

Consumers acknowledge their messages either one by one, or cumulatively. Cumulative acknowledgement will be better for throughput but introduces duplicate message processing after consumer failures. However, cumulative acknowledgement is not available for shared subscriptions as acknowledgements are based on the offset. However, the consumer API does allow for batched acknowledgements that will end up with the same number of acks but with fewer RPC calls. This can improve throughput for competing consumers on a shared subscription.

Finally there are partitioned topics similar to the topics of Kafka. The difference is that the partitions in Pulsar are also topics. Just like with kafka a producer can send messages round-robin, use a hashing algorithm or choose a partition explicitly.

That was a whirlwind introduction to the high-level concepts, we’ll now delve deeper. Remember this is not a primer on Apache Pulsar from 10,000 feet but a look at how it all works underneath from 1000 feet.

Layer 2 - Logical Storage Model

Now Apache BookKeeper enters the scene. I will talk about BookKeeper in the context of Apache Pulsar, though BookKeeper is a general purpose log storage solution.

First of all, BookKeeper stores data across a cluster of nodes. Each BookKeeper node is called a Bookie. Secondly, Apache Zookeeper is used by both Pulsar and BookKeeper for storing meta-data and monitoring node health.

Fig 3. Apache Pulsar, BookKeeper and ZooKeeper working together

A topic is in fact a stream of Ledgers. A Ledger is a log in its own right. So we compose a parent log (the topic) from a sequence of child logs (Ledgers).

Ledgers are appended to a topic, and entries (messages or groups of messages) are appended to Ledgers. Ledgers, once closed, are immutable. Ledgers are deleted as a unit, that is, we cannot delete individual entries but ledgers as a whole.

Ledgers themselves are also broken down into Fragments. Fragments are the smallest unit of distribution across a BookKeeper cluster (depending on your perspective, striping might invalidate that claim).

Fig 4. Entries at the bottom

Topics are a Pulsar concept. Ledgers, Fragments and Entries are BookKeeper concepts, though Pulsar understand and works with ledgers and entries.

Each Ledger (consisting of one or more Fragments) can be replicated across multiple BookKeeper nodes (Bookies) for both redundancy and read performance. Each Fragment is replicated across a different set of Bookies (if enough Bookies exist).

Fig 5. Apache Pulsar, Apache BookKeeper and Apache Zookeeper working together

Each Ledger has three key configurations:

  • Ensemble Size (E)

  • Write Quorum Size (Qw)

  • Ack Quorum Size (Qa)

These configurations are applied at the Topic level, which Pulsar then sets on the BookKeeper Ledgers/Fragments of the topic.

Note: "Ensemble" means the actual list of Bookies that will be written to. Ensemble size is an instruction to Pulsar to say how big an ensemble it should create. Note that you will need at least E Bookies available for writes. By default, bookies are picked up randomly from the list of available bookies (each Bookie registers itself in Zookeeper).

There's also the option to configure rack-awareness, by marking Bookies to belong to specific racks. A rack can be a logical construct (eg: an availability zone in a cloud environment). With a rack-aware policy, the BookKeeper client of the Pulsar broker will try to pick Bookies from different racks. It's also possible to plug in a custom policy to perform a different type of selection.

Ensemble Size (E) governs the size of the pool of Bookies available for that Ledger to be written to by Pulsar. Each Fragment may have a different ensemble, the broker will select a set of Bookies on creating the fragment, but the ensemble will always be the size indicated by E. There must be enough Bookies that are write available to cover E.

Write Quorum (Qw) is the number of actual Bookies that Pulsar will write an entry to. It can be equal to or smaller than E.

Fig 6. A fragment of 8 entries stored across an ensemble of 3 with each entry written to 3 bookies.

When Qw is smaller than E then we get striping which distributes reads/writes in such a way that each Bookie need only serve a subset of read/write requests. Striping can increase total throughput and lower latency.

Ack Quorum (Qa) is the number of Bookies that must acknowledge the write, for the Pulsar broker to send its acknowledgement to its client. In order to be strongly consistent Qa should be: (Qw + 1) / 2 or greater. In practice it would either be:

  • (Qa == Qw) or

  • (Qa == Qw -1) ---> This will improve latency by ignoring the slowest bookie.

Ultimately, every bookie must receive the write. But if we always wait for all bookies to respond we can get spikey latency and unappealing tail latencies. Pulsar promises predictable latencies after all.

A Ledger is created when it is a new topic or when roll-over occurs. Roll-over is the concept of creating a new Ledger when either:

  • a Ledger size or time limit has been reached

  • ownership (by a Pulsar broker) of a Ledger changes (more on that later).

A Fragment is created when:

  • a new Ledger is created

  • when a Bookie in the current Fragment ensemble returns an error or timesout when a write occurs.

When a bookie cannot serve a write then the Pulsar broker gets busy creating a new fragment and making sure the write gets acknowledged by Qw bookies. It’s like the Terminator, it won’t stop until that message is persisted.

Insight #1: Increase E to optimize for latency and throughput. Increase Qw for redundancy at the cost of write throughput. Increase Qa to increase the durability of acknowledged writes at the increased risk of extra latency and longer tail latencies.

Insight #2: E and Qw are not a list of Bookies. They simply indicate how large the pool of Bookies that can serve a given Ledger is. Pulsar will use E and Qw in the moment that it creates a new Ledger or Fragment. Each Fragment has a fixed set of Bookies in its ensemble that will never change.

Insight #3: Adding new Bookies does not mean manual rebalancing needs to be performed. Automatically, those new Bookies will be candidates for new Fragments. After joining the cluster, new Bookies will be written to immediately upon new fragments/ledgers being created. Each Fragment can be stored on a different subset of Bookies in the cluster! We do not couple Topics or Ledgers to a given Bookie or set of Bookies.

Let’s stop and take stock. This is a very different and more complex model to Kafka. With Kafka each partition replica is stored in its entirety on a single broker. The partition replica is comprised of a series of segment and index files. This blog post nicely describes it.

The great thing about the Kafka model is that it is simple and fast. All reads and writes are sequential. The bad thing is that a single broker must have enough storage to cope with that replica, so very large replicas can force you to have very large disks. The second downside is that rebalancing partitions when you grow your cluster becomes necessary. This can be painful and requires good planning and execution to pull if off without any hitches.

Returning to the Pulsar + BookKeeper model. The data of a given topic is spread across multiple Bookies. The topic has been split into Ledgers and the Ledgers into Fragments and with striping, into calculatable subsets of fragment ensembles. When you need to grow your cluster, just add more Bookies and they’ll start getting written to when new fragments are created. No more Kafka-style rebalancing required. However, reads and writes now have to jump around a bit between Bookies. We’ll see how Pulsar manages this and does it fast further down this post.

But now each Pulsar broker needs to keep track of the Ledgers and Fragments that each Topic is comprised of. This meta-data is stored in ZooKeeper and if you lose that then you’re in serious trouble.

In the storage layer we've written a topic evenly across a BookKeeper cluster. We've avoided the pitfalls of coupling Topic replicas to specific nodes. Where Kafka topics are like sticks of Toblerone, our Pulsar topics are like a gas expanding to fill the available space. This avoids painful rebalancing.

Layer 2 - Pulsar Brokers and Topic Ownership

Also in Layer 2 of my abstraction layers we have the Pulsar Brokers. Pulsar brokers have no persistent state that cannot be lost. They are separated from the storage layer. A BookKeeper cluster by itself does not perform replication, each Bookie is just a follower that is told what to do by a leader - the leader being a Pulsar broker. Each topic is owned by a single Pulsar broker. That broker serves all reads and writes of that topic.

When a Pulsar broker receives a write, it will perform that write against the ensemble of the current Fragment of that Topic. Remember that if no striping occurs the ensemble of each entry is the same as the fragment ensemble. If striping occurs then each entry has its own ensemble which is a subset of the fragment ensemble.

In a normal situation there will be a single Fragment in the current Ledger. Once Qa brokers have acknowledged the write the Pulsar broker will send an acknowledgement to the producer client.

An acknowledgement can only be sent if all prior messages have also been Qa acknowledged. If for a given message, a Bookie responds with an error or does not respond at all, then the broker will create a new Fragment on a new ensemble of Bookies (that does not include the problem Bookie).

Fig 8. A single broker serves all reads and writes of a given topic.

Note that the broker will only wait for Qa acks from the bookies.

Reads also go through the owner. The broker, being the singular entrypoint for a given topic, knows up to which offset has been safely persisted to BookKeeper. It needs only read from a single Bookie to serve a read. We’ll see in Layer 3 how it uses caching to serve many reads from its in-memory cache rather than sending reads to BookKeeper.

Fig 9. Reads only need go to a single Bookie

Pulsar Broker health is monitored by ZooKeeper. When a broker fails or becomes unavailable (to ZooKeeper) an ownership change occurs. A new broker becomes the topic owner and all clients are now directed to read/write to this new broker.

BookKeeper has a critically important functionality called Fencing. Fencing allows BookKeeper to guarantee that only one writer (Pulsar broker) can be writing to a ledger.

It works as follows:

  1. The current Pulsar broker (B1) that has ownership of topic X is deemed dead or unavailable (via ZooKeeper).

  2. Another broker (B2) updates the state of the current ledger of topic X to IN_RECOVERY from OPEN.

  3. B2 sends a fence message to all bookies of the current fragment of the ledger and waits for (Qw-Qa)+1 responses. Once this number of responses is received the ledger is now fenced. The old broker if it is in fact still alive, can no longer make writes as it will not be able to get Qa acknowledgements (due to fencing exception responses).

  4. B2 then requests from each bookie in the fragment ensemble what their the last acknowledged entry is. It takes the most recent entry id and then starts reading forward from that point. It ensures that all entries from that point on (which may not have been previously acknowledged to the Pulsar broker) get replicated to Qw bookies. Once B2 cannot read and replicate any more entries, the ledger is fully recovered.

  5. B2 changes the state of the ledger to CLOSED

  6. B2 can now accept writes and opens a new ledger.

The great thing about this architecture is that by making the leaders (the Pulsar brokers) have no state, split-brain is trivally taken care of by BookKeeper's fencing functionality. There is no split-brain, no divergence, no data loss.

Layer 2 - Cursor Tracking

Each subscription stores a cursor. The cursor is the current offset in the log. Subscriptions store their cursor in BookKeeper in ledgers. This makes cursor tracking scalable just like topics.

Layer 3 - Bookie Storage

Ledgers and Fragments are logical constructs which are maintained and tracked in ZooKeeper. Physically, the data is not stored in files that correspond to Ledgers and Fragments. The actual implementation of storage in BookKeeper is pluggable and Pulsar uses a storage implementation called DbLedgerStorage by default.

When a write to a Bookie occurs, first that message is written to a journal file. This is a write-ahead log (WAL) and it helps BookKeeper avoid data loss in the event of a failure. It is the same mechanism by which relational databases achieve their durability guarantees.

The write is also made to the Write Cache. The Write Cache accumulates writes and periodically sorts and flushes them to disk in Entry Log files. Writes are sorted so that entries of the same ledger are placed together which improves read performance. If the entries are written in strict temporal order then reads will not benefit from a sequential layout on disk. By aggregating and sorting we achieve temporal ordering at the ledger level which is what we care about.

The Write Cache also writes the entries to RocksDB which stores an index of the location of each entry. It simply maps (ledgerId, entryId) to (entryLogId, offset in the file).

Reads hit the Write Cache first as the write cache has the latest messages. If there is a cache miss then it hits the Read Cache. If there is a second cache-miss then the Read Cache looks up the location of the requested entry in RocksDB and then reads that entry in the correct Entry Log file. It performs a read-ahead and updates the Read Cache so that following requests are more likely to get a cache hit. These two layers of caching mean that reads are generally served from memory.

BookKeeper allows you to isolate disk IO from reads and writes. Writes are all written sequentially to the Journal file that can be stored on a dedicated disk and are committed in groups for even greater throughput. After that no other disk IO is synchronous from the point of view of the writer. Data is just written to memory buffers.

Asynchronously on background threads, the Write Cache performs bulk writes to Entry Log files and RocksDB, which typically run a their own shared disk. So one disk for synchronous writes (journal file) and another disk for asynchronous optimized writes and all reads.

On the read-side, readers are served from either the Read Cache or from the Log Entry files and RocksDB.

Also take into account that writes can saturate the ingress network bandwidth and reads can saturate the egress network bandwidth, but they do not affect each other.

This elegantly isolated reads from writes at a disk and network level.

Fig 10. A Bookie with the default (with Apache Pulsar) DbLedgerStorage architecture.

Layer 3 - Pulsar Broker Caching

Each topic has a single broker that acts as owner. All reads and writes go through that broker. This provides many benefits.

Firstly, the broker can cache the log tail in memory meaning that the broker can serve tailing readers itself without the need for BookKeeper. This avoids paying the cost of a network round-trip and a possible disk read on a Bookie.

The broker is also aware of the id of the Last Add Confirmed entry. It can track which message is the last safely persisted message.

When the broker does not have the message in its cache it will request the data from one Bookie in the ensemble of the Fragment of that message. This means that the difference in read serving performance between tail readers and catch-up readers is large. Tail readers can be served from memory on the Pulsar broker whereas a catch-up reader may have to incur the cost of an extra network round trip and multiple disk reads if neither the Write nor Read Cache have the data.

So we’ve covered from a high level the logical and physical representation of messages, as well as the different actors in a Pulsar cluster and their relationships with each other. There is plenty of detail that has not been covered but we’ll leave that as an exercise for another day.

Next up we’ll cover how an Apache Pulsar cluster ensures that messages are sufficiently replicated after node failures.

Recovery Protocol

When a bookie fails, all the ledgers that have fragments on that bookie are now under replicated. Recovery is the process of "rereplicating" fragments to ensure the replication factor (Qw) is maintained for each ledger.

There are two types of recovery: manual or automatic. The rereplication protocol is the same for both, but Automatic Recovery uses an in-built failed node detection mechanism that registers rereplication tasks to be performed. The manual process requires manual intervention.

We'll focus on the Auto Recovery mode.

Auto Recovery can be run from a dedicated set of servers or hosted on the Bookies, in the AutoRecoveryMain process. One of the auto-recovery processes gets elected as Auditor. The role of the Auditor is to detect downed bookies and then:

  1. Read the full ledger list from ZK and find the ledgers hosted on the failed bookie.

  2. For each ledger it will create a rereplication task in the /underreplicated znode in ZooKeeper.

If the Auditor node fails then another node gets promoted as the Auditor. The Auditor is a thread in the AutoRecoveryMain process.

The AutoRecoveryMain process also has a thread that runs a Replication Task Worker. Each worker watches the /underreplicated znode for tasks.

On seeing a task it will try and lock it. If it is not able to acquire the lock, it will move onto the next task.

If it does manage to acquire a lock it then:

  1. Scans the ledger for fragments which its local bookie is not a member of

  2. For each matching fragment, it replicates the data from another bookie to its own bookie, updates ZooKeeper with the new ensemble and the fragment is marked as fully replicated.

If the ledger has remaining underreplicated fragments then the lock is released. If all fragments are all fully replicated the task is deleted from /underreplicated.

If a fragment does not have an end entry id then the replication task waits and checks again, if the fragment still has no end entry id it fences the ledger before rereplicating the fragment.

Therefore, with Auto Recovery mode, a Pulsar cluster is able to fully the details of replication to ensure the correct replication factor for each ledger. The admin must just ensure that the right amount of bookies are deployed.

ZooKeeper

ZooKeeper is required by both Pulsar and BookKeeper. If a Pulsar node loses visibility of all ZooKeeper nodes then it stops accepting read and writes and restarts itself. This is as a precaution to ensure that the cluster cannot enter an inconsistent state.

This does mean that if ZooKeeper goes down, everything becomes unavailable and that all Pulsar node caches will be wiped. Therefore upon resumption of service there could in theory be a latency spike due to all reads going to BookKeeper.

Round Up

  • Each topic has an owner broker

  • Each topic is logically broken down into Ledgers, Fragments and Entries

  • Fragments are distributed across the bookie cluster. There is no coupling of a given topic to a given bookie(s).

  • Fragments can be striped across multiple bookies.

  • When a Pulsar broker fails, ownership of the topics of that broker fail-over to another broker. Fencing avoids two brokers that might believe themselves the owner from actually writing to the current topic ledger at the same time.

  • When a bookie fails, auto recovery (if enabled) will automatically perform “rereplication” of the data to other bookies. If disabled, a manual process can be initiated

  • Brokers cache the log tail allowing them to serve tailing readers very efficiently

  • Bookies use a journal to provide guarantees on failure. The journal can be used to recover data not yet written to Entry Log files at the time of the failure.

  • Entries of all topics are interleaved in Entry Log files. A lookup index is kept in RocksDB.

  • Bookies serve reads as follows: Write Cache -> Read Cache -> Log Entry files

  • Bookies can isolate reads from writes IO via separate disks for journal files, log entry files and RocksDB.

  • ZooKeeper stores all meta-data for both Pulsar and BookKeeper. If ZooKeeper is unavailable Pulsar is unavailable.

  • Storage can be scaled out separately to the Pulsar brokers. If storage is the bottleneck then simply add more bookies and they will start taking on load without the need for rebalancing.

Fig 11. Round-up of concepts

Some Initial Thoughts on Potential Data Loss

Let’s look at the RabbitMQ and Kafka acknowledged write message loss scenarios and see if they apply to Pulsar.

RabbitMQ split-brain with either Ignore or Autoheal mode.

The losing side of the partition loses any messages delivered since the partition began that were not consumed.

Split-brain on the storage layer is not theoretically possible with Apache Pulsar.

Apache Kafka, acks=1 and broker with leader replica dies.

Fail-over to a follower in the ISR occurs with potential message loss as an ack was sent once the leader had persisted the message but potentially before the follower was able to fetch it.

Apache Pulsar has no leader storage node. Given a replication factor (Qw) of 2 or more, there simply is no way for a single node failure to cause message loss.

Let’s consider two scenarios.

Scenario 1. E = 3, Qw =2, Qa = 1. The broker sends the write to two bookies. Bookie 1 and Bookie 2 return an ack to the broker who then sends an ack to its client. Now to produce message loss we’ll need both the Bookie 1 and Bookie 2 to fail. If any single bookie dies then the Auto Recovery protocol will kick in.

Scenario 2. E = 3, Qw =2, Qa = 1. The broker sends the write to two bookies. Bookie 1 returns an ack to the broker who then sends an ack to its client. Bookie 2 has not responded yet. Now to produce message loss we’ll need both the broker and Bookie 1 to fail and Bookie 2 to have no successfully made the write. If only Bookie 1 dies, then the broker will still end up writing the message to a second bookie in the end.

The only way a failure of a single node could cause message loss is if the Qw is 1 (which means no redundancy). Then the only copy of a message could be lost when its bookie fails. So if you want to avoid message loss, make sure you have redundancy (Qw >= 2).

Apache Kafka. Node with leader partition is isolated from ZooKeeper.

This causes short-term split-brain in Kafka.

With acks=1, the leader will continue to accept writes until it realizes it cannot talk to ZooKeeper at which point it will stop accepting writes. Meanwhile a follower got promoted to leader. Any messages persisted to the original leader during that time are lost when the original leader becomes a follower.

With acks=all, if the followers fall behind and get removed from the ISR, then the ISR consists only of the leader. Then the leader becomes isolated from ZooKeeper and continues to accept acks=all messages for a short while even after a follower got promoted to leader. The messages received in that short time window are lost when the leader becomes a follower.

Apache Pulsar cannot have split-brain of the storage layer. If the current broker owner gets isolated from ZooKeeper, or suffers a long GC, or its VM gets suspended, and another broker becomes the owner, then still only one broker can write to the topic. The new owner will fence off the ledger and prevent the original leader from making any writes that could get lost.

Apache Kafka. Acks=all with leader failure.

The followers fall behind and get removed from the ISR. Now the ISR consists of a single replica. The leader continues to ack messages even with acks=all. The leader dies. All unreplicated messages are lost.

Apache Pulsar uses a quorum based approach where this cannot happen. An ack can only be sent once Qa bookies have persisted the message to disk.

Apache kafka - Simultaneous Loss of Power (Data Center Outage)

Kafka will acknowledge a message once written to memory. It fsyncs to disk periodically. When a data center suffers a power loss, all servers could go offline at the same time. A message might only be in memory on all replicas. That message is now lost.

Apache Pulsar only acks a message once Qa bookies have acknowledged the message. A bookie only acknowledges an entry once it is persisted to its journal file on disk. Simultaneous power loss to all servers should not lose messages unless multiple disk failures also occur.

So far Apache Pulsar is looking pretty robust. We’ll have to see how it fares in the chaos testing.

Conclusion

There are more details that I have either missed out or don’t yet know about. Apache Pulsar is significantly more complicated than Apache Kafka in terms of its protocols and storage model.

The two stand-out features of a Pulsar cluster are:

  • Separation of brokers from storage, combined with BookKeepers fencing functionality, elegantly avoids split-brain scenarios that could provoke data loss.

  • Breaking topics into ledgers and fragments, and distributing those across a cluster allow Pulsar clusters to scale out with ease. New data automatically starts getting written to new bookies. No rebalancing is required.

Plus I haven’t even gotten to geo-replication and tiered storage which are also amazing features.

My feeling is that Pulsar and BookKeeper are part of the next generation of data streaming systems. Their protocols are well thought out and rather elegant. But with added complexity comes added risk of bugs. In the next post we’ll start chaos testing an Apache Pulsar cluster and see if we can identify weaknesses in the protocols, and any implementation bugs or anomalies.

A guide to rhythm in web typography

$
0
0

This article is based on chapter 6 from the book Better Web Typography for a Better Web by Matej Latin. The book consists of 13 chapters through which the reader designs and builds an example website with awesome typography.

___

Rhythm in typography is just like rhythm in music. A text can either flow like a masterpiece symphony performed by an in-tune orchestra, or it can be a disjointed flimsy song by a one-man band wannabe. Just like in music, where order is more pleasurable to our ears than chaos, so is a well-designed text with an established rhythm easier to read and more enjoyable to consume by our eyes. Ears and eyes are just the sensory tools; it’s our mind that processes the information. And our mind is a machine for pattern recognition. That’s why a well-tuned, rhythmic and proportional text will always triumph over a scrappy one. But, unlike in music, there are two types of rhythm in typography: horizontal and vertical.

The music analogy works very well with typography, because your text will either be easy to read and the reader will get into flow—not focusing on reading but simply consuming the content —or struggle through the information before finally giving up. Horizontal rhythm mostly impacts the legibility, while vertical rhythm impacts the readability of the text and establishes a sense of visual hierarchy.

Vertical and horizontal rhythm in web typography.

Horizontal rhythm

Letter spacing (tracking)

Letter spacing is more commonly known as tracking in print design. It can have a massive impact on the legibility of the words, so it should be used with caution. Letter spacing lower case text is not recommended. “A man who would letterspace lower case would steel sheep,” Frederic Goudy used to say. Letter spacing impacts legibility because it makes the words harder to be deciphered by our brain. Reading slows down, even for fast readers. Unless you have a very good reason for doing so, don’t letter space the main body text.

There are two occasions when letter spacing can have a positive impact. Headings tend to be larger and heavier than the body text. Because of their larger size, the spacing between the letters also looks optically larger than at smaller sizes. In this case, it’s a good idea to slightly reduce the letter spacing. We’re talking about 3–5%, not more. This will make your heading a bit more compact, and a bit closer in appearance to the body type.

Applying negative letter spacing to headings makes them more compact and closer in appearance to the body type.

Another occasion where letter spacing can be useful is when applied to small caps or uppercase text only. Using uppercase for longer text is a really bad idea, so this might be best combined with headings again. Whenever we have a line of text set in all uppercase or small caps (we’ll cover small caps in the second part of the book), it’s a good idea to increase the spacing between the letters just a bit. Again, we’re talking about small increases, but just enough to make a difference. My recommendation is from 5 to 10%.

Applying letter spacing to uppercase or small caps helps with legibility.

By doing so, we make the uppercase letters and words easier to read and process because the letters are easier to recognise. Besides that, a bit more space between the letters will add a touch of sophistication to our design. Pay attention to well-designed products or brands that use all uppercase in their branding. You’ll notice that most of them are letter spaced.

Letter spacing acronyms and long series of digits is also recommended, even in the body text.

Kerning

Spacing between different letters is far from equal. Each letter comes with a default set of spacing around it, no matter the neighbouring letter. That’s why we get inconsistencies like this:

Bad kerning, also known as keming. In this particular case it’s so bad that the word “SAVE” seems to be two words “SA” and “VE”.

Kerning—altering the spaces between particular pairs of letters—can resolve these issues. The result is a much better proportioned word and optical perfection. Kerning, unlike letter spacing, changes the spacing around letters depending on their neighbouring letters.

Fixing the bad kerning.

Most web browsers default kerning to auto. This means that the kerning will be enabled for larger type and disabled for smaller. Bad kerning is not as obvious on small type. If you wish, you can control it
 like this:

font-kerning: auto; // default
font-kerning: normal; // enables kerning
font-kerning: none; // disables kerning

That’s about it when it comes to what we can do with the default browser support right now. This is probably not good enough for special occasions when we need to move a particular letter by x pixels to achieve that optical perfection. Thankfully, there are tools like Lettering.js. With it, we can control the positioning (and also style) of each letter.

Kerning in Sketch

Sketch comes with default kerning enabled but you can change it or disable it completely if needed. To do that, you need to select the text (actual text, not the text box) and go to Type > Kern > Tighten if you want tighter kerning and Type > Kern > Loosen if you want looser kerning. Choose the default option to go back to normal or the disable option to completely disable it.

Don’t justify on the web

A break in music has a meaning. It separates the sound from silence. Nothingness from a rich sound of a chord. It seemingly breaks the rhythm (even though breaks in music always match the rhythm). We get the same effect in typography. A combination of letters, words and empty spaces define the rhythm. For reading to flow, that rhythm needs to be consistent. And because (as we learned earlier) we read word by word, too much spacing between words breaks this rhythm. It breaks the flow of reading. It turns the easiest text to read into something that is hard to consume, no matter the language or words used. I still encounter this far too often on the web:

Comparing left-aligned and justified text on the web (no hyphenation).

Comparing two justified paragraphs: one hyphenated, one not.

Web browsers render justified text very poorly. They don’t have the necessary algorithms to properly set the spacing between words and even individual letters (which is what proper text editors do). That’s why justified texts come with patches of space between the words: rivers of white between the black of text. This makes it very hard to read so justifying text on the web should be avoided at all costs. Web browsers are getting better in supporting hyphenation, though. If you do use justified text, complement it with hyphenation. For now, I recommend not using justified alignment at all.

Justified alignment and accessibility

Benedicte, a student from the Better Web Type course sent me an email recently. He told me that he works for an organisation in Norway that specialises in books for readers with different special needs. He pointed me to the fact that people with dyslexia have a particular problem with reading justified aligned texts. It’s not clear where a line of text ends which makes it very easy to switch to the wrong line. I did a bit of research and even Gov.uk (a UK public sector information website) recommends aligning text to the left.

Paragraph indenting

A common way to visually separate paragraphs in books is to indent the first line. It’s actually more common than putting an empty line between them. In terms of rhythm, we’re changing the horizontal rhythm to separate paragraphs instead of the vertical one. The contrary is true on the web—paragraphs are more commonly spaced apart but indenting the first line is quite a simple thing to do. There are two rules that must be followed:

1. Don’t indent the first line of the first paragraph or of the paragraph that comes after a heading, an image or any other type of figure. I know it sounds quite complicated but it’s actually very simple when it comes to CSS.

p + p {
  text-indent: 1em;
}

This works because the text-indent property will only be applied to paragraphs that are preceded by another paragraph. If a paragraph is preceded by an h1, for example, the text indent won’t be applied. That’s exactly what we want.

2. Don’t put a bottom margin on your paragraphs. They’re visually divided by the indented line and that’s enough. There’s no point in having space between them. Indented and spaced-apart paragraphs make skilled typographers cringe.

p {
  margin: 0 auto;
}

This will set the top and bottom margins of all paragraphs to 0, just to make sure that a top margin doesn’t put a blank space between the paragraphs.

I recommend doing this for texts that aren’t broken down into different sections, divided by titles or a lot of images. It simply works best with longer texts divided into paragraphs. Long articles or web books are best-use cases for paragraph indentation.

How much indentation?

The recommended and most common paragraph text indent is 1em, just like we set in our example above. It’s also common to set it to one unit of line height (more on line height being a unit in the upcoming vertical rhythm section). So if we had a paragraph with a font size of 18 pixels and a line height of 27 pixels, we would set the paragraph indent to 27 pixels, or 1.5em (equalling the line height of 1.5). Half an em is considered the minimum for paragraph indentation and 3em the maximum. My recommendation is either 1em or equal to one unit of line height.

Hanging punctuation

Hanging punctuation is something 9 out of 10 websites get wrong. I’m sure that once I tell you about it, you’ll start noticing it everywhere. By its definition, hanging punctuation “is a way of typesetting punctuation marks and bullet points, most commonly quotation marks and hyphens, so that they do not disrupt the ‘flow’ of a body of text or ‘break’ the margin of alignment”. Let’s take a look at a few examples.

Can you notice the difference in the image above? It’s jarring to me and it really hurts to see the example on the left so often. The quotation marks at the very beginning of the paragraph must sit outside the main block of text so it doesn’t disrupt the flow, or the rhythm, of it. It’s a small detail, I know, but it can have a major impact on the overall look of your typography if done right. So how do we do it right? That’s where it gets a bit complicated. There is a CSS property for hanging punctuation but it’s only well supported by Safari at the time of writing this.

blockquote p {
  hanging-punctuation: first;
}

Setting the hanging-punctuation property to “first” means that “An available character at the start of the first formatted line of an element hangs”. In our case this means the quotation mark at the start of the first line of our paragraph hangs. The “available character” part simply means that the quotation mark is on the list of characters that CSS considers as the ones that can hang.

That’s great, but because of the poor browser support, we can’t use this at the time of writing this book. Too bad, that would make it so easy. Well, as it turns out, there’s a workaround that’s equally simple: negative text indent.

blockquote p {
  text-indent: -0.5em;
}

Here’s what we get:

Just make sure you change its value so it works with the font you’re using. -0.5em should be pretty much there but make sure you change it if needed, so it’s spot on. Take a look at the live example at betterwebtype.com/book/c6e1.

There’s a problem with both of the two solutions I haven’t told you about yet. Well, actually, both solutions work considering the quotation marks only come in the first line. What if we were to quote someone in the middle of a sentence and that quotation happens to start on a new line?

None of the two solutions work in this example. The quotation mark is pushed into the main body of the text. It’s aesthetically unappealing but there’s not much we can do about it. At least not yet.

Hanging punctuation with other characters

Hanging punctuation should be applied to other characters as well. After the quotation marks, the next most important are bullets. Again, most websites get this wrong but, actually, browsers get this wrong by default as well. This is how most browsers will render ordered and unordered lists by default.

In fact, the bullets should be hanging so they don’t disrupt the flow of the text. To keep the horizontal rhythm undisrupted we need to change the padding of the unordered and ordered list elements.

ul, ol {
  padding-left: 0;
  list-style-position: outside;
}

Note: Also make sure that list-style-position is set to outside.

This will push the bullets outside and keep the horizontal rhythm unaffected. Having the bullet points inside the main block of text is not as big a crime as having the quotation marks inside. It is typographically correct but I know that it looks strange to some people, at least at first. They don’t mind it once they get used to it. We, as a web design community, have been making mistakes like this for so long that these things need to be re-hardwired. I, personally, see the hanging bullets as a recommendation, but hanging quotation marks as a rule. Take a look at a live example of the list at betterwebtype.com/book/c6e2.

Vertical rhythm

Let’s say that a website has the main body text set at 20 pixels and a line height of 30 pixels. The line length should be appropriate for this size and line height: 600 pixels looks just about right. We now have all we need to set up the vertical rhythm throughout the page. To do that, we need a basic rhythmical unit. In typography, that’s line height. We can see it by adding a 30-pixels-tall baseline grid to our website.

p {
  font-size: 20px;
  line-height: 30px;
  max-width: 600px;
}

Baseline grid indicates equal line height and vertical rhythm.

Note: Unlike in print and graphic design, the baseline grid lies right in the middle of the lines. Lots of people ask me if it shouldn’t lie right at the bottom of the bodies of letters. Not on the web. Take a look at how web browsers interpret a line of text:

A baseline grid on the web falls right in the middle between the line, unlike in print where letter bodies lie directly on it.

We only have a paragraph of text for now, so everything looks right. To keep this rhythm going, we need to use the line height as a base unit for every size, margin and padding on the site. Let’s see. We want to add a title to our text. We assign a size of 55 pixels to it, to make it stand out. Its line height now needs to be an even multiple of the original (base) line height, our main rhythmic unit. This also applies to its margins—especially the top and bottom ones.

h3 {
  font-size: 55px;
  line-height: 60px; // = 2 × 30px (main body text
 line-height)
  margin-top: 90px; // = 3 × 30px
  margin-bottom: 30px; // = 1 × 30px
}

Note: I’m using pixels for these examples but you should be using units like em, rem or just a decimal 1.5 or a percentage (150%) value for line height.

Heading 3’s line height equals two lines, its margins equal three lines on top and one line at the bottom. Everything falls into place.

We assigned a line height of 60 pixels because it’s the next multiple of our base line height (30 pixels) that comfortably accommodates the title set at 55 pixels. A guideline for heading margins that I like to stick with is that the bottom margin should be noticeably smaller than the top one. A lot of websites make the mistake of having equal top and bottom margins for headings, so they float right in the middle between the blocks of text. The title needs to visually connect with the text beneath it, the text it’s actually referring to. And that’s exactly what we want to achieve with vertical rhythm and visual hierarchy. A reader can now understand the structure of the text by simply scanning it.

Because we’ll need this in most cases it’s best practice to assign default line height, margins and paddings to all elements and deviate from it only when necessary.

* {
  line-height: 27px;
  margin-top: 0;
  margin-bottom: 27px; // = 1 × 27px
}

So if you want your lists to have a specific bottom margin, you’d go for something like this:

ul, ol {
  margin-bottom: 54px; // = 2 × 27px
}

You may be questioning what happens when an image breaks the vertical rhythm. It’s a good question. Images come in various sizes. It’s impossible to expect that we’ll be able to control their height on every occasion, especially for large-scale websites. What do we do then? My suggestion: let it be. Use the baseline grid as a guide, not as a restraint. Your text and your page proportions are still rhythmically correct. It still has an established visual hierarchy. Large solid blocks of content, which images certainly are, don’t visually break the rhythm apart. It may break your grid, yes, but at the end of the day it doesn’t matter. The grid is just a tool.

Vertical rhythm with Sass

Last year, I built a tool revolving around vertical rhythm and modular scales in typography. I called it Gutenberg—a meaningful web typography starter kit. Sass was the one thing that made it so much easier. In fact, I don’t think it could have been built without it. Its main goal is that a developer sets a few basic sizes (font size, line height and max width) and Gutenberg takes care of the proportions and vertical rhythm. I now use it for every website I build. Here’s the main thing that Sass made so much easier:

$base-font-size: 112.5; // Is used as %
$line-height: 1.5; // = 27px

$base: 16 * ($base-font-size / 100);
$leading: $base * $line-height;

@mixin margin-bottom($number){
  margin-bottom: #{$number * $leading + 'px'};
  margin-bottom: #{$number * $line-height + 'rem'};
}

This is the mixin I used a lot. Gutenberg has similar mixins for other margins and padding as well. Instead of manually setting any margins or paddings, I used mixins every time. That way, I’m sure the vertical rhythm is left intact. An example of using the mixin above is:

h3 {
  @include margin-bottom(2);
}

Which translates to:

h3 {
  margin-bottom: 54px; // 2 × line-height (27px)
  margin-bottom: 3rem; // 2 × line-height (1.5)
}

The Sass mixin sets the bottom margin in rems, and in pixels as a fallback.

Vertical rhythm in Sketch

You don’t need a plugin to set a baseline grid in Sketch. We can do that with the features Sketch offers out of the box with a simple workaround. Once in Sketch, add an artboard. Then, go to View > Canvas > Layout Settings. The layout settings window will open. Here’s what you need to do:

  1. Tick the “Columns” checkbox and change the value of “Total width” to match the width of your artboard. “Offset” must be set to 0. You can disable the “Columns” checkbox then, as we don’t really need them.
  2. Now tick the ‘Rows’ checkbox if it isn’t already. Change the ‘Gutter height’ to ‘1px’ and ‘Row height’ to what your line-height is in pixels + 1 pixel (to accommodate the ‘Gutter height’ we set in the previous step. In this case, that’s 27 + 1 which translates to 28 pixels.
  3. Then change the “Visuals” radio to “Stroke outline”.
  4. Change the colour of the lines to something subtle. You need to change the “Dark” colour. I usually set it to a light shade of blue. Click “OK”.

Bam! You now have a baseline grid in Sketch. Simple, right?

Note: your content will display behind the grid and there’s no way of changing that. What you can do is change the opacity of the lines. You can do that in the colour selection window while selecting the colour in Step 4 above.

An example of rhythm in web typography

This article is based on chapter 6 from the book Better Web Typography for a Better Web. To this point, the reader would already be familiar the anatomy of the typeface, choosing and combining fonts and how to shape perfect paragraphs. They already applied all this to the example that we’re now starting to work on.

___

Well done for making it so far. Rhythm is one of the most important things in typography and it doesn’t take too much effort to get it right. I know that some designers like to use what looks right to their eyes (they make the decision based on what looks right optically), and others prefer to back their decisions with maths. I believe both can and should work together. I like to start off with what’s mathematically correct and make optical corrections when they’re needed.

In the previous chapter, we decided which typeface to use for headings on our example website. The following is where we left off.

We have our body text set to 18 pixels and our line height is 1.45 (26 pixels). With that, we have everything we need to apply vertical rhythm to our website. I’ll be using Sass throughout this example and I’ll only be focusing on desktop, to make things easier to understand. In a real example, the process would need to be repeated for the mobile text size and line height (if it’s not the same as on desktop)—something we’ll cover in the chapter about responsive web typography.

Let’s start by setting the line height and the bottom margin for all elements.

// Variables
$base-font-size: 112.5; // Gets used as %
$line-height: 1.45;

// Vertical rhythm mixins
@mixin line-height($number) {
  line-height: #{ $number * $line-height + 'rem'};
}

@mixin margin-top($number) {
  margin-top: #{ $number * $line-height + 'rem'};
}

@mixin margin-bottom($number) {
  margin-bottom: #{ $number * $line-height + 'rem'};
}

html {
  font-size: #{$base-font-size + '%'}; // 112.5% = 18 pixels
}

* {
  @include line-height(1);
  @include margin-bottom(1);
  @include margin-top(0);
}

We’ve just reset the line height and the margins of every element. They all fit the baseline grid now. With that, we created a problem at the same time. Headings are usually larger than body text size and won’t fit into a single line height. We need to change the line height for all headings. Let’s change their margins while we’re at it as well.

The easiest way to do that is to create an object with a list of all headings and their values for their line height, top and bottom margins.

// Headings parameters [ h1: line-height: 2 × 26px, margin-top: 3 × 26px, margin-bottom: 1 × 26px ]
$headings: (
  h1: (2, 3, 1),
  h2: (1.5, 2, 1),
  h3: (1.5, 1, 0),
  h4: (1, 1, 0),
  h5: (1, 1, 0),
  h6: (1, 1, 0)
);

// Set line-heights and margins
@each $heading, $properties in $headings {
  #{$heading} {
    @include line-height(nth($properties, 1));
    @include margin-top(nth($properties, 2));
    @include margin-bottom(nth($properties, 3));
  }
}

All right, now we’re talking. Now even all the major text elements fit the baseline grid. And that’s pretty much it when it comes to vertical rhythm. We have a good foundation to work on. Let’s make sure we set all our line heights and margins with the Sass mixins from now on and everything will be all right. Check out our example website so far at betterwebtype.com/book/c6.

From here on, the book explores modular scales and what “meaningful typography” means, composing pages, responsive web typography and also dives into 4 additional chapters about micro typography. This is the finished live example website that gets built as your progress through the chapters.

___

Free web typography course

Get 7 free web typography lessons and learn how to craft websites with better web typography in just a week!

Sign up for the free web typography course →

Resources & tools for vertical rhythm

Here’s a list of really cool and useful tools and resource when it comes to rhythm in web typography.

  • Syncope

    Syncope is a WYSIWYG tool for establishing vertical rhythm on websites.

  • Archetype

    Create beautiful web typography designs, in the browser.

  • Grid Lover

    Establish a typographic system with modular scale & vertical rhythm.

  • Gutenberg

    A meaningful web typography starter kit.

Vectorized Emulation: fuzzing at 2 trillion instructions per second

$
0
0

This is the introduction of a multipart series. It is to give a high level overview without really deeply diving into any individual component.

Vectorized emulation, why do I do this to myself?

Why

DateInfo
2018-10-14Initial

Follow me at @gamozolabs on Twitter if you want notifications when new blogs come up, or I think you can use RSS or something if you’re still one of those people.

All benchmarks done here are on a single Xeon Phi 7210 with 96 GiB of RAM. This comes out to about $4k USD, but if you cheap out on RAM and buy used Phis you could probably get the same setup for $1k USD.

This machine has 64 cores and 256 hardware threads. Using AVX-512 I run 4096 32-bit VMs at a time ((512 / 32) * 256).

All performance numbers in this article refer to the machine running at 100% on all cores.

TermInology
LaneA single component in a larger vector (often 32-bit throughout this document)
VMA single VM, in terms of vectorized emulation it refers to a single lane of a vector

In this blog I’m going to introduce you to a concept I’ve been working on for almost 2 years now. Vectorized emulation. The goal is to take standard applications and JIT them to their AVX-512 equivalent such that we can fuzz 16 VMs at a time per thread. The net result of this work allows for high performance fuzzing (approx 40 billion to 120 billion instructions per second [the 2 trillion clickbait number is theoretical maximum]) depending on the target, while gathering differential coverage on code, register, and memory state.

By gathering more than just code coverage we are able to track state of code deeper than just code coverage itself, allowing us to fuzz through things like memcmp() without any hooks or static analysis of the target at all.

Further since we’re running emulated code we are able to run a soft MMU implementation which has byte-level permissions. This gives us stronger-than-ASAN memory protections, making bugs fail faster and cleaner.

My history with fuzzing tools starts off effectively with my hypervisor for fuzzing, falkervisor. falkervisor served me well for quite a long time, but my work rotated more towards non-x86 targets, which it did not support. With a demand for emulation I made modifications to QEMU for high-performance fuzzing, and ultimately swapped out their MMU implementation for my own which has byte-level permissions. This new byte-level permission model allowed me to catch even the smallest memory corruptions, leading to finding pretty fun bugs!

More and more after working with QEMU I got annoyed. It’s designed for whole systems yet I was using it for fuzzing targets that were running with unknown hardware and running from dynamically dumped memory snapshots. Due to the level of abstraction in QEMU I started to get concerned with the potential unknowns that would affect the instrumentation and fuzzing of targets.

I developed my first MIPS emulator. It was not designed for performance, but rather purely for simple usage and perfect single stepping. You step an instruction, registers and memory get updated. No JIT, no intermediate registers, no flushing or weird block level translation changes. I eventually made a JIT for this that maintained the flush-state-every-instruction model and successfully used it against multiple targets. I also developed an ARM emulator somewhere in this timeframe.

When early 2017 rolls around I’m bored and want to buy a Xeon Phi. Who doesn’t want a 64-core 256-thread single processor? I really had no need for the machine so I just made up some excuse in my head that the high bandwidth memory on die would make reverting snapshots faster. Yeah… like that really matters? Oh well, I bought it.

While the machine was on the way I had this idea… when fuzzing from a snapshot all VMs initially start off fuzzing with the exact same state, except for maybe an input buffer and length being changed. Thus they do identical operations until user-controlled data is processed. I’ve done some fun vectorization work before, but what got me thinking is why not just emit vpaddd instead of add when JITting, and now I can run 16 VMs at a time!

Alas… the idea was born

Snapshot fuzzing is fundamental to this work and almost all fuzzing work I have done from 2014 and beyond. It warrants its own blog entirely.

Snapshot fuzzing is a method of fuzzing where you start from a partially-executed system state. For example I can run an application under GDB, like a parser, put a breakpoint after the file/network data has been read, and then dump memory and register state to a core dump using gcore. At this point I have full memory and register state for the application. I can then load up this core dump into any emulator, set up memory contents and permissions, set up register state, and continue execution. While this is an example with core dumps on Linux, this methodology works the same whether the snapshot is a core dump from GDB, a minidump on Windows, or even an exotic memory dump taken from an exploit on a locked-down device like a phone.

All that matters is that I have memory state and register state. From this point I can inject/modify the file contents in memory and continue execution with a new input!

It can get a lot more complex when dealing with kernel state, like file handles, network packets buffered in the kernel, and really anything that syscalls. However in most targets you can make some custom rigging using strace to know which FDs line up, where they are currently seeked, etc. Further a full system snapshot can be used instead of a single application and then this kernel state is no longer a concern.

The benefits of snapshot fuzzing are performance (linear scaling), high levels of introspection (even without source or symbols), and most importantly… determinism. Unless the emulator has bugs snapshot fuzzing is typically deterministic (sometimes relaxed for performance). Find some super exotic race condition while snapshot fuzzing? Well, you can single step through with the same input and now you can look at the trace as a human, even if it’s a 1 in a billion chance of hitting.

Since the 90s many computer architectures have some form of SIMD (vectorized) instruction set. SIMD stands for single instruction multiple data. This means that a single instruction performs an operation (typically the same) on multiple different pieces of data. SIMD instruction sets fall under names like MMX, SSE, AVX, AVX512 for x86, NEON for ARM, and AltiVec for PPC. You’ve probably seen these instructions if you’ve ever looked at a memcpy() implementation on any 64-bit x86 system. They’re the ones with the gross 15 character mnemonics and registers you didn’t even know existed.

For a simple case lets talk about standard SSE on x86. Since x86_64 started with the Pentium 4 and the Pentium 4 had up to SSE3 implementations, almost any x86_64 compiler will generate SSE instructions as they’re always valid on 64-bit systems.

SSE provides 128-bit SIMD operations to x86. SSE introduced 16 128-bit registers named xmm0 through xmm15 (only 8 xmm registers on 32-bit x86). These 128-bit registers can be treated as groups of different sized smaller pieces of data which sum up to 128 bits.

  • 4 single precision floats
  • 2 double precision floats
  • 2 64-bit integers
  • 4 32-bit integers
  • 8 16-bit integers
  • 16 8-bit integers

Now with a single instruction it is possible to perform the same operation on multiple floats or integers. For example there is an instruction paddd, which stands for packed add dwords. This means that the 128-bit registers provided are treated as 4 32-bit integers, and an add operation is performed.

Here’s a real example, adding xmm0 and xmm1 together treating them as 4 individual 32-bit integer lanes and storing them back into xmm0

paddd xmm0, xmm1

RegisterDword 1Dword 2Dword 3Dword 4
xmm05678
xmm110203040
xmm0 (result)15263748

Cool. Starting with AVX these registers were expanded to 256-bits thus allowing twice the throughput per instruction. These registers are named ymm0 through ymm15. Further AVX introduced three operand form instructions which allow storing a result to a different register than the ones being used in the operation. For example you can do vpaddd ymm0, ymm1, ymm2 which will add the 8 individual 32-bit integers in ymm1 and ymm2 and store the result into ymm0. This helps a lot with register scheduling and prevents many unnecessary movs just to save off registers before they are clobbered.

AVX-512 is a continuation of x86’s SIMD model by expanding from 16 256-bit registers to 32 512-bit registers. These registers are named zmm0 through zmm31. Further AVX-512 introduces 8 new kmask registers named k0 through k7 where k0 has a special meaning.

The kmask registers are used to perform masking on instructions, either by merging or zeroing. This makes it possible to loop through data and process it while having conditional masking to disable operations on a given lane of the vector.

The syntax for the common instructions using kmasks are the following:

vpaddd zmm0 {k1}, zmm1, zmm2

chart simplified to show 4 lanes instead of 16

RegisterDword 1Dword 2Dword 3Dword 4
zmm09999
zmm11234
zmm210203040
k11011
zmm0 (result)1193344

or

vpaddd zmm0 {k1}{z}, zmm1, zmm2

chart simplified to show 4 lanes instead of 16

RegisterDword 1Dword 2Dword 3Dword 4
zmm09999
zmm11234
zmm210203040
k11011
zmm0 (result)1103344

The first example uses k1 as the kmask for the add operation. In this case the k1 register is treated as a 16-bit number, where each bit corresponds to each of the 16 32-bit lanes in the 512-bit register. If the corresponding bit in k1 is zero, then the add operation is not performed and that lane is left unchanged in the resultant register.

In the second example there is a {z} suffix on the kmask register selection, this means that the operation is performed with zeroing rather than merging. If the corresponding bit in k1 is zero then the resultant lane is zeroed out rather than left unchanged. This gets rid of a dependency on the previous register state of the result and thus is faster, however it might not be suitable for all applications.

The k0 mask is implicit and does not need to be specified. The k0 register is hardwired to having all bits set, thus the operation is performed on all lanes unconditionally.

Prior to AVX-512 compare instructions in SIMD typically yielded all ones in a given lane if the comparision was true, or all zeroes if it was false. In AVX-512 comparison instructions are done using kmasks.

vpcmpgtd k2 {k1}, zmm10, zmm11

You may have seen this instruction in the picture at the start of the blog. What this instruction does is compare the 16 dwords in zmm10 with the 16 dwords in zmm11, and only performs the compare on lanes enabled by k1, and stores the result of the compare into k2. If the lane was disabled due to k1 then the corresponding bit in the k2 result will be zero. Meaning the only set bits in k2 will be from enabled lanes which were greater in zmm10 than in zmm11. Phew.

Now that you’ve made it this far you might already have some gears turning in your head telling you where this might be going next.

Since with snapshot fuzzing we start executing the same code, we are doing the same operations. This means we can convert the x86 instructions to their vectorized counterparts and run 16 VMs at a time rather than just one.

Let’s make up a fake program:

moveax,5movebx,10addeax,ebxsubeax,20

How can we vectorize this code?

; Register allocation:; eax = zmm0; ebx = zmm1vpbroadcastdzmm0,dwordptr[memorycontainingconstant5]vpbroadcastdzmm1,dwordptr[memorycontainingconstant10]vpadddzmm0,zmm0,zmm1vpsubdzmm0,zmm0,dwordptr[memorycontainingconstant20]{1to16}

Well that was kind of easy. We’ve got a few new AVX concepts here. We’re using the vpbroadcastd instruction to broadcast a dword value to all lanes of a given ZMM register. Since the Xeon Phi is bottlenecked on the instruction decoder it’s actually faster to load from memory than it is to load an immediate into a GPR, move this into a XMM register, and then broadcast it out.

Further we introduce the {1to16} broadcasting that AVX-512 offers. This allows us to use a single dword constant value with in our example vpsubd. This broadcasts the memory pointed to to all 16 lanes and then performs the operation. This saves one instruction as we don’t need an explicit vpbroadcastd.

In this case if we executed this code with any VM state we will have no divergence (no VMs do anything different), thus this example is very easy. It’s pretty much a 1-to-1 translation of the non-vectorized x86 to vectorized x86.

Alright, let’s try one a bit more complex, this time let’s work with VMs in different states:

becomes

; Register allocation:; eax = zmm0vpadddzmm0,zmm0,dwordptr[memorycontainingconstant10]{1to16}

Let’s imagine that the value in eax prior to execution is different, let’s say it’s [1, 2, 3, 4] for 4 different VMs (simplified, in reality there are 16).

RegisterDword 1Dword 2Dword 3Dword 4
zmm01234
const10101010
zmm0 (result)11121314

Oh? This is exactly what AVX is supposed to do… so it’s easy?

So you might have noticed we’ve dodged a few things here that are hard. First we’ve ignored memory operations, and second we’ve ignored branches.

Lets talk a bit about AVX memory

With AVX-512 we can load and store directly from/to memory, and ideally this memory is aligned as 512-bit registers are whole 64-byte cache lines. In AVX-512 we use the vmovdqa32 instruction. This will load an entire aligned 64-byte piece of memory into a ZMM register ala vmovdqa32 zmm0, [memory], and we can store with vmovdqa32 [memory], zmm0. Further when using kmasks with vmovdqa32 for loads the corresponding lane is left unmodified (merge masking) or zeroed (zero masking). For stores the value is simply not written if the corresponding mask bit is zero.

That’s pretty easy. But this doesn’t really work well when we have 16 unique VMs we’re running with unique address spaces.

… or does it?

VM memory interleaving

Since most VM memory operations are not affected by user input, and thus are the same in all VMs, we need a way to organize the 16 VMs memory such that we can access them all quickly. To do this we actually interleave all 16 VMs at the dword level (32-bit). This means we can perform a single vmovdqa32 to load or store to memory for all 16 VMs as long as they’re accessing the same address.

This is pretty simple, just interleave at the dword level:

chart simplified to show 4 lanes instead of 16

Guest AddressHost AddressDword 1Dword 2Dword 3Dword 16
0x00000x000012333
0x00040x004032745545
0x00080x008024242424

All we need to do is take the guest address, multiply it by 16, and then vmovdqa32 from/to that address. It once again does not matter what the contents of the memory are for each VM and they can differ. The vmovdqa32 does not care about the memory contents.

In reality the host address is not just the guest address multiplied by 16 as we need some translation layer. But that will get it’s own entire blog. For now let’s just assume a flat, infinite memory model where we can just multiply by 16.

So what are the limitations of this model?

Well when reading bytes we must read the whole dword value and then shift and mask to extract the specific byte. When writing a byte we need to read the memory first, shift, mask, and or in the new byte, and write it out. And when doing non-aligned operations we need to perform multiple memory operations and combine the values via shifting and masking. Luckily compilers (and programmers) typically avoid these unaligned operations and they’re rare enough to not matter much.

So far everything we have talked about does not care about the values it is operating on at all, thus everything has been easy so far. But in reality values do matter. There are 3 places where divergence matters in this entire system:

  • Loads/stores with different addresses
  • Branches
  • Exceptions/faults

Loads/stores with different addresses

Let’s knock out the first one real quick, loads and stores with different addresses. For all memory accesses we do a very quick horizontal comparison of all the lanes first. If they have the same address then we take a fast path and issue a single vmovdqa32. If their addresses differ than we simply perform 16 individual memory operations and emulate the behavior we desire. It technically can get a bit better as AVX-512 has scatter/gather instructions which allow the CPU to do this load/storing to different addresses for us. This is done with a base and an offset, with 32-bits it’s not possible to address the whole address space we need. However with 64-bit vectorization (8 64-bit VMs) we can leverage scatter/gather instructions to their fullest and all loads and stores just become a fast path with one vmovdqa32, or a slow (but fast) path where we use a single scatter/gather instruction.

Branches

We’ve avoided this until now for a reason. It’s the single hardest thing in vectorized emulation. How can we possibly run 16 VMs at a time if one branches to another location. Now we cannot run a AVX-512 instruction as it would be invalid for the VMs which have gone down a different path.

Well it turns out this isn’t a terribly hard problem on AVX-512. And when I say AVX-512 I mean specifically AVX-512. Feel free to ponder why this might be based on what you’ve learned is unique to AVX-512.

Okay it’s kmasks. Did you get it right? Well kmasks save our lives. Remember the merging kmasks we talked about which would disable updates to a given lane of a vector and ignore writes to a given lane if it is not enabled in the kmask?

Well by using a kmask register on all JITted AVX-512 instructions we can simply change the kmask to disable updates on a given VM.

What this allows us to do is start execution at the same location on all 16 VMs as they start with the same EIP. On all branches we will horizontally compare the branch targets and compute a new kmask value to use when we continue execution on the new branch.

AVX-512 doesn’t have a great way of extracting or broadcasting arbitrary elements of a vector. However it has a fast way to broadcast the 0th lane in a vector ala vpbroadcastd zmm0, xmm0. This takes the first lane from xmm0 and broadcasts it to all 16 lanes in zmm0. We actually never stop following VM #0. This means VM #0 is always executing, which is important for all of the horizontal compares that we talk about. When I say horizontal compare I mean a broadcast of the VM#0 and compare with all other VMs.

Let’s look in-detail at the entire JIT that I use for conditional indirect branches:

; IL operation is Beqz(val, true_target, false_target);; val          - 16 32-bit values to conditionally branch by; true_target  - 16 32-bit guest branch target addresses if val == 0; false_target - 16 32-bit guest branch target addresses if val != 0;; IL pseudocode:;; if val == 0 {;    goto true_target;; } else {;    goto false_target;; };; Register usage; k1    - The execution kmask, this is the kmask used on all JITted instructions; k2    - Temporary kmask, just used for scratch; val   - Dynamically allocated zmm register containing val; ttgt  - Dynamically allocated zmm register containing true_target; ftgt  - Dynamically allocated zmm register containing false_target; zmm0  - Scratch register; zmm31 - Desired branch target for all lanes; Compute a kmask `k2` which contains `1`s for the corresponding lanes; for VMs which are enabled by `k1` and also have a non-zero value.; TL;DR: k2 contains a mask of VMs which will be taking `ftgt`vptestmdk2{k1},val,val; Store the true branch target unconditionally, while not clobbering; VMs which have been disabledvmovdqa32zmm31{k1},ttgt; Store the false branch target for VMs not taking the branch; Note the use of k2vmovdqa32zmm31{k2},ftgt; At this point `zmm31` contains the targets for all VMs. Including ones; that previously got disabled.; Broadcast the target that VM #0 wants to take to all lanes in `zmm0`vpbroadcastdzmm0,xmm31; Compute a new kmask of which represents all VMs which are going to; the same location as VM #0vpcmpeqdk1,zmm0,zmm31; ...; Now just rip out the target for VM #0 and translate the guest address; into the host JIT address and jump there.; Or break out and generate the JIT if it hasn't been hit before

The above code is quite fast and isn’t a huge performance issue, especially as we’re running 16 VMs at a time and branches are “rare” with respect to expensive operations like memory loads and stores.

One thing that is important to note is that zmm31 always contains the last desired branch target for a given VM. Even after it has been disabled. This means that it is possible for a VM which has been disabled to come back online if VM #0 ends up going to the same location.

Lets go through a more thorough example:

; Register allocation:; ebx - Pointer to some user controlled buffer; ecx - Length of controlled buffer; Validate buffer sizecmpecx,4jne.end; Fallthrough.next:; Check some magic from the buffercmpdwordptr[ebx],0x13371337jne.end; Fallthrough.next2:; Conditionally jump to end, for clarityjmp.end.end:

And the theoretical vectorized output (not actual JIT output):

; Register allocation:; zmm10 - ebx; zmm11 - ecx; k1    - The execution kmask, this is the kmask used on all JITted instructions; k2    - Temporary kmask, just used for scratch; zmm0  - Scratch register; zmm8  - Scratch register; zmm31 - Desired branch target for all lanes; Compute kmask register for VMs which have `ecx` == 4vpcmpeqdk2{k1},zmm11,dwordptr[memorycontaining4]{1to16}; Update zmm31 to reference the respective branch targetvmovdqa32zmm31{k1},addressof.end; By default we go to endvmovdqa32zmm31{k2},addressof.next; If `ecx` == 4, go to .next; Broadcast the target that VM #0 wants to take to all lanes in `zmm0`vpbroadcastdzmm0,xmm31; Compute a new kmask of which represents all VMs which are going to; the same location as VM #0vpcmpeqdk1,zmm0,zmm31; Branch to where VM #0 is going (simplified)jmpwhere_vm0_wants_to_go.next:; Magicially load memory at ebx (zmm10) into zmm8vmovdqa32zmm8,complex_mmu_operation_and_stuff; Compute kmask register for VMs which have packet contents 0x13371337vpcmpeqdk2{k1},zmm8,dwordptr[memorycontaining0x13371337]{1to16}; Go to .next2 if memory is 0x13371337, else go to .endvmovdqa32zmm31{k1},addressof.end; By default we go to endvmovdqa32zmm31{k2},addressof.next2; If contents == 0x13371337 .next2; Broadcast the target that VM #0 wants to take to all lanes in `zmm0`vpbroadcastdzmm0,xmm31; Compute a new kmask of which represents all VMs which are going to; the same location as VM #0vpcmpeqdk1,zmm0,zmm31; Branch to where VM #0 is going (simplified)jmpwhere_vm0_wants_to_go.next2:; Everyone still executing is unconditionally going to .endvmovdqa32zmm31{k1},addressof.end; Broadcast the target that VM #0 wants to take to all lanes in `zmm0`vpbroadcastdzmm0,xmm31; Compute a new kmask of which represents all VMs which are going to; the same location as VM #0vpcmpeqdk1,zmm0,zmm31.end:

Okay so what does the VM state look like for a theoretical version (simplified to 4 VMs):

Starting state, all VMs enabled with different memory contents (pointed to by ebx) and different packet lengths:

RegisterVM 0VM 1VM 2VM 3
ecx4344
memory0x133713370x1337133730x13371337
K11111

First branch, all VMs with ecx != 4 are disabled and are pending branches to .end, VM #1 falls off

RegisterVM 0VM 1VM 2VM 3
ecx4344
memory0x133713370x1337133730x13371337
K11011
Zmm31.next.end.next.next

Second branch, VMs without 0x13371337 in memory are pending branches to .end, VM #2 falls off

RegisterVM 0VM 1VM 2VM 3
ecx4344
memory0x133713370x1337133730x13371337
K11001
Zmm31.next2.end.end.next2

Final branch, everyone ends up at .end, all VMs are enabled again as they’re following VM #0 to .end

RegisterVM 0VM 1VM 2VM 3
ecx4344
memory0x133713370x1337133730x13371337
K11111
Zmm31.end.end.end.end

Branch summary

So we saw branches will disable VMs which do not follow VM #0. When VMs are disabled all modifications to their register states or memory states are blocked by hardware. The kmask mechanism allows us to keep performance up and not use different JITs based on different branch states.

Further, VMs can come back online if they were pending to go to a location which VM #0 eventually ends up going to.

Exceptions/faults

These are really just glorified branches with a VM exit to save the input and memory/register state related to the crash. No reason to really go in depth here.



Okay we’ve covered all the very high level details of how vectorized emulation is possible but that’s just academic thought. It’s pointless unless it accomplishes something.

At this point all of the next topics are going to be their own blogs and thus are only lightly touched on

Differential coverage is a special type of coverage that we are able to gather with this vectorized emulation model. This is the most important aspect of all of this tooling and is the main reason it is worth doing.

Since we are running 16 VMs at a time we are able to very cheaply (a few cycles) do a horizontal comparison with other VMs. Since VMs are deterministic and only have differing user-controlled inputs any situation where VMs have different branches, different register states, different memory states, etc is when the user input directly or indirectly caused a change in behavior.

I would consider this to be the holy grail of coverage. Any affect the input has on program state we can easily and cheaply detect.

How differential coverage combats state explosion

If we wanted to track all register states for all instructions the state explosion would be way too huge. This can be somewhat capped by limiting the amount of state each instruction can generate. For example instead of storing all unique register values for an instruction we could simply store the minimums and maximums, or store up to n unique values, etc. However even when limited to just a few values per instruction, the state explosion is too large for any real application.

However, since most memory and register states are not influenced by user input, with differential coverage we can greatly reduce the amount of instructions which state is stored on as we only store state that was influenced by user data.

This works for code coverage as well, for example if we hit a printf with completely uncontrolled parameters that would register as potentially hundreds of new blocks of coverage. With differential coverage all of this state can be ignored.

How differential coverage is great for performance

While the focus of this tool is not performance, the performance costs of updating databases on every instruction is not feasible. By filtering only instructions which have user-influenced data we’re able to perform much more complex operations in the case that new coverage was detected.

For example all of my register loads and stores start with a horizontal compare and a quick jump out if they all match. If one differs it’s a rare enough occasion that it’s feasible to spend a few more cycles to do a hash calculation based on state and insertion into the global input and coverage databases. Without differential coverage I would have to unconditionally do this every instruction.

Since the soft MMU deserves a blog entirely on it’s own, we’ll just go slightly into the details.

As mentioned before, we interleave memory at the dword level, but for every byte there is also a corresponding permission byte. In memory this looks like 16 32-bit dwords representing the permissions, followed by 16 32-bit dwords containing their corresponding memory contents. This allows me to read a 64-byte cache line with the permissions which are checked first, followed by reading the 64-byte cache line directly following with the contents.

For permissions: the read, write, and execute bits are completely separate. This allows more exotic memory models like execute-only memory.

Since permissions are at the byte level, this means we can punch a one-byte hole anywhere in memory and accessing that byte would cause a fault. For some targets I’ll do special modifications to permissions and punch holes in unused or padding fields of structures to catch overflows of buffers contained inside structures.

Further I have a special read-after-write (RaW) bit, which is used to mark memory as uninitialized. Memory returned from allocators is marked as RaW and thus will fault if ever read before written to. This is tracked at the byte level and is one of the most useful features of the MMU. We’ll talk about how this can be made fast in a subsequent blog.

Performance is not the goal of this project, however the numbers are a bit better than expected from the theorycrafting.

In reality it’s possible to hit up to 2 trillion emulated instructions per second, which is the clickbait title of this blog. However this is on a 32-deep unrolled loop that is just adding numbers and not hitting memory. This unrolling makes the branch divergence checking costs disappear, and integer operations are almost a 1-to-1 translation into AVX-512 instructions.

For a real target the numbers are more in the 40 billion to 120 billion emulated instructions per second range. For a real target like OpenBSD’s DHCP client I’m able to do just over 5 million fuzz cases per second (fuzz case is one DHCP transaction, typically 1 or 2 packets). For this specific target the emulation speed is 54 billion instructions per second. This is while gathering PC-level coverage and all register and memory divergence coverage.

I’ve been working on this tooling for almost 2 years now and it’s been usable since month 3. It’s my primary tool for fuzzing and has successfully found bugs in various targets. Sadly most of these bugs are not public yet, but soon.

This tool was used to find a remote bluescreen in Windows Firewall: CVE-2018-8206 (okay technically I found it first manually, but was able to find it with the fuzzer with a byte flipper even though it has much more complex constraints)

It was also used to find a theoretical OOB in OpenBSD’s dhclient: dhclient bug . This is a fun one as really no tradtional fuzzer would find this as it’s an out-of-bounds by 1 inside of a structure.

  • Description of the IL used, as it’s got some specific designs for vectorized emulation

  • Internal details of the MMU implementation

  • Showing the power of differential coverage by looking a real example of fuzzing an HTTP parser and having a byte flipper quickly (under 5 seconds) find the basic “VERB HTTP/number.number\r\n". No magic, no `strings` feedback, no static analysis. Just a useless fuzzer with strong harnessing.

  • Talk about the new IL which handles graphs and can do cross-block optimizations

  • Showing better branch divergence handling via post-dominator analysis and stepping VMs until they sync up at a known future merge point

Show HN: How to start a business in France

$
0
0

Looking at the cost of hiring, it's crazy to see that the employee only gets ~50% (or even less as the salary grows) than what the employer pays.

This makes it very expensive for an employer to pay their employees well, and explains very well why salaries are so low in France.

Something else to note is also employment laws, which are way more strict in France and makes it either extremely expansive or just impossible to fire someone, the cost of hiring seems to be way too high for startups, which are business why high risks already.


That's because in France you get indirect salary as well whereas in other countries you have to pay some of those things from the indirect salary from your own pocket. To compare to a US salary, I would personally add 30 to 40% to those raw figures to reflect external costs.

> Something else to note is also employment laws, which are way more strict in France and makes it either extremely expansive or just impossible to fire someone, the cost of hiring seems to be way too high for startups, which are business why high risks already.

No it's not, first there's the trial period for permanent employees where you can fire at will and then once it's over, you can still do it but you need to motivate your decision.


While I do agree with your opinion about employment in France, that index could also be considered an "Employee abuse index", depending on your perspective.

No severance package - yay for flexibility! I'm not saying you should pay a lot, but 1-2 month's salary after 5 years with the company and 3-6 after 10 years shouldn't kill the company and should provide a respite for the just-fired loyal employee. By the way, in your index the country would get a 100/100 when it has no severance package.

No extra payment for overtime - again, this seems like abuse. This would be 100/100 in your index but I don't really see how it can be justified. Same with the time restrictions for overtime.

Paid annual leave - ...

It's one thing to be overbearing and another to offer decent social support to employees (I find France overbearing).


Reality always comes back snag you out of delusion. And reality is that France is not attractive. Without core modifications this kind of initiative will be close to useless.

(I'm one of many french engineers who left France to have a decent career)


Same here. I left because France is not attractive indeed. And I also agree core changes are needed to make France attractive for entrepreneurs.

At the same time, the points that OP makes are still valid :). For someone that wants to open a company, social aids and healthcare access can matter. Other countries are more open than France is, but it doesn't make France a black hole where nobody can make a good career. The Paris tech scene is most alive.


I am French and I applaud at any initiative to help the part of the French economy not under the government control (circa 45%)!

1. But even if my browser is set to English, I see French on https://embauche.beta.gouv.fr/ and most of the issues on the Github repository are written in French! What signal do you want to send?

2. As one can find on "https://embauche.beta.gouv.fr/" the employee gets only half of what the employer pays.

Contrary to what is stated on the same site, one employee does not get much benefit of what the employer pays for her retirements and healthcare, because:

* if she is hired now, there is a good probability that she will have only a little retirement pension in 40 years, even if the law does not change. (probably close to the minimum salary, at least I hope so). So all this considerable amount of money will be wasted in the state and the "sécurité sociale" deep pockets.

* It is the same for the health fund: What makes the French health system seems free for workers, is that they have a mandatory mutual [0] which pays for most of the costs. The "Social security" pays around of 10% of most usual costs, the mutual fund pays the rest...

Most French companies are trying hard to use other schemes for paying better their employees, for example because they have some company investment fund, but "shut...".

[0] This website is provided by the URSSAF, the French social security contributions collector"


Why spreading FUD?

Social security paid for 76% of all medical costs in 2014 (last numbers I could find).

92% of hospital bills 64% of doctors visits 62% of drugs, glasses and such. Most prescribed drugs by your doctor being reimbursed 100% you don't even have to pay at the pharmacy.

http://www.vie-publique.fr/decouverte-institutions/protectio...

You also have a very simplistic way of understanding how our retirement pension system works and the role it has in the society.


> the part of the French economy not under the government control (circa 45%)!

That public spending is c. 55% of GDP does not mean that private spending is 45% of GDP, since private + public spending does not equal GDP. Back of the envelope computations put private spending at 265% of GDP [0].

> if she is hired now, there is a good probability that she will have only a little retirement pension in 40 years, even if the law does not change. (probably close to the minimum salary, at least I hope so). So all this considerable amount of money will be wasted in the state and the "sécurité sociale" deep pockets.

It is hard to understand how French people can both have "very low" pensions and then be one of the country that spend the most on pensions as a share of GDP [1], and have retirees richer than working people [2, p5.]

[0] https://www.nouvelobs.com/rue89/rue89-chez-les-economistes-a...

[1] https://data.oecd.org/socialexp/pension-spending.htm

[2] http://www.cor-retraites.fr/IMG/pdf/doc-4099.pdf


> The "Social security" pays around of 10% of most usual costs, the mutual fund pays the rest...

What? The basic insurance scheme pays at least 60% of a set of reasonable, negotiated fees and the "mutuelle" covers the rest. There are a few corner cases (complicated dental work, optics) but these are being addressed. Usually the out-of-pocket expense is minimal to nil (a "forfait moderateur" of around 1 euro per doctor's appointment for instance)


I'm not sure what you mean by "deep pockets" of the "Sécurité Sociale" (is it "deep" in the sense that it's seen as bottomless, has a rather big deficit for parts of it, and it kinda assumed to be bailed out by taxpayer ; or "deep" because someone keeps the money ?)

Also, we have to make one thing clear about the whole system in France, that's quite different from other places: your employer does not pay for your retirement and healthcase ; she pays for everyone's. Since you're most likely not retired when you have an employer (duh), and most likely not heavily sick, that's probably mostly "everyone else's".

Of course, the line gets more bluried if you happen to have kids (and get the "allocations familiales", or housing help, etc...) or sick relatives, or retired ancestors.

It's a long, heated, and complicated debate whether this is a good thing or not, and whether this is sustainable in the long-term, and whether the current generation of worker will be able to get the same "generosity" from the next generation.

It's also true that most people find some way to save money "for themselves", and that companies can benefit from systems where they save money directly for their employees.

And, guess what ? As usual, this is debated at the very moment we speak, so expect nothing except change :/


Would be great to understand more about accounting requirements, recurring filings, costs of preparing and filing these and the business tax system in general. Also, are there any other demanding/taxis laws for a company with let's say 4 employees. What about sick leave, what about parental leave? What about if the first hire is underperforming, can you easily get rid of that person etc (important in startups with a handful of employees)?

At least in my case, I would want some information regarding that so I could decide if incorporating in France is a good idea, compared to let's say Estonia or the US.

Deprecation of Legacy TLS 1.0 and 1.1 Versions

$
0
0

This is a guest post from Apple’s Secure Transports team about TLS protocol version deprecations. This announcement may require changes for your websites.

Transport Layer Security (TLS) is a critical security protocol used to protect web traffic. It provides confidentiality and integrity of data in transit between clients and servers exchanging (often sensitive) information. To best safeguard this data, it is important to use modern and more secure versions of this protocol. Specifically, applications should move away from TLS 1.0 and 1.1. Doing so provides many benefits, including:

  • Modern cryptographic cipher suites and algorithms with desirable performance and security properties, e.g., perfect forward secrecy and authenticated encryption, that are not vulnerable to attacks such as BEAST.
  • Removal of mandatory and insecure SHA-1 and MD5 hash functions as part of peer authentication.
  • Resistance to downgrade-related attacks such as LogJam and FREAK.

Now is the time to make this transition. Properly configured for App Transport Security (ATS) compliance, TLS 1.2 offers security fit for the modern web. It is the standard on Apple platforms and represents 99.6% of TLS connections made from Safari. TLS 1.0 and 1.1 — which date back to 1999 — account for less than 0.36% of all connections. With the recent finalization of TLS 1.3 by the IETF in August 2018, the proportion of legacy TLS connections will likely drop even further. TLS 1.2 is also required for HTTP/2, which delivers significant performance improvements for the web.

Therefore, we are deprecating support for TLS 1.0 and 1.1. Complete support will be removed from Safari in updates to Apple iOS and macOS beginning in March 2020. Firefox, Chrome, and Edge are also planning to drop TLS 1.0 and 1.1 support at that time. If you own or operate a web server that does not support TLS 1.2 or newer, please upgrade now. If you use legacy services or devices that cannot be upgraded, please let us know by contacting our Web Technologies Evangelist or by filing a bug report with details.

Viewing all 25817 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>