Quantcast
Channel: Hacker News
Viewing all 25817 articles
Browse latest View live

India's top court rules against instant divorce

$
0
0
A bride is sitting with her hand covered in hennaImage copyrightGetty Images
Image caption Five Muslim women asked the Supreme Court to declare instant divorce unconstitutional

India's top court has ruled the practice of instant divorce in Islam unconstitutional, marking a major victory for women's rights activists.

In a 3-2 majority verdict, the court called the practice "un-Islamic".

India is one of a handful of countries where a Muslim man can divorce his wife in minutes by saying the word talaq (divorce) three times.

The landmark court decision came in response to petitions challenging the so-called "triple talaq" custom.

The cases were filed by five Muslim women who had been divorced in this way and two rights groups.

"Muslim women in India have suffered for the last 70 years. It's a historic day for us, but it doesn't end here. I cannot tell you how much Indian women have supported us, despite their religions," Zakia Soman, an activist from Bharatiya Muslim Mahila Andolan, one of the groups which contested the practice, told reporters.

What is instant divorce?

In recent years, there have been cases in which Muslim men in India have divorced their wives by issuing the so-called triple talaq by letter, telephone and, increasingly, by text message, WhatsApp and Skype. A number of these cases made their way to the courts as women contested the custom.

Yet triple talaq divorce has no mention in Sharia or the Koran, even though the practice has existed for decades.

Islamic scholars say the Koran clearly spells out how to issue a divorce - it has to be spread over three months, allowing a couple time for reflection and reconciliation.

Activists say most Islamic countries, including Pakistan and Bangladesh, have banned triple talaq, but the custom has continued in India, which does not have a uniform set of laws on marriage and divorce that apply to every citizen.

What did the court say?

Three of the judges called the controversial practice "un-Islamic, arbitrary and unconstitutional". One of the judges, Justice Kurien Joseph, said the practice was not an essential part of Islam and enjoyed no protection.

Chief Justice JS Khehar, in a differing opinion, said that personal law could not be touched by a constitutional court of law.

The opposing judgements also recommended that parliament legislate on the issue. However this is not binding and is up to parliament to take up.


'Strong message' - By Geeta Pandey, Editor, India women and social affairs

The judgement is a huge victory for Muslim women. For decades, they have had to live with the threat of instant divorce dangling over their heads like a sword.

Campaigners say over the years thousands of women, especially those from poor families, have been discarded by their husbands in this manner. Many have been rendered destitute, with nowhere to go, or have been forced to return to their parental homes or fend for themselves.

The top court has also sent a very strong message to Muslim clergy. India's Muslim personal law board had called the practice "reprehensible" but said that it was not an issue for the courts and government to interfere in. With this latest ruling, this will no longer be the case.


How are people reacting?

The judgement is being widely hailed as a major victory for Muslim women and women's rights.

The hashtags #TripleTalaq and #SupremeCourt began trending on Twitter India even as the verdict was being announced. The hashtag #Tripletalaq is also trending globally on Twitter.


Initial Hammer2 filesystem implementation

$
0
0
Next DFly release will have an initial HAMMER2 implementationMatthew Dillondillon at backplane.com
Fri Aug 18 23:40:22 PDT 2017
The next DragonFly release (probably in September some time) will have an
initial HAMMER2 implementation.  It WILL be considered experimental and
won't be an installer option yet.  This initial release will only have
single-image support operational plus basic features.  It will have live
dedup (for cp's), compression, fast recovery, snapshot, and boot support
out of the gate.

This first H2 release will not have clustering or multi-volume support, so
don't expect those features to work.  I may be able to get bulk dedup and
basic mirroring operational by release time, but it won't be very
efficient.  Also, right now, sync operations are fairly expensive and will
stall modifying operations to some degree during the flush, and there is no
reblocking (yet).  The allocator has a 16KB granularity (on HAMMER1 it was
2MB), so for testing purposes it will still work fairly well even without
reblocking.

The design is in a good place.  I'm quite happy with how the physical
layout turned out.  Allocations down to 1KB are supported.  The freemap has
a 16KB granularity with a linear counter (one counter per 512KB) for
packing smaller allocations.  INodes are 1KB and can directly embed 512
bytes of file data for files <= 512 bytes, or have four top-level blockrefs
for files > 512 bytes.  The freemap is also zoned by type for I/O locality.

The blockrefs are 'fat' at 128 bytes but enormously powerful.  That will
allow us to ultimately support up to a 512-bit crypto hash and blind dedup
using said hash.  Not on release, but that's the plan.

I came up with an excellent solution for directory entries.  The 1KB
allocation granularity was a bit high but I didn't want to reduce it.
However, because blockrefs are now 128 byte entities, and directory entries
are hashed just like in H1, I was able to code them such that a directory
entry is embedded in the blockref itself and does not require a separate
data reference or allocation beyond that.  Filenames up to 64 bytes long
can be accomodated in the blockref using the check-code area of the
blockref.  Longer filenames will use an additional data reference hanging
off the blockref to accomodate up to 255 char filenames.  Of course, a
minimum of 1KB will have to be allocated in that case, but filenames are <=
64 bytes in the vast majority of use cases so it just isn't an issue.

This gives directory entries optimal packing and indexing and is a huge win
in terms of performance since blockrefs are arrayed in 16KB and 64KB
blocks.  In addition, since inodes can embed up to four blockrefs, the
directory entries for 'small' directories with <= 4 entries ('.' and '..'
don't count) can actually be embedded in the directory inode itself.

So, generally speaking, the physical layout is in a very happy place.  The
basics are solid on my test boxes so it's now a matter of implementing as
many of the more sophisticated features as I can before release, and
continuing to work on the rest after the release.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20170818/b14d63d6/attachment.html>


More information about the Users mailing list

All 50 startups from Y Combinator’s Summer 2017 Demo Day 1

$
0
0

Biotech and artificial intelligence have emerged as the top startup trends at Y Combinator‘s 25th Demo Day. The 124 companies presenting at the entrepreneur school’s twice-yearly graduation event compose YC’s largest batch from its 12.5 years running.

YC partner Michael Seibel kicked off the event by reiterating the accelerator’s commitment to advancing diversity in Silicon Valley. In this class, 12 percent of the founders are female and 9.5 percent are black or latinx.

While those percentages have been pretty stable over the years, YC shines in its inclusion of international startups. In part thanks to outreach via its scalable online Startup School and global events, with 28 percent of this batch’s startups based internationally.

Pyka shows off its self-flying personal plane outside Y Combinator Demo Day at the Computer History Museum in Mountain View, CA

Past YC hits include Airbnb, Dropbox, and Stripe, plus newer unicorns like Twitch, Instacart, and Coinbase. Investors from across Silicon Valley and the world packed Mountain View’s Computer History Museum to look for the next big thing.

Here’s a look at every company that presented on the record at Demo Day 1 of 2. Check back for our picks of the best of today’s startups, plus write-ups of all tomorrow’s companies and the highlights.

Zendar– High definition radar that allows self-driving vehicles to see in all weather conditions

Zendar develops high-definition radar for autonomous vehicles. Today, autonomous vehicles rely on two main technologies: Lidar and traditional radar. Lidar can see in high definition, but does poorly in bad weather, while radar is great in bad weather conditions, but can’t see in high resolution. Zendar seeks to provide high-res imagining for self-driving cars in bad weather, allowing all-weather autonomy. In the next three years, Zendar says there will be 10 million autonomous vehicles made, and it’s hoping to be used by as many as possible.

Image via Sombre Lidar

Meetingbird– Team-wide meeting scheduling optimization

Having scattered meetings throughout the day destroys productivity. But it’s tough to coordinate meetings by yourself, let alone with the rest of your team’s schedule in mind. Meetingbird is a smart calendar startup that makes it simple to plan a meeting, overlays schedules to find times that work for everyone, and optimizes everything to condense meetings so everyone can get back to work. Meetingbird is now signing up paid enterprise customers for its premium service, with 53 percent week-over-week growth and inherent virality. While competitors are trying to create AI assistants that try to handle meeting communication for you, Meetingbird just gets things scheduled as fast as possible.

Read more about Meetingbird on TechCrunch.

Thematic– Text analysis for surveys and reviews

Getting people to type all the things they love or hate about your product through reviews and surveys can be a great source of quality feedback but distilling massive walls of text to get insights can be a nightmare. Thematic is devoted to analyzing unstructured sources to give customers more actionable steps to increasing customer satisfaction. The company has already analyzed millions of data sources since its launch earlier this year, and they’re delivering insights to partners like Vodafone and Stripe.

PullRequest– A marketplace for code review

Pull Request is a marketplace pairing corporate code with freelance code reviewers looking for a side hustle. The team is recruiting reviewers that have experience from top tech companies like Amazon, Facebook and Dropbox. With this pedigree, PullRequest has managed to draw interest from 450 teams. Though only a portion of these are actually using the service, PullRequest touts a $136,000 annualized revenue run rate. Together, startups and Fortune 500 companies spend an estimated $40 billion on code reviews. The secret sauce of PullRequest lies in automation techniques that allow the startup to do reviews faster and more accurately.

Helium Healthcare– Electronic Medical Records For Africa

Paper medical records can cost lives. Helium is making them a thing of the past with its “rugged” electronic medical records system for Africa. Designed for minimal training and offline access from any device, Helium can handle patient records for doctor’s visits, prescriptions, and billing. Helium offers both pay-as-you-go billing and traditional enterprise subscriptions for larger hospitals. With over 20 facilities and 500 medical professionals on board, Helium hopes to improve healthcare across Africa by making EMR easy to adopt.

Darmiyan– Early detection of Alzheimer’s disease up to 15 years before symptoms

Darmiyan reduces the cost and time it takes to test for early-onset Alzheimer’s disease. Anyone over the age of 45 should be tested, and the company has already tested 3,000 patients. Even before getting submitted to the FDA, the company has signed up a $1 million contract. Currently there are 26 million Americans who should be tested, and each test costs $500, which means a potential $13 billion market for the company.

Roofr– Satelite-powered roofing estimates

Roofr uses satellite imagery to let consumers easily grab a quote on the cost of their roof and then get connected with roofers to tackle repairs. Property owners can easily set their address, trace an image of their home’s roof on a satellite map and within 30 seconds they get an estimate and can get connected with a roofing installer within 72 hours. The startup takes a 10 percent fee for the process and says they’re saving their customers save about 20 percent.

CashFree – Payments automation for the Indian market

Payments products are a dime a dozen these days, but CashFree is hoping its intention and focus on the Indian market will set it apart. CashFree is a payment gateway that automates both inbound and outbound ACH payments. The founder of CashFree explains that this could enable instant transactions on an individual basis — i.e. an Uber-esque service could pay drivers directly after their trips. The company is currently processing $3.5 million in payments and earning an attractive 40 basis points on each transaction.

Skyways– VTOL drones

Skyways is building vertical take-off and landing (VTOL) drones to be used by the military for transporting things without putting people in danger. Since the military currently operates in places with little infrastructure, Skyways can provide a way to deliver goods without putting people in danger. Their drones are fully autonomous and have a payload capability of 45 pounds. While they’re starting with military drones, the company wants to eventually use that business to fund a consumer vehicle in the long-term.

Mystro– Helping on-demand drivers earn more

Juggling different apps like Uber and Lyft can distract drivers and cause them to miss the most lucrative rides. Mystro’s service auto-accepts the most profitable fares for a driver so they can focus on the dollars and the road. And since it enhances driver satisfaction, it’s chipping away at Uber and Lyft’s huge driver retention problem that sees 96 percent quit their first year. That’s why the ride share services don’t block Mystro, and it’s expanding beyond Lyft and Uber. The $12/month Mystro subscription is growing 25 percent week-over-week, and the service handles 100,000 rides a week. With 20 million on-demand drivers worldwide, Mystro is chasing a $3 billion a year opportunity. While there’s a risk that the ride share platforms will try to add similar functionality, none will work cross-platform, leaving a big opportunity for Mystro.

Read more about Mystro on TechCrunch.

10 By 10– Recruitment agency hiring marketplace

Hiring at big tech companies is a pretty intensive and expensive process for recruiters, 10by10 is building a marketplace to more quickly match qualified candidates with companies by pooling data across recruitment agencies. The startup takes what a lot of agencies are already doing on an informal level, but brings it into the startup’s platform to get stuff done “ten times easier and ten times faster.” Things are just getting started at 10by10 which just launched last month, the company is already $60k in booked revenue over the past month. The startup splits the fee with the recruiter 50/50.

Honeydue– Financial planning for couples

Honeydue is a collaboration tool for couples to manage their finances together. We’ve all heard that the number one point of contention for couples is money. Eugene Park, the startup’s founder, aims to reduce this friction with transparency. The app currently has about 24,000 users monitoring $68 million in cash balances. This is music to the ears of anyone looking to target financial products to the millennial couples demographic. Park proudly noted a 16x click through rate for financial products offered up via Honeydue.

Read more about Honeydue on TechCrunch

D-ID– Protect your identity from face recognition technologies.

D-ID has developed an AI to protect your photo from facial recognition. With just your photo, hackers can steal your identity and hack your devices. But unlike passwords, you can’t change your face. D-ID has created software that processes your photo and creates a protected image that looks similar to the naked eye. The company is targeting customers and security agencies who store user photos, and has two $1 million letters of intent signed.

Life Bot– One voice app for everything

It’s tough to remember the names and scripts of all the different voice apps when you don’t have icons to browse like on mobile. That’s why Life Bot says the average retention of an Amazon Alexa app is 3 percent, while it has 52 percent, and plans to launch on Google Home and Microsoft Cortana. Life Bot’s app can give you personalized news, manage your calendar, or find your phone. And since it knows your phone number, it can send you reminders even when you’re not home. Eventually it wants to work in your car and on every other device. While it may have to contend with native omni-apps from voice platforms like Amazon and Google, the voice bot space is exploding and there are few name brands.

Read more about Life Bot on TechCrunch

Modular Science– Outdoor robot farming.

Elon Musk may be concerned about robots taking over the world, but Modular Science just wants robots to farm our vegetables. The startup, which currently has robots out in the field (!) in Petaluma, CA, is aiming to automate 99 percent of the processes involved in vegetable farming within the next six months with their specialized farming bots. Modular Science is looking to charge $2,000 per acre, which they say is half of what farms are currently paying to for human labor.

Audm– Subscription audio content

Unafraid of Apple, Spotify and other incumbents, Audm is trying to find white space in monetizing spoken word audio content. By taking a revenue sharing approach, Audm has managed to get Buzzfeed, The Atlantic, Wired, Esquire and more on board. About 1,150 subscribers are paying $7 per month to access that audio content. The startup sees itself as the disruptor of Sirius XM, beginning the long journey of building out a library of podcasts, news and talk radio.

Read more about Audm on TechCrunch

GameLynx– Next generation mobile eSport

GameLynx wants to build a competitive eSport game to bring hardcore gaming to mobile. The company believes that success will be defined not just by creating a new type of game, but creating a better user experience. Mobile devices are now powerful enough to support the types of games hardcore gamers love to play, so now the company wants to bring eSports gaming to that platform. In doing so, it hopes to build eSports games that aren’t just fun to watch for gamers, but for everyone. GameLynx will launch its first game in its first test market in the next six months, but is already backed by one of the largest game companies in the world.

Gopher– An app platform atop email  

We all hate email, but still spend most of our day there. Gopher wants to make that time more productive by letting any developer build apps for your inbox. For example, you can forward it emails of data for entry into Salesforce, or collaboration plans to schedule a meeting. Its first extension for sending follow-up emails has earned it 13,00 monthly users, and 300 devs have signed up to build on the platform. Rather than forcing you to waste your hours hopping back and forth between email and other apps, Gopher will help you get things done all in one place.

70 Million Jobs – Job recruitment platform for America’s formerly incarcerated

There are 70 million Americans with a criminal record in this country and when it comes to finding employment, things can get complicated. 70 Million Jobs is a for-profit recruitment platform that connects companies with applicants. Founder Richard Bronson knows some of the challenges facing the recently incarcerated, as he spent two years in a federal prison after being convicted of securities fraud in 2002. Since then he has joined with Defy Ventures to help formerly incarcerated people get a second chance through entrepreneurship. “What we do is use advanced insights to connect ignored talent with jobs that companies can’t fill,” Bronson told the crowd of investors. The startup is starting its efforts with job recruitment, working with companies like Uber, but Bronson hopes the startup becomes a hub for providing services to those with a criminal record.

May Mobility– Autonomous vehicles for urban environments

May Mobility is the latest of a ballooning number of startups tackling the autonomous vehicle space. The team, formerly University of Michigan roboticist, is pretty deep in R&D. Rather than beat competitors purely on technology, May just wants to be first to market. And with a paid partnership lined up in the City of Detroit, that actually just might happen. The vision is one of reduced variables — the vehicles would operate in more predictable environments like central business districts and residential communities. And Detroit isn’t alone, negotiations are progressing with four cities to get autonomy on the road to make money sooner rather than later.

Read more about May Mobility on TechCrunch.

Flock– Wireless security systems for neighborhoods

Flock builds wireless cameras that can be used to protect neighborhoods. The company has developed an outdoor camera that can track cars and record license plates. It can provide data to local police officers when crimes occur, but it can also proactively notify them when a stolen vehicle enters a neighborhood. The company has already solved its first crime and is being used by multiple neighborhoods, but believes it is targeting a $1.5 billion market opportunity in protecting local municipalities.

Indivio– Video ad A/B testing

Advertisers know that the best performing ads come from creating tons of variants and whittling them down to what works. That’s easy with text and images, but much harder with video. Indivio takes the work out of video ad optimization. It can use motion graphics instead of traditionally filmed video to make different versions of an ad for different locations and target customers. Indivio reduced Instacart’s cost per acquisition by 25 percent, and now it wants to optimize all the video ads on Facebook and Instagram. As ad spend shifts from television to social, plenty of brands will need help, and Indivio will charge them 5 percent to 10 percent to make sure their marketing resonates with our fast-moving feeds.

Relationship Hero– Relationship help for the digital age

If there’s anything Silicon Valley hasn’t proven itself adept at helping with, it may be navigating  the complexities of human relationships. Thankfully it’s not AI-based and unlike so many of the gimmicky chat bots or Dear Abby-style products, Relationship Hero is looking to help you solve relationship issues by connecting users with live relationship experts over the phone or through online chat. Through what the startup calls “tactical step-by-step plans,” the startups wants to help you through issues with family members, coworkers and significant others. 30 million people go to therapy, Relationship Hero says they want to create a “lighter weight” solution. They won’t just offer you random truisms either, in some cases the experts will tell you what to say in a text and when to send it. The average client spends over $100 inside the app as they get live expert help from relationship coaches.

ShiftDoc– A marketplace for healthcare professionals

ShiftDoc is building a better way to fill shifts for private healthcare practices. The startup is undercutting staffing agencies and offering a better user experience than job boards with its marketplace. The nice part about addressing the healthcare market is that the take for each shift filled is very high. ShiftDoc says that it’s earning $50 per shift it fills. Of course the hard part is building up initial supply and demand to get to a point where the marketplace will self sustain. To this avail, the team has on-boarded 150 part-time doctors willing to fill shifts at 50 private practices.

Dropleaf– Netflix for indie video games

Dropleaf provides a subscription service for independently produced PC games. It’s taking advantage of a growth in the number of indie games — which double each year — and interest from PC gamers. With its $10 per month service, Dropleaf offers more than 50 games to users. In a limited beta, 90 percent its users play games at least twice a week, and it believes it has an addressable market of 120 million PC gamers around the world.

Sunu– Sonar bracelet for the blind

The vision-impaired frequently hurt themselves, with one blind person going to the hospital every 5 seconds due to head injury. But their options are limited to a low-tech $30 cane or a pricey $30,000 guide dog. Sunu is a sonar bracelet that vibrates to let the vision-impaired know that they’re approaching an object. Its six-month beta test saw users reduce accidents by 90 percent. Sunu has sold $25,000-worth of its bracelets that ship in October. Now that the product has been built and patented, it’s seeking to sell one to all 10 million blind people in the US. People are willing to pay a premium for safety, so even if cheaper devices emerge, Sunu could win by becoming a trusted brand.

Wildfire– An administration-approved Yik Yak for college campuses

Wildfire seems to be a bit of mixture of Yik Yak and Patch, bringing local user-submitted news and administration-sanctioned campus alerts. The app’s initial draw is as a system to send out campus safety warning notification pushes so students are alerted if there’s a robbery or active shooter situation on campus. In the less dire, day-to-day use cases, the app is a “hyperlocal news app” allowing users to share what’s happening on campus whether it’s an extracurricular event or party. Wildfire says it has 23 thousand MAUs across six college campuses and will be available in 50 campuses by the end of the year.

OncoBox– Better drug treatment decisions for late-stage cancer patients

When a patient is suffering with late-stage cancer, every treatment decision that is made by an oncologist makes a huge impact on potential survival. There are over 150 cancer drugs on the market today — everyone would love a panacea but the pragmatic problem of today is deciding which patients should be assigned which drugs. OncoBox provides pre-testing to estimate the likelihood that a given drug will improve outcomes for a specific patient. The team is charging $1,000 for its test and it estimates that there are about 500,000 tests done per year. The $500 million market is just a starting point for the startup that promises 2x more effective drug matches over doctors.

VergeSense– Facility management powered by AI

VergeSense uses hardware sensors and machine learning techniques to help companies operate buildings more efficiently. For most companies, the cost of real estate is the second largest cost to their business, but VergeSense believes that it can reduce their costs by 10 percent to 15 percent. By installing wireless sensors around a company’s buildings, it can recognize human movement flow and make recommendations to customers to lower costs. Already, VergeSense has two paid pilots with Fortune 500 clients, but believes every big company needs a product like what it’s produced.

Pyka– Self-driving personal aircraft

Pyka wants to make “flying cars” a reality with its auto-piloting single-person planes. The company has already built a 400lb plane that flies itself, can take off and land in 90 feet. But since regulators want to see tons of testing before allowing humans aboard, Pyka has developed a placeholder business doing crop dusting in New Zealand. That helps it earn $600 per hour while logging the hours necessary to prepare for the human transportation market. Crop dusting alone is a $1.5 billion business in the US. But with employees from Zee airplanes and Google’s Waymo, Pyka aims to become a first-mover in self-flying personal planes.

Fastpad– Job applicant tracking system for India

Fastpad is building hiring software for the Indian market that gets rid of spam and ensures that companies can see quick snapshots of real candidates. Fastpad claims that most job openings in India have thousands of applications and candidates often apply without even reading the descriptions. Because of this, around 70 percent of actual hires end up coming from third-party recruiters. Fastpad is looking to create the dominant recruitment marketplace by cutting through the noise in an Indian hiring marketplace that’s growing 40 percent year-over-year.

Gustav– Marketplace aggregating small staffing agencies

Gustav might not look like a traditional staffing agency, but that hasn’t stopped it from earning money like a traditional staffing agency. The startup works with companies to fill temporary positions. Traditionally this work is done by large staffing agencies, but Gustav is testing its thesis that an aggregation of small staffing agencies outperforms the big legacy players. Uber, Sony, H&M, Vice and others have done work with Gustav to hire about 20 individuals to date. And even as a middleman, using automation to organize the 19,000 small staffing agencies in the U.S., Gustav gets to collect three percent of the salary paid out to contractors. This tends to give each hire about $1,000 in LTV.

Forever Labs– Transplant your stem cells to your older self to combat aging

Forever Labs wants to help users cryogenically freeze their stem cells, allowing them to use those cells to fight their age-related diseases in the future. Stem cells have been shown to help improve the life of mice by 16 percent, but the older you get, the less helpful they get in helping to fight disease. Now, Forever Labs has 20 doctors providing the procedure, but expects to be in every major US market by this time next year. Stem cell banking could be a $56 billion market, the company believes.

Read more about Forever Labs on TechCrunch.

Ubiq– Screen-sharing solution for enterprise conference rooms

No matter how amazing technological advances seem to get, telepresence business meetings are still awful. Ubiq is aiming to simplify conference room screen sharing with their cable-free setup that cuts down on confusion and lets businesses focus on the tasks at hand. It’s basically bringing enterprise-grade AirPlay-like streaming tech into the conference room with wireless HDMI output. The startup’s solution has already been deployed at more than 150 companies and has increased revenue 3.5X in the past four months.

Airthium – Energy storage using hydrogen compressors

Energy storage is one of those holy grails that everyone knows exists but nobody has been able to come close to capturing. Airthium is chipping off a tiny portion of the huge market with its hydrogen compressor-based energy storage. The team of physicists and experts in fluid dynamics is building small systems without moving parts, a decision that is saving Airthium serious money. Despite the R&D-heavy nature of the business, Airthium has managed to obtain two letters of intent at a value of $4 million per year and a third letter for a smaller $300,000 energy system.

UpCodes– Construction legal compliance

UpCodes helps the construction industry navigate compliance. Currently most compliance codes are hidden in physical books and PDFs, which means multimillion dollar mistakes are common in the industry. UpCodes has taken those analog compliance resources and taken them online, growing to 61,000 monthly unique visitors to its site only through SEO. It has a freemium model that it’s using to go after the 18 million professionals who deal with code compliance globally.

Read more about UpCodes on TechCrunch

Cambridge Cancer Genomics– Blood test cancer treatment monitoring

It can take six months before a cancer patient’s doctor knows if the chemotherapy regimen they chose is working, yet 2/3s of treatments fail. Cambridge Cancer Genomics has developed a blood test that can detect failed treatments up to many months faster than standard monitoring, so doctors can switch plans sooner when necessary. Founded by 4 PhDs with cancer research experience, CGC is also building AI for personalizing cancer treatment using a data set it says is 4X larger than what’s available to the public, as it absorbs data from each medical facility it signs on.

HelpWear– Medical grade heart-monitoring wearables

For the 17 million Americans suffering from acute heart conditions, HelpWear is building a more versatile ECG system that patients can . Existing systems are uncomfortable amalgams of wires and adhesives and can only be worn for 72 hours and have to be taken off before hopping in the shower, something that can be a major inconvenience to those suffering from acute heart conditions. HelpWear’s solution is a much more svelte system of three wearable units akin to fitness trackers which are wireless and can be worn 24/7 and are waterproof. The startup is on track to be FDA-approved in nine months.

Net30 – Getting construction workers paid faster

The construction industry is one of those places where, despite increasing attention from startups, there always seems to be an infinite number of archaic processes that need solving. Net30 is pursuing online invoicing and payments for construction companies. Typically general contractors collect invoices from subcontractors, but this seemingly easy process often involved over 200 pages of complex accounting. The end result is a basically unacceptable 70 day pay delay. With a background in construction project management, Net30 is cutting pay periods down to just 30 days. This case has proven so attractive that the startup is expecting $400,000 in annual revenue.

Read more about Net30 on TechCrunch.

Guggy– Transform text messages to personalized funny GIFs.

GIFs are everywhere these days, as the growth in GIF views has increased more than 100x by 2014. With that in mind, Guggy helps users express themselves with personalized GIFs. Using a natural language processing engine that understands slang and emotion, the company can instantly create GIFs that represent their words. Already the company has 1 million active users on its API, but it’s looking to build the messaging app of the future and deliver it direct to consumers.

Escher Reality– Augmented reality’s data backend

To augmented the real world, you need data about it. Escher Reality aggregates AR video data from people’s camera phones and pins it to locations so other developers can build better experiences on rop. And while Facebook and Apple have their own AR platforms, Escher works across iOS and Android right inside developers’ apps. It now has 600 devs on its waitlist, 10 letters of intent from potential clients like game studios, and a signed deal to power an AR app for blockbuster robot fighting movie Pacific Rim. If Escher Reality can be the device- and platform-agnostic engine for AR, it could become a gateway to tons of developer spending and consumer time spent.

Read more about Escher Reality on TechCrunch.

Carrot Fertility– Fertility benefits for corporate health plans

Carrot Fertility wants to bring fertility benefits to company health plans so that employers cover fertility procedures like IVF or egg-freezing just like they do for vision or dental. Though big tech companies like Apple, Facebook and Google are already offering fertility benefits to employees, other companies that aren’t so flush with cash may not have the ability to be seeking out the best path towards adding this coverage. Carrot Fertility makes it easier for companies to add the service to health plans, helping them keep their list of benefits attractive to potential new hires.

Feather– Stylish furniture rental for millennials

It’s 2017 — owning things isn’t cool because owning things is expensive and requires commitment. Feather is rescuing millennials from IKEA purgatory with its furniture rental service. By focusing on style, Feather wants to offer furniture that people actually want. The New York-based startup is making about $275 per month, per order. On an average order size of $2,200, Feather earns $830. And the company manages this without actually owning any of its own furniture. Working alongside a debt capital partner, the startup leases its furniture as a middleman, renting it back to customers at a convenience premium.

Read more about Feather on TechCrunch.

Prism IO– Help companies fix churn

Churn kills companies, but Prism IO wants to help kill churn. Most companies try to quantify customer loyalty, because as they scale they can no longer talk to customers the way they used to. To help them, Prism IO talks to customers who have churned to find out why they left and what a company can do to win them back. Over the last few months, Prism IO has grown MRR 12x to $10,000 and today has 20 paying B2B and B2C customers.

PayFazz– Bankless payments for Indonesia

China has WeChat Pay. India has PayTM. And this startup wants to give Indonesia its own app for bill payment and money transfers. PayFazz works by verifying people to become mobile bank agents. PayFazz users can hand the agents cash, who then route the money to the appropriate bank or pay a balance for the user. Agents earn a cut for their services and risk while PayFazz takes a 1 percent fee. It now has 70 percent month-over-month growth and is processing over $1 million per month. With more unbanked people around the world wanting a gateway to the worlds of ecommerce and on-demand employment, PayFazz could help modernize Indonesia.

Read more about PayFazz on TechCrunch.

Sixty– On-demand web app support platform

Web applications are becoming increasingly essential in just about every office, with new updates bringing new problems that lots of IT departments aren’t always equipped to deal with. Sixty is building a platform that brings in-app on-demand experts to help users navigate the ins-and-outs of web apps like QuickBooks or MailChimp. The startup found that nearly 20 percent of support tickets for some web services were best suited for paid professional help, Sixty serves as a great outlet for these issues so businesses aren’t saddled with the toughest issues and customers aren’t left waiting in the interim.

Totemic Labs– Safety device for senior citizens

Totemic Labs is disrupting legacy fall detection solutions for seniors. Millions of seniors fall every year, and even when seniors own necklaces and other wearables, they regularly forget to wear them. This is both a safety hazard and a failure of product design. Totemic Labs is building a device that resembles an Amazon Echo that uses sound to identify and respond to falls automatically. The team promises that a single device can monitor an entire home. For its efforts, Totemic earns $300 per year, per device within a $12 billion market.

Peergrade– Student feedback and grading platform

Peergrade wants to help teachers save time and teach more effectively by letting students grade each other’s work. When students give feedback anonymously and grade their colleague’s work, they also learn. Meanwhile, teachers have a lot less work and can drastically reduce costs — universities can save on average $13,000 a year by replacing teaching assistants with PeerGrade. Already the company has $150,000 in ARR, but it has a huge market opportunity ahead of it, with more than 100,000 university departments it could go after.

Kestrel Materials– Temperature-responsive fabrics

William Gore built a multi-decade business by inventing the breathable, waterproof fabric Gor-Tex. Kestrel wants to make the next big leap in textile science. It’s developing a fabric that responds to cold by flexing to create air pockets that trap heat and keep people warm. But when the air warms up, these pockets collapse so less heat is trapped and people stay cooler. Kestrel already has a letter of intent from Casper to use its materials in bedding. With people spending tons of money on outerwear, athletics clothes, and more, temperature-reactive technology could find its way into the fabrics that fill our day-to-day life.

SMB Rate– Kayak for small business loans

SMB Rate is aiming to help small businesses get loans and build up the credit history they need in order to qualify for more attractive loans. On the lending side, SMB Rate takes a peek into the customer’s financials and uses its analysis platform to determine what the best loan rates are for businesses and subsequently connects them with lenders.

Android Oreo

$
0
0

Swift moves, behind the scenes

2x faster:

Get started on your favorite tasks more quickly with 2x the boot speed when powering up* *boot time as measured on Google Pixel

Background limits:

Android Oreo helps minimize background activity in the apps you use least, it's the super power you can't even see.

Exploring bump mapping with WebGL

$
0
0
Exploring bump mapping with WebGL

Exploring bump mapping with WebGL

04-Mar-2017
Introduction ============ Bump mapping is a collective term for a number of techniques used in graphics to simulate meso-features - features which aren't large enough to be necessarily represented by geometry, and aren't small enough to be represented by a shading model. These techniques allow us to add details without adding geometry. This blog post is an attempt to study bump mapping techniques and implement a WebGL demo to compare and visualize them. We will not be looking at displacement mapping techniques, which rely on adding/manipulating geometry to the shaded surface. We will discuss four techniques: Normal mapping, parallax mapping, steep parallax mapping and parallax occlusion mapping. The WebGL demo is built using a single [JavaScript file](assets/bump_mapping/bump_mapping.js). Shader code is at the bottom of the page. The [diffuse map](assets/bump_mapping/diffuse.png), [normal map](assets/bump_mapping/normal.png) and the [depth map](assets/bump_mapping/depth.png) were taken from [learnopengl.com](https://learnopengl.com/#!Advanced-Lighting/Parallax-Mapping) and are licensed CC BY 4.0. Each face of the cube below is a simple square made up of two triangles. The detail is added using bump mapping. You can click on the cube to pause its rotation, and play around with the controls below to visualize the various techniques and their parameters. Your browser doesn't appear to support the <canvas> element. Normal Mapping ============== ![Normal map](assets/bump_mapping/normal.png) Normal mapping is probably the most widely used bump mapping technique, and readers even casually associated with graphics will have worked with normal maps at some point. In the WebGL demo above, all four of the bump mapping techniques use normal mapping; the parallax shaders just add extra math on top. Lighting at a point on a surface varies with the angle of incidence, which is the angle between the incoming light ray and the surface normal. If we modify the surface normal, the lighting at the given point also changes. Normal mapping relies on this property and uses a texture to store surface normals. The red, green and blue channels are used to encode the normal's vector components along the X, Y, and Z axes. Since these components can be negative, a channel value of 128 is considered to be zero. Anything below this is negative, and anything above this is positive. e.g. a color RGB (128, 128, 255) denotes a unit vector pointing in the positive Z direction, i.e. the vector $0\bar{i} + 0\bar{j} + 1\bar{k}$. One problem with this approach is that these surface normals stored in the texture cannot be represented in world space. If they are, then the normals would change every time the normal-mapped object is rotated, and the texture would have to be updated. Moreover, multiple objects with the same meso-features, but with differing geometry and orientations would not be able to share the same normal map. This problem is solved by storing the objects in tangent space instead of in world space. To achieve this, we use a concept called [the change of basis](https://en.wikipedia.org/wiki/Change_of_basis). Any 3D coordinate space is defined by its three basis vectors. Any given vector in this space can be uniquely represented as a linear combination of these basis vectors. For example, the vector (1, 2, 3) is actually the result of $1\bar{i} + 2\bar{j} + 3\bar{k}$, and the vectors $\bar{i}, \bar{j}$ and $\bar{k}$ are called the basis vectors. In world space, the basis vectors are simply the X, Y and Z axes. This is called the [standard basis](https://en.wikipedia.org/wiki/Standard_basis). While choosing the basis vectors for the tangent space, we want them to be unperturbed by the changes in orientation of the mesh in world space. In other words, these basis vectors should "stick" to the mesh. The three basis vectors that are chosen are called the tangent, the bitangent, and the normal. The tangent at each vertex is the partial derivative of the UV texture coordinate with respect to the `u` component. This means that the tangent points in the direction of change of `u` at that vertex. The bitangent is similar, but points in the direction of change of the `v` component. These tangents and bitangents are stored in model space - the coordinate space where the object is located at the origin and has unit scale and zero rotation. In the vertex shader, the tangent and the bitangent vectors are transformed into world space using the inverse transpose of the model-view matrix. More information on this [here](http://web.archive.org/web/20120228095346/http://www.arcsynthesis.org/gltut/Illumination/Tut09%20Normal%20Transformation.html). These tangent and bitangent basis vectors are usually computed for each vertex when a mesh is loaded, and are stored in the vertex buffers. In this example, I am not doing this via computation, but am directly storing tangents and bitangents via hard-coded arrays, for the sake of simplicity. In the vertex shader, once we have the tangent and the bitangent vectors in world space, we can simply take a cross-product to get the normal in world space at that vertex. An important thing to note is that the tangent and the bitangent vectors may not necessarily be perpendicular to each other, depending on the the mesh and its UV map. Once we have the tangent $(T_x, T_y, T_z)$, bitangent $(B_x, B_y, B_z)$ and normal $(N_x, N_y, N_z)$ vectors in world space, we can construct a matrix to change the basis of a vector between the two spaces as required. The following matrix multiplication allows us to "move" a tangent-space vector $V (V_x, V_y, V_z)$ to world space: $$ V_{world\ space} = \begin{pmatrix} T_x & T_y & T_z \\ B_x & B_y & B_z \\ N_x & N_y & N_z \end{pmatrix} \begin{pmatrix} V_x \\ V_y \\ V_z \end{pmatrix} \tag{a} $$ The inverse of this matrix can be used to do the opposite change of basis, i.e. conversion of a world-space vector into tangent space. Since our test mesh is a uniformly texture-mapped cube, the tangent and the bitangent are perpendicular at each vertex, and hence this matrix is an [orthogonal matrix](https://en.wikipedia.org/wiki/Orthogonal_matrix), and so its inverse is equal to its transpose. We take advantage of this fact, since the transpose is far cheaper to compute than the inverse. Thus for a world-space vector $V (V_x, V_y, V_z)$: $$ V_{tangent\ space} = \begin{pmatrix} T_x & B_y & N_x \\ T_y & B_y & N_y \\ T_z & B_z & N_z \end{pmatrix} \begin{pmatrix} V_x \\ V_y \\ V_z \end{pmatrix} \tag{b} $$ If your meshes do have texture shearing, the tangents and the bitangents will not be perfectly orthogonal. In this case, in order to get fully correct lighting, you will have to compute the proper inverse of the matrix inside the vertex shader instead of using the relatively cheaper transpose operation shown above. Now that we can change basis at will, we can proceed with the lighting calculations. Our light positions are defined in world space, while our surface normals are in tangent space. We need to bring both of these things into the same space to calculate the lighting. We could follow two different approaches here: 1. Convert the surface normals to world space 2. Convert the lights, camera and fragment positions to tangent-space If we follow approach 1, we will have to perform the matrix multiplication in equation (a) **for each fragment** in order to convert normals to world space. If we follow approach 2, we will have to perform the matrix multiplication in equation (b) **for each vertex**. We will need to convert all the light positions (this demo only has one), and other vectors such as the fragment position and the view position from world space to tangent space. We only need to do this in the vertex shader, since the transformed vectors can be interpolated over the fragments, and hence this approach is usually cheaper than the first approach. We follow this approach in the demo. Once everything is in the tangent space, the angle of incidence can be computed along with its cosine, and the Lambertian lighting in the demo can be evaluated as usual. Parallax mapping ================ ![Depth map](assets/bump_mapping/depth.png) In reality, bumps will obscure the area behind them, and this behaviour will change depending upon the viewing angle. For instance, this blocking effect will be negligible if you view the surface head-on, and the effect will be very apparent as your angle of viewing increases. While normal mapping does a good job of appropriately simulating lighting, it does not simulate this obscuring parallax effect at all. Regardless of the normal map, the position of the sampled texel is never changed, and the texture is simply uniformly mapped over the surface. Parallax mapping techniques remedy this shortcoming by modifying the texel coordinates based on a depth map before any further lighting calculations are done. This way, the bumpiness of the surface perturbs the texture mapped on to it. In the diagrams below, a cross-section of the parallax-mapped surface is shown. The eye represents the camera, the top-most horizontal line represents the surface. The area below the surface represents a "virtual" area, with increasing depth as you go down. The curve represents a topology described by the depth map. Note that the depth map is simply an image mapped flatly onto the surface, and that the curve is a visual representation of the topology that it describes. The point of intersection of the view ray and the surface denotes the fragment that the shader is running on. ![Ideal parallax mapping](assets/bump_mapping/parallax_ideal.svg) As described before, we will have to perturb the texture coordinates sampled at any point depending on the depth of the surface. With simple normal mapping, the texel sampled will be the one located at the current fragment. We will alter this coordinate by $\Delta\text{uv}$ to get a parallax effect. The ideal value of $\Delta\text{uv}$ relies on finding the precise location of the intersection of the depth field and the view ray, making it a ray-tracing problem.

Simple parallax mapping does a first-order approximation by calculating $\Delta\text{uv}$ as follows: $$ \Delta\text{uv} = \frac{h \cdot v_{xy}}{v_z} \tag{c} $$ Here, $h$ is the value of the depth map sampled at the fragment, and $v$ is the normalized view direction (a.k.a. view ray). There is a variant of this technique with offset limiting, which reduces the amount of drift that occurs at oblique view angles: $$ \Delta\text{uv} = h \cdot v_{xy} \tag{d} $$ These approximations make the shader fast, but they are extremely crude, and lead to obviously incorrect results at oblique view angles and larger depth scales, which you can see in the WebGL demo. ![Simple parallax mapping](assets/bump_mapping/parallax.svg) Steep parallax mapping ====================== Steep parallax mapping does a more accurate ray intersection calculation than the previous approach. It does so by breaking down the depth space into a number of equal layers (controlled by the steps slider in the demo above). We step through each depth layer along the view ray, moving from the surface towards increasing depth. ![Steep parallax mapping](assets/bump_mapping/steep.svg) At each step, we check whether the current layer depth is greater than the value of the depth map sampled at that step. If it is greater, then it means that we are "inside" the depth field. We select the first such point found, and derive the $\Delta\text{uv}$ based on this point. This technique is thus a linear search along the depth space. A greater step count leads to a better visual result at the cost of shader clock cycles. The number of steps can be adaptively determined based on the view angle - steps increasing with the obliqueness of the view direction. The performance gained from this adaptive step count will vary greatly because of shader divergence resulting from variations in the view direction vector depending on the shaded geometry, the projection parameters, and the position of the camera. The demo doesn't have this adaptive step count. Parallax occlusion mapping ========================== Steep parallax mapping results in stair-stepping artefacts for smaller step counts. This happens because neighboring ray intersections resolve to the same depth layer, since we only take the depth at the first layer at which the intersection occurred. ![Parallax occlusion mapping](assets/bump_mapping/pom.svg) Parallax Occlusion Mapping (POM) fixes this by taking an additional depth texture sample from the layer _before_ the intersection happened. Now we have two depths - the depth at the layer after the intersection, and the depth at the layer before the intersection. We do a lerp between these values to get a better approximation of the point intersection. Read the shader code below to get a clearer idea of these calculations. This technique assumes that between two layers, the curvature of depth map can be approximated as straight lines. This assumption holds true for sufficiently large step counts, and gives a good visual result. Further reading =============== * I was inspired to study and write this because of the great book [Real-Time Rendering](http://www.realtimerendering.com/book.html), which has a short section on bump mapping. * [learnopengl.com](https://learnopengl.com/#!Advanced-Lighting/Parallax-Mapping) and [sunandblackcat.com](http://sunandblackcat.com/tipFullView.php?topicid=28) have good beginner-friendly writeups with code samples. * The parallax techniques discussed above work well with flat surfaces, but may cause artifacts with curved meshes. It may be possible to analytically compensate for mesh curvature, as discussed very briefly in [this Unreal Engine 4 video](https://youtu.be/4gBAOB7b5Mg?t=42m6s). * The original paper for parallax mapping by Tomomichi Kaneko can be found [here](https://www.researchgate.net/publication/228583097_Detailed_shape_representation_with_parallax_mapping). * The original paper for steep parallax mapping by Morgan McGuire and Max McGuire can be found [here](http://graphics.cs.brown.edu/games/SteepParallax/). * An in-depth slide deck about parallax occlusion mapping by Natalya Tatarchuk can be found [here](https://developer.amd.com/wordpress/media/2012/10/Tatarchuk-ParallaxOcclusionMapping-FINAL_Print.pdf). Closing notes ============= The JavaScript file behind this demo, and the shader code at the bottom of this post are public domain. Feel free to use them however you want. If you have comments, suggestions, corrections or questions, feel free to get in touch with me on Twitter or email (links in the footer). Shader source ============= Vertex Shader -------------


precision highp float;

attribute vec3 vert_pos;
attribute vec3 vert_tang;
attribute vec3 vert_bitang;
attribute vec2 vert_uv;

uniform mat4 model_mtx;
uniform mat4 norm_mtx;
uniform mat4 proj_mtx;

varying vec2 frag_uv;
varying vec3 ts_light_pos; // Tangent space values
varying vec3 ts_view_pos;  //
varying vec3 ts_frag_pos;  //

mat3 transpose(in mat3 inMatrix)
{
    vec3 i0 = inMatrix[0];
    vec3 i1 = inMatrix[1];
    vec3 i2 = inMatrix[2];

    mat3 outMatrix = mat3(
        vec3(i0.x, i1.x, i2.x),
        vec3(i0.y, i1.y, i2.y),
        vec3(i0.z, i1.z, i2.z)
    );

    return outMatrix;
}

void main(void)
{
    gl_Position = proj_mtx * vec4(vert_pos, 1.0);
    ts_frag_pos = vec3(model_mtx * vec4(vert_pos, 1.0));
    vec3 vert_norm = cross(vert_bitang, vert_tang);

    vec3 t = normalize(mat3(norm_mtx) * vert_tang);
    vec3 b = normalize(mat3(norm_mtx) * vert_bitang);
    vec3 n = normalize(mat3(norm_mtx) * vert_norm);
    mat3 tbn = transpose(mat3(t, b, n));

    vec3 light_pos = vec3(1, 2, 0);
    ts_light_pos = tbn * light_pos;
    // Our camera is always at the origin
    ts_view_pos = tbn * vec3(0, 0, 0);
    ts_frag_pos = tbn * ts_frag_pos;
 
    frag_uv = vert_uv;
} 
Fragment Shader ---------------
precision highp float;

uniform sampler2D tex_norm;
uniform sampler2D tex_diffuse;
uniform sampler2D tex_depth;
/*
    The type is controlled by the radio buttons below the canvas.
    0 = No bump mapping
    1 = Normal mapping
    2 = Parallax mapping
    3 = Steep parallax mapping
    4 = Parallax occlusion mapping
*/
uniform int type;
uniform int show_tex;
uniform float depth_scale;
uniform float num_layers;

varying vec2 frag_uv;
varying vec3 ts_light_pos;
varying vec3 ts_view_pos;
varying vec3 ts_frag_pos;

vec2 parallax_uv(vec2 uv, vec3 view_dir)
{
    if (type == 2) {
        // Parallax mapping
        float depth = texture2D(tex_depth, uv).r;    
        vec2 p = view_dir.xy * (depth * depth_scale) / view_dir.z;
        return uv - p;  
    } else {
        float layer_depth = 1.0 / num_layers;
        float cur_layer_depth = 0.0;
        vec2 delta_uv = view_dir.xy * depth_scale / (view_dir.z * num_layers);
        vec2 cur_uv = uv;

        float depth_from_tex = texture2D(tex_depth, cur_uv).r;

        for (int i = 0; i 



Email / Twitter / GitHub / CV

Harmful Consequences of Postel's Maxim

$
0
0
[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 0001
Network Working Group                                         M. Thomson
Internet-Draft                                                   Mozilla
Intended status: Informational                             June 12, 2017
Expires: December 14, 2017The Harmful Consequences of Postel's Maximdraft-thomson-postel-was-wrong-01

Abstract

   Jon Postel's famous statement in RFC 1122 of "Be liberal in what you
   accept, and conservative in what you send" - is a principle that has
   long guided the design of Internet protocols and implementations of
   those protocols.  The posture this statement advocates might promote
   interoperability in the short term, but that short-term advantage is
   outweighed by negative consequences that affect the long-term
   maintenance of a protocol and its ecosystem.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 14, 2017.

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e ofThomson                 Expires December 14, 2017               [Page 1]

Internet-Draft          Elephants Out, Donkeys In              June 2017


   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   22.  Protocol Decay  . . . . . . . . . . . . . . . . . . . . . . .   33.  The Long Term Costs . . . . . . . . . . . . . . . . . . . . .   44.  A New Design Principle  . . . . . . . . . . . . . . . . . . .   54.1.  Fail Fast and Hard  . . . . . . . . . . . . . . . . . . .   54.2.  Implementations Are Ultimately Responsible  . . . . . . .   54.3.  Protocol Maintenance is Important . . . . . . . . . . . .   65.  Security Considerations . . . . . . . . . . . . . . . . . . .   66.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   67.  Informative References  . . . . . . . . . . . . . . . . . . .   6
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   71.  Introduction

   Of the great many contributions Jon Postel made to the Internet, his
   remarkable technical achievements are often ignored in favor of the
   design and implementation philosophy that he first captured in the
   original IPv4 specification [RFC0760]:

      In general, an implementation should be conservative in its
      sending behavior, and liberal in its receiving behavior.

   In comparison, his contributions to the underpinnings of the
   Internet, which are in many respects more significant, enjoy less
   conscious recognition.  Postel's principle has been hugely
   influential in shaping the Internet and the systems that use Internet
   protocols.  Many consider this principle to be instrumental in the
   success of the Internet as well as the design of interoperable
   protocols in general.

   Over time, considerable changes have occurred in both the scale of
   the Internet and the level of skill and experience available to
   protocol and software designers.  Much of that experience is with
   protocols that were designed, informed by Postel's maxim, in the
   early phases of the Internet.

   That experience shows that there are negative long-term consequences
   to interoperability if an implementation applies Postel's advice.
   Correcting the problems caused by divergent behavior in
   implementations can be difficult.






Thomson                 Expires December 14, 2017               [Page 2]

Internet-Draft          Elephants Out, Donkeys In              June 2017


   It might be suggested that the posture Postel advocates was indeed
   necessary during the formative years of the Internet, and even key to
   its success.  This document takes no position on that claim.

   This document instead describes the negative consequences of the
   application of Postel's principle to the modern Internet.  A
   replacement design principle is suggested.

   There is good evidence to suggest that designers of protocols in the
   IETF widely understand the limitations of Postel's principle.  This
   document serves primarily as a record of the shortcomings of His
   principle for the wider community.

2.  Protocol Decay

   Divergent implementations of a specification emerge over time.  When
   variations occur in the interpretation or expression of semantic
   components, implementations cease to be perfectly interoperable.

   Implementation bugs are often identified as the cause of variation,
   though it is often a combination of factors.  Application of a
   protocol to new and unanticipated uses, and ambiguities or errors in
   the specification are often confounding factors.

   Of course, situations where two peers disagree are common, and should
   be expected over the lifetime of a protocol.  Even with the best
   intentions, the pressure to interoperate can be significant.  No
   implementation can hope to avoid having to trade correctness for
   interoperability indefinitely.

   An implementation that reacts to variations in the manner advised by
   Postel sets up a feedback cycle:

   o  Over time, implementations progressively add new code to constrain
      how data is transmitted, or to permit variations in what is
      received.

   o  Errors in implementations, or confusion about semantics can
      thereby be masked.

   o  These errors can become entrenched, forcing other implementations
      to be tolerant of those errors.

   For example, the original JSON specification [RFC4627] omitted
   critical details on a range of points including Unicode handling,
   ordering and duplication of object members, and number encoding.
   Consequently, a range of interpretations were used by
   implementations.  An update [RFC7159] was unable to correct theseThomson                 Expires December 14, 2017               [Page 3]

Internet-Draft          Elephants Out, Donkeys In              June 2017


   errors, instead concentrating on defining the interoperable subset of
   JSON.  I-JSON [RFC7493] defines a new format that is substantially
   similar to JSON without the interoperability flaws.  I-JSON also
   intentionally omits some interoperability: an I-JSON implementation
   will fail to accept some valid JSON texts.  Consequently, most JSON
   parsers do not implement I-JSON.

   An entrenched flaw can become a de facto standard.  Any
   implementation of the protocol is required to replicate the aberrant
   behavior, or it is not interoperable.  This is both a consequence of
   applying Postel's advice, and a product of a natural reluctance to
   avoid fatal error conditions.  This is colloquially referred to as
   being "bug for bug compatible".

   It is debatable as to whether decay can be completely avoided, but
   Postel's maxim encourages a reaction that compounds this issue.

3.  The Long Term Costs

   Once deviations become entrenched, there is little that can be done
   to rectify the situation.

   For widely used protocols, the massive scale of the Internet makes
   large-scale interoperability testing infeasible for all but a
   privileged few.  Without good maintenance, new implementations can be
   restricted to niche uses, where the problems arising from
   interoperability issues can be more closely managed.

   This has a negative impact on the ecosystem of a protocol.  New
   implementations are important in ensuring the continued viability of
   a protocol.  New protocol implementations are also more likely to be
   developed for new and diverse use cases and often are the origin of
   features and capabilities that can be of benefit to existing users.
   These problems also reduce the ability of established implementations
   to change.

   Protocol maintenance can help by carefully documenting divergence and
   recommending limits on what is both acceptable and interoperable.
   The time-consuming process of documenting the actual protocol -
   rather than the protocol as it was originally conceived - can restore
   the ability to create and maintain interoperable implementations.

   Such a process was undertaken for HTTP/1.1 [RFC7230].  This effort
   took more than 6 years to document protocol variations and describe
   what has - over time - become a far more complex protocol.Thomson                 Expires December 14, 2017               [Page 4]

Internet-Draft          Elephants Out, Donkeys In              June 20174.  A New Design Principle

   The following principle applies not just to the implementation of a
   protocol, but to the design and specification of the protocol.

      Protocol designs and implementations should fail noisily in
      response to bad or undefined inputs.

   Though less pithy than Postel's formulation, this principle is based
   on the lessons of protocol deployment.  The principle is also based
   on valuing early feedback, a practice central to modern engineering
   discipline.

4.1.  Fail Fast and Hard

   Protocols need to include error reporting mechanisms that ensure
   errors are surfaced in a visible and expedient fashion.

   Generating fatal errors in place of recovering from a possible fault
   is preferred, especially if there is any risk that the error
   represents an implementation flaw.  A fatal error provides excellent
   motivation for addressing problems.

   In contrast, generating warnings provide no incentive to fix a
   problem as the system remains operational.  Users can become inured
   to frequent use of warnings and thus systematically ignore them,
   whereas a fatal error can only happen once and will demand attention.

   On the whole, implementations already have ample motivation to prefer
   interoperability over correctness.  The primary function of a
   specification is to proscribe behavior in the interest of
   interoperability.  Specifications should mandate fast failure where
   possible.

4.2.  Implementations Are Ultimately Responsible

   Implementers are encouraged to expose errors immediately and
   prominently, especially in cases of underspecification.

   Exposing errors is particularly important for early implementations
   of a protocol.  If preexisting implementations generate errors in
   response to divergent behaviour, then new implementations will be
   able to detect and correct their own flaws quickly.

   An implementer that discovers a scenario that is not covered by the
   specification does the community a greater service by generating a
   fatal error than by attempted to interpret and adapt.  Hiding errors




Thomson                 Expires December 14, 2017               [Page 5]

Internet-Draft          Elephants Out, Donkeys In              June 2017


   can cause long-term problems.  Ideally, specification shortcomings
   are taken to protocol maintainers.

   Unreasoning strictness can be detrimental.  Protocol designers and
   implementers expected to exercise judgment in determining what level
   of strictness is ultimately appropriate.  In every case, documenting
   the decision to deviate from what is specified can avoid later
   issues.

4.3.  Protocol Maintenance is Important

   Protocol designers are strongly encouraged to continue to maintain
   and evolve protocols beyond their initial inception and definition.
   If protocol implementations are less tolerant of variation, protocol
   maintenance becomes critical.  Good extensibility [RFC6709] can
   relieve some of the pressure on maintenance.5.  Security Considerations

   Sloppy implementations, lax interpretations of specifications, and
   uncoordinated extrapolation of requirements to cover gaps in
   specification can result in security problems.  Hiding the
   consequences of protocol variations encourages the hiding of issues,
   which can conceal bugs and make them difficult to discover.

   Designers and implementers of security protocols generally understand
   these concerns.  However, general-purpose protocols are not exempt
   from careful consideration of security issues.  Furthermore, because
   general-purpose protocols tend to deal with flaws or obsolescence in
   a less urgent fashion than security protocols, there can be fewer
   opportunities to correct problems in protocols that develop
   interoperability problems.

6.  IANA Considerations

   This document has no IANA actions.

7.  Informative References

   [RFC0760]  Postel, J., "DoD standard Internet Protocol", RFC 760,
              DOI 10.17487/RFC0760, January 1980,<http://www.rfc-editor.org/info/rfc760>.

   [RFC4627]  Crockford, D., "The application/json Media Type for
              JavaScript Object Notation (JSON)", RFC 4627,
              DOI 10.17487/RFC4627, July 2006,<http://www.rfc-editor.org/info/rfc4627>.Thomson                 Expires December 14, 2017               [Page 6]

Internet-Draft          Elephants Out, Donkeys In              June 2017


   [RFC6709]  Carpenter, B., Aboba, B., Ed., and S. Cheshire, "Design
              Considerations for Protocol Extensions", RFC 6709,
              DOI 10.17487/RFC6709, September 2012,<http://www.rfc-editor.org/info/rfc6709>.

   [RFC7159]  Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
              Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March
              2014, <http://www.rfc-editor.org/info/rfc7159>.

   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
              Protocol (HTTP/1.1): Message Syntax and Routing",RFC 7230, DOI 10.17487/RFC7230, June 2014,<http://www.rfc-editor.org/info/rfc7230>.

   [RFC7493]  Bray, T., Ed., "The I-JSON Message Format", RFC 7493,
              DOI 10.17487/RFC7493, March 2015,<http://www.rfc-editor.org/info/rfc7493>.

Author's Address

   Martin Thomson
   Mozilla

   Email: martin.thomson@gmail.com



























Thomson                 Expires December 14, 2017               [Page 7]


Html markup produced by rfcmarkup 1.123, available fromhttps://tools.ietf.org/tools/rfcmarkup/

Dinosaur-Killing Asteroid Could Have Caused 2 Years of Darkness

$
0
0

Tremendous amounts of soot, lofted into the air from global wildfires following a massive asteroid strike 66 million years ago, would have plunged Earth into darkness for nearly two years, new research finds.

This would have shut down photosynthesis, drastically cooled the planet, and contributed to the mass extinction that marked the end of the age of dinosaurs.

These new details about how the climate could have dramatically changed following the impact of a 10-kilometer-wide asteroid will be published Aug. 21 in the Proceedings of the National Academy of Sciences. The study, led by the National Center for Atmospheric Research (NCAR) with support from NASA and the University of Colorado Boulder, used a world-class computer model to paint a rich picture of how Earth's conditions might have looked at the end of the Cretaceous Period, information that paleobiologists may be able to use to better understand why some species died, especially in the oceans, while others survived.

Scientists estimate that more than three-quarters of all species on Earth, including all non-avian dinosaurs, disappeared at the boundary of the Cretaceous-Paleogene periods, an event known as the K-Pg extinction. Evidence shows that the extinction occurred at the same time that a large asteroid hit Earth in what is now the Yucatán Peninsula. The collision would have triggered earthquakes, tsunamis, and even volcanic eruptions.

Scientists also calculate that the force of the impact would have launched vaporized rock high above Earth's surface, where it would have condensed into small particles known as spherules. As the spherules fell back to Earth, they would have been heated by friction to temperatures high enough to spark global fires and broil Earth's surface. A thin layer of spherules can be found worldwide in the geologic record.

"The extinction of many of the large animals on land could have been caused by the immediate aftermath of the impact, but animals that lived in the oceans or those that could burrow underground or slip underwater temporarily could have survived," said NCAR scientist Charles Bardeen, who led the study. "Our study picks up the story after the initial effects -- after the earthquakes and the tsunamis and the broiling. We wanted to look at the long-term consequences of the amount of soot we think was created and what those consequences might have meant for the animals that were left."

Other study co-authors are Rolando Garcia and Andrew Conley, both NCAR scientists, and Owen "Brian" Toon, a researcher at the University of Colorado Boulder.

A world without photosynthesis

In past studies, researchers have estimated the amount of soot that might have been produced by global wildfires by measuring soot deposits still preserved in the geologic record. For the new study, Bardeen and his colleagues used the NCAR-based Community Earth System Model (CESM) to simulate the effect of the soot on global climate going forward. They used the most recent estimates of the amount of fine soot found in the layer of rock left after the impact (15,000 million tons), as well as larger and smaller amounts, to quantify the climate's sensitivity to more or less extensive fires.

In the simulations, soot heated by the Sun was lofted higher and higher into the atmosphere, eventually forming a global barrier that blocked the vast majority of sunlight from reaching Earth's surface. "At first it would have been about as dark as a moonlit night," Toon said.

While the skies would have gradually brightened, photosynthesis would have been impossible for more than a year and a half, according to the simulations. Because many of the plants on land would have already been incinerated in the fires, the darkness would likely have had its greatest impact on phytoplankton, which underpin the ocean food chain. The loss of these tiny organisms would have had a ripple effect through the ocean, eventually devastating many species of marine life.

The research team also found that photosynthesis would have been temporarily blocked even at much lower levels of soot. For example, in a simulation using only 5,000 million tons of soot -- about a third of the best estimate from measurements -- photosynthesis would still have been impossible for an entire year.

In the simulations, the loss of sunlight caused a steep decline in average temperatures at Earth's surface, with a drop of 50 degrees Fahrenheit (28 degrees Celsius) over land and 20 degrees Fahrenheit (11 degrees Celsius) over the oceans.

While Earth's surface cooled in the study scenarios, the atmosphere higher up in the stratosphere actually became much warmer as the soot absorbed light from the Sun. The warmer temperatures caused ozone destruction and allowed for large quantities of water vapor to be stored in the upper atmosphere. The water vapor then chemically reacted in the stratosphere to produce hydrogen compounds that led to further ozone destruction. The resulting ozone loss would have allowed damaging doses of ultraviolet light to reach Earth's surface after the soot cleared.

The large reservoir of water in the upper atmosphere formed in the simulations also caused the layer of sunlight-blocking soot to be removed abruptly after lingering for years, a finding that surprised the research team. As the soot began to settle out of the stratosphere, the air began to cool. This cooling, in turn, caused water vapor to condense into ice particles, which washed even more soot out of the atmosphere. As a result of this feedback loop -- cooling causing precipitation that caused more cooling -- the thinning soot layer disappeared in just a few months.

Challenging the model

While the scientists think the new study gives a robust picture of how large injections of soot into the atmosphere can affect the climate, they also caution that the study has limitations.

For example, the simulations were run in a model of modern-day Earth, not a model representing what Earth looked like during the Cretaceous Period, when the continents were in slightly different locations. The atmosphere 66 million years ago also contained somewhat different concentrations of gases, including higher levels of carbon dioxide.

Additionally, the simulations did not try to account for volcanic eruptions or sulfur released from the Earth's crust at the site of the asteroid impact, which would have resulted in an increase in light-reflecting sulfate aerosols in the atmosphere.

The study also challenged the limits of the computer model's atmospheric component, known as the Whole Atmosphere Community Climate Model (WACCM).

"An asteroid collision is a very large perturbation -- not something you would normally see when modeling future climate scenarios," Bardeen said. "So the model was not designed to handle this and, as we went along, we had to adjust the model so it could handle some of the event's impacts, such as warming of the stratosphere by over 200 degrees Celsius."

These improvements to WACCM could be useful for other types of studies, including modeling a "nuclear winter" scenario. Like global wildfires millions of years ago, the explosion of nuclear weapons could also inject large amounts of soot into the atmosphere, which could lead to a temporary global cooling.

"The amount of soot created by nuclear warfare would be much less than we saw during the K-Pg extinction," Bardeen said. "But the soot would still alter the climate in similar ways, cooling the surface and heating the upper atmosphere, with potentially devastating effects."

###

The University Corporation for Atmospheric Research manages the National Center for Atmospheric Research under sponsorship by the National Science Foundation. Any opinions, findings and conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Please follow Astrobiology on Twitter.


Recognizing when two arithmetic expressions are essentially the same

$
0
0

Recognizing when two arithmetic expressions are essentially the same

[ Warning: The math formatting in the RSS / Atom feed for this article is badly mutilated. I suggest you read the article on my blog. ]

In this article, I discuss “twenty-four puzzles”. The puzzle «4 6 7 9 ⇒ 24» means that one should take the numbers 4, 6, 7, and 9, and combine them with the usual arithmetic operations of addition, subtraction, multiplication, and division, to make the number 24. In this case the unique solution is !!6·\frac{7 + 9}{4}!!. This is a continuation of my previous articles on this topic:

My first cut at writing a solver for twenty-four puzzles was a straightforward search program. It had a couple of hacks in it to cut down the search space by recognizing that !!a+E!! and !!E+a!! are the same, but other than that there was nothing special about it andI've discussed it before.

It would quickly and accurately report whether any particular twenty-four puzzle was solvable, but as it turned out that wasn't quite good enough. The original motivation for the program was this: Toph and I play this game in the car. Pennsylvania license plates have three letters and four digits, and if we see a license plate FBV 2259 we try to solve «2 2 5 9 ⇒ 24». Sometimes we can't find a solution and then we wonder: it is because there isn't one, or is it because we just didn't get it yet? So the searcher turned into a phone app, which would tell us whether there was solution, so we'd know whether to give up or keep searching.

But this wasn't quite good enough either, because after we would find that first solution, say !!2·(5 + 9 - 2)!!, we would wonder: are there any more? And here the program was useless: it would cheerfully report that there were three, so we would rack our brains to find another, fail, ask the program to tell us the answer, and discover to our disgust that the three solutions it had in mind were:

$$ 2 \cdot (5 + (9 - 2)) \\ 2 \cdot (9 + (5 - 2)) \\ 2 \cdot ((5 + 9) - 2) $$

The computer thinks these are different, because it uses different data structures to represent them. It represents them with an abstract syntax tree, which means that each expression is either a single constant, or is a structure comprising an operator and its two operand expressions—always exactly two. The computer understands the three expressions above as having these structures:

It's not hard to imagine that the computer could be taught to understand that the first two trees are equivalent. Getting it to recognize that the third one is also equivalent seems somewhat more difficult.

Commutativity and associativity

I would like the computer to understand that these three expressions should be considered “the same”. But what does “the same” mean? This problem is of a kind I particularly like: we want the computer to do something, but we're not exactly sure what that something is. Some questions are easy to ask but hard to answer, but this is the opposite: the real problem is to decide what question we want to ask. Fun!

Certainly some of the question should involve commutativity and associativity of addition and multiplication. If the only difference between two expressions is that one has !!a + b!! where the other has !!b + a!!, they should be considered the same; similarly !!a + (b + c)!! is the same expression as !!(a + b) + c!! and as !!(b + a) + c!! and !!b + (a + c)!! and so forth.

The «2 2 5 9 ⇒ 24» example above shows that commutativity and associativity are not limited to addition and multiplication. There are commutative and associative properties of subtraction also! For example, $$a+(b-c) = (a+b)-c$$ and $$(a+b)-c = (a-c)+b.$$ There ought to be names for these laws but as far as I know there aren't. (Sure, it's just commutativity and associativity of addition in disguise, but nobody explaining these laws to school kids ever seems to point out that subtraction can enter into it. They just observe that !!(a-b)-c ≠ a-(b-c)!!, say “subtraction isn't associative”, and leave it at that.)

Closely related to these identities are operator inversion identities like !!a-(b+c) = (a-b)-c!!, !!a-(b-c) = (a-b)+c!!, and their multiplicative analogues. I don't know names for these algebraic laws either.

One way to deal with all of this would to build a complicated comparison function for abstract syntax trees that tried to transform one tree into another by applying these identities. A better approach is to recognize that the data structure is over-specified. If we want the computer to understand that !!(a + b) + c!! and !!a + (b + c)!! are the same expression, we are swimming upstream by using a data structure that was specifically designed to capture the difference between these expressions.

Instead, I invented a data structure, called an Ezpr (“Ez-pur”), that can represent expressions, but in a somewhat more natural way than abstract syntax trees do, and in a way that makes commutativity and associativity transparent.

An Ezpr has a simplest form, called its “canonical” or “normal” form. Two Ezprs represent essentially the same mathematical expression if they have the same canonical form. To decide if two abstract syntax trees are the same, the computer converts them to Ezprs, simplifies them, and checks to see if resulting canonical forms are identical.

The Ezpr

Since associativity doesn't matter, we don't want to represent it. When we (humans) think about adding up a long column of numbers, we don't think about associativity because we don't add them pairwise. Instead we use an addition algorithm that adds them all at once in a big pile. We don't treat addition as a binary operation; we normally treat it as an operator that adds up the numbers in a list. The Ezpr makes this explicit: its addition operator is applied to a list of subexpressions, not to a pair. Both !!a + (b + c)!! and !!(a + b) + c!! are represented as the Ezpr

    SUM [ a b c # ]

which just says that we are adding up !!a!!, !!b!!, and !!c!!. (The# sign is just punctuation; ignore it for now.)

Similarly the Ezpr MUL [ a b c # ] represents the product of !!a!!, !!b!!, and !!c!!.

To handle commutativity, we want those [ a b c ] lists to be bags. Perl doesn't have a built-in bag object, so instead I used arrays and required that the array elements be in sorted order. (Exactly which sorted order doesn't really matter.)

Subtraction and division

This doesn't yet handle subtraction and division, and the way I chose to handle them is the only part of this that I think is at all clever. A SUM object has not one but two bags, one for the positive and one for the negative part of the expression. An expression like !!a - b + c - d!! is represented by the Ezpr:

SUM [ a c # b d ]

and this is also the representation of !!a + c - b - d!!, of !!c + a - d - b!!, of !!c - d+ a-b!!, and of any other expression of the idea that we are adding up !!a!! and !!c!! and then deducting !!b!! and !!d!!. The # sign separates the terms that are added from those that are subtracted. (I am not happy with this notation, especially the crabbed-looking # sign, but I haven't found any I liked better.)

Either of the two bags may be empty, so for example !!a + b!! is justSUM [ a b # ].

Division is handled similarly. Here conventional mathematical notation does a little bit better than in the sum case: MUL [ a c # b d ] is usually written as !!\frac{ac}{bd}!!.

Ezprs handle the associativity and commutativity of subtraction and division quite well. I pointed out earlier that subtraction has an associative law !!(a + b) - c = a + (b - c)!! even though it's not usually called that.
No code is required to understand that those two expressions are equal if they are represented as Ezprs, because they are represented by completely identical structures:

        SUM [ a b # c ]

Similarly there is a commutative law for subtraction: !!a + b - c = a - c + b!! and once again that same Ezpr does for both.

Ezpr laws

Ezprs are more flexible than binary trees. A binary tree can represent the expressions !!(a+b)+c!! and !!a+(b+c)!! but not the expression !!a+b+c!!. Ezprs can represent all three and it's easy to transform between them. Just as there are rules for building expressions out of simpler expressions, there are a few rules for combining and manipulating Ezprs.

Lifting and flattening

The most important transformation is lifting, which is the Ezpr version of the associative law. In the canonical form of an Ezpr, aSUM node may not have subexpressions that are also SUM nodes. If you have

  SUM [ a SUM [ b c # ] # … ]

you should lift the terms from the inner sum into the outer one:

  SUM [ a b c # … ]

effectively transforming !!a+(b+c)!! into !!a+b+c!!. More generally, in

   SUM [ a SUM [ b # c ]
       # d SUM [ e # f ] ]

we lift the terms from the inner Ezprs into the outer one:

   SUM [ a b f # c d e ]

This effectively transforms !!a + (b - c) - d - (e - f))!! to !!a + b + f - c - d - e!!.

Similarly, when a MUL node contains another MUL, we can flatten the structure.

Say we are converting the expression !!7 ÷ (3 ÷ (6 × 4))!! to an Ezpr. The conversion function is recursive and the naïve version computes this Ezpr:

      MUL [ 7 # MUL [ 3 # MUL [ 6 4 # ] ] ]

But then at the bottom level we have a MUL inside a MUL, so the 4 and 6 in the innermost MUL are lifted upward:

      MUL [ 7 # MUL [ 3 # 6 4 ] ]

which represents !!\frac7{\frac{3}{6\cdot 4}}!!. Then again we have a MUL inside a MUL, and again the subexpressions of the innermost MUL can be lifted:

      MUL [ 7 6 4 # 3 ]

which we can imagine as !!\frac{7·6·4}3!!.

The lifting only occurs when the sub-node has the same type as its parent; we may not lift terms out of a MUL into a SUM or vice versa.

Trivial nodes

The Ezpr SUM [ a # ] says we are adding up just one thing, !!a!!, and so it can be eliminated and replaced with just !!a!!. SimilarlySUM [ # a ] can be replaced with the constant !!-a!!, if !!a!! is a constant. MUL can be handled similarly.

An even simpler case is SUM [ # ] which can be replaced by the constant 0; MUL [ # ] can be replaced with 1. These sometimes arise as a result of cancellation.

Cancellation

Consider the puzzle «3 3 4 6 ⇒ 24». My first solver found 49 solutions to this puzzle. One is !!(3 - 3) + (4 × 6)!!. Another is !!(4 + (3 - 3)) × 6!!. A third is !!4 × (6 + (3 - 3))!!.

I think these are all the same: the solution is to multiply the 4 by the 6, and to get rid of the threes by subtracting them to make a zero term. The zero term can be added onto the rest of expression or to any of its subexpressions—there are ten ways to do this—and it doesn't really matter where.

This is easily explained in terms of Ezprs: If the same subexpression appears in both of a node's bags, we can drop it. For example, the expression !!(4 + (3 -3)) × 6!! starts out as

    MUL [ 6 SUM [ 3 4 # 3 ] # ]

but the duplicate threes in SUM [ 3 4 # 3 ] can be canceled, to leave

    MUL [ 6 SUM [ 4 # ] # ]

The sum is now trivial, as described in the previous section, so can be eliminated and replaced with just 4:

    MUL [ 6 4 # ]

This Ezpr records the essential feature of each of the three solutions to «3 3 4 6 ⇒ 24» that I mentioned: they all are multiplying the 6 by the 4, and then doing something else unimportant to get rid of the threes.

Another solution to the same puzzle is !!(6 ÷ 3) × (4 × 3)!!. Mathematically we would write this as !!\frac63·4·3!! and we can see this is just !!6×4!! again, with the threes gotten rid of by multiplication and division, instead of by addition and subtraction. When converted to an Ezpr, this expression becomes:

    MUL [ 6 4 3 # 3 ]

and the matching threes in the two bags are cancelled, again leaving

    MUL [ 6 4 # ]

In fact there aren't 49 solutions to this puzzle. There is only one, with 49 trivial variations.

Identity elements

In the preceding example, many of the trivial variations on the !!4×6!! solution involved multiplying some subexpression by !!\frac 33!!. When one of the input numbers in the puzzle is a 1, one can similarly obtain a lot of useless variations by choosing where to multiply the 1.

Consider «1 3 3 5 ⇒ 24»: We can make 24 from !!3 × (3 + 5)!!. We then have to get rid of the 1, but we can do that by multiplying it onto any of the five subexpressions of !!3 × (3 + 5)!!:

$$ 1 × (3 × (3 + 5)) \\ (1 × 3) × (3 + 5) \\ 3 × (1 × (3 + 5)) \\ 3 × ((1 × 3) + 5) \\ 3 × (3 + (1×5)) $$

These should not be considered different solutions. Whenever we see any 1's in either of the bags of a MUL node, we should eliminate them. The first expression above, !!1 × (3 × (3 + 5))!!, is converted to the Ezpr

 MUL [ 1 3 SUM [ 3 5 # ] # ]

but then the 1 is eliminated from the MUL node leaving

 MUL [ 3 SUM [ 3 5 # ] # ]

The fourth expression, !!3 × ((1 × 3) + 5)!!, is initially converted to the Ezpr

 MUL [ 3 SUM [ 5 MUL [ 1 3 # ] # ] # ]

When the 1 is eliminated from the inner MUL, this leaves a trivialMUL [ 3 # ] which is then replaced with just 3, leaving:

 MUL [ 3 SUM [ 5 3 # ] # ]

which is the same Ezpr as before.

Zero terms in the bags of a SUM node can similarly be dropped.

Multiplication by zero

One final case is that MUL [ 0 … # … ] can just be simplified to 0.

The question about what to do when there is a zero in the denominator is a bit of a puzzle. In the presence of division by zero, some of our simplification rules are questionable. For example, when we have MUL [ a # MUL [ b # c ] ], the lifting rule says we can simplify this to MUL [ a c # b ]—that is, that !!\frac a{\frac bc} = \frac{ac}b!!. This is correct, except that when !!b=0!! or !!c=0!! it may be nonsense, depending on what else is going on. But since zero denominators never arise in the solution of these puzzles, there is no issue in this application.

Results

The Ezpr module is around 200 lines of Perl code, including everything: the function that converts abstract syntax trees to Ezprs, functions to convert Ezprs to various notations (both MUL [ 4 # SUM [ 3 # 2 ] ] and 4 × (3 - 2)), and the two versions of the normalization process described in the previous section. The normalizer itself is about 35 lines.

Associativity is taken care of by the Ezpr structure itself, and commutativity is not too difficult; as I mentioned, it would have been trivial if Perl had a built-in bag structure. I find it much easier to reason about transformations of Ezprs than abstract syntax trees. Many operations are much simpler; for example the negation ofSUM [ A # B ] is simply SUM [ B # A ]. Pretty-printing is also easier because the Ezpr better captures the way we write and think about expressions.

It took me a while to get the normalization tuned properly, but the results have been quite successful, at least for this problem domain. The current puzzle-solving program reports the number of distinct solutions to each puzzle. When it reports two different solutions, they are really different; when it fails to support the exact solution that Toph or I found, it reports one essentially the same. (There are some small exceptions, which I will discuss below.)

Since there is no specification for “essentially the same” there is no hope of automated testing. But we have been using the app for several months looking for mistakes, and we have not found any. If the normalizer failed to recognize that two expressions were essentially similar, we would be very likely to notice: we would be solving some puzzle, be unable to find the last of the solutions that the program claimed to exist, and then when we gave up and saw what it was we would realize that it was essentially the same as one of the solutions we had found. I am pretty confident that there are no errors of this type, but see “Arguable points” below.

A harder error to detect is whether the computer has erroneously conflated two essentially dissimilar expressions. To detect this we would have to notice that an expression was missing from the computer's solution list. I am less confident that nothing like this has occurred, but as the months have gone by I feel better and better about it.

I consider the problem of “how many solutions does this puzzle really have to have?” been satisfactorily solved. There are some edge cases, but I think we have identified them.

Code for my solver is on Github. The Ezpr code is in the Ezpr package in the Expr.pm file. This code is all in the public domain.

Some examples

The original program claims to find 35 different solutions to «4 6 6 6 ⇒ 24». The revised program recognizes that these are of only two types:

!!4 × 6 × 6 ÷ 6!!MUL [ 4 6 # ]
!!(6 - 4) × (6 + 6)!!MUL [ SUM [ 6 # 4 ] SUM [ 6 6 # ] # ]

Some of the variant forms of the first of those include:

$$ 6 × (4 + (6 - 6)) \\ 6 + ((4 × 6) - 6) \\ (6 - 6) + (4 × 6) \\ (6 ÷ 6) × (4 × 6) \\ 6 ÷ ((6 ÷ 4) ÷ 6) \\ 6 ÷ (6 ÷ (4 × 6)) \\ 6 × (6 × (4 ÷ 6)) \\ (6 × 6) ÷ (6 ÷ 4) \\ 6 ÷ ((6 ÷ 6) ÷ 4) \\ 6 × (6 - (6 - 4)) \\ 6 × (6 ÷ (6 ÷ 4)) \\ \ldots
$$

In an even more extreme case, the original program finds 80 distinct expressions that solve «1 1 4 6 ⇒ 24», all of which are trivial variations on !!4·6!!.

Of the 715 puzzles, 466 (65%) have solutions; for 175 of these the solution is unique. There are 3 puzzles with 8 solutions each («2 2 4 8 ⇒ 24», «2 3 6 9 ⇒ 24», and «2 4 6 8 ⇒ 24»), one with 9 solutions («2 3 4 6 ⇒ 24»), and one with 10 solutions («2 4 4 8 ⇒ 24»).

The 10 solutions for «2 4 4 8 ⇒ 24» are as follows:

!!4 × 8 - 2 × 4 !!SUM [ MUL [ 4 8 # ] # MUL [ 2 4 # ] ]
!!4 × (2 + 8 - 4) !!MUL [ 4 SUM [ 2 8 # 4 ] # ]
!!(8 - 4) × (2 + 4) !!MUL [ SUM [ 8 # 4 ] SUM [ 2 4 # ] # ]
!!4 × (4 + 8) ÷ 2 !!MUL [ 4 SUM [ 4 8 # ] # 2 ]
!!(4 - 2) × (4 + 8) !!MUL [ SUM [ 4 # 2 ] SUM [ 4 8 # ] # ]
!!8 × (2 + 4/4) !!MUL [ 8 SUM [ 1 2 # ] # ]
!!2 × 4 × 4 - 8 !!SUM [ MUL [ 2 4 4 # ] # 8 ]
!!8 + 2 × (4 + 4) !!SUM [ 8 MUL [ 2 SUM [ 4 4 # ] # ] # ]
!!4 + 4 + 2 × 8 !!SUM [ 4 4 MUL [ 2 8 # ] # ]
!!4 × (8 - 4/2) !!MUL [ 4 SUM [ 8 # MUL [ 4 # 2 ] ] # ]

A complete listing of every essentially different solution to every «a b c d ⇒ 24» puzzle is available here. There are 1,063 solutions in all.

Arguable points

There are a few places where we have not completely pinned down what it means for two solutions to be essentially the same; I think there is room for genuine disagreement.

  1. Any solution involving !!2×2!! can be changed into a slightly different solution involving !!2+2!! instead. These expressions are arithmetically different but numerically equal. For example, I mentioned earlier that «2 2 4 8 ⇒ 24» has 8 solutions. But two of these are !! 8 + 4 × (2 + 2)!! and !! 8 + 4 × 2 × 2!!. I am willing to accept these as essentially different. Toph, however, disagrees.

  2. A similar but more complex situation arises in connection with «1 2 3 7 ⇒ 24». Consider !!3×7+3!!, which equals 24. To get a solution to «1 2 3 7 ⇒ 24», we can replace either of the threes in !!3×7+3!! with !!(1+2)!!, obtaining !!((1 + 2) × 7) + 3!! or !! (3×7)+(1 +2)!!. My program considers these to be different solutions. Toph is unsure.

It would be pretty easy to adjust the normalization process to handle these the other way if the user wanted that.

Some interesting puzzles

«1 2 7 7 ⇒ 24» has only one solution, quite unusual. (Spoiler) «2 2 6 7 ⇒ 24» has two solutions, both somewhat unusual. (Spoiler)

Somewhat similar to «1 2 7 7 ⇒ 24» is «3 9 9 9 ⇒ 24» which also has an unusual solution. But it has two other solutions that are less surprising. (Spoiler)

«1 3 8 9 ⇒ 24» has an easy solution but also a quite tricky solution. (Spoiler)

One of my neighbors has the license plate JJZ 4631. «4 6 3 1 ⇒ 24» is one of the more difficult puzzles.

What took so long?

Back in March, I wrote:

I have enough material for at least three or four more articles about this that I hope to publish here in the coming weeks.

But the previous article on this subject ended similarly, saying

I hope to write a longer article about solvers in the next week or so.

and that was in July 2016, so don't hold your breath.

And here we are, five months later!

This article was a huge pain to write. Sometimes I sit down to write something and all that comes out is dreck. I sat down to write this one at least three or four times and it never worked. The tortured Git history bears witness. In the end I had to abandon all my earlier drafts and start over from scratch, writing a fresh outline in an empty file.

But perseverance paid off! WOOOOO.

[Other articles in category /math] permanent link



Sonos: users must accept new privacy policy or devices may “cease to function”

$
0
0
sonos.jpg

(Image: CNET/CBS Interactive)

Sonos has confirmed that existing customers will not be given an option to opt out of its new privacy policy, leaving customers with sound systems that may eventually "cease to function".

It comes as the home sound system maker prepares to begin collecting audio settings, error data, and other account data before the launch of its smart speaker integration in the near future.

A spokesperson for the home sound system maker told ZDNet that, "if a customer chooses not to acknowledge the privacy statement, the customer will not be able to update the software on their Sonos system, and over time the functionality of the product will decrease."

"The customer can choose to acknowledge the policy, or can accept that over time their product may cease to function," the spokesperson said.

News of the changes was announced to customers in an email last week.

But the company's move to disallow any existing customer a way to opt-out of the policy has riled many who commented in various tweets and Reddit threads. It comes as the company becomes the latest tech firm to offer a new privacy policy for its users, which governs how the company collects its customers' data and shares it with partners.

Sonos said that users "can opt out of submitting certain types of personal information to the company; for instance, additional usage data such as performance and activity information."

But users will not be able to switch off data that the company considers necessary for each Sonos device to perform its basic functions.

That "functional data" includes email addresses, IP addresses, and account login information -- as well as device data, information about Wi-Fi antennas and other hardware information, room names, and error data.

The move has drawn ire from several privacy and policy experts.

"Sonos is a perfect illustration of how effective privacy, when it comes to not just services but also physical objects, requires more than just 'more transparency' -- it also requires choices and effective controls for users," said Joe Jerome, a policy analyst at the Center for Democracy & Technology.

"We're going to see this more and more where core services for things that people paid for are going to be conditioned on accepting ever-evolving privacy policies and terms of use," he said. "That's not going to be fair unless companies start providing users with meaningful choices and ensure that basic functionality continues if users say no to new terms."

Lee Tien, a senior staff attorney at the Electronic Frontier Foundation, said it was a "growing" problem among the consumer electronics space.

"[Device] makers obviously can do a lot about the problem," said Tien. "They can design their systems to separate more data collection side from product feature. Obviously some features don't work without data but even so, you can often choose to store data locally and not transmit it to some mothership somewhere."

"Society as a whole continues down a path where devices in your home, traditionally our most private space, are largely controlled by other people who want to know what you're doing," he said.

Sonos isn't the only company under scrutiny for changes to its privacy policy.

Plex, a software multi-platform media server, also told customers last week that it would begin collecting more data on its users, limited to non-identifiable device data. Like Sonos, the company did not allow users to opt-out of the changes.

After users complained on severalRedditthreads and across social media that their only option was to no longer use the services, Plex reversed course.

It now proposes to "make it even more clear that we don't collect data that tells us what is in your library," says an updated page on its website.

Contact me securely

Zack Whittaker can be reached securely on Signal and WhatsApp at 646-755–8849, and his PGP fingerprint for email is: 4D0E 92F2 E36A EC51 DAAE 5D97 CB8C 15FA EB6C EEA5.

Read More

Understanding Asymmetric Numeral Systems

$
0
0

Published on August 20, 2017; updated on August 22, 2017; tags:Haskell,Mathematics,Probability

Apparently, Google is trying to patent (an application of) Asymmetric Numeral Systems, so I spent some time today learning what it is.

In its essense lies a simple and beautiful idea.

ANS is a lossless compression algorithm. Its input is a list of symbols from some finite set. Its output is a positive integer. Each symbol \(s\) has a fixed known probability \(p_s\) of occurring in the list. The algorithm tries to assign each list a unique integer so that the more probable lists get smaller integers.

If we ignore the compression part (assigning smaller integers to more probable inputs), the encoding could be done as follows: convert each symbol to a number from \(0\) to \(B-1\) (where \(B\) is the number of symbols), add a leading 1 to avoid ambiguities caused by leading zeros, and interpret the list as an integer written in a base-\(B\) positional system.

This encoding process is an iterative/recursive algorithm:

  1. Start with the number 1;
  2. If the current number is \(n\), and the incoming symbol correspond to a number \(s\), update the number to be \(s + n\cdot B\).

The decoding process is a corecursive algorithm:

  1. Start with the number that we are decoding;
  2. Split the current number \(n\) into the quotient and remainder modulo \(B\);
  3. Emit the remainder and continue decoding the quotient;
  4. Stop when the current number reaches 1.

(The decoding is LIFO: the first decoded element will be the last encoded element.)

This encoding scheme relies on the standard isomorphism between the sets \(\{0,\ldots,B-1\}\times \mathbb{Z}_{\geq 1}\) and \(\mathbb{Z}_{\geq B}\), established by the functions

\[f(s,n) = s + n\cdot B;\]\[g(n) = (n \bmod B, [n/B]).\]

(The peculiar domain and codomain of this isomorphism are chosen so that we have \(\forall n,s.\;f(s,n) > n\); this ensures that the decoding process doesn’t get stuck.)

We can represent this in Haskell as

{-# LANGUAGE ScopedTypeVariables, TypeApplications,             NamedFieldPuns, AllowAmbiguousTypes #-}import Data.Ordimport Data.Listimport Numeric.NaturaldataIso a b =Iso
  { to :: a -> b
  , from :: b -> a
  }encode ::Iso (s, Natural) Natural-> [s] ->Natural
encode Iso{to} = foldl' (\acc s -> to (s, acc)) 1decode ::Iso (s, Natural) Natural->Natural-> [s]
decode Iso{from} = reverse . unfoldr
  (\n ->if n ==1thenNothingelseJust$ from n
  )

And the standard isomorphism which we used in the simple encoding process is

std_iso :: forall s . (Bounded s, Enum s) =>Iso (s, Natural) Natural
std_iso =Iso (\(s,n) -> s2n s + base @s * n) (\n -> (n2s $ n `mod` base @s, n `div` base @s))s2n :: forall s . (Bounded s, Enum s) => s ->Natural
s2n s = fromIntegral $
  ((fromIntegral . fromEnum) s             ::Integer) -
  ((fromIntegral . fromEnum) (minBound @s) ::Integer)n2s :: forall s . (Bounded s, Enum s) =>Natural-> s
n2s n = toEnum . fromIntegral $
  (fromIntegral n + (fromIntegral . fromEnum) (minBound @s) ::Integer)base :: forall s . (Bounded s, Enum s) =>Natural
base = s2n (maxBound @s) +1

(The functions are more complicated than they have to be to support symbol types like Int. Int does not start at 0 and is prone to overflow.)

Let’s now turn to the general form of the isomorphism

\[f \colon \{0,\ldots,B-1\}\times \mathbb{Z}_{\geq 1} \to \mathbb{Z}_{\geq \beta};\]\[g \colon \mathbb{Z}_{\geq \beta} \to \{0,\ldots,B-1\}\times \mathbb{Z}_{\geq 1}.\]

(In general, \(\beta\), the smallest value of \(f\), does not have to equal \(B\), the number of symbols.)

If we know (or postulate) that the second component of \(g\), \(g_2\colon \mathbb{Z}_{\geq \beta} \to \mathbb{Z}_{\geq 1}\), is increasing, then we can recover it from the first component, \(g_1\colon \mathbb{Z}_{\geq \beta} \to \{0,\ldots,B-1\}\).

Indeed, for a given \(s=g_1(n)\), \(g_2\) must be the unique increasing isomorphism from \[A_s = \{f(s,m)\mid m\in\mathbb{Z}_{\geq 1}\} = \{n\mid n\in\mathbb{Z}_{\geq \beta}, g_1(n) = s\}\] to \(\mathbb{Z}_{\geq 1}\). To find \(g_2(n)\), count the number of elements in \(A_s\) that are \(\leq n\).

Similarly, we can recover \(f\) from \(g_1\). To compute \(f(s,n)\), take \(n\)th smallest number in \(A_s\).

In Haskell:

ans_iso :: forall s .Eq s => (Natural, Natural-> s) ->Iso (s, Natural) Natural
ans_iso (b, classify) =Iso{to, from} where  to :: (s, Natural) ->Natural
  to (s, n) = [ k | k <- [b..], classify k == s ] `genericIndex` (n-1)  from ::Natural-> (s, Natural)
  from n =let s = classify n
        n' = genericLength [ () | k <- [b..n], classify k == s ]in (s, n')

For every function \(g_1\colon \mathbb{Z}_{\geq \beta} \to \{0,\ldots,B-1\}\) (named classify in Haskell), we have a pair of encode/decode functions, provided that each of the sets \(A_s\) is infinite. In particular, we can get the standard encode/decode functions (originally defined by std_iso) by setting classify to

classify_mod_base :: forall s . (Bounded s, Enum s) => (Natural, Natural-> s)
classify_mod_base = (base @s, \n -> n2s (n `mod` base @s))

By varying \(g_1\) (and therefore the sets \(A_s\)), we can control which inputs get mapped to smaller integers.

If \(A_s\) is more dense, \(f(s,n)\), defined as \(n\)th smallest number in \(A_s\), will be smaller.

If \(A_s\) is more sparse, \(f(s,n)\) will be larger.

The standard isomorphism makes the sets \[A_s = \{ s+n\cdot B \mid n\in \mathbb Z_{\geq 1} \} \] equally dense for all values of \(s\). This makes sense when all \(s\) are equally probable.

But in general, we should make \(A_s\) denser for those \(s\) that are more frequent. Specifically, we want

\[ \frac{|\{k\in A_s \mid k \leq x\}|}{x} \approx p_s. \]

Substituting \(x=f(s,n)\) then gives \(\log_2 f(s,n) \approx \log_2 n + \log_2 (1/p_s)\). This means that adding a symbol \(s\) costs \(\log_2 (1/p_s)\) bits, which is what we should strive for.

Here’s a simple example of a suitable \(g_1\):

classify_prob ::Show s => (Bounded s, Enum s) => [Double] -> (Natural, Natural-> s)
classify_prob probs =let beta =2-- arbitrary number > 1
      t = genericLength l
      l = concatMap (\(s, t) -> replicate t s). sortBy (comparing (Down. snd)). zip [minBound..maxBound]$ map (round . (/ minimum probs)) probs
      g1 n = l `genericIndex` ((n-beta) `mod` t)in (beta, g1)

This is a periodic function. It computes the number of times each symbol \(s\) will appear within a single period as \(k_s=\mathrm{round}(p_s/\min \{p_s\})\). The number \(p_s/\min \{p_s\}\) is chosen for its following two properties:

  1. it is proportional to the probability of the symbol, \(p_s\);
  2. it is \(\geq 1\), so that even the least likely symbol occurs among the values of the function.

The function then works by mapping the first \(k_0\) numbers to symbol \(0\), the next \(k_1\) numbers to symbol \(1\), and so on, until it maps \(k_{B-1}\) numbers to symbol \(B-1\) and repeats itself. The period of the function is \(\sum_s k_s\approx 1/\min \{p_s\}\).

classify_prob rearranges the symbols in the order of decreasing probability, which gives further advantage to the more probable symbols. This is probably the best strategy if we want to allocate integers in blocks; a better way would be to interleave the blocks in a fair or random way in order to keep the densities more uniform.

Another downside of this function is that its period may be too small to distinguish between similar probabilities, such as 0.4 and 0.6. The function used in rANS is better in this regard; it uses progressively larger intervals, which provide progressively better approximations.

But classify_prob is enough to demonstate the idea. Let’s encode a list of booleans where True is expected 90% of time.

> iso = ans_iso $ classify_prob [0.1,0.9]> encode iso (replicate 4True)5> encode iso (replicate 4False)11111

Four Trues compress much better than four Falses. Let’s also compare the number of bits in 14641 with the number of bits that the information theory predicts are needed to encode four events with probability 0.1:

> logBase 21111113.439701045971955>4* logBase 2 (1/0.1)13.28771237954945

Not bad.

The implementation of ANS in this article is terribly inefficient, especially its decoding part, mostly because the isomorphism uses brute force search instead of computation. The intention is to elucidate what the encoding scheme looks like and where it comes from. An efficient implementation of ANS and its different variants is an interesting topic in itself, but I’ll leave it for another day.

The full code (including tests) is available here.

Thanks to /u/sgraf812 for pointing out a mistake in a previous version of classify_prob.

A Brief Survey of Deep Reinforcement Learning

$
0
0

(Submitted on 19 Aug 2017)

Abstract: Deep reinforcement learning is poised to revolutionise the field of AI and represents a step towards building autonomous systems with a higher level understanding of the visual world. Currently, deep learning is enabling reinforcement learning to scale to problems that were previously intractable, such as learning to play video games directly from pixels. Deep reinforcement learning algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of reinforcement learning, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep reinforcement learning, including the deep $Q$-network, trust region policy optimisation, and asynchronous advantage actor-critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via reinforcement learning. To conclude, we describe several current areas of research within the field.
Comments:To appear in IEEE Signal Processing Magazine, Special Issue on Deep Learning for Image Understanding
Subjects:Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as: arXiv:1708.05866 [cs.LG]
 (or arXiv:1708.05866v1 [cs.LG] for this version)
From: Kai Arulkumaran [view email]
[v1] Sat, 19 Aug 2017 15:55:31 GMT (3565kb,D)

Bitdefender Anti-Virus: Heap Buffer Overflow via 7z LZMA

$
0
0

A few days after having published the post about the Bitdefender stack buffer overflow via 7z PPMD, I discovered a new bug in Bitdefender’s product. While this is a 7z bug, too, it has nothing to do with the previous bug or with the PPMD codec. Instead, it concerns dynamic memory management. In contrast to the previous post, which described an arbitrary free vulnerability in F-Secure’s anti-virus product, this post presents the first heap buffer overflow of this blog series.

Introduction

For the write-up on the 7z PPMD bug, I read a lot of the original 7-Zip source code and discovered a few new things that looked promising to investigate in anti-virus products. Therefore, I took another stab at analyzing Bitdefender’s 7z module.

I previously wrote about relaxed file processing. The Bitdefender 7z PPMD stack buffer overflow was a good example of relaxed file processing by removing a check (that is, removing code).

This bug demonstrates another fundamental difficulty that arises when incorporating new code into an existing code base. In particular, a minimal set of changes to the new code is often inevitable. Mostly, this affects memory allocation and code that is concerned with file access, especially if a totally different file abstraction is used. The presented bug is an example of the former type of difficulty. More specifically, an incorrect use of a memory allocation function that extends the 7-Zip source code in Bitdefender’s 7z module causes a heap buffer overflow.

Getting Into the Details

When Bitdefender’s 7z module discovers an EncodedHeader in a 7z archive, it tries to decompress it with the LZMA decoder. Their code seems to be based on 7-Zip, but they made a few changes. Loosely speaking, the extraction of a 7z EncodedHeader is implemented as follows:

  1. Read the unpackSize from the 7z EncodedHeader.
  2. Allocate unpackSize bytes.
  3. Use the C API of the LZMA decoder that comes with 7-Zip and let it decompress the stream.

The following snippet shows how the allocation function is called:

1DD02A845FA lea     rcx, [rdi+128h] //<-------- result
1DD02A84601 mov     rbx, [rdi+168h]
1DD02A84608 mov     [rsp+128h], rsi
1DD02A84610 mov     rsi, [rax+10h]
1DD02A84614 mov     [rsp+0E0h], r15
1DD02A8461C mov     edx, [rsi]      //<-------- size
1DD02A8461E call    SZ_AllocBuffer

Recall the x64 calling convention. In particular, the first two integer arguments (from left to right) are passed via rcx and rdx.

SZ_AllocBuffer is a function within the Bitdefender 7z module. It has two arguments:

  • The first argument result is a pointer to which the result (a pointer to the allocated buffer in case of success or NULL in case of a failure) is written.
  • The second argument size is the allocation size.

Let us look at the functions’s implementation.

260ED3025D0 SZ_AllocBuffer proc near
260ED3025D0
260ED3025D0 mov     [rsp+8], rbx
260ED3025D5 push    rdi
260ED3025D6 sub     rsp, 20h
260ED3025DA mov     rbx, rcx
260ED3025DD mov     edi, edx //<-------- edi holds size
260ED3025DF mov     rcx, [rcx]
260ED3025E2 test    rcx, rcx
260ED3025E5 jz      short loc_260ED3025EC
260ED3025E7 call    near ptr irrelevant_function
260ED3025EC
260ED3025EC loc_260ED3025EC:
260ED3025EC cmp     edi, 0FFFFFFFFh  //<------- {*}
260ED3025EF jbe     short loc_260ED302606
260ED3025F1 xor     ecx, ecx
260ED3025F3 mov     [rbx], rcx
260ED3025F6 mov     eax, ecx
260ED3025F8 mov     [rbx+8], ecx
260ED3025FB mov     rbx, [rsp+30h]
260ED302600 add     rsp, 20h
260ED302604 pop     rdi
260ED302605 retn
260ED302606 ; ------------------------------------
260ED302606
260ED302606 loc_260ED302606:                        
260ED302606 mov     rcx, rdi  //<------ set size argument for mymalloc
260ED302609 call    mymalloc//[rest of the function omitted]

Note that mymalloc is just a wrapper function that eventually calls malloc and returns the result.

Apparently, the programmer expected the size argument of SZ_AllocBuffer to be of a type with size greater than 32 bits. Obviously, it is only a 32-bit value.

It is funny to see that the compiler failed to optimize away the comparison at {*}, given that its result is only used for an unsigned comparison jbe. If you have any hints on why this might happen, I’d be very interested to hear them.

After SZ_AllocBuffer returns, the function LzmaDecode is called:

LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen, /* further arguments omitted */)

Note that dest is the buffer allocated with SZ_AllocBuffer and destLen is supposed to be a pointer to the buffer’s size.

In the reference implementation, SizeT is defined as size_t. Interestingly, Bitdefender’s 7z module uses a 64-bit type for SizeT in both the 32-bit and the 64-bit version, making both versions vulnerable to this bug. I suspect that this is the result of an effort to create identical behavior for the 32-bit and 64-bit versions of the engine.

The LZMA decoder extracts the given src stream and writes (up to) *destLen bytes to the dest buffer, where *destLen is the 64-bit unpackSize from the 7z EncodedHeader. This results in a neat heap buffer overflow.

Triggering the Bug

To trigger the bug, we create a 7z LZMA stream containing the data we want to write on the heap. Then, we construct a 7z EncodedHeader with a Folder that has an unpackSize of (1<<32) + 1. This should make the function SZ_AllocBuffer allocate a buffer of 1 byte.

That sounds nice, but does this actually work?

0:000> g
!Heap block at 1F091472D40 modified at 1F091472D51 past requested size of 1
(2f8.14ec): Break instruction exception - code 80000003 (first chance)
ntdll!RtlpNtMakeTemporaryKey+0x435e:
00007ff9`d849c4ce cc              int     3

0:000> db 1F091472D51
000001f0`91472d51  59 45 53 2c 20 54 48 49-53 20 57 4f 52 4b 53 ab  YES, THIS WORKS.

Attacker Control and Exploitation

The attacker can write completely arbitrary data to the heap without any restriction. A file system minifilter is used to scan all files that touch the disk, making this vulnerability easily exploitable remotely, for example by sending an email with a crafted file as attachment to the victim.

Moreover, the engine runs unsandboxed and as NT Authority\SYSTEM. Hence, this bug is highly critical. However, since ASLR and DEP are in place, successful exploitation for remote code execution might require another bug (e.g. an information leak) to bypass ASLR.

Note also that Bitdefender’s engine is licensed to many different anti-virus vendors, all of which could be affected by this bug.

The Fix

The patched version of the function SZ_AllocBuffer looks as follows:

1E0CEA52AE0 SZ_AllocBuffer proc near
1E0CEA52AE0
1E0CEA52AE0 mov     [rsp+8], rbx
1E0CEA52AE5 mov     [rsp+10h], rsi
1E0CEA52AEA push    rdi
1E0CEA52AEB sub     rsp, 20h
1E0CEA52AEF mov     esi, 0FFFFFFFFh
1E0CEA52AF4 mov     rdi, rdx  //<-----rdi holds the size
1E0CEA52AF7 mov     rbx, rcx
1E0CEA52AFA cmp     rdx, rsi  //<------------{1}
1E0CEA52AFD jbe     short loc_1E0CEA52B11
1E0CEA52AFF xor     eax, eax
1E0CEA52B01 mov     rbx, [rsp+30h]
1E0CEA52B06 mov     rsi, [rsp+38h]
1E0CEA52B0B add     rsp, 20h
1E0CEA52B0F pop     rdi
1E0CEA52B10 retn
1E0CEA52B11 ; -----------------------------------
1E0CEA52B11
1E0CEA52B11 loc_1E0CEA52B11: 
1E0CEA52B11 mov     rcx, [rcx]
1E0CEA52B14 test    rcx, rcx
1E0CEA52B17 jz      short loc_1E0CEA52B1E
1E0CEA52B19 call    near ptr irrelevant_function
1E0CEA52B1E
1E0CEA52B1E loc_1E0CEA52B1E: 
1E0CEA52B1E cmp     edi, esi  //<------------{2}
1E0CEA52B20 jbe     short loc_1E0CEA52B29
1E0CEA52B22 xor     ecx, ecx
1E0CEA52B24 mov     [rbx], rcx
1E0CEA52B27 jmp     short loc_1E0CEA52B3B
1E0CEA52B29 ; -----------------------------------
1E0CEA52B29
1E0CEA52B29 loc_1E0CEA52B29:
1E0CEA52B29 mov     ecx, edi
1E0CEA52B2B call    near ptr mymalloc//[rest of the function omitted]

Most importantly, we see that the function’s second argument size has been changed to a 64-bit type.

Note that at {1}, a check ensures that the passed size is not greater than 0xFFFFFFFF.

At {2}, the value of rdi is guaranteed to be at most 0xFFFFFFFF, hence it suffices to use the 32-bit register edi. However, just as in the original version (see above), it is useless to compare this 32-bit value once more to 0xFFFFFFFF and it is a mystery to me why the compiler does not optimize this away.

Using a full 64-bit type for the second argument size resolves the described bug.

Conclusion

In a nutshell, the discovered bug is a 64-bit value size being passed to the allocation function SZ_AllocBuffer which looks roughly like this:

void* SZ_AllocBuffer(void *resultptr, uint32_t size);

Assuming that the size is not explicitly casted, the compiler should throw a warning of the following kind:

warning C4244: 'argument': conversion from 'uint64_t' to 'uint32_t', possible loss of data

Note that in Microsoft’s MSVC compiler, this is a Level2 warning (Level1 being the lowest and Level4 being the highest level). Hence, this bug most likely could have been avoided simply by taking compiler warnings seriously.

For a critical codebase such as the engine of an anti-virus product, it would be adequate to treat warnings as errors, at least up to a warning level of 2 or 3.

Nevertheless, the general type of bug shows that even if only few lines of additional code are necessary to incorporate external code (such as the 7-Zip code) into a code base, those very lines can be particularly prone to error.

Do you have any comments, feedback, doubts, or complaints? I would love to hear them. You can find my email address on the about page.

Alternatively, you are invited to join the discussion on HackerNews or on /r/netsec.

Timeline of Disclosure

  • 07/24/2017 - Discovery
  • 07/24/2017 - Report
  • 07/24/2017 - “Thank you for your report, we will investigate and come back with an answer.”
  • 08/22/2017 - “confirm that this vulnerability has been patched” and “will have an internal discussion about the reward”
  • ??/??/2017 - Bug bounty paid

Thanks & Acknowledgements

I want to thank Bitdefender and especially Marius for their response as well as for fixing the bug.

Why is this C++ code faster than my hand-written assembly (2016)

$
0
0

If you think a 64-bit DIV instruction is a good way to divide by two, then no wonder the compiler's asm output beat your hand-written code, even with -O0 (compile fast, no extra optimization, and store/reload to memory after/before every C statement so a debugger can modify variables).

See Agner Fog's Optimizing Assembly guide to learn how to write efficient asm. He also has instruction tables and a microarch guide for specific details for specific CPUs. See also the tag wiki for more perf links.

See also this more general question about beating the compiler with hand-written asm: Is inline assembly language slower than native C++ code?. TL:DR: yes if you do it wrong (like this question). But usually you're fine letting the compiler do its thing. Also see is assembly faster than compiled languages?. One of the answers links to these neat slides showing how various C compilers optimize some really simple functions with cool tricks


even:
    mov rbx, 2
    xor rdx, rdx
    div rbx

On Intel Haswell, div r64 is 36 uops, with a latency of 32-96 cycles, and a throughput of one per 21-74 cycles. (Plus the 2 uops to set up RBX and zero RDX, but out-of-order execution can run those early). High-uop-count instructions like DIV are microcoded, which can also cause front-end bottlenecks. In this case, latency is the most relevant factor because it's part of a loop-carried dependency chain.

shr rax, 1 does the same unsigned division: It's 1 uop, with 1c latency, and can run 2 per clock cycle.

For comparison, 32-bit division is faster, but still horrible vs. shifts. idiv r32 is 9 uops, 22-29c latency, and one per 8-11c throughput on Haswell.


As you can see from looking at gcc's asm output (with -O0 on the Godbolt compiler explorer), it doesn't use any divide instructions in its implementation of sequence(), just shifts. clang -O0 does actually compile naively like you thought, even using 64-bit IDIV twice. (When optimizing, compilers do use both outputs of IDIV when the source does a division and modulus with the same operands, if they use IDIV at all)

gcc doesn't have a totally-naive mode; it always transforms through GIMPLE, which means some "optimizations" can't be disabled. This includes recognizing division-by-constant and using shifts (power of 2) or a fixed-point multiplicative inverse (non power of 2) to avoid IDIV (see div_by_13 in the above godbolt link).

gcc -Os (optimize for size) does use IDIV for non-power-of-2 division, instead of multiplicative inverses (unfortunately even in cases where the multiplicative inverse code is only slightly larger but much slower).


(summary for this case: use uint64_t n)

First of all, it's only interesting to look at optimized compiler output. (-O3). -O0 speed is basically meaningless.

Look at the asm output (on Godbolt, or see How to remove "noise" from GCC/clang assembly output?). When the compiler doesn't make optimal code in the first place: Writing your C/C++ source in a way that guides the compiler into making better code is usually the best approach. You have to know asm, and know what's efficient, you just apply it indirectly. Compilers are also a good source of ideas: sometimes clang will do something cool, and you can hand-hold gcc into doing the same thing: see this answer and what I did with the non-unrolled loop in @Veedrac's code below.)

This approach is portable, and in 20 years some future compiler can compile it to whatever is efficient on future hardware (x86 or not), maybe using new ISA extension or auto-vectorizing. Hand-written x86-64 asm from 15 years ago would usually not be optimally tuned for Skylake. e.g. compare&branch macro-fusion didn't exist back then. What's optimal now for hand-crafted asm for one microarchitecture might not be optimal for other current and future CPUs.Comments on @johnfound's answer discuss major differences between AMD Bulldozer and Intel Haswell, which have a big effect on this code. But in theory, g++ -O3 -march=bdver3 and g++ -O3 -march=skylake will do the right thing. (Or g++ -O3 -march=native when host = target.) (Or -mtune-... to just tune, without using instructions that other CPUs might not support.)

My feeling is that guiding the compiler to asm that's good for a current CPU you care about shouldn't be a problem for future compilers. They're hopefully better than current compilers at finding ways to transform code, and can find a way that works for future CPUs. Regardless, future x86 probably won't be terrible at anything that's good on current x86, and the future compiler will avoid any asm-specific pitfalls while implementing something like the data movement from your C source, if it doesn't see something better.

Hand-written asm is a black-box for the optimizer, so constant-propagation doesn't work when inlining makes an input a compile-time constant. Other optimizations are also affected. Read https://gcc.gnu.org/wiki/DontUseInlineAsm before using asm. (And avoid MSVC-style inline asm: inputs/outputs have to go through memory which adds overhead.)

In this case: your n has a signed type, and gcc uses the SAR/SHR/ADD sequence that gives the correct rounding. (IDIV and arithmetic-shift "round" differently for negative inputs, see the SAR insn set ref manual entry). (IDK if gcc tried and failed to prove that n can't be negative, or what. Signed-overflow is undefined behaviour, so it should have been able to.)

You should have used uint64_t n, so it can just SHR. And so it's portable to systems where long is only 32-bit (e.g. x86-64 Windows). (I made this change in the version I linked on Godbolt).


BTW, gcc's optimized asm output looks pretty good (using unsigned long n): the inner loop it inlines into main() does this:

 # from gcc5.4 `-O3`, plus my comments

 # edx= count=1
 # rax= uint64_t n

.L9:                   # do{
    lea    rcx, [rax+1+rax*2]   # rcx = 3*n + 1
    mov    rdi, rax
    shr    rdi         # rdi = n>>1;
    test   al, 1       # set flags based on n%2 (aka n&1)
    mov    rax, rcx
    cmove  rax, rdi    # n= (n%2) ? 3*n+1 : n/2;
    add    edx, 1      # ++count;
    cmp    rax, 1 
    jne   .L9          #}while(n!=1)

  cmp/branch to update max and maxi, and then do the next n

The inner loop is branchless, and the critical path of the loop-carried dependency chain is:

  • 3-component LEA (3 cycles)
  • cmov (2 cycles on Haswell, 1c on Broadwell or later).

Total: 5 cycle per iteration, latency bottleneck. Out-of-order execution takes care of everything else in parallel with this (in theory: I haven't tested with perf counters to see if it really runs at 5c/iter).

The FLAGS input of cmov (produced by TEST) is faster to produce than the RAX input (from LEA->MOV), so it's not on the critical path.

Similarly, the MOV->SHR that produces CMOV's RDI input is off the critical path, because it's also faster than the LEA. MOV on IvyBridge and later has zero latency (handled at register-rename time). (It still takes a uop, and a slot in the pipeline, so it's not free, just zero latency). The extra MOV in the LEA dep chain is part of the bottleneck on other CPUs.

The cmp/jne is also not part of the critical path: it's not loop-carried, because branch prediction takes control dependencies out of the critical path.


gcc did a pretty good job here. It could save one code byte by using inc edx instead of add edx, 1, because nobody cares about P4 and its false-dependencies for partial-flag-modifying instructions.

It could also save all the MOV instructions, and the TEST: SHR sets CF= the bit shifted out, so we can use cmovc instead of test / cmovz.

 ### Hand-optimized version of what gcc does
.L9:                       #do{
    lea     rcx, [rax+1+rax*2] # rcx = 3*n + 1
    shr     rax            # n>>=1;  // in-place.  CF = n&1 = n%2
    cmovc   rax, rcx       # n= (n&1) ? 3*n+1 : n/2;
    inc     edx            # ++count;
    cmp     rax, 1
    jne     .L9            #}while(n!=1)

See @johnfound's answer for another clever trick: remove the CMP and branch on SHR's flag result: zero only if n was 1 (or 0) to start with.

This doesn't help with the latency at all on Haswell. It does help significantly on CPUs like Intel pre-IvB, and AMD Bulldozer-family, where MOV is not zero-latency. The compiler's wasted MOV instructions do affect the critical path. BD's complex-LEA and CMOV are both lower latency (2c and 1c respectively), so it's a bigger fraction of the latency. Also, throughput bottlenecks become an issue, because it only has two integer ALU pipes. See @johnfound's answer, where he has timing results from an AMD CPU.

Even on Haswell, this version may help a bit by avoiding some occasional delays where a non-critical uop steals an execution port from one on the critical path, delaying execution by 1 cycle. (This is called a resource conflict). It also saves a register, which may help when doing multiple n values in parallel in an interleaved loop (see below).

LEA's latency depends on the addressing mode, on Intel SnB-family CPUs. 3c for 3 components ([base+idx+const], which takes two separate adds), but only 1c with 2 or fewer components (one add). Some CPUs (like Core2) do even a 3-component LEA in a single cycle, but SnB-family doesn't. Worse, Intel SnB-family standardizes latencies so there are no 2c uops, otherwise 3-component LEA would be only 2c like Bulldozer. (3-component LEA is slower on AMD as well, just not by as much).

So lea rcx, [rax + rax*2] / inc rcx is only 2c latency, faster than lea rcx, [rax + rax*2 + 1], on Intel SnB-family CPUs like Haswell. Break-even on BD, and worse on Core2. It does cost an extra uop, which normally isn't worth it to save 1c latency, but latency is the major bottleneck here and Haswell has a wide enough pipeline to handle the extra uop throughput.

Neither gcc, icc, nor clang (on godbolt) used SHR's CF output, always using an AND or TEST. Silly compilers. :P They're great pieces of complex machinery, but a clever human can often beat them on small-scale problems. (Given thousands to millions of times longer to think about it, of course! Compilers don't use exhaustive algorithms to search for every possible way to do things, because that would take too long when optimizing a lot of inlined code, which is what they do best. They also don't model of the pipeline in the target microarchitecture; they just use some heuristics.)


Simple loop unrolling won't help; this loop bottlenecks on the latency of a loop-carried dependency chain, not on loop overhead / throughput. This means it would do well with hyperthreading (or any other kind of SMT), since the CPU has lots of time to interleave instructions from two threads. This would mean parallelizing the loop in main, but that's fine because each thread can just check a range of n values and produce a pair of integers as a result.

Interleaving by hand within a single thread might be viable, too. Maybe compute the sequence for a pair of numbers in parallel, since each one only takes a couple registers, and they can all update the same max / maxi. This creates more instruction-level parallelism.

The trick is deciding whether to wait until all the n values have reached 1 before getting another pair of starting n values, or whether to break out and get a new start point for just one that reached the end condition, without touching the registers for the other sequence. Probably it's best to keep each chain working on useful data, otherwise you'd have to conditionally increment its counter.


You could maybe even do this with SSE packed-compare stuff to conditionally increment the counter for vector elements where n hadn't reached 1 yet. And then to hide the even longer latency of a SIMD conditional-increment implementation, you'd need to keep more vectors of n values up in the air. Maybe only worth with 256b vector (4x uint64_t).

I think the best strategy to make detection of a 1 "sticky" is to mask the vector of all-ones that you add to increment the counter. So after you've seen a 1 in an element, the increment-vector will have a zero, and +=0 is a no-op.

Untested idea for manual vectorization

# starting with YMM0 = [ n_d, n_c, n_b, n_a ]  (64-bit elements)
# ymm4 = _mm256_set1_epi64x(1):  increment vector
# ymm5 = all-zeros:  count vector

.inner_loop:
    vpaddq    ymm1, ymm0, xmm0
    vpaddq    ymm1, ymm1, xmm0
    vpaddq    ymm1, ymm1, set1_epi64(1)     # ymm1= 3*n + 1.  Maybe could do this more efficiently?

    vprllq    ymm3, ymm0, 63                # shift bit 1 to the sign bit

    vpsrlq    ymm0, ymm0, 1                 # n /= 2

    # There may be a better way to do this blend, avoiding the bypass delay for an FP blend between integer insns, not sure.  Probably worth it
    vpblendvpd ymm0, ymm0, ymm1, ymm3       # variable blend controlled by the sign bit of each 64-bit element.  I might have the source operands backwards, I always have to look this up.

    # ymm0 = updated n  in each element.

    vpcmpeqq ymm1, ymm0, set1_epi64(1)
    vpandn   ymm4, ymm1, ymm4         # zero out elements of ymm4 where the compare was true

    vpaddq   ymm5, ymm5, ymm4         # count++ in elements where n has never been == 1

    vptest   ymm4, ymm4
    jnz  .inner_loop
    # Fall through when all the n values have reached 1 at some point, and our increment vector is all-zero

    vextracti128 ymm0, ymm5, 1
    vpmaxq .... crap this doesn't exist
    # Actually just delay doing a horizontal max until the very very end.  But you need some way to record max and maxi.

You can and should implement this with intrinsics, instead of hand-written asm.


Besides just implementing the same logic with more efficient asm, look for ways to simplify the logic, or avoid redundant work. e.g. memoize to detect common endings to sequences. Or even better, look at 8 trailing bits at once (gnasher's answer)

@EOF points out that tzcnt (or bsf) could be used to do multiple n/=2 iterations in one step. That's probably better than SIMD vectorizing, because no SSE or AVX instruction can do that. It's still compatible with doing multiple scalar ns in parallel in different integer registers, though.

So the loop might look like this:

goto loop_entry;  // C++ structured like the asm, for illustration only
do {
   n = n*3 + 1;
  loop_entry:
   shift = _tzcnt_u64(n);
   n >>= shift;
   count += shift;
} while(n != 1);

This may do significantly fewer iterations, but variable-count shifts are slow on Intel SnB-family CPUs without BMI2. 3 uops, 2c latency. (They have an input dependency on the FLAGS because count=0 means the flags are unmodified. They handle this as a data dependency, and take multiple uops because a uop can only have 2 inputs (pre-HSW/BDW anyway)). This is the kind that people complaining about x86's crazy-CISC design are referring to. It makes x86 CPUs slower than they would be if the ISA was designed from scratch today, even in a mostly-similar way. (i.e. this is part of the "x86 tax" that costs speed / power.) SHRX/SHLX/SARX (BMI2) are a big win (1 uop / 1c latency).

It also puts tzcnt (3c on Haswell and later) on the critical path, so it significantly lengthens the total latency of the loop-carried dependency chain. It does remove any need for a CMOV, or for preparing a register holding n>>1, though. @Veedrac's answer overcomes all this by deferring the tzcnt/shift for multiple iterations, which is highly effective (see below).

We can safely use BSF or TZCNT interchangeably, because n can never be zero at that point. TZCNT's machine-code decodes as BSF on CPUs that don't support BMI1. (Meaningless prefixes are ignored, so REP BSF runs as BSF).

TZCNT performs much better than BSF on AMD CPUs that support it, so it can be a good idea to use REP BSF, even if you don't care about setting ZF if the input is zero rather than the output. Some compilers do this when you use __builtin_ctzll even with -mno-bmi.

They perform the same on Intel CPUs, so just save the byte if that's all that matters. TZCNT on Intel (pre-Skylake) still has a false-dependency on the supposedly write-only output operand, just like BSF, to support the undocumented behaviour that BSF with input = 0 leaves its destination unmodified. So you need to work around that unless optimizing only for Skylake, so there's nothing to gain from the extra REP byte. (Intel often goes above and beyond what the x86 ISA manual requires, to avoid breaking widely-used code that depends on something it shouldn't, or that is retroactively disallowed. e.g. Windows 9x's assumes no speculative prefetching of TLB entries, which was safe when the code was written, before Intel updated the TLB management rules.)

Anyway, LZCNT/TZCNT on Haswell have the same false dep as POPCNT: see this Q&A. This is why in gcc's asm output for @Veedrac's code, you see it breaking the dep chain with xor-zeroing on the register it's about to use as TZCNT's destination, when it doesn't use dst=src. Since TZCNT/LZCNT/POPCNT never leave their destination undefined or unmodified, this false dependency on the output on Intel CPUs is purely a performance bug / limitation. Presumably it's worth some transistors / power to have them behave like other uops that go to the same execution unit. The only software-visible upside is in the interaction with another microarchitectural limitation: they can micro-fuse a memory operand with an indexed addressing mode on Haswell, but on Skylake where Intel removed the false dependency for LZCNT/TZCNT they "un-laminate" indexed addressing modes while POPCNT can still micro-fuse any addr mode.


@hidefromkgb's answer has a nice observation that you're guaranteed to be able to do one right shift after a 3n+1. You can compute this more even more efficiently than just leaving out the checks between steps. The asm implementation in that answer is broken, though (it depends on OF, which is undefined after SHRD with a count > 1), and slow: ROR rdi,2 is faster than SHRD rdi,rdi,2, and using two CMOV instructions on the critical path is slower than an extra TEST that can run in parallel.

I put tidied / improved C (which guides the compiler to produce better asm), and tested+working faster asm (in comments below the C) up on Godbolt: see the link in @hidefromkgb's answer. (this answer hit the 30k char limit from the large Godbolt URLs. I could post a second answer for these code-tweak parts of this answer, but that link is not out of place in the other answer. And no, I don't want to shorten them because shortlinks can rot, and couldn't anyway because they exceed goo.gl's length limit.)

Also improved the output-printing to convert to a string and make one write() instead of writing one char at a time. This minimizes impact on timing the whole program with perf stat ./collatz (to record performance counters), and I de-obfuscated some of the non-critical asm.


@Veedrac's code

I got a very small speedup from right-shifting as much as we know needs doing, and checking to continue the loop. From 7.5s for limit=1e8 down to 7.275s, on Core2Duo (Merom), with an unroll factor of 16.

code + comments on Godbolt. Don't use this version with clang; it does something silly with the defer-loop. Using a tmp counter k and then adding it to count later changes what clang does, but that slightly hurts gcc.

See discussion in comments: Veedrac's code is excellent on CPUs with BMI1 (i.e. not Celeron/Pentium)

The Synthesis Kernel (1988) [pdf]

Apple IIe Design Guidelines (1982) [pdf]


Apple Scales Back Its Ambitions for a Self-Driving Car

$
0
0

Apple’s testing vehicles will carry employees between its various Silicon Valley offices. The new effort is called PAIL, short for Palo Alto to Infinite Loop, the address of the company’s main office in Cupertino, Calif., and a few miles down the road from Palo Alto, Calif.

Apple’s in-house shuttle service, which isn’t operational yet, follows Waymo, Uber and a number of car companies that have been testing driverless cars on city streets around the world.

Apple has a history of tinkering with a technology until its engineers figure out what to do with it. The company worked on touch screens for years, for example, before that technology became an essential part of the iPhone.

Photo
Apple will soon be moving into a new Silicon Valley headquarters.Credit Justin Sullivan/Getty Images

But the initial scale of Apple’s driverless ambitions went beyond tinkering or building underlying technology. The Titan project started in 2014, and it was staffed by many Apple veterans. The company also hired engineers with expertise in building cars, and not just the software that would run an autonomous vehicle.

It was a do-it-all approach typical of Apple, which prefers to control every aspect of a product, from the software that runs it to the look and feel of the hardware.

From the beginning, the employees dedicated to Project Titan looked at a wide range of details. That included motorized doors that opened and closed silently. They also studied ways to redesign a car interior without a steering wheel or gas pedals, and they worked on adding virtual or augmented reality into interior displays.

The team also worked on a new light and ranging detection sensor, also known as lidar. Lidar sensors normally protrude from the top of a car like a spinning cone and are essential in driverless cars. Apple, as always focused on clean designs, wanted to do away with the awkward cone.

Apple even looked into reinventing the wheel. A team within Titan investigated the possibility of using spherical wheels — round like a globe — instead of the traditional, round ones, because spherical wheels could allow the car better lateral movement.

But the car project ran into trouble, said the five people familiar with it, dogged by its size and by the lack of a clearly defined vision of what Apple wanted in a vehicle. Team members complained of shifting priorities and arbitrary or unrealistic deadlines.

There was disagreement about whether Apple should develop a fully autonomous vehicle or a semiautonomous car that could drive itself for stretches but allow the driver to retake control.

Steve Zadesky, an Apple executive who was initially in charge of Titan, wanted to pursue the semiautonomous option. But people within the industrial design team including Jonathan Ive, Apple’s chief designer, believed that a fully driverless car would allow the company to reimagine the automobile experience, according to the five people.

A similar debate raged inside Google’s self-driving car effort for years. There, the fully autonomous vehicle won out, mainly because researchers worried drivers couldn’t be trusted to retake control in an emergency.

Photo
Driverless cars are being tested on city streets throughout the country.Credit Gene J. Puskar/Associated Press

Even though Apple had not ironed out many of the basics, like how the autonomous systems would work, a team had already started working on an operating system software called CarOS. There was fierce debate about whether it should be programmed using Swift, Apple’s own programming language, or the industry standard, C++.

Mr. Zadesky, who worked on the iPod and iPhone, eventually left Titan and took a leave of absence from the company for personal reasons in 2016. He is still at Apple, although he is no longer involved in the project. Mr. Zadesky could not be reached for comment.

Last year, Apple started to rein in the project. The company tapped Bob Mansfield, a longtime executive who over the years had led hardware engineering for some of Apple’s most successful products, to oversee Titan.

Mr. Mansfield shelved plans to build a car and focused the project on the underlying self-driving technology. He also laid off some hardware staff, though the exact number of employees dedicated to working on car technology was unclear.

More recently, the team has grown again, adding personnel with expertise in autonomous systems, rather than car production.

Apple’s headlong foray into autonomous vehicles underscores one of the biggest challenges facing the company: finding the next breakthrough product. As Apple celebrates the iPhone’s 10th anniversary, the company remains heavily dependent on smartphone sales for growth. It has introduced new products like the Apple Watch and expanded revenue from services, but the iPhone still accounts for more than half of its sales.

In April, the California Department of Motor Vehicles granted Apple a test permit to allow the company to test autonomous driving technology in three 2015 Lexus RX 450h sport utility vehicles. There will be a safety driver monitoring the car during testing.

While many companies are pursuing driverless technology and see it as a game changer for car ownership and transportation, no one has figured out how to cash in yet.

With expectations reset and the team more focused, people on the Titan project said morale has improved under Mr. Mansfield. Still, one of the biggest challenges is holding onto talented engineers because self-driving technology is one of the hottest things in Silicon Valley, and Apple is hardly the only company working on it.

Continue reading the main story

Show HN: The best time to visit any city

$
0
0
Show HN: The best time to visit any city
224 points by ignostic6 hours ago | hide | past | web | 86 comments | favorite
I wanted to build a tool to help people decide when and where to travel. As I started building, I realized that "when" and "where" need separate treatment to be most useful. The map tool handles "where" best:

https://championtraveler.com/travel-weather-map/

Clicking through each week would be frustrating for those who know where they want to travel but not when. For these people I built "best time to travel" pages using the same data.

https://championtraveler.com/best-time-to-travel/

My hope is this site will help travelers plan.

This data is taken from the National and Atmospheric Administration's global summaries of the day (NOOA's GSoD). I used an SQL database to crunch the numbers into monthly and weekly averages by station. For the "best time" pages I used and calculated several more variables. I then imported the data into Tableau and added the filters you see on the map. I also used data from the State Department regarding travel advisories.

Would love your thoughts!

The whole buildout was a solo project, but I owe Ryan Whitacker a big "thank you" for his guidance. He built a similar tool on his site (https://decisiondata.org/the-best-time-to-visit-anywhere/) in April, and was generous to offer me guidance for expanding upon his idea.

Known issues:

* I am aware that the map is bad on mobile, so my next step is to improve the mobile experience.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Build-a-Coin Cryptocurrency Creator

$
0
0

Build-a-Coin Cryptocurrency Creator


basics

  • coin name

  • currency code

  • address identifier byte

  • testnet address identifier byte

  • multisig address identifier byte

  • testnet multisig address identifier byte

  • TCP port

  • testnet TCP port

  • JSON-RPC TCP port

  • testnet JSON-RPC TCP port
money supply

  • initial block reward

  • blocks until reward halves
transactions

  • blocks before mined coins can be spent

  • minimum sendable without dust fee

  • minimum sendable at all

  • largest tx in bytes without size fee
blockchain

  • desired seconds between blocks

  • starting difficulty

  • desired seconds to difficulty change

  • miner-configurable limit on block size in bytes

  • hard limit on block size in bytes

  • genesis block embedded message
governance

  • network alert signing pubkey

  • testnet alert signing pubkey

Winner-takes all effects in autonomous cars

$
0
0

There are now several dozen companies trying to make the technology for autonomous cars, across OEMs, their traditional suppliers, existing major tech companies and startups. Clearly, not all of these will succeed, but enough of them have a chance that one wonders what and where the winner-take-all effects could be, and what kinds of leverage there might be. Are there network effects that would allow the top one or two companies to squeeze the rest out, as happened in smartphone or PC operating systems? Or might there be room for five or ten companies to compete indefinitely? And for what layers in the stack does victory give power in other layers? 

These kinds of question matter because they point to the balance of power in the car industry of the future. A world in which car manufacturers can buy commodity ‘autonomy in a box’ from any of half a dozen companies (or make it themselves), much as they buy ABS today, is very different from one in which Waymo and perhaps Uber are the only real options, and can set the business model of their choice, as Google did with Android. Microsoft and Intel found choke points in the PC world, and Google did in smartphones - what might those points be in autonomy?

To begin with, it seems pretty clear that the hardware and sensors for autonomy - and, probably, for electric - will be commodities. There is plenty of science and engineering in these (and a lot more work to do), just as there is in, say, LCD screens, but there is no reason why you have to use one rather than another just because everyone else is. There are strong manufacturing scale effects, but no network effect. So, LIDAR, for example, will go from a ‘spinning KFC bucket’ that costs $50k to a small solid-state widget at a few hundred dollars or less, and there will be winners within that segment, but there’s no network effect, while winning LIDAR doesn’t give leverage at other layers of the stack (unless you get a monopoly), anymore than than making the best image sensors (and selling them to Apple) helps Sony’s smartphone business. In the same way, it’s likely that batteries (and motors and battery/motor control) will be as much of a commodity as RAM is today - again, scale, lots of science and perhaps some winners within each category, but no broader leverage. 

On the other hand, there probably won’t be direct parallels to the third party software developer ecosystems that we see in PCs or smartphones. Windows squashed the Mac and then iOS and Android squashed Windows Phone because of the virtuous circle of developer adoption above anything else, but you won’t buy a car (if you own a car at all, of course) based on how many apps you can run on it. They’ll all run Uber and Lyft and Didi, and have Netflix embedded in the screens, but any other apps will happen on your phone (or watch, or glasses).  

Rather, the place to look is not within the cars directly but still further up the stack - in the autonomous software that enables a car to move down a road without hitting anything, in the city-wide optimisation and routing that mean we might automate all cars as a system, not just each individual car, and in the on-demand fleets of 'robo-taxis' that will ride on all of this. The network effects in on-demand are self-evident, but will will get much more complex with autonomy (which will cut the cost of an on-demand ride by three quarters or more). On-demand robo-taxi fleets will dynamically pre-position their cars, and both these and quite possibly all other cars will co-ordinate their routes in real time for maximum efficiency, perhaps across fleets, to avoid, for example, all cars picking the same route at the same time. This in turn could be combined not just with surge pricing but with all sorts of differential road pricing - you might pay more to get to your destination faster in busy times, or pick an arrival time by price. 

From a technological point of view, these three layers (driving, routing & optimisation, and on-demand) are largely independent - you could install the Lyft app in a GM autonomous car and let the pre-installed Waymo autonomy module drive people around, hypothetically. Clearly, some people hope there will be leverage across layers, or perhaps bundling - Tesla says that it plans to forbid people from using its autonomous cars with any on-demand service other than its own. This doesn't work the other way - Uber won't insist you use only its own autonomous systems. But though Microsoft cross-leveraged Office and Windows, both of these won in their own markets with their own network effects: a small OEM insisting you use its small robo-taxi service would be like Apple insisting you buy AppleWorks instead of Microsoft Office in 1995. I suspect that a more neutral approach might prevail. This would especially be the case if we have cross-city co-ordination of all vehicles, or even vehicle-to-vehicle communication at junctions - you would need some sort of common layer (though my bias is always towards decentralised systems). 

All this is pretty speculative, though, like trying to predict what traffic jams would look like from 1900. The one area where we can talk about what the key network effects might look like is in autonomy itself. This is about hardware, and sensors, and software, but mostly it's about data, and there are two sorts of data that matter for autonomy - maps and driving data. First, ‘maps.’ 

Our brains are continuously processing sensor data and building a 3D model of the world around us, in real time and quite unconsciously, such that when we run through a forest we don’t trip over a root or bang our head on a branch (mostly). In autonomy this is referred to as SLAM (Simultaneous Localisation And Mapping) - we map our surroundings and localise ourselves within them. This is obviously a basic requirement for autonomy - AVs need to work out where they are on the road and what features might be around (lanes, turnings, curbs, traffic lights etc), and they also need to work out what other vehicles are on the road and how fast they’re moving. 

Doing this in real time on a real road remains very hard. Humans drive using vision (and sound), but extracting a sufficiently accurate 3D model of your surroundings from imaging alone (especially 2D imaging) remains an unsolved problem: machine learning makes it conceivable but no-one can do it yet with the accuracy necessary for driving. So, we take shortcuts. This is why almost all autonomy projects are combining imaging with 360 degree LIDAR: each of these sensors have their limitations, but by combining them (‘sensor fusion’) you can get a complete picture. Building a model of the world around you with imaging alone will certainly be possible at some point in the future, but using more sensors gets you there a lot quicker, even given that you have to wait for the cost and form factor of those sensors to become practical. That is, LIDAR is a shortcut to get to a model of the world around you. Once you've got that, you often use machine learning to understand what's in it - that shape is a car, or a cyclist, but for this, there don't seem to be a network effect (or a strong one): you can get enough images of cyclists yourself without needing a fleet of cars. 

If LIDAR is one shortcut to SLAM, the other and more interesting one is to use prebuilt maps, which actually means ‘high-definition 3D models’. You survey the road in advance, process all the data at leisure, build a model of the street and then put it onto any car that’s going to drive down the road. The autonomous car doesn’t now have to process all that data and spot the turning or traffic light against all the other clutter in real-time at 65 miles an hour - instead it knows where to look for the traffic light, and it can take sightings of key landmarks against the model to localise itself on the road at any given time. So, your car uses cameras and LIDAR to work out where it is on the road and where the traffic signals etc are by comparing what it can see with a pre-built map instead of having to do it from scratch, and also uses those inputs to spot other vehicles around it in real time. 

Maps have network effects. When any autonomous car drives down a pre-mapped road, it is both comparing the road to the map and updating the map: every AV can also be a survey car. If you have sold 500,000 AVs and someone else has only sold 10,000, your maps will be updated more often and be more accurate, and so your cars will have less chance of encountering something totally new and unexpected and getting confused. The more cars you sell the better all of your cars are - the definition of a network effect. 

The risk here is that in the long term it is possible that just as cars could do SLAM without LIDAR, they could also do it without pre-built maps - after all, again, humans do. When and whether that would happen is unclear, but at the moment it appears that it would be long enough after autonomous cars go on sale that all the rest of the landscape might look quite different as well (that is, 🤷🏻‍♂️).  

So, maps are the first network effect in data - the second comes in what the car does once it understands its surroundings. Driving on an empty road, or indeed on a road full of other AVs, is one problem, once you can see it, but working out what the other humans on the road are going to do, and what to do about it, is another problem entirely. 

One of the breakthroughs supporting autonomy is that machine learning should work very well for this: instead of trying to write complex rules explaining how you think that people will behave, machine learning uses data - the more the better. The more data that you can collect of how real drivers behave and react in the real world (both other drivers and the drivers of your survey vehicles themselves), the better your software will be at understanding what is going on around it and the better it will be at planning what to do next. Just as for maps, before launch your test cars collect this data, but after launch, every car that you sell is collecting this data and sending it home. So, just as for maps, the more cars you sell the better all of your cars are - the definition of a network effect.

Driving data also has another, secondary use for that driving data, in simulation. This seeks to solve the question “if X happens, how will our autonomous software react?” One way to do this is by making an AV and letting it drive itself around the city all day to see how it reacts to whatever random things any other drivers happen to do. The problem is that this is not a controlled experiment - you can’t rerun a scenario with new software to see what changes and whether any problems have been fixed. Hence, a great deal of effort is now going into simulation - you put your AV software into Grand Theft Auto (almost literally) and test it on whatever you want. This doesn’t necessarily capture some things (“will the LIDAR detect that truck?”), and some simulation scenarios would be circular, but it does tell you how your system will react to defined situations, and you can collect those situations from your real-world driving data. So, there is an indirect network effect: the more real world driving data that you have, the more accurate you can make your simulation and therefore the better you can make your software. There are also clear scale advantages to simulation, in how much computing resource you can afford to devote to this, how many people you have working on it, and how much institutional expertise you have in large computing projects. Being part of Google clearly gives Waymo an advantage: it reports driving 25,000 ‘real’ autonomous miles each week, but also one billion simulated miles in 2016 (an average of 19 million miles a week). 

It could be argued that Tesla has a lead in both maps and driving data: since late 2016, those of its new vehicles whose buyers bought the ‘Autopilot’ add-on have eight cameras giving a near-360 degree field of view, supplemented by a forward-facing radar (there is also a set of ultrasonic sensors, which have pretty short range and are mostly used for parking). All of those can collect both mapping and driver behaviour data and send it back to Tesla, and it appears that Tesla has very recently begun actually collecting some of this. The catch is that since the radar only points forwards, Tesla will have to use imaging alone to build most of the model of the world around itself, but, as I noted above, we don’t yet know how to do that accurately. This means that Tesla is effectively collecting data that no-one today can read (or at least, read well enough to produce a complete solution). Of course, you would have to solve this problem both to collect the data and actually to drive the car, so Tesla is making a big contrarian bet on the speed of computer vision development. Tesla saves time by not waiting for cheap/practical LIDAR (it would be impossible for Tesla to put LIDAR on all of its cars today), but doing without LIDAR means the computer vision software will have to solve harder problems and so could well take longer. And if all the other parts of the software for autonomy - the parts that decide what the car should actually do - take long enough, then LIDAR might get cheap and practical long before autonomy is working anyway, making Tesla’s shortcut irrelevant. We’ll see. 

So, the network effects - the winner-takes-all effects - are in data: in driving data and in maps. This prompts two questions: who gets that data, and how much do you need? 

Ownership of the data is an interesting power and value chain question. Obviously Tesla plans to make all of the significant parts of the technology itself and put it in its own cars, so it owns the data as well. But some OEMs have argued that it’s their vehicle and their customer relationship, so it’s their data to own and allocate, and not for any technology partners. This looks like a reasonable position to take in regard to a sensor vendor: I’m not sure that it’s sustainable to sell commodity GPUs, cameras or LIDAR on their own and want to keep the data. But the company that makes the actual autonomous unit itself needs to have the data, because how it works. If you don’t cycle the data back into the technology it can’t improve. This means that the OEM is generating network value for a supplier without getting any of that value itself, except in the form of better autonomy, but that better autonomy becomes a commodity across all products from any OEM using it. This is the position of PC or Android OEMs: they create the network effect by agreeing to use the software in their products, and this makes it possible to sell their products, but their product has become a near-commodity with the network value going to the tech company. It's s virtuous circle where most of the value goes to the vendor, not the OEM. This is course is why most car OEMs want to make it themselves: they don’t want to end up like Compaq.

This leads me to the final question: how much data do you really need? Does the system get better more or less indefinitely as you add more data, or is there an S-Curve - is there a point at which adding more data has diminishing returns? 

That is - how strong is the network effect? 

This is a pretty obvious question for maps. What density of cars with what frequency do you need for the maps to be good enough, and what minimum market share does that translate to? How many participants does the market have room for? Could ten companies have this, or two? Could a bunch of second-tier OEMs get together and pool all of their mapping data? Can delivery trucks sell their data just as they sell other kinds of mapping data today? Again, this isn't like consumer software ecosystems - RIM and Nokia couldn't pool Blackberry and S60 user bases, but you could pool maps. Is this a barrier to entry or a condition of entry? 

This question also applies to driving data, and indeed to all machine learning projects: at what point are there diminishing returns as you add more data and at what point does the curve flatten, and how many people can get that amount of data? For, say, general purpose search, the improvement does seem indefinite - the answers can (almost) always get more relevant. But for autonomy, deterministically, it does seem as though there should be a ceiling - if a car can drive in Naples for a year without ever getting confused, how much more is there to improve? At some point you’re effectively finished. So, a network effect means that your product gets better if you have more users, but how many users do you need before the product stops getting significantly better? How many cars do you need to sell before your autonomy is as good as the best on the market? How many companies might be able to reach that? And meanwhile, machine learning itself is changing quickly - one cannot rule out the possibility that the amount of data you need to get autonomy working might shrink dramatically. 

Implicit in all of this, finally, is an assumption there is even such as thing as better and worse autonomy. But what would 'worse' autonomy mean? Would it mean you are slightly likely to die, or just that the car is more likely to get confused, pull over to the side of the road and connect to a remote support centre for a human operator to take over? Would manual controls burst out of a console in a shower of polystyrene packaging, and would the car make encouraging comments?

The answer, I suspect, is that Level 5 will come as an evolution out of Level 4 - that every car will have manual controls, but they will be used less and less, and explicit Level 5 will emerge in stages, as the manual controls shrink, and then are hidden, and then removed - they atrophise. This will probably come by scenario - we might have Level 5 for Germany before Naples, or Moscow. This would meant that the data was being collected at network scale and used well before full autonomy. 

We can’t really know the answers to these questions now. Very few people in the field expect full, ‘Level 5’ autonomy within the next five years and most tend closer to ten years. However, they point to a range of outcomes that would have dramatically different implications for the car industry. At one extreme, it might be that network effects are relatively weak and there are five or ten companies with a viable autonomy platform. In this case, the car industry would buy autonomy as a component at a price much like ABS, air bags or satnav today. It would still face radical change - autonomy means the cost of an on-demand ride falls by at least three quarters, which would make many people reconsider car ownership, while the shift to electric reduces the number of moving parts in a car by five to ten times, totally changing the engineering dynamics, supplier base and barriers to entry. But it wouldn’t get Androided.  At the other extreme, only Waymo gets it working, and the industry would look very different. 

A glut has used-car depreciation accelerating

$
0
0

Car sales in the U.S. have been rising for seven consecutive years now, and it’s denting the value of whatever is currently parked in your garage or driveway. With so many new cars rolling out of dealerships lots and instantly becoming used cars, the secondary market is glutted and the pace of depreciation is rapidly accelerating.

Your not-that-old car might not be a clunker quite yet, but it’s probably a lot closer than you think.

The average used car lost 17 percent of its value in the past 12 months, dropping from $18,400 to $15,300, according to data from Black Book, an auto analytics company. That annual depreciation figure has been increasing steadily, too. The average used car today depreciates nearly twice as fast at it did in 2014, when the annual rate was just 9.5 percent.

“We’ve got ourselves in an oversupply situation,” said Jim Hallett, chief executive officer of KAR Auction Services Inc., which sells about 5 million used cars every year. “Nobody is interested in stockpiling inventory right now.” Translation: If you’re trading in a used car, don’t expect to get much of a deal.

Certain segments are shedding value even more quickly. Subcompact cars, such as the Honda Fit, and large sedans, such as the Chevrolet Impala, are depreciating faster than average. Big SUVs, vans, and pickups are holding their value a little better, and imports tend to drop more quickly than domestic models.

The problem, of course, is supply. Seven consecutive years of increasing U.S. auto sales have put a glut of vehicles on the road. What’s more, an increasing share of those sales came with a lease, so there’s now a rising tide of machines flowing back onto the market when their three-year contracts run out. 

Automakers, having added manufacturing capacity, are also offering larger incentives on new vehicles just to maintain their record sales momentum. That puts downward pressure on the entire market, according to Hallett, even for used cars.

Consequently, the number of drivers who are upside down on their car loans is surging. Americans are paying—or trying to pay—108 million auto loans at the moment, according to the most recent Federal Reserve data. That represents roughly half of licensed drivers in the U.S. At the same time, 14 percent of Americans have a negative net worth. Among those who have more debts than assets, the Federal Reserve says auto loans make up between 10 percent and 23 percent of their total financial obligations.

Not surprisingly, KAR Auctions is seeing a rising number of repossessions. The company expects nearly 2 million vehicles to be seized by lenders this year and added to the used-car market, up from 1.1 million at the nadir of the last recession.

The increasing pace of depreciation is also bad news at the corporate level. Companies with huge fleets of cars and trucks—think dealerships and rental chains—are seeing their balance sheets tick down by the day. Consider Avis Budget Group: In the quarter ended June 30, the rental-car empire managed to rent more cars than in the year-earlier period, and at fairly stable prices. Yet its profit dropped 92 percent as it struggled to sell vehicles it wasn’t using. Expenses tied to vehicle depreciation and lease charges increased 12 percent in the quarter.

Hallett at KAR Auction said many fleet managers are in a similar pickle. At Hertz Global Holdings, for example, depreciation per vehicle was up 27 percent in the recent quarter; at the time, Hertz had 500,000 vehicles. “We call it losing money by volume,” Hallett said.

Avis, in response, has resorted to selling more cars directly to consumers, cutting out the middleman at dealerships to realize slightly higher prices. And it’s buying fewer 2018 models as it gears up for next year. 

The upside is that America is in the midst of a buyer’s market for used vehicles. In 2012, the average three-year-old vehicle was selling at 26 percent off its original sticker price on Cargurus.com, an online platform listing some 2.5 million vehicles. A three-year-old car is currently trading at a 34 percent discount from the sticker price, said Cargurus spokeswoman Amy Mueller.

A cheap used car hasn’t been this cheap in quite some time. 

Viewing all 25817 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>