Only a couple of weeks ago, there were a lot of news headlines about how Germany had banned an internet-connected doll called "Cayla" over fears hackers could target children. One of their primary concerns was the potential risk to the privacy of children:
conversations between the child and others can be recorded and forwarded
The Germans had a good point: kids' toys which record their voices and send the recordings up to the web pose some serious privacy risks. It's not that the risks are particularly any different to the ones you and I face every day with the volumes of data we produce and place online (and if you merely have a modern phone, that's precisely what you're doing), it's that our tolerances are very different when kids are involved. I've got young kids myself and frankly, I'm with the Germans on this one; I don't see a need for them to have things like their voices recorded and stored online. That's not to say I don't want them to have an online presence and I'm gradually exposing both of them to more and more modern internet things, but I don't particularly want innocent childish behaviour like playing with a toy to be recorded and stored on other people's computers.
Cayla isn't the first connected toy to raise concerns either. Just over a year ago it was "Hello Barbie" making the headlines for precisely the same reasons. Yes, it's a cool idea but no, I (and many others) don't want my kids exposed in that way. In fact, just before that, we had the VTech data breach which exposed a huge amount of very personal information after parents bought their kids connected tablets, joined them to the wifi network and created accounts for them. Those accounts were ultimately exposed and included the kids' names, genders, birth dates, photos and links to parent with full physical addresses. That should have been the wakeup call where we all said "hey, if we put our kids' data on the web, we need to expect it to be leaked", but evidently it hasn't stopped the flood of connected toy things.
Which brings us to CloudPets (a brand owned by Spiral Toys) which is a toy that represents the nexus of both the problems discussed above: kids' voices being recorded and their data consequently being leaked. The best way to understand what these guys do is to simply watch the video:
Now firstly, put yourself in the shoes of the average parent, that is one who's technically literate enough to know the wifi password but not savvy enough to understand how the "magic" of daddy talking to the kids through the bear (and vice versa) actually works. They don't necessarily realise that every one of those recordings – those intimate, heartfelt, extremely personal recordings – between a parent and their child is stored as an audio file on the web. They certainly wouldn't realise that in CloudPets' case, that data was stored in a MongoDB that was in a publicly facing network segment without any authentication required and had been indexed by Shodan (a popular search engine for finding connected things).
Unfortunately, things only went downhill from there. People found the exposed database online. Many people and the worrying thing is, it's highly unlikely anyone knows quite how many. The first I knew of it was when earlier last week, someone sent me data from the table holding the user accounts, about 583k records in total (this subsequently turned out to be a subset of the total number in the CloudPets service). I started going through my usual verification process to ensure it was legitimate and by pure coincidence, I was in the US running a private security workshop at the time and one of the guys in my class had a CloudPets account. Sure enough, his email address was in the breach and it was time-stamped Christmas day, the day his daughter had been given the toy. His record looked somewhat like these, the first few in the data I was given:
The password was stored as a bcrypt hash and to verify it was legitimate, he gave me his original password (I asked him to change it on CloudPets first) and I successfully validated that the hash against his record was the correct one (I'd previously validated the Dropbox data breach by doing the same thing with my wife's account). The data was real.
CloudPets left their database exposed publicly to the web without so much as a password to protect it.
Getting back to who sent it to me, this is someone who travels in data breach trading circles so I have no idea how far the data had actually circulated. However, I subsequently discovered that the database had definitely been accessed well beyond just this individual. But first, it gets worse still...
The guy who sent me the data had tried to contact CloudPets three times to warn them about the exposure. The first was on December 31 when he reached out to the email address listed on their support page but the message immediately bounced back. He subsequently followed up with the email address on their WHOIS record (it appears the WHOIS contact is a marketing company called On Demand):
4 days later after still not getting a response, he contacted their hosting provider:
Note the record count here – he'd identified "over 820k users" – the 583k in circulation was not the full amount. So 3 attempts to warn the organisation of a serious security vulnerability and not a single response. I've said many times before in many blog posts, public talks and workshops that one of the greatest difficulties I have in dealing with data breaches is getting a response from the organisation involved. Time and time again, there are extensive delays or no response at all from the very people that should be the most interested in incidents like this. If you run any sort of online service whatsoever, think about what's involved in ensuring someone can report this sort of thing to you because this whole story could have had a very different outcome otherwise. (For reference, check out Tesla's Security Vulnerability Reporting Policy which is beautiful in its simplicity.) But it gets worse still and this brings me back to the earlier point about multiple people having accessed the data...
Now knowing that not only was the data legitimate, it was highly likely to be circulating and the company in question wasn't responding to emails, I reached out to Lorenzo from Motherboard. I worked with Lorenzo on that VTech data breach I mentioned earlier and he's both someone who knows how this industry works and a guy I trust to be fair, accurate and responsible in dealing with these incidents. Journalists have a knack of getting responses from organisations and I had confidence that he'd do his utmost to approach CloudPets in an appropriate fashion and alert them to this incident before publishing anything. When he heard the details of the breach, he was immediately interested and took up the task of investigating it. And then he found these images in previous communications he'd had with someone else:
He had already been contacted about CloudPets! It wasn't just the images either, he'd received small samples of data from a selection of the tables in the exposed DB. Journos get a heap of communication from people about all number of things and a bunch of it turns out to be red herrings but in this case, it was exactly the same vulnerability identified by someone completely different. Not only that, but this individual had also contacted CloudPets on December 30 and sent this message:
I want to inform you that 45.79.147.159 is running a MongoDB instance which appears not to be correctly configured or protected by a firewall allowing connections via port 2701
Clearly, CloudPets weren't just ignoring my contact, they simply weren't even reading their emails.
4 attempts (that we know of) were made to contact CloudPets and warn them of this risk.
The images sent to Lorenzo both confirm the findings of the individual who sent me the data (it shows 821,296 records) and tell us new things about the extent of the exposure. For example, we can see 2,182,337 voice recordings in the system which seems to be a feasible number for 821k registered users. We can also see two databases of identical size at almost 10GB each, "cloudpets-staging" and "cloudpets-test". Assuming their names are self-explanatory, these support both a staging environment and a test environment although they're both facing the public web and have real customer data in them. This breaks the cardinal rule of never putting production data into a non-production system (read my post on test data done right for more on this). It also potentially exposes the production system (and production customer data) to developers building the software (another cardinal rule broken), but at this stage when it's entirely open to the internet anyway, that would be the least of their worries. The point is, what's disclosed in the images above suggests the problems go deeper than data exposure alone.
There are references to almost 2.2 million voice recordings of parents and their children exposed by databases that should never have contained production data.
But then I dug a little deeper and took a look at the mobile app:
This app communicates with a website at spiraltoys.s.mready.net which is on a domain owned by Romanian company named mReady. That URL is bound to a server with IP address 45.79.147.159, the exact same address the exposed databases were on. That's a production website there too because it's the one the mobile app is hitting so in other words, the test and staging databases along with the production website were all sitting on the one box. The most feasible explanation I can come up with for this is that one of those databases is being used for production purposes and the other non-production (a testing environment, for example).
I wanted to understand more about how the application was communicating with the server so I took a closer look at the traffic. With the help of a Have I been pwned (HIBP) subscriber I found in the data, I was added as a friend and was then able to observe how the app communicated with the back end services. One of the first things I noticed was that the profile picture I uploaded (I just screen-captured a bunny and used that), was stored in an Amazon S3 bucket:
My profile photo: https://cloudpets-prod.s3.amazonaws.com/d0b5962f8a168dbb1c7761bf795f929e_CB53FC23-9E61-4551-9661-BC252975A931.jpeg
As you can see by loading the image, all that's required to access the file is the path which is returned by the app every time my profile is loaded. That profile also contains other personal information; the data sent to Lorenzo shows that along with references to their profile photos, it contained the names of children and their day and month of birth (although not year). It also contains relationships to parents and "friends" (i.e. grandmother, uncle) that have been authorised to share messages with the child.
I was curious as to whether or not the voice recordings would demonstrate the same behaviour, so I left a test message for my new "friend" and discovered a similar pattern:
My voice recording: https://cloudpets-prod.s3.amazonaws.com/990e5b17ad63146f4aa729209b5fea7a_3FBDE242-0954-466E-800D-421BF0CB0951.wav
Once again, an Amazon S3 bucket with no specific authorisation required, merely knowledge of the file path which is obviously stored in the app itself (returned via the API). Based on how CloudPets position their toys, you can imagine the sorts of voice messages the system contains. By virtue of the support I got from members of the service, I was given access to some of the short clips recorded directly on the toy. One little girl who sounded about the same age as my own 4-year old daughter left a message to her parents:
Hello mommy and daddy, I love you so much
Another one has her singing a short song, others have precisely the sorts of messages you'd expect a young child to share with her parents. I didn't download either pictures or recordings from other parties, only those I was specifically granted access to by HIBP subscribers, but the risk was clear:
The services sitting on top of the exposed database are able to point to the precise location of the profile pictures and voice recordings of children.
But even if you didn't know the exact location of the files on AWS, there's still another risk which would expose them and that relates to the passwords on profiles. Here's where things are a bit of a double-edged sword: on the one hand, CloudPets stored passwords as a bcrypt hash which is a good thing. It's a slow hashing algorithm designed to be more resilient to cracking should the hashes be leaked in precisely the sort of circumstance we have here. However, counteracting that is the fact that CloudPets has absolutely no password strength rules. When I say "no rules", I mean you can literally have a password of "a". That's right, just a single character. Not only that, check out how the tutorial demonstrates account creation and particular, how to choose a password:
The password used here in the demonstration is literally just "qwe"; 3 characters and a keyboard sequence. What this meant is that when I passed the bcrypt hashes into hashcat and checked them against some of the world's most common passwords ("qwerty", "password", "123456", etc.) along with the passwords "qwe" and "cloudpets", I cracked a large number in a very short time:
That's just a very small sample from a brief run, but the figures showed there would be thousands of passwords adhering to this very small handful of bad examples.
Due to there being absolutely no password strength requirements whatsoever, anyone with the data could crack a large number of passwords, log on to accounts and pull down the voice recordings.
By now it's pretty obvious that multiple parties identified the exposed database, it remained open for a long period of time and it exposed some very personal data. It would be a safe bet to assume that many other parties located and then exfiltrated the same data because that's what people do; scanning for this sort of thing is enormously prevalent and that data – including the kids' and parents' intimate audio clips – is now in the hands of an untold number of people. But it gets even worse again...
My good friend Niall Merrigan was doing some really good work cataloguing exposed MongoDBs recently so I reached out to him for some support to investigate the extent of CloudPets' exposure. He used Shodan's API to go back and look at the historical states of the IP address and the exposed databases running on it. December 25 was the first recorded instance of the data being exposed (that's as far back as Shodan's search will take us):
You're looking the JSON emitted by Shodan's API here and it showed both those databases present on Xmas day. As of Jan 5, both databases were still there. However, a couple of days later on Jan 7 they were joined by a new one:
This is where things took a turn for the worse because as innocuous as the name may seem, it has a more sinister meaning:
Niall is highlighting a pattern he began to see only just before this new database appeared which was the use of that name to demand a ransom. He was seeing databases named "PLEASE_READ" appear across many compromised systems containing a ransom as follows:
You DB is backed up on our servers, send 1 BTC to 1J5ADzFv1gx3fsUPUY1AWktuJ6DF9P6hiF then send your ip address to email:kraken0@india.com
Whilst Shodan doesn't index the contents of exposed databases it finds, it's a safe bet that the exposed CloudPets one contained the same message as so many other compromised ones with the same name did. The analysis that Niall was doing at the time showed that at this stage, the two original CloudPets databases had been deleted which is what you'd expect when a ransom is being demanded.
On January 8, it got worse again:
Like the earlier image, these are yet more indicators of compromise (IOC) consistent with the ransom demands that were going around for MongoDBs in early Jan. Niall called them out later that month as part of his commentary on how the whole saga was unfolding:
There were many malicious parties taking action against exposed databases during this period and we frequently saw the same system accessed multiple times by different actors, each demanding their own ransom. It wasn't until Jan 13 that Shodan reported no publicly accessible databases remained on CloudPets' IP Address.
The CloudPets data was accessed many times by unauthorised parties before being deleted and then on multiple occasions, held for ransom.
For a great write-up on how MongoDBs were being compromised in this fashion, have a read of Extortionists Wipe Thousands of Databases, Victims Who Pay Up Get Stiffed by Brian Krebs.
So just to tie the whole saga together in a neat chronological fashion, here's the entire timeline of events as I know them:
- Dec 30: Lorenzo's contact attempted to alert CloudPets
- Dec 31: My contact attempted to alert CloudPets via their published support address
- Dec 31: My contact attempted to alert CloudPets via the WHOIS record contact
- Jan 4: My contact attempted to alert Linode, CloudPets' hosting provider
- Jan 7: The original databases were deleted and a ransom demand was left on the exposed system via the IOC named "PLEASE_READ"
- Jan 8: Another ransom demand was left for "README_MISSING_DATABASES" and another again for "PWNED_SECURE_YOUR_STUFF_SILLY"
- Jan 13: No remaining databases were found to still be publicly accessible
It's impossible to believe that CloudPets (or mReady) did not know that firstly, the databases had been left publicly exposed and secondly, that malicious parties had accessed them. Obviously, they've changed the security profile of the system and you simply could not have overlooked the fact that a ransom had been left. So both the exposed database and intrusion by those demanding the ransom must have been identified yet this story never made the headlines. Certainly, the guy in my workshop whose data was in the set I was provided had never heard anything about the exposure and that his private conversations with his daughter had potentially been illegally accessed.
Unauthorised access must have been detected but impacted parents were never notified.
There's another angle to all this as well which helps explain why nobody was returning emails or phone calls (Lorenzo tried in vain to phone both CloudPets and Spiral Toys) as well as why there appear to be some serious shortcuts taken with the hosting situation. One look at the stock price for Spiral Toys tells the story:
Spiral Toys is worth less than half a cent per share. Since late 2015, they've been in rapid decline to the point where the company is near worthless with a market cap of only $262k (that's down more than 99% of their peak value). Not even the launch of their new connected piggy bank (yes, you read that correctly) back in November could save them; they saw a momentary bump in the share price then it went back downhill and stayed flat. The CloudPets Twitter account has also been dormant since July last year so combined with the complete lack of response to all communications, it looks like operations have well and truly been shuttered.
Circling back to the parents' position for a moment, you must assume data like this will end up in other peoples' hands. Whether it's the Cayla doll, the Barbie, the VTech tablets or the CloudPets, assume breach. It only takes one little mistake on behalf of the data custodian – such as misconfiguring the database security – and every single piece of data they hold on you and your family can be in the public domain in mere minutes. If you're fine with your kids' recordings ending up in unexpected places then sobeit, but that's the assumption you have to work on because there's a very real chance it'll happen. There's no doubt whatsoever in my mind that there are many other connected toys out there with serious security vulnerabilities in the services that sit behind them. Inevitably, some would already have been compromised and the data taken without the knowledge of the manufacturer or parents.
If you think you've been impacted by the CloudPets breach, you can now search for your email address in HIBP. You can also read Lorenzo's full writeup in Internet of Things Teddy Bear Leaked 2 Million Parent and Kids Message Recordings.