FreshRSS

πŸ”’
❌ About FreshRSS
There are new available articles, click to refresh the page.
Before yesterdayTroy Hunt

Weekly Update 323

By Troy Hunt
Weekly Update 323

Finally, after nearly 3 long years, I'm back in Norway! We're here at last, leaving our sunny paradise for a winter wonderland. It's almost surreal given how much has happened in that time, not just the pandemic but returning to Oslo with Charlotte as my Norwegian wife is super cool 😎 Other things this week are not so different, namely people complaining on Twitter (albeit also complaining about Twitter). As I find myself continually caveating, YMMV but it does feel like events are being overly dramatised by some at present. Time will tell, but I think we'll all still be using the platform to complain about things just as effectively in a year from now as we are today πŸ™‚

Weekly Update 323
Weekly Update 323
Weekly Update 323
Weekly Update 323

References

  1. Catch me this week in Oslo doing a free meetup for NDC and NNUG (Tuesday from 17:00 onwards)
  2. Have you heard there's some controversy surrounding Twitter at present? (geez this thread opened a can of worms, it's a massively divisive topic right now)
  3. Acxiom didn't get breached, but that doesn't stop people shipping around "The Acxiom Breach" (I hate breach misattribution with a passion)
  4. You can now get Pwned for 30% less! (because it's a holiday in America, we've made my book cheaper 😊)
  5. Sponsored by: 1Password, a secure password manager, is building the passwordless experience you deserve. See how passkeys work

Get Pwned, for 30% Less!

By Troy Hunt
Get Pwned, for 30% Less!

We've had great feedback from people who have gotten Pwned. Loads of people had told us how much they've enjoyed it and would like to get their friends Pwned too. Personally, I think everyone should get Pwned! Which is why we're making it possible for 30% less 😊

Ok, being more serious for a moment, I'm talking about Pwned the book which we launched a couple of months ago and it's chock full of over 800 pages worth of epic blog posts and more importantly, the stories behind them. Because it's a holiday on the other side of the world right now (still true as I write this as an Aussie about to jump on a plane to Norway), we've smashed 30% off the price over at book.troyhunt.com

So, go get Pwned, you'll love it!

Data Breach Misattribution, Acxiom & Live Ramp

By Troy Hunt
Data Breach Misattribution, Acxiom & Live Ramp

If you find your name and home address posted online, how do you know where it came from? Let's assume there's no further context given, it's just your legitimate personal data and it also includes your phone number, email address... and over 400 other fields of data. Where on earth did it come from? Now, imagine it's not just your record, but it's 246 million records. Welcome to my world.

This is a story about a massive corpus of data circulating widely within the hacking community and misattributed to a legitimate organisation. That organisation is Acxiom, and their business hinges on providing their customers with data on their customers. By the very nature of their business, they process large volumes of data that includes a broad set of personal attributes. By pure coincidence, there is nominal commonality between Acxiom’s records and the ones in the 246M corpus I mentioned earlier. But I'm jumping ahead to the conclusion, let's go back to the beginning:

Disclosure and Attribution Debunking

In June last year, I received an email from someone I trust who had sent me data for Have I Been Pwned (HIBP) in the past:

Have you seen Axciom [sic] data? It was just sent to us. Seems to being traded/sold on some forums. Have you received it yet? If not i can upload it for you. It's quite large tho, ~250M Records.

A corpus of data that size is particularly interesting as it impacts such a huge number of people. So, I reviewed the data and concluded... pretty much nothing. Looks legit, smells legit but there was absolutely nothing beyond the word of one person to tie it to Acxiom (and who knows who they got that word from). Burdened by other more immediately actionable data breaches, I filed it away until recently when that name popped up again, this time on a popular hacking forum:

Data Breach Misattribution, Acxiom & Live Ramp

It was referred to as "LiveRamp (Formerly Acxiom)" and before I go any further, let's just clarify the problem with that while you're looking at the image above: LiveRamp was previously a subsidiary of Acxiom, but that hasn't been the case since they separated businesses in 2018 so whoever put this together is referring back to a very old state of play. Regardless, those downloading it from the forum were clearly very excited about it. Seeing this for the second time and spreading far more broadly, I decided to reach out to the (alleged) source and ask Acxiom what was going on.

I dread this process - contacting an organisation about a breach - because I usually get either no response whatsoever or a standoffish one. Rarely do I find a receptive organisation willing to fully investigate an alleged incident, but that's exactly what I found on this occasion. Much of the reason why I wanted to write this post is because whilst I hate breached organisations not properly investigating an incident, I also hate seeing misattribution of a breach to an innocent party. That's a particularly sore point for me right now because of this incident just last week:

This is the dumbest infosec story I’ve read in… forever? It is so profoundly incorrect, poorly researched, never verified, rambling and indistinguishable from parody that I literally went looking for the parody reference. I think he’s actually serious! https://t.co/oLyIHxb8D3

β€” Troy Hunt (@troyhunt) November 15, 2022

I've had various public users of HIBP, commercial users and even governments reach out to ask what's going on because they were concerned about their data. Whilst this incident won't do HIBP any actual harm (and frankly, I'm stunned anyone took that story seriously), I can very easily see how misattribution can be damaging to an organisation, indeed that's a key reason why I invest so much effort into properly investigating these claims before putting anything into HIBP. But that ridiculous example is nothing compared to the amount of traction some misattributions get. Remember how just recently a couple of billion TikTok accounts had been "breached"? This made massive news headlines until...

The thread on the hacking forum with the samples of alleged TikTok data has been deleted and the user banned for β€œlying about data breaches” https://t.co/9ZKkKvu8JT

β€” Troy Hunt (@troyhunt) September 5, 2022

"Lying about data breaches". Ugh, criminals are so untrustworthy! This happens all the time and when I'm not sure of the origin of a substantial breach, I often write a blog post like this and on many occasions, the masses help establish the origin. So, here goes:

The Data

Let's jump into the data, starting with 2 of the most obvious things I look for in any new data breach:

  1. The total number of unique email addresses is 51,730,831 (many records don't have this field populated)
  2. The most recent data I can find is from mid-2020 (which also speaks to the inaccuracy of the LiveRamp association)

As to the aforementioned attributes, they total 410 different columns:

To my eye, this data is very generic and looks like a superset of information that may be collected across a large number of people. For example, the sort of data requested when filling out dodgy online competitions. However, unlike many large corpuses of aggregated data I've seen in the past, this one is... neat. For example, here's a little sample of the first 5 columns (redaction of some chars with a dash), note how the names are all uniformly presented:

120321486,4,BE-----,B,TAYLOR
120321487,2,JOY,M,----EY
120321466,1,DOYLE,E,------HAM
120321486,3,L----,,TAYLOR
120321486,2,R---,M,TAYLOR

Sure, this is just uppercasing characters but over and over again, I found data that was just too neat. The addresses. The phone numbers. Everything about it was far to curated to simply be text entered by humans. My suspicion is that it's likely a result of either a very refined collection process or in the case of addresses, matched using a service to resolve the human-entered address to a normalised form stored centrally.

Perhaps what I was most interested in though was the URL column as that seems to give some indication of where the data might have come from. I queried out the top 100 most common ones and took a look:

Eyeballing them, I couldn't help feel that my earlier hunch was on the money - "dodgy online competitions". Not just competitions but a general theme of getting stuff for cheap or more specifically, services that look like they've been built to entice people to part with their personal data.

Take the first one, for example, DIRECTEDUCATIONCENTER.COM. That's a dead domain as of now but check out what it looked line in March last year:

Data Breach Misattribution, Acxiom & Live Ramp

"I may be contacted by trusted partners and others". What's "others"? Untrusted partners? πŸ€·β€β™‚οΈ

Let's try the next one being originalcruisegiveaway.com and again, the site is now gone so it's back over to archive.org:

Data Breach Misattribution, Acxiom & Live Ramp

It's different, but somehow the same. Clicking through to the claim form, it seems the only way you can enter is if you agree to receive comms from all sorts of other parties:

Data Breach Misattribution, Acxiom & Live Ramp

Ok, one more, this time free-ukstuff.com which is also now a dead site, and not even indexed by archive.org. Next then, is findyourdegreenow.com which is - you're not gonna believe this - a dead site! Here's what it used to look like:

Data Breach Misattribution, Acxiom & Live Ramp

And again, it feels the same. Same same, but different.

To try and get a sense of how localised this data was, I queried out all the values in the "state" column. Is this a US-only data set? If that column is anything to go by, yes:

Something didn't add up when I first saw that and after a quick check of the population of each US state, it become immediately obvious: there's no California, the most populous state in the country. Nor Texas, the second most populous state. In fact, with only 35 rows there's a bunch of US states missing. Why? Who knows, the only thing I can say for sure is that this is a subset of the population with some glaring geographical omissions.

Then there's another curveball - what about the URL quickquid.co.uk, that doesn't look very US-centric. Heading over there redirects to casheuronetukadministration.grantthornton.co.uk which advises that as of last month, "The Administration of CashEuroNet UK, LLC has closed and the Joint Administrators have ceased to act". So something has obviously been wound up, wonder what was there originally? I had to go back a few years to find this:

Data Breach Misattribution, Acxiom & Live Ramp

To my mind, this is more of the same ilk in terms of a service targeted at people after quick money. But it's clearly all in GBP and with a .co.uk TLD, this being right after I've just said all the states are in the US, what gives? Back to the source data, filter out the records based on that URL and sure enough, everyone has a US address. Grabbing a random selection of IP addresses had them all resolving to the US too so I have absolutely no idea how his geographically inconsistent set of data came to being.

And that's really the theme across the data set when doing independent analysis - how is this so? What service or process could have pulled the data together in this way? Maybe the people who this data actually refers to will have the answers, let's go and ask them.

Responses From Impacted HIBP Subscribers

We're approaching 4.5M subscribers to HIBP's free notification service now which makes for a great corpus of people I can reach out to when doing breach verification. I grabbed a handful of addresses from this data set and asked them if they could help out. I sent those that responded positively their full record and asked some questions about the legitimacy of the data, and where they thought it might have come from, here's what they said:


1. The data is mostly accurate.

A few things are off, such as date of birth (could very well be a fake one I've entered before) and details of household members.

There are a lot of columns with single-letter values, which I can't verify without knowing what they mean.

But overall, it's quite accurate.

2. No idea where it came from, sorry. There is a URL in the third-to-last column, but it doesn't seem like a website I would have used before.


I looked through the csv file and couldn't find anything I recognized. I saw the names [redacted], [redacted] and [redacted]- I don't know anyone by those names. I live in Ontario, Canada, but addresses in the file were located in the united states.

Data says I have one child between the ages of 0 and 2, but that's not true - my only son is five. Birth date is wrong - my birthday is [redacted], but the file says [redacted].

There were a few urls in the file and I don't recognize any of them.

Not sure if this last thing is relevant or not. I sometimes get emails intended for other people. I searched my inboxes for the names [redacted] and [redacted]. Nothing came up for [redacted], but I do see an email for [redacted] from [redacted]. I searched through the csv to see if anything matched the data in the email (member number, confirmation number), but nothing matched.

I also noticed that although my email address ([redacted]) is in the csv data, there's also another email address ([redacted]) which is not mine.

I'm not sure if that's helpful or not, but if there's anything more I can do, let me know. :)


As far as name and address they are correct. Β number of ppl living at the house has changed. Β The other information I can't seem to understand what the information for example under column AQ row 2 it has a U and I don't know what the U is for. Β I have noticed that some information is really outdated, so I wouldn't know where the data originated from.


Thank you for sharing, I took a look at the data, let me see if I can answer your questions:

1. While that is my email, the rest of the data actually belongs to an immediate family member. With the exception of a few outdated fields, the data on my family member is correct.

2. I am unfamiliar with Acxiom and am unsure of where this data originated from. I want to note that I have recently been doxxed and have reason to believe data breaches may have been used; however, the data you've provided here was not used in the attacks, to my knowledge.

Please let me know if you have any other questions, or if there is anything else I may do to help.


"Mostly accurate". The feeling I have when reading this is that whoever is responsible for this corpus of data has put it together from multiple sources and quite likely made some assumptions along the way. I can picture how that would happen; imagine trying to match various sources of data based on human-provided text fields in order to "enrich" the collection.

Analysis by Acxiom

This isn't the fist time Acxiom has had to deal with misattribution, and they'd seen exactly the same data set passed around before. Think about it from their perspective: every time there's a claim like this they need to treat it as though it could be legitimate, because we've all seen what happens when an organisation brushes off a disclosure attempt (I could literally write a book about this!) Thus it becomes a burdensome process for them as they repeat the same analysis over and over again, each time drawing the same conclusion.

And what was that conclusion? Simply put, the circulating data didn't align with their own. They're in the best position of all of us to draw that conclusion as they have access to both data sets and whilst I suspect some people may retort with "how do you know you can trust them", not only do I not have a good reason to doubt their findings, I also don't have a good reason to attribute it to them. Every reference I've seen to Acxiom has been from whoever is handing the data around; I've been able to find absolutely nothing within the data set itself to tie it back to them. In almost all breaches I've processed, the truth is in the data and there's nothing here that points the finger at them.

I offered Acxiom the opportunity to further clarify their position with a statement which I've included in its entirety here:

β€œAcxiom has worked to build a reputation over the course of fifty years for having the highest standards around data privacy, data protection and security. In the past, questionable organizations have falsely attached our name to a data file in an attempt to create a deceitful sense of legitimacy for an asset. In every instance, Acxiom conducts an extensive analysis under our cyber incident response and privacy programs. These programs are guided by stakeholders including working with the appropriate authorities to inform them of these crimes. Β The forensic review of the case that Troy has looked into, along with our continuous monitoring of security, means we can conclusively attest that the claims are indeed false and that the data, which has been readily available across multiple environments, does not come from Acxiom and is in no way the subject of an Acxiom breach.

Acxiom’s Commitment To Data Protection/ Data Privacy:
We value consumer privacy. Β  U.S. consumers who would like to know what information Acxiom has collected about them and either delete it or opt out of Acxiom’s marketing products, may visit acxiom.com/privacy for more information.”

Summary

The email addresses from the data set have now been loaded into HIBP and are searchable. One point of note that became evident after loading the data is that 94% of the email addresses has already been pwned. That's a very high number (a quick look through the HIBP Twitter feed shows the count is normally between 40% and 80%), and it suggests that this corpus of data may be at least partially constructed from other data already in circulation.

Because the question will inevitably come up, no, I won't send you your full record, I simply don't have the capacity to operate as a personal data lookup and delivery service. I know it's frustrating finding yourself in a breach like this and not being able to take any action, all you can really do at this point is treat it as another reminder of how our data spread around the web and often, we have no idea about it.

Full disclosure: I have absolutely no commercial interest in Acxiom, no money has changed hands and I wasn't incentivised in any way, I just want everyone to have a much healthier suspicion when alleging the source of a data breach πŸ™‚

Weekly Update 322

By Troy Hunt
Weekly Update 322

It's very strange to have gone 1,051 days without spending more than a few hours apart, but here we are... very temporarily:

Only 15,501km away 😒 And only 4 days until I head back to Oslo 😊 pic.twitter.com/PDn1Syplig

β€” Troy Hunt (@troyhunt) November 20, 2022

Which means that right now, I'm throwing myself into a gazillion other things to keep me busy including how schools advise parents to manage devices, wrapping gup that HTML signature, asking probing questions about paying ransoms and, unbelievably, fighting off the most ridiculous claim of HIBP having been P'd. That last one especially, FFS, just listen...

Weekly Update 322
Weekly Update 322
Weekly Update 322
Weekly Update 322

References

  1. Does your child's school provide any guidance around the use of native parental controls on their devices? (not a poll, but a near unanimous "no" response anyway)
  2. My HTML email signature is finally done - it was not a fun process 😭 (for my next trick - making it actually work in Exchange for iOS)
  3. Should there be a government ban on paying a ransom to stop breached data from being publicly leaked? (this one is a poll... with a very clear result)
  4. Have I Been Pwned didn't get pwned (I can't believe how this got written in the first place, nor how anyone ever even took it seriously πŸ€¦β€β™‚οΈ)
  5. Sponsored by: Varonis. Reduce your SaaS blast radius with data-centric security for AWS, G Drive, Box, Salesforce, Slack and more.

Weekly Update 321

By Troy Hunt
Weekly Update 321

What a week to pick to be in Canberra. Planned well before things got cyber-crazy in Australia, I spent a few days catching up with folks in our capital and talking to the Australia Federal Police for scam awareness week. That it coincided with the dumping of Medibank customer health records made it an especially interesting time to talk with police, politicians and industry leaders. A bit of a bizarre, whirlwind week if I'm honest, but full of very positive encounters even though it coincided with such a demanding time for many of us in this industry down here.

Weekly Update 321
Weekly Update 321
Weekly Update 321
Weekly Update 321

References

  1. Mastodon has been... entertaining 🀣 (just a collection of fun tweets that perfectly illustrate how much many of us have struggled to wrap our heads around it)
  2. HTML email signatures are a complete nightmare ("mjml" bubbled to the top a few times as a way of tackling this)
  3. HIBP API keys can be bought at different rate limits and paid a year in advance! (by some unexplainable miracle, 100% of feedback has been positive!)
  4. I've honestly become a bit lost for words over the Medibank ransom saga, it's just absolutely horrendous (that's a link to my thread commentating on the data dumps)
  5. Sponsored by: Varonis. Reduce your SaaS blast radius with data-centric security for AWS, G Drive, Box, Salesforce, Slack and more.

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

By Troy Hunt
The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

A couple of weeks ago I wrote about some big changes afoot for Have I Been Pwned (HIBP), namely the introduction of annual billing and new rate limits. Today, it's finally here! These are two of the most eagerly awaited, most requested features on HIBP's UserVoice so it's great to see them finally knocked off after years of waiting. In implementing all this, there are changes to the existing "one size fits all" model so if you're using the HIBP API, please make sure you read this carefully and understand the impact (if any) on you. Here goes:

The Rate Limits and (Some) Pricing is Different

The launch blog post for the authenticated API explained the original rationale behind the $3.50 per month price and most importantly, how I wanted to ensure it didn't pose a barrier:

In choosing the $3.50 figure, I wanted to ensure it was a number that was inconsequential to a legitimate user of the service

As I said in the previous blog post, what I didn't understand at the time was that paradoxically, the low amount was a barrier to many organisations! But equally, it's made the API super accessible to the masses so that price stays. The rate limit, however, needed revisiting and to understand why, let's go back to the beginning:

The "1 request per 1,500ms" rate dated all the way back to 2016 where I'd initially attempted to combat abuse by applying the limit per IP. This was an entirely non-empirical, gut feel, "let's just try and fix the problem right now" decision and it was only very recently I actually started trawling through the data and looking at how the API was being consumed. 1 request every 1,500ms is a maximum of 57,600 requests in a day; here's the number of requests by the top 20 consumers of the service in a recent 24 hour mid-week period:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

Keeping in mind that you're never going to achieve the full 57,600 requests in a day as you'd have to time every single one of them perfectly so as not to hit the rate limit, only 1 subscriber even achieved half that potential. In fact, only 9 subscribers achieved even a quarter of the potential with everyone else very quickly falling back to a small fraction of even that. To be fair, I'm conscious that I'm taking a full day of data and talking about requests as if they were evenly distributed across the entire period when there are inevitably use cases where it's more a short burst rather than a prolonged, even distribution. Regardless, what the data is saying is that the default "one size fits all" rate limit is way above and beyond what almost every single subscriber is actually consuming, and by a significant order of magnitude too. In a way, what we ended up with is the little guys subsidising the big guys.

The bottom line is that we're simultaneously adding a bunch of higher rate limits whilst reducing the entry level rate limit. It's easier if you see it all in context so let's just jump straight into the pricing (all in USD):

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

This is from Stripe's embeddable pricing table I mentioned in the previous post and it's what you see when you first sign up for a key. With new limits, it's easier to talk about "requests per minute" or RPM so that's the nomenclature we're sticking with now. That entry level 10RPM model will work for well in excess of 90% of current subscribers and it's only a very small percentage of the existing subscriber base exceeding it. (And yes, again, I know these requests are sometimes made in bursts but even still, 10RPM is far in excess of the vast majority of use cases.)

There are economies of scale that have been factored in here. Going from 10RPM to 100RPM isn't a 10x increase, it's about a 7x increase. Going to 5 times more requests is only 4 times the price, and so on and so forth. The hope is that this makes it easier for the folks who were previously buying multiple keys to justify scratching all the kludge previously used to do that and replacing it with a single key at a higher RPM.

To get to this outcome, we trawled back through heaps of data ranging from the high-level aggregated stats in the earlier chart to the nature of the organisations buying multiple keys (which we can obviously determine based on the email address used). I also chatted with a bunch of API users both during this process and over the preceding years and have a pretty good sense of the use cases. A few trends became immediately clear:

Firstly, use cases that are genuinely personal have a very low rate limit requirement. Checking your own address(es) or those of your family by a custom app, for example. Or one of my favourite uses (and one I definitely use), the Home Assistant integration:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

On an ongoing basis, HA makes 1 request every 15 minutes. That's all. Each time we looked at genuine personal use cases, 10RPM was plenty.

Next, we found a bunch of use cases used within internal corporate environments, for example to monitor staff exposure in breaches. Now we're talking larger numbers of requests, but it's also something that's way more efficiently done via the existing domain search feature on the website. It's an on-demand, self-service and totally free feature that's been there for years. I know it's not API-based and there are good reasons for that (see the comment from me on that idea), but there's also the Enterprise route if API access is really that important (more on that later). Other examples included things like scanning customer emails to assess exposure at points where, for example, account takeover was a risk. In each of these cases, we're primarily talking about business entities using the service and I'm comfortable with commercial ventures wearing a greater cost.

And finally, there were the "heavy hitters", the ones with large volumes of keys. One such example using the API en masse provides security services to the big end of town and was funded to the tune of a figure that looks like a phone number. And again, I'm perfectly comfortable with them wearing a cost that's more commensurate to the value as opposed to a figure that was originally arrived at just to keep the bad guys out.

Existing Subscribers are Grandfathered in for 60 Days

Before I talk about the annual pricing, I want to make sure this headline is clear. Nothing changes for existing subscribers until the 6th of Jan next year, which is 60 days from today. On that date, the legacy rate limit of 1 request every 1,500ms will roll to the new 10RPM limit at exactly the same price. For that handful of big users for whom the 10RPM limit will be insufficient, you've got a couple of months to work out the best path forward. I'll be emailing every single active subscriber today to ensure everyone is notified well in advance (there's also an updated Terms of Use which requires a notification email to be sent).

What does this mean in practical terms? If you want annual billing or a higher rate limit, you can go and implement that whenever you're ready (more on that soon). Alternatively, if you just want to stick with 10 RPM then you don't have to do anything, nothing will change. What I do strongly suggest though (and this hasn't changed, it's always been the guidance), is to make sure you're handling HTTP 429 responses gracefully. Regardless of what your rate limit is, if you're consuming the API in a fashion where you're not directly controlling the rate yourself, make sure you handle those responses appropriately.

Billing Can Now Occur Annually

This is the easy one to explain: annual payments are now a thing 😊 As I explained in the previous blog post, frequent payments of small amounts can play havoc with reimbursements in the corporate environment. It sucks, I've been there, but it is what it is. Annual billing alleviates that through a combination of a 12x reduction in the frequency of an expense claim and a larger single sum that's easier to explain to your procurement people than $3.50.

So, what do you charge for annual rather than monthly billing? My initial temptation was just to make it literally 12 times more because I don't have a lot of patience for spivvy marketing guff. However, there's a valid case to be made that a 12x reduction on individual payments warrants a discount as it removes overhead from our end (there's a constant percentage of all payments that are disputed or fail or cause other demands on our time), plus there's an argument to be made along the lines of customer loyalty warranting a discount. There's also just the very simple mathematics of the whole thing, best illustrated by a recent payment in Stripe:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

That's 8.5% that disappears on every transaction, largely due to the 30c AUD charge no matter what the price of the transaction is:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

The point is that there's merit for all in incentivising annual rather than monthly payments. We decided to look at what a typical annual discount was and time and time again, found the same thing:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing
The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing
The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

Or in other words, a couple of months for free when you sign up for a year. In fact, coincidentally, that's exactly what I just signed up for with Nabu Casa (Home Assistant cloud) after receiving an email saying annual billing was now available 😊

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

It's never exactly 17%, rather it's like each example took 17% off 12 month's worth of a normal monthly fee then moved the number to something that looked pretty πŸ™‚ Some examples were less (Pluralsight is 14%) and others were more (the higher tiers of Zendesk are 20%), but ultimately we decided to work to that 17% number and came up with the following:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

In keeping with the "pay for 10, get 12" theme, these prices are exactly 10 times the monthly ones. Easy peasy.

Stripe Customer Portal Magic Makes Changing Plans Easy

As I mentioned in the "big changes ahead" blog post, I've been deleting code like crazy in favour of deferring more processing back to Stripe themselves. By using their Customer Portal paradigm, it's now easy to change an existing plan:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

The change can be to a different rate limit or to a different renewal cadence:

The Have I Been Pwned API Now Has Different Rate Limits and Annual Billing

Stripe automatically proratas everything too so whilst you can upgrade immediately to a higher RPM or from monthly to annually, you'll only pay for the difference between the previous plan and the new one. Or, you can downgrade and on next renewal the lower plan will be automatically applied. It's super simple and it's all self-service.

Enterprise

For more than 7 years now, a small handful of organisations have used HIBP in a larger scale commercial fashion. Some of them you're familiar with, for example both 1Password and Mozilla do email address searches using k-Anonymity and that's not something that's a self-service "put your card into Stripe" sort of model (in part because k-Anonymity returns a huge number of results for each search). Infosec firms use Enterprise to support customers via domain level API searches. Identity theft companies use it to advise customers when they're exposed in a breach. One firm even uses it to help detect bot signups; it turns out that so many of us are so pwned, if someone signs up for their service and they're not pwned, that's a little bit suspicious (that's just one of many indicators they use).

This is a fundamentally different model, one that involves a close working relationship, lots of legal documents, procurement people, invoicing instead of credit cards and all sorts of other "Enterprisey" things. That still exists and nothing in today's blog post changes that. I mention this now in today's post simply because some of the folks from those organisations with Enterprise subscriptions will read this post and wonder where they sit. Likewise, I suspect those "100+ key" subscribers of the public API really should be on Enterprise and I'll be reaching out to them separately given the rate limit change will have a bigger impact on them than most.

In Closing

For that vast majority of users who are only at a fraction of the old rate limit, nothing changes other than there now being a key available for 17% less than before on an annual subscription. Meanwhile, for the folks battling corporate bureaucracy around small, frequent payments, this will sort you out and give you choices around rate limits you didn't have before.

There will be some people that fall between the cracks of the use cases outlined above and won't be happy with the changes. I expect that - I know it will happen - but I hope the rationale outlined here demonstrates the volume of thought and consideration that has gone into trying the find the sweet spot for pricing and rate limits. I also expect people will ask about adding other rate limits, for example to fill the gaps between say, 100RPM and 500RPM. We started out with more options, but a combination of that creating the whole paradox of choice problem and deeper analysis of how the API was actually being used led us to simplifying things. But who knows over the longer term, feedback is certainly welcome.

Lastly, if you're watching closely, you'll notice a lot more structure going in around the way HIBP is run. Last week I wrote about rolling out Zendesk for support so there's now a formal ticketing system in place. I also explained how Charlotte is playing a very active role in the management of HIBP and in the coming months, you'll see more around other initiatives to make the project more sustainable. I'm thinking of it like this: what must HIBP do to be sustainable in a post-Troy world? Or in other words, how can we get what has increasingly become an essential service for so many to be more robust and more self-sustaining beyond what one person can do as a sole operator devoting spare time to a passion project.

Stay tuned, there's much more to come πŸ™‚

Weekly Update 320

By Troy Hunt
Weekly Update 320

I feel like life is finally complete: I have beaches, sunshine and fast internet! (Yes, and of course an amazing wife, but that goes without saying 😊) For the folks asking via various channels, the speed is not exactly symmetrical at 1000/400 and I'm honestly not sure why that's the case here in Australia. I also had to shell out quite a bit extra to go from 50 up to a "business" plan of 400 up, but with the volumes of data I ship around it'll make a pretty big difference to the way I work over time. Also this week, much more on the work we're doing with HIBP from pricing the annual plans to a proper support system via Zendesk. I'm really hoping that by next week's update we'll have shipped the new rate limits too, stay tuned for that but for now, here's number 320:

Weekly Update 320
Weekly Update 320
Weekly Update 320
Weekly Update 320

References

  1. Finally - I have fast internet! (just a "little" 25x speed boost, thank you very much 😊)
  2. Everyone seems to be doing 17% discounts for annual over monthly billing (that's Slack's pricing page and as someone pointed out in the live stream, it's effectively 2 free months)
  3. We now have a proper support system up and running for the HIBP API keys (we're really happy with Zendesk, hoping this makes both subscribers' and our lives easier)
  4. Sponsored by: Kolide is a fleet visibility solution for Mac, Windows, and Linux that can help you securely scale your business. Learn more here.

Better Supporting the Have I Been Pwned API with Zendesk

By Troy Hunt
Better Supporting the Have I Been Pwned API with Zendesk

I've been investing a heap of time into Have I Been Pwned (HIBP) lately, ranging from all the usual stuff (namely trawling through masses of data breaches) to all new stuff, in particular expanding and enhancing the public API. The API is actually pretty simple: plug in an email address, get a result, and that's a very clearly documented process. But where things get more nuanced is when people pay money for it because suddenly, there are different expectations. For example, how do you cancel a subscription once it's started? You could read the instructions when signing up for a key, but who remembers what they read months ago? There's also a greater expectation of support for everything from how to construct an API request to what to do when you keep getting 429 responses because you're (allegedly) making too many requests. And yes, some of these queries are, um, "basic", but they're still things people want support with.

In the beginning, all emails from HIBP came from noreply@haveibeenpwned.com because I simply wasn't geared up to provide support. In my naivety, I assumed people would see "noreply" and not reply. Instead, they'd send email to that address, get frustrated when there was no reply (from the "noreply" address...) and seek out my personal contact info. Or they'd lodge a dispute with Stripe because they'd emailed noreply@ asking for their subscription to be cancelled and it wasn't. So, back in September I started looking for a better solution:

I’m thinking of setting up a more formal support process for @haveibeenpwned, especially for folks buying API keys and having queries around billing or implementation. Any suggestions on a service? Something that can triage requests, perhaps also have FAQs. Thoughts?

β€” Troy Hunt (@troyhunt) September 29, 2022

This was a non-trivial exercise. We've all used support services before, so we have an idea of what to expect from an end user perspective, but it's a different story once you dive into all the management bits behind them. Frankly, I find this sort of thing mind-numbing but fortunately it's a task my amazing wife Charlotte picked up with gusto. She has become increasingly involved in all things troyhunt.com and HIBP lately as she brings order, calm and frankly, much needed sanity into my otherwise crazy, demanding professional life. We also figured that if we did this right, she'd be able to handle a lot of the support queries I previously did myself, so she was always going to play a big part in choosing the support platform.

Largely based on Charlotte's work, we settled on Zendesk and about a week ago, silently pushed out support.haveibeenpwned.com:

Better Supporting the Have I Been Pwned API with Zendesk

There are FAQs that cover a bunch of frequent questions, troubleshooting that addresses common problems and, of course, the ability to submit a request if you still need help. These are all a work in progress, and we'll add a lot more content in response to queries, just so long as they're about the right thing. Speaking of which:

This service is only for users of the public commercial API key, not for general HIBP queries.

Why? Because I constantly get queries like this:

Uh… and why am I sleeping during the day?! pic.twitter.com/BUGTJtgl7t

β€” Troy Hunt (@troyhunt) November 1, 2022

Is that even a query?! I don't know! But I do know that someone took the time to track down my personal email address this week and send it to me, and it's not the sort of thing we're going to be responding to on Zendesk. Nor are queries along the lines of the following:

I've been pwned, now what?

Or:

How do I remove my data from data breaches?

Or one of my personal favourites:

I demand you delete all my data from the data breaches or you'll get a letter from my lawyer!

This whole data breach landscape is a foreign concept for many people, and I understand there being questions, but Charlotte and I can't simultaneously run a free service and reply to queries like this from the masses. But the queries that come in via Zendesk are something we can manage as it's clearly scoped, there's lots of supporting docs and for the most part, we're dealing with tech professionals who understand this world a bit better than your average punter in the first place.

As I announced in last week's blog post, we're pushing ahead with new rate limits and annual billing for the API key and getting this piece out first was always an important prerequisite. It's all part of gearing up for bigger things ahead for HIBP 😊

Weekly Update 319

By Troy Hunt
Weekly Update 319

Geez we've been getting hammered down here: Optus, MyDeal, Vinomofo, Medibank and now Australian Clinical Labs. It's crazy how much press interest there's been down here and whilst I think some of it is a bit hyperbolic, bringing the issue to the forefront and ensuring it's being discussed is certainly a good thing. Anyway, let's see what happens between now and next week's video, at this rate there'll be at least one more major Aussie breach to talk about!

Weekly Update 319
Weekly Update 319
Weekly Update 319
Weekly Update 319

References

  1. Big Ass Fan IoT integration has been a big pain in the ass (it really shouldn't be this hard)
  2. Australian Clinical Labs is the latest Aussie company to make the data breach headlines (includes pathology test results 😲)
  3. The E-Pal breach went into HIBP (100k email addresses, more than half in HIBP already)
  4. The Doomworld breach also went into HIBP (they "got pwned by a script kiddie", according to their disclosure)
  5. I've been putting a heap of work into the Stripe integration for the HIBP API key (deleting code is so satisfying!)
  6. Sponsored by: Varonis. Reduce your SaaS blast radius with data-centric security for AWS, G Drive, Box, Salesforce, Slack and more.

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

By Troy Hunt
Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

Just over 3 years ago now, I sat down at a makeshift desk (ok, so it was a kitchen table) in an Airbnb in Olso and built the authenticated API for Have I Been Pwned (HIBP). As I explained at the time, the primary goal was to combat abuse of the service and by adding the need to supply a credit card, my theory was that the bad guys would be very reluctant to, well, be bad guys. The theory checked out, and now with the benefit of several years of data, I can confidently say abuse is near non-existent. I just don't see it. Which is awesome 😊

But there were other things I also didn't see, and it's taken a while for me to get around to addressing them. Some of them are fixed now (like right now, already in production), and some of them will be fixed very, very soon. I think it's all pretty cool, let me explain:

Payments Can Be Hard... if You Don't Stripe Right

A little more background will help me explain this better: in the opening sentence of this blog post I mentioned building the original authenticated API out on a kitchen table at an Airbnb in Oslo. By that time, everyone knew I was going through an M&A process with HIBP I called Project Svalbard, which ultimately failed. What most people didn't know at the time was the other very stressful goings on in my life which combined, had me on a crazy rollercoaster ride I had little control over. It was in that environment that I created the authenticated API, complete with the Azure API Management (APIM) component and Stripe integration. It was rough, and I wish I'd done it better. Now, I have.

In the beginning, I pushed as much of the payment processing as possible to the HIBP website. This was due to a combination of me wanting to create a slick UX and frankly, not understanding Stripe's own UI paradigms. It looked like this:

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

Cards never ended up hitting HIBP directly, rather the site did a dance with Stripe that involved the card data going to them directly from the client side, a token coming back and then that being used for the processing. It worked, but it had numerous problems ranging from lack of support for things like 3D Secure payments, no support for other payments mechanisms such as Google Pay and Apple Pay and increasingly, large amounts of plumbing required to tie it all together. For example, there were hundreds of lines of code on my end to process payments, change the default card and show a list of previous receipts. The Stripe APIs are extraordinarily clever, but I couldn't escape writing large troves of my own code to make it work the way I originally designed it.

Two new things from Stripe since I originally wrote the code have opened up a whole new way of doing this:

  1. Customer Portal: This is a fully hosted environment where payments are made, cards and subscriptions are managed, invoices and receipts are retrieved and basically, a huge amount of the work I'd previously hand-built can be managed by them rather than by me
  2. Embeddable Pricing Table: This brings the products and prices defined in Stripe into the UI of third party services (such as HIBP) such that customers can select their product then head off to Stripe and do the purchasing there

Rolling to these services removed a huge amount of code from HIBP with the bulk of what's left being email address verification, API key management and handling callbacks from Stripe when a payment is successful. What all this means is that when you first create a subscription, after verifying your email address, you see these two screens:

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API
Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

That's the embeddable pricing table following by Stripe's own hosted payment page. I left the browser address bar in the latter to highlight that this is served by Stripe rather than HIBP. I love distancing myself from any sort of card processing and what's more, everything to do with actually taking the payment is now Stripe's problem 😊 If you're interested in the mechanics of this, a successful payment calls a webhook on HIBP with the customer's details which updates their account with a month of API key whilst the screen above redirects them over to the HIBP website where they can grab their key. Easy peasy.

I silently rolled this out a week ago, watched it being used, made a few little tweaks and then waited until now to write about it. The rollout coincided with a typical email I've received so many times before:

First of all I would like to thank you for the wonderful service that helps people to keep track of their email breaches. I was trying to build a product to provide your services via my website, something similar to Firefox, avast and 100's of other companies doing. We were trying to do it according to the guidelines mentioned in the website. However I am not able to renew my purchase due to payment gateway failures at stripe payment. Requesting you to kindly check the same and advise me on alternate methods for making the payment.

The old model often caused payments to be rejected, especially from subscribers in India. The painful thing for me when trying to help folks is that Stripe would simply report the failed payment as follows:

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

However, going back to the individual who raised the query above after rolling out this update, things changed very dramatically:

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

To the title of this section, I simply wasn't "Striping" right. I'm sure there's a way with enough plumbing that it's feasible, but why bother? I cut hundreds of lines of code out just by delegating more of the workload back to them. Further, with ever tightening PCI DSS standards (read Scott's piece, interesting stuff) the less I have to do with cards, the better.

This was a "penny drop" moment for me and it's already made a big difference in a positive way. But there's another penny that dropped for me at the same time: one-off keys were an unnecessary problem.

There Are No More One-Off Keys

It was at the moment I was ripping out those hundreds of lines of code that I wondered: why do I have all the additional kludge to support the paradigm of a one-off key that only lasts a month? Why had I built (and was now maintaining) server side code to handle different types of purchases and UX paradigms to represent one-off versus recurring keys? My gut feel was that most payments formed part of an ongoing subscription but hey, who needs gut feels when you have real data?! So I pulled the numbers:

Only 7% of payments were one-offs, with 93% of payments forming part of ongoing subscriptions.

And so I killed the one-off keys. Kinda, because you can still have a key for only one month, you just purchase a monthly subscription then immediately cancel it via the Stripe Customer Portal:

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

That's linked into from the API key dashboard on HIBP and it'll take all of 5 seconds to do (also note the ability to change payment method directly on the Stripe site). I've added text to that effect on the HIBP website (you may have spotted that in the earlier screen cap) so in practice, the ability to purchase a one-off key is still there and the main upside of this is that I've just killed a trove of code I no longer have to worry about πŸ™‚ Because this is the internet, I'm sure someone will still be upset, but if you only want a key for a month then that capability still well and truly exists.

All of this so far amounts to doing the same things that were always there but better. Now let's talk about the all new stuff!

Annual Billing and Different Rate Limits are Coming... Very Soon!

The title is self-explanatory and "very soon" is in about 2 weeks from now 😎

Let me illustrate the first part of that title with a message I received recently:

Is there a way to procure a 10 year API key? Our client wants to use the Have I been Pwned plugin for [redacted service name]; however, the $3.50 monthly subscription is too small to go through procurement.

What's that saying about no good deed going unpunished? In my naivety, I made the pricing low with the thinking that was a good thing, yet here we are with that posing barriers! This was a recurring message over and over again with folks simply struggling to get their $3.50 reimbursed. I should have seen this coming after years of living the corporate life myself (I have vivid flashbacks of how hard it was to get small sums reimbursed), and filling out an untold number of expense reports. Speaking of which, this was another recurring theme:

Is there a way to pay yearly for HIBP API access vs monthly? Β Monthly adds overhead in paperwork.

And again, I get it, this is a painful process. It somehow feels even more painful due to the fact the sum is so low; how much time are people burning trying to justify $3.50 to their boss?! It's painful, and this likely explains why the request for annual payments is the second most requested idea on HIBP's UserVoice. The comments there speak for themselves, and I'm having corporate PTSD flashbacks just reading them again now!

Sticking with the UserVoice theme, the 5th most requested feature is for different pricing on different rate limits. This is mostly self-explanatory but what I wasn't aware of until I went and pulled the stats was just how many people were hacking around the rate limit problem. There are heaps of API accounts like this:

hibp+1@domain.com
hibp+2@domain.com
hibp+3@domain.com
...

Because there can only be one key per email address, organisations are creating heaps of unique sub-addressed emails in order to buy multiple keys. This would have been a manual, laborious process; there's no automated way to do this, quite the contrary with anti-automation controls built into the process. Further, each key has it's own rate limit so I imagine they were also building a bunch of plumbing in the back end to then distribute requests across a collection of keys which, yeah, I get it, but man that seems like hard work! When I say "a collection of keys", I'm not just talking about a few of them either; the largest number of active in-use keys by a single organisation is 112. One hundred and twelve! The next largest is 110. I never expected that 🀯 (Incidentally, these orgs and the others obtaining multiple keys are all precisely the kinds I want using the API to do good things.)

Building the mechanics of annual billing and different rate limits is only part of the challenge and most of that is already done, the harder part is pricing it. I'm pulling troves of analytics from APIM at present to better understand the usage patterns, and it's quite interesting to see the data as it relates to requests for the API:

Big Changes are Afoot: Expanding and Enhancing the Have I Been Pwned API

There's no persistent logging of the actual queries themselves, but APIM makes it easy to understand both the volume of queries and how many of them are successful versus failed, namely because they exceed the existing rate limit or were made with an invalid (likely expired) key. So, that's what I need to work out over the next couple of weeks when I'll launch everything and write it up, as always, in detail πŸ™‚

Summary

The HIBP API has become an increasingly important part of all sorts of different tools and systems that use the data to help protect people impacted by data breaches. The changes I've pushed out over the last week help make the service more accessible and easier to manage, but it's the coming changes I'm most excited about. These are the ones that will make life so much easier on so many people integrating the service and, I sincerely hope, will enable them to do things that make a much more profound impact on all of us who've been pwned before.

Go and check out how the whole API key process works, I'd love to hear your feedback 😊

Weekly Update 318

By Troy Hunt
Weekly Update 318

Aussie breachapalooza! That what it feels like this week between Optus (ok, it was weeks ago but it's still in the news), Vinomofo, My Deal and the mother of all of them (at least as far as media interest goes), Medibank. That last one totally smashed my week out with unprecedented press enquiries, so is it any wonder I totally missed the Microsoft one? I read through that last one live in this week's video and as you'll hear, a breach of any kind is never a good look but what stands out for me about this one isn't the breach itself, rather the marketing effort SOCRadar has made around it. As I say in the video, it just feels... icky. See if you agree.

Weekly Update 318
Weekly Update 318
Weekly Update 318
Weekly Update 318

References

  1. The Optus breach really got the nation down here paying attention to data breaches (that alone got a huge amount of attention, and then Medibank happened...)
  2. I myself got an email from My Deal saying I'm in the breach (ok, so password reset and then they tell me I have no account!)
  3. Vinomofo also had themselves a data breach (they were just using production data for testing "as is industry practice" πŸ€¦β€β™‚οΈ)
  4. The Medibank breach has made massive news down here (it's particularly nasty when we're talking about health data being held to ransom)
  5. The BlueBleed marketing campaign (sorry - "breach") is more about how it was reported rather than what it actually is (note in the thread that Kevin mentions the search tool has now been removed)
  6. Sponsored by: EPAS by Detack. No EPAS protected password has ever been cracked and won't be found in any leaks. Give it a try, millions of users use it.

Weekly Update 317

By Troy Hunt
Weekly Update 317

I decided to do something a bit different this week and mostly just answer questions from my talk at GOTO Copenhagen last week. I wasn't actually in Denmark this time, but a heap of really good questions came through and as I started reading them, I thought "this would actually make for a really good weekly update". So here we are, and those questions then spurned on a whole heap more from the live audience too so this week's video became one large Q&A. I hope you enjoy this one, let me know if I should do more of these in the future.

Weekly Update 317
Weekly Update 317
Weekly Update 317
Weekly Update 317

References

  1. I now have a teenager... on social media! (it's been fun setting stuff up with Ari and locking it down, lots of fundamentals there everyone should know)
  2. Here's all the questions from GOTO (also includes the ratings, which please me 😊)
  3. Sponsored by: Varonis. Reduce your SaaS blast radius with data-centric security for AWS, G Drive, Box, Salesforce, Slack and more.

Weekly Update 316

By Troy Hunt
Weekly Update 316

Geez it's nice to be home 😊 It's nice to live in a home that makes you feel that way when returning from a place as beautiful as Bali 😊 This week's video is dominated by the whole discussion around this tweet:

I love that part of the Microsoft Security Score for Identity in Azure improves your score if you *don't* enforce password rotation, what a sign of the times! Who out there still works somewhere that forces rotation (because "reasons")? pic.twitter.com/a2yQQvNRpa

β€” Troy Hunt (@troyhunt) October 6, 2022

I love this for the way it throws traditional logic out the window, logic we all knew sucked and I suspect the massive engagement the tweet drove is due to precisely that: Microsoft giving us all a good reason to whinge about a sucky practice that still prevails so broadly. So... I hope you enjoy listening to just how bad enforced password rotation sucks 😊

Weekly Update 316
Weekly Update 316
Weekly Update 316
Weekly Update 316

References

  1. We've known that mandatory password rotation has passed its used by date for years now (that blog post was actually the genesis for Pwned Passwords)
  2. The Bhinneka breach went into HIBP (Indonesian e-commerce service with 83% of pwnees being repeat visitors to HIBP)
  3. The Wakanim breach also went in, a pretty fresh one from 6 weeks ago (actually thought this was quite under-reported for an incident impacting 6.7M people)
  4. Sponsored by: Kolide can help you nail third-party audits and internal compliance goals with endpoint security for your entire fleet. Learn more here.

Weekly Update 315

By Troy Hunt
Weekly Update 315

How's this weeks video for a view?! It's a stunning location here in Bali and it's just been the absolute most perfect spot for a honeymoon, especially after weeks of guests and celebrations. But whoever hacked and ransom'd Optus didn't care about me taking time out and I've done more media in the last week than I have in a long time. I don't mind, it's a fascinating story the way this has unfolded and that's where most of the time in this week's video has gone, I hope you enjoy my analysis of what has become a pretty crazy story back home in Australia.

Weekly Update 315
Weekly Update 315
Weekly Update 315
Weekly Update 315

References

  1. Bali is a stunning place with postcard worthy shots around every corner (link through to the tweet thread with all the magic 😍)
  2. I've never seen a data breach make as much local news as Optus has, not even close! (link through to Jeremy Kirk's thread explaining how it went down)
  3. When people are wondering if they need to change their name and date of birth in the wake of a data breach, you know there's bigger problems to be solved (seriously, depending on numbers as some sort of secret source sufficient to form a significant part of an identity theft attack is madness and needs to die in a fire)
  4. Sponsored by: Varonis. Reduce your SaaS blast radius with data-centric security for AWS, G Drive, Box, Salesforce, Slack and more.

Weekly Update 314

By Troy Hunt
Weekly Update 314

Wow, what a week! Of course there's lots of cyber / tech stuff in this week's update, but it was really only the embedded tweet below on my mind so I'm going to leave you with this then come to you from somewhere much more exotic than usual (and I reckon that's a pretty high bar for me!) next week 😎

Absolutely over the moon to formally make @Charlotte_Hunt_ a part of our family ❀️ πŸ’ pic.twitter.com/XfahXElboC

β€” Troy Hunt (@troyhunt) September 21, 2022
Weekly Update 314
Weekly Update 314
Weekly Update 314
Weekly Update 314

References

  1. Optus disclosed a breach, but really didn't share much solid information about it... unlikely what Jeremy Kirk has since tweeted (these tweets came out after I recorded the vid so I didn't reference them, but it's the best analysis of the legitimacy of the data that I've seen to date)
  2. Lots of gigabytes of TAP Air Portugal customers is now floating around (and it's searchable within HIBP)
  3. Sponsored by: SecAlerts vulnerability awareness: Receive CVE & zero-day alerts, news & version updates all matched to your software. Discount code within!

Weekly Update 313

By Troy Hunt
Weekly Update 313

I came so close to skipping this week's video. I'm surrounded by family, friends and my amazing wife to be in only a couple of days. But... this video has been my constant companion through very difficult times, and I'm happy to still being doing it at the best of times 😊 So, with that, I'm signing out and heading off to do something much more important. See you next week.

Taking a bit of time off Twitter while @charlottelyng and I do more important things πŸ’ πŸ‘°β€β™€οΈ pic.twitter.com/9JJrPM9kWX

β€” Troy Hunt (@troyhunt) September 13, 2022
Weekly Update 313
Weekly Update 313
Weekly Update 313
Weekly Update 313

References

  1. The Brand New Tube video site was breached and is now in HIBP (350k account details of what seems to be a very, uh, "unique" demographic were exposed)
  2. The TikTok breach that... wasn't (why is this still getting media attention?!)
  3. Sponsored by: Varonis. Reduce your SaaS blast radius with data-centric security for AWS, G Drive, Box, Salesforce, Slack and more.

Weekly Update 312

By Troy Hunt
Weekly Update 312

I'm so excited to see the book finally out and awesome feedback coming in, but I'm disappointed with this week's video. I frankly wasn't in the right frame of mind to do it justice (it's been a very hard road up until this point, for various reasons), then my connection dropped out halfway through and I had to roll to 5G, and now I'm hearing (both from other people and with my own ears), a constant background noise being picked up by the mic. Argh! But, that's the reality of scheduled live streams and for better or worse, you end up getting the "warts and all" version. It is what it is, and next week's will be better 😊

Weekly Update 312
Weekly Update 312
Weekly Update 312
Weekly Update 312

References

  1. book.troyhunt.com
  2. Sponsored by: Kolide believes that maintaining endpoint security shouldn’t mean compromising employee privacy. Check out our manifesto: Honest Security.

"Pwned", the Book, is Finally Here!

By Troy Hunt
"Pwned", the Book, is Finally Here!

The first time I ever wrote publicly about a company's security vulnerabilities, my boss came to have a word with me after seeing my name in the news headlines.

One of the worst days I've ever had was right in the middle of the Have I Been Pwned sale process, and it left me an absolute emotional wreck.

When I wrote about how I deal with online abuse, it was off the back of some pretty nasty stuff... which I've now included in this book 😊

These are the stories behind the stories and finally, the book about it all is here:

"Pwned", the Book, is Finally Here!

I announced the book back in April last year after Rob, Charlotte and I had already invested a heap of effort before releasing a preview in October. I'd hoped to have it out by Christmas... but it wasn't perfect. Ok, so it'll never be perfect without faults of any kind, but it had to meet an extremely high bar for me to be happy with an end result we could charge people money for. I completely rewrote the intro, changed a bunch of the posts we decided to include, reordered them all and edited a heap of the personal stuff. Rob wrote an amazing intro (I genuinely felt emotional reading it, given the time of both our lives he refers to), and both Charlotte and I wrote pieces at the end of it all. That bit in particular is very personal, and it's what was the most natural to us at the time. With our wedding being only next week, it just felt... right. That's why we're publishing today; to get this story about the earlier phases of my life out before the next phase begins. There are things we both wanted to say, and I hope you enjoying reading them 😊

We actually pushed the book out to a preview audience a couple of weeks ago and requested feedback to make it even better before releasing to the masses. We got two kinds of feedback: typos, which we've now fixed and testimonials, which have been awesome:

Great read! Troy has the technical know-how, and is able to effectively communicate complex topics so that anyone understands them. The personal stories kept me hooked!

- Mikko HyppΓΆnnen
I haven't been able to put the book down. The added intros and epilogue on each post in particular and the retrospectives from today's perspective are particularly interesting... Captivating stuff, apart from infosec, you really feel as though you’ve been taken on a journey with Troy through the years of living in paradise a.k.a. Gold Coast, craft beer, good coffee’s, travel and jet skis. Top of my list for resources I point anyone new to the space to. Simply Outstanding read - 10/10.

- Henk Brink
Love, love, LOVE the intros - i'm only familiar with Lars in terms of his online identity and videos, but the "commendations" from Richard Campbell, and in particular Rob Conery are fantastic and really anchor an emotional aspect to the book, that i was not expecting. Great to see a book deliver this authenticity - we're all only human after all!. I "cheated" and also skipped to read Charlotte's epilogue and again was blown away by the depth and genuine nature of the emotion on display. I honestly was not expecting there to be so much heart on display, but am very glad there is.

- C. Morgan
A famous American newsman used to call this "The rest of the story." This is the kind of insight that only comes with time, as the author can reflect and pick out important details that didn't get the coverage they deserved and these then find a life of their own. This book provides "the rest of the story" behind other stories which broke months or years ago. It gives these stories new life, and new significance.

- Pat Phelan
PWNED! Troy Hunt takes us on his life journey, ups and downs, explaining how haveIbeenpwned came to be, raising awareness of the world’s poor password and online security habits... This book has it all. Plenty of tech, data breaches, career hacks, IoT, Cloud, password management, application security, and more, delivered in a fun way. This info is gold, and has improved and complemented my career in IT, and also my digital life at home too. I highly recommend this book as it explains cyber security so well, and hope you have your eureka moment and improve your cyber hygiene too.

- George Ousak

I never imagined turning blog posts into a book, but here we are, and it's come out awesome 😎 This book is a labour of love the three of us have poured ourselves into over the last couple of years. It's a personal story I'm publishing at a very personal time and it's now live at book.troyhunt.com

Weekly Update 311

By Troy Hunt
Weekly Update 311

Well, after a crazy amount of work, a lot of edits, reflection, and feedback cycles, "Pwned" is almost here:

This better be a sizzling read @troyhunt or I'll be crashing the wedding in ways never done before.

Also, I thought they'd cancelled Neighbours? πŸ˜‰β€οΈ pic.twitter.com/jrYIKtL0Uh

β€” Mike Thompson (@AppSecBloke) August 30, 2022

The preview cycle is in full swing with lots of feedback coming in and revisions being made before we push it live to the masses. This is really exciting and I can't wait to get the book out there in front of everyone, stay tuned 😊

Weekly Update 311
Weekly Update 311
Weekly Update 311
Weekly Update 311

References

  1. There's clearly more going on behind the scenes with Krebs' "Final Thoughts on Ubiquiti" post (but hey, I love what they both do so hopefully that's that and everyone can get back to doing what they do best)
  2. The Russian streaming service START made it into HIBP (should I have done anything differently because it's Russian, or mostly full of Russian subscribers?)
  3. The Stripchat data is also now in HIBP (a very adult website so flagged as "sensitive" and not publicly searchable)
  4. I love a good crazy corporate response on Twitter, so here's a couple of them for you 😊 (quite funny that Ocado now decides to delete their crazy tweet!)
  5. Sponsored by: Kolide is an endpoint security solution for teams that want to meet SOC2 compliance goals without sacrificing privacy. Learn more here.

Weekly Update 310

By Troy Hunt
Weekly Update 310

By all accounts, this was one of the best weekly updates ever courtesy of a spam caller giving me a buzz at the 38:40 mark and struggling with "pwn" versus "porn". It resulted in an entertaining little on-air call and subsequently caused me to go out and register both haveibeeninpwn.com and haveibeeninporn.com. I figure these will result in much ongoing hilarity the next time I get a call of this nature about one of those domains 🀣 Oh - and there's a whole bunch of data breach stuff this week, enjoy!

Weekly Update 310
Weekly Update 310
Weekly Update 310
Weekly Update 310

References

  1. The Mudge v. Twitter scandal has some pretty serious accusations in it (there's a 6 min CNN vid in that tweet that's worth a look)
  2. Plex has gone for another round of data breach this week (actually pretty impressed that they now have 30M subscribers!)
  3. LastPass has also gone for another round (I know the optics aren't good, but the real world impact of this is almost certainly insignificant)
  4. I got a very convincing SMS phish this week (think about the human vulnerabilities this exploits, no wonder phishing remains so lucrative)
  5. Sponsored by: Kolide is a fleet visibility solution for Mac, Windows, and Linux that can help you securely scale your business. Learn more here.

Weekly Update 309

By Troy Hunt
Weekly Update 309

Right off the back of a visit to our wedding venue (4 weeks and counting!) and a few hours before heading to the snow (yes, Australia has snow), I managed to slip in a weekly update earlier today. I've gotta say, the section on Shitexpress is my favourite because there's just so much to give with this one; a service that literally ships shit with a public promise of multiple kinds of animal shit whilst data that proves only horse shit was ever shipped, a promise of 100% anonymity whilst the data set clearly shows both shit-senders and shit-receivers and possibly the most eye-opening of all, the messages accompanying the shit. So, uh, yeah, enjoy! πŸ’©

Weekly Update 309
Weekly Update 309
Weekly Update 309
Weekly Update 309

References

  1. The acoustic panelling in my office is starting to come together, but it needs more work (I'll always notice those little misaligned lines... and you probably will too now that I've mentioned it!)
  2. Kickstarter's password reset email left a lot of people confused (turns out they were just rolling people on Facebook auth to native Kickstarter accounts, but by their own admission the messaging was really confusing)
  3. Turns out the source of the templated emails I was getting about removing data from HIBP was Rightly (their intentions are good, but IMHO their execution is poor)
  4. Shitexpress - where do I even being with this one?! (just read my Twitter thread on it, it's all kinds of crazy this one)
  5. Sponsored by: Kolide can help you nail third-party audits and internal compliance goals with endpoint security for your entire fleet. Learn more here.

Weekly Update 308

By Troy Hunt
Weekly Update 308

It was all a bit last minute today after travel, office works and then a quick rebuild of desk and PC before doing this livestream (didn't even have time to comb my hair!) So yes, I took a shortcut with the description of this video, but it all worked out well in the end IMHO with plenty of content that wasn't entirely data breach related, but yeah, that does seem to be a bit of a recurring theme in these vids. Enjoy 😊

Weekly Update 308
Weekly Update 308
Weekly Update 308
Weekly Update 308

References

  1. The acoustic panelling in my office is starting to look awesome (some stuff is not lining up so it will be a little longer yet before completion)
  2. The QuestionPro breach has been pretty poorly handled (it's also now well beyond debate that it's real)
  3. If you're sending a C&D notice to a data breach forum, you're really got no idea how these things work (and now their data is... everywhere)
  4. Here's that UniFi Protect Theta cam (they're pumping out so much cool stuff lately 😎)
  5. The stage at NEXTGEN's Cyber Republic event was pretty awesome (the delayed flight home, late night and early start the next day was... less awesome πŸ™)
  6. I got what will possibly be the funniest set of spammer responses to Password Purgatory this week 🀣 (also learned a few things, I'm determined to get even better at this!)
  7. Sponsored by: Kolide believes that maintaining endpoint security shouldn’t mean compromising employee privacy. Check out our manifesto: Honest Security.

Weekly Update 307

By Troy Hunt
Weekly Update 307

A very early weekly update this time after an especially hectic week. The process with the couple of data breaches in particular was a real time sap and it shouldn't be this hard. Seriously, the amount of effort that goes into trying to get organisations to own their breach (or if they feel strongly enough about it, help attribute it to another party) is just nuts. It's not getting any better either πŸ™ Regardless, listen to how these couple went and as always, if you've got any bright ideas about how to make this process less painful then I'd love to hear them.

Weekly Update 307
Weekly Update 307
Weekly Update 307
Weekly Update 307

References

  1. The 3D models of Looney Toons characters are so cool! (were you looking for a "good" reason to get a 3D printer? 😊)
  2. The bloke behind some nasty stalkerware has now been charged (how he managed to run this for 9 years is a bit beyond me...)
  3. Just read the mSpy website about how by design, it's intended to evade detection and run surreptitiously ("think of the children" is a rubbish excuse)
  4. Speaking of thinking of the children, here's how to do it right (native controls, conversations with your kids and simply being present)
  5. There's lots of pointed to Tuned Global being involved in a breach (the way those dots line up with JB Hi-Fi's music service is the real smoking gun)
  6. I ended up doing a long tweet thread on the QuestionPro breach right after this week's video (I've also now removed the "unverified" flag)
  7. Ah, spammer hell 😈 (as I say in the vid, some tweaking of my initial email response is probably required to maximise the success rate)
  8. Sponsored by: Cloudflare. Speed up and protect your apps, APIs and websites with the world's fastest DNS. Add CDN, SSL, WAF, bot management and much more.

  • August 6th 2022 at 05:43

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

By Troy Hunt
Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

How best to punish spammers? I give this topic a lot of thought because I spend a lot of time sifting through the endless rubbish they send me. And that's when it dawned on me: the punishment should fit the crime - robbing me of my time - which means that I, in turn, need to rob them of their time. With the smallest possible overhead on my time, of course. So, earlier this year I created Password Purgatory with the singular goal of putting spammers through the hellscape that is attempting to satisfy really nasty password complexity criteria. And I mean really nasty criteria, like much worse than you've ever seen before. I opened-sourced it, took a bunch of PRs, built out the API to present increasingly inane password complexity criteria then left it at that. Until now because finally, it's live, working and devilishly beautiful 😈

Step 1: Receive Spam

This is the easy bit - I didn't have to do anything for this step! But let me put it into context and give you a real world sample:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

Ugh. Nasty stuff, off to hell for them it is, and it all begins with filing the spam into a special folder called "Send Spammer to Password Purgatory":

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

That's the extent of work involved on a spam-by-spam basis, but let's peel back the covers and look at what happens next.

Step 2: Trigger a Microsoft Power Automate Flow

Microsoft Power Automate (previously "Microsoft Flow") is a really neat way of triggering a series of actions based on an event, and there's a whole lot of connectors built in to make life super easy. Easy on us as the devs, that is, less easy on the spammers because here's what happens as soon as I file an email in the aforementioned folder:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

Using the built in connector to my Microsoft 365 email account, the presence of a new email in that folder triggers a brand new instance of a flow. Following, I've added the "HTTP" connector which enables me to make an outbound request:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

All this request does is makes a POST to an API on Password Purgatory called "create-hell". It passes an API key because I don't want just anyone making these requests as it will create data that will persist at Cloudflare. Speaking of which, let's look at what happens over there.

Step 3: Call a Cloudflare Worker and Create a Record in KV

Let's start with some history: Back in the not too distant past, Cloudflare wasn't a host and instead would just reverse proxy requests through to origin services and do cool stuff with them along the way. This made adding HTTPS to any website easy (and free), added heaps of really neat WAF functionality and empowered us to do cool things with caching. But this was all in-transit coolness whilst the app logic, data and vast bulk of the codebase sat at that origin site. Cloudflare Workers started to change that and suddenly we had code on the edge running in hundreds of nodes around the world, nice and close to our visitors. Did that start to make Cloudflare a "host"? Hmm... but the data itself was still on the origin service (transient caching aside). Fast forward to now and there are multiple options to store data on Cloudflare's edges including their (presently beta) R2 service, Durable Objects, the (forthcoming) D1 SQL database and of most importance to this blogpost, Workers KV. Does this make them a host if you can now build entire apps within their environment? Maybe so, but let's skip the titles for now and focus on the code.

All the code I'm going to refer to here is open source and available in the public Password Purgatory Logger Github repo. Very early on in the index.js file that does all the work, you'll see a function called "createHell" which is called when the flow step above runs. That code creates a GUID then stores it in KV after which I can easily view it in the Cloudflare dashboard:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

There's no value yet, just a key and it's returned via a JSON response in a property called "kvKey". To read that back in the flow, I need a "Parse JSON" step with a schema I generated from a sample:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

At this point I now have a unique ID in persistent storage and it's available in the flow, which means it's time to send the spammer an email.

Step 4: Invite the Spammer to Hell

Because it would be rude not to respond, I'd like to send the spammer back an email and invite them to my very special registration form. To do this, I've grabbed the "Reply to email" connector and fed the kvKey through to a hyperlink:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

It's an HTML email with the key hidden within the hyperlink tag so it doesn't look overtly weird. Using this connector means that when the email sends, it looks precisely like I've lovingly crafted it myself:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

With the entire flow now executed, we can view the history of each step and see how the data moves between them:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

Now, we play the waiting game 😊

Step 5: Log Spammer Pain

Wasting spammer time in and of itself is good. Causing them pain by having them attempt to pass increasingly obtuse password complexity criteria is better. But the best thing - the pièce de résistance - is to log that pain and share it publicly for our collective entertainment 🀣

So, by following the link the spammer ends up here (you're welcome to follow that link and have a play with it):

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

The kvKey is passed via the query string and the page invites the spammer to begin the process of becoming a partner. All they need to leave is an email address... and a password. That page then embeds 2 scripts from the Password Purgatory website, both of which you can find in the open source and public Github repository I created in the original blog post. Each attempt at creating an account sends off the password only to the original Password Purgatory API I created months ago, after which it responds with the next set of criteria. But each attempt also sends off both the criteria that was presented (none on the first go, then something increasingly bizarre on each subsequent go), the password they tried to use to satisfy the criteria and the kvKey so it can all be tied together. What that means is that the Cloudflare Workers KV entry created earlier gradually builds up as follows:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

There are a couple of little conditions built into the code:

  1. If a kvKey is passed in the log request that doesn't actually exist on Cloudflare, HTTP 404 is returned. This is to ensure randos out there don't attempt to submit junk logs into KV.
  2. Once the first password is logged, there's a 15 minute window within which any further passwords can be logged. The reason is twofold: firstly, I don't want to share the spammers attempts publicly until I'm confident no more passwords can be logged just in case they add PII or something else inappropriate. Secondly, once they know the value of the kvKey a non-spammer could start submitting logs (for example, when I tweet it later on or share it via this blog post).

That's everything needed to lure the spammer in and record their pain, now for the really fun bit 😊

Step 6: Enjoy Revelling in Spammer Pain

The very first time the spammer's password attempt is logged, the Cloudflare Worker sends me an email to let me know I have a new spammer hooked (this capability using MailChannels only launched this year):

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

It was so exciting getting this email yesterday, I swear it's the same sensation as literally getting a fish on your line! That link is one I can share to put the spammer's pain on display for the world to see. This is achieved with another Cloudflare Workers route that simply pulls out the logs for the given kvKey and formats it neatly in an HTML response:

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

Ah, satisfaction 😊 I listed the amount of time the spammer burned with a goal to further refining the complexity criteria in the future to attempt to keep them "hooked" for longer. Is the requirement for a US post code in the password a bit too geographically specific, for example? Time will tell and I wholeheartedly welcome PRs to that effect in the original Password Purgatory API repo.

Oh - and just to ensure traction and exposure are maximised, there's a neatly formatted Twitter card that includes the last criteria and password used, you know, the ones that finally broke the spammer's spirit and caused them to give up:

Spammer burned a total of 80 seconds in Password Purgatory 😈 #PasswordPurgatory https://t.co/VwSCHNZ2AW

β€” Troy Hunt (@troyhunt) August 3, 2022

Summary

Clearly, I've taken a great deal of pleasure in messing with spammers and I hope you do too. I've gotta be honest - I've never been so excited to go through my junk mail! But I also thoroughly enjoyed putting this together with Power Automate and Workers KV, I think it's super cool that you can pull an app together like this with a combination of browser-based config plus code and storage that runs directly in hundreds of globally distributed edge nodes around the world. I hope the spammers appreciate just how elegant this all is 🀣

Weekly Update 306

By Troy Hunt
Weekly Update 306

I didn't intend for a bunch of this week's vid to be COVID related, but between the breach of an anti-vaxxer website and the (unrelated) social comments directed at our state premier following some pretty simple advice, well, it just kinda turned out that way. But there's more on other breaches too, in particular the alleged Paytm one and the actual Customer.io one.

I'm really looking forward to next week's update, here's a little teaser of what you can expect to hear about then 🀣

Weekly Update 306
Weekly Update 306
Weekly Update 306
Weekly Update 306

References

  1. I've updated the Paytm data breach to be flagged as "fabricated" (full thread on the reasons why, it's a tricky one)
  2. Anti-vax dating site that let people advertise β€˜mRNA FREE’ semen left all its user data exposed (😲😳😲)
  3. I'm genuinely sympathetic to all politicians on any side of the political fence who have to deal with the COVID mess (just read the volume of ridiculous crap they're at the receiving end of)
  4. We're still seeing the long tail of the Customer.io data breach (protecting against malicious insiders is a hard one)
  5. Sponsored by: Kolide is an endpoint security solution for teams that want to meet SOC2 compliance goals without sacrificing privacy. Learn more here.

Weekly Update 305

By Troy Hunt
Weekly Update 305

I broke Yoda's stick! 3D printing woes, and somehow I managed to get through the explanation without reverting to a chorus of My Stick by a Bad Lip Reading (and now you'd got that song stuck in your head). Loads of data breaches this week and whilst "legacy", still managed to demonstrate how bad some practices remain today (hi Shadi.com πŸ‘‹). Never a dull moment in data breach land, more from there next week 😊

Weekly Update 305
Weekly Update 305
Weekly Update 305
Weekly Update 305

References

  1. The Yoda 3D print looks amazing (just don't touch his stick)
  2. New flash - social media platform collects lots of data! (seriously, the TicTok hyperbole got a bit too much this week)
  3. What if... some free stuff is actually free? (you're not always "the product" and in many cases, that's frankly a pretty disingenuous term)
  4. Sponsored by: Kolide is a fleet visibility solution for Mac, Windows, and Linux that can help you securely scale your business. Learn more here.

If You're Not Paying for the Product, You Are... Possibly Just Consuming Goodwill for Free

By Troy Hunt
If You're Not Paying for the Product, You Are... Possibly Just Consuming Goodwill for Free

How many times have you heard the old adage about how nothing in life is free:

If you're not paying for the product, you are the product

Facebook. LinkedIn. TikTok. But this isn't an internet age thing, the origins go back way further, originally being used to describe TV viewers being served ads. Sure, TV was "free" in that you don't pay to watch it (screwy UK TV licenses aside), but running a television network ain't cheap so it was (and still is) supported by advertisers paying to put their message in front of viewers. A portion of those viewers then go out and buy the goods and services they've been pitched hence becoming the "product" of TV.

But what I dislike - no, vehemently hate - is when the term is used disingenuously to imply that nobody ever does anything for free and that there is a commercial motive to every action. To bring it closer to home for my audience, there is a suggestion that those of us who create software and services must somehow be in it for the money. Our time has a value. We pay for hardware and software to build things. We pay for hosting services. If not to make money, then why would we do it?

There are many, many non-financial motives and I'm going to talk about just a few of my own. In my very first ever blog post almost 13 years ago now, I posited that it was useful to one's career to have an online identity. My blog would give me an opportunity to demonstrate over a period of time where my interests lie and one day, that may become a very useful thing. Nobody that read that first post became a "product", quite the contrary if the feedback is correct.

The first really serious commitment I made to blogging was the following year when I began the OWASP Top 10 for ASP.NET series. That was ten blog posts of many thousands of words each that took a year and a half to complete. I had the idea whilst literally standing in the shower one day thinking about the things that bugged me at work: "I'm so sick of sending developers who write code for us basic guidance on simple security things". I wanted to solve that problem, and as I started writing the series, it turned out to be useful for a whole range of people which was awesome! Did that make them the product? No, of course not, it just made them a consumer of free content.

I can't remember exactly when I put ads on my blog. I think it was around the end of 2012, and they were terrible! I made next to no money out of them and I got rid of them altogether in 2016 in favour of the sponsorship line of text you still see at the top of the page today. Did either of these make viewers "the product" in a way that they weren't when reading the same content prior to their introduction? By any reasonable measure, no, not unless you stretch reality far enough to claim that the ads consumed some of their bandwidth or device power or in some other way was detrimental such that they pivoted from being a free consumer to a monetised reader. Then that argument dies when ads rolled to sponsorship. Perhaps it could be claimed that people became the product because the very nature of sponsorship is to get a message out there which may one day convert visitors (or their employers) to customers and that's very true, but that doesn't magically pivot them from being a free consumer of content to a "product" at the moment sponsorship arrived, that's a nonsense argument.

How about ASafaWeb in 2011? Totally free and designed to solve the common problem of ASP.NET website misconfiguration. I never made a cent from that. Never planned to, never did. So why do it? Because it was fun πŸ™‚ Seriously, I really enjoyed building that service and seeing people get value from it was enormously fulfilling. Of course nobody was the product in that case, they just consumed something for free that I enjoyed building.

Which brings me to Have I Been Pwned (HIBP), the project that's actually turned out to be super useful and is the most frequent source of the "if you're not paying for the product" bullshit argument. There were 2 very simple reasons I built that and I've given this same answer in probably a hundred interviews since 2013:

  1. I wanted to build something on Azure in anger. I was trying to drive Pfizer (where I worked at the time) down the cloud path and in particular, towards PaaS. I wanted to learn more about modern cloud paradigms myself and I didn't want to build "Hello World", so HIBP seemed like a good way to achieve this.
  2. I wanted to build a data breach search service. Ok, obvious answer, but I'd just found both my personal and Pfizer email addresses in the Adobe data breach which was somewhere I never expected to see them. But I'd given them to Macromedia (Dreamweaver FTW!) and they subsequently flowed to the new parent company after the acquisition.

That's it. Those 2 reasons. No visions of grandeur, no expectation of a return on my time, just itches I wanted to scratch. Months later, I posed this question:

A number of people have asked for a donate button on @haveibeenpwned. What do you think? Worth donating to? Or does it come across as cheap?

β€” Troy Hunt (@troyhunt) March 7, 2014

Which is exactly what it looks like on face value: people appreciating the service and wanting to support what I was doing. It didn't make anyone "the product". Nor did the first commercial use of HIBP the following year make anyone a product, it didn't change their experience one little bit. The partnership with 1Password several years later is the same again; arguably, it made HIBP more useful for the masses or non-techies that had never given any consideration to a password manager.

What about Why No HTTPS? Definitely not a product either as the service itself or the people that use it. Or HTTPS is Easy? Nope, and Cloudflare certainly didn't pay me a cent for it either, they had no idea I was building it, I just got up and felt like it one day. Password Purgatory? I just want to mess with spammers, and I'm happy to spend some of my time doing that 😊 (Unless... do they become the product if their responses are used for our amusement?!) And then what must be 100+ totally free user group talks, webinars, podcasts and other things I can't even remember that by their very design, were simply intended to get information to people for free.

What gets me a bit worked up about the "you're the product" sentiment is that it implies there's an ulterior motive for any good deed. I'm dependent on a heap of goodwill for every single project I build and none of that makes me feel like "the product". I use NWebsec for a bunch of my security headers. I use Cloudflare across almost every single project (they provide services to HIBP for free) and that certainly doesn't make me a product. The footer of this blog mentions the support Ghost Pro provides me - that's awesome, I love their work! But I don't feel like a "product".

Conversely, there are many things we pay for yet we remain "the product" of by the definition referred to in this post. YouTube Premium, for example, is worth every cent but do you think you cease being "the product" once you subscribe versus when you consume the service for free? Can you imagine Google, of all companies, going "yeah, nah, we don't need to collect any data from paying subscribers, that wouldn't be cool". Netflix. Disqus. And pretty much everything else. Paying doesn't make you not the product any more than not paying makes you the product, it's just a terrible term used way too loosely and frankly, often feels insulting.

Before jumping on the "you're the product" bandwagon, consider how it makes those who simply want to build cool stuff and put it out there for free feel. Or if you're that jaded and convinced that everything is done for personal fulfilment then fine, go and give me a donation. And now you're thinking "I bet he wrote this just to get donations" so instead, go and give Let's Encrypt a donation... but then that would kinda make free certs a commercial endeavour! See how stupid this whole argument is?

Weekly Update 304

By Troy Hunt
Weekly Update 304

It's very much a last-minute agenda this week as I catch up on the inevitable post-travel backlog and pretty much just pick stuff from my tweet timeline over the week 😊 But hey, there's some good stuff in there and I still managed to knock out almost an hour worth of content!

Weekly Update 304
Weekly Update 304
Weekly Update 304
Weekly Update 304

References

  1. La Poste Mobile got themselves ransom'd and their data dumped (and they're still offline)
  2. Mangatoon are very clearly covering up their breach (which is now hard to do given it's in HIBP and received plenty of press)
  3. The "Seconds" app is my secret presenting sauce! (any workout app that can run a sequence of timed intervals will do it)
  4. I'm totally loving Apple's AirTags to track all my things! (not loving that my AMG is still sitting Melbourne πŸ€¦β€β™‚οΈ)
  5. The Wi-Fi BBQ thermometer is actually really neat (and it does benefit from being connected, too)
  6. Sponsored by: Kolide can help you nail third-party audits and internal compliance goals with endpoint security for your entire fleet. Learn more here.

Weekly Update 303

By Troy Hunt
Weekly Update 303

And we're finally done with this trip. 26 days, 14 different accommodations, 5,146km of driving through 4 states and the last 4 weekly vids all done on the road. Travel is great, but right now going home is even better 😊 Next week's vid will be back in my comfy office with good lighting, video, audio and better planning. Until then, here's a (late) weekly update 303:

Weekly Update 303
Weekly Update 303
Weekly Update 303
Weekly Update 303

References

  1. If you're going to scrape someone else's content, don't embed the images directly off their site! (referrer header based Rickrolls 😎)
  2. The Shanghai police data breach is massive... (if it turns out to be legitimate)
  3. SHA-1 is fine and k-anonymity isn't PII (and frankly, if an organisation doesn't understand these simple facts, they've got bigger issues to deal with)
  4. The Polish government is the 34th to use HIBP's gov service (and I'm still toying with the idea of doing a "visit all the govs" tour one day)
  5. My 12th MVP award came in this week (it's still such an important part of my career 😊)
  6. Sponsored by: CrowdSec - The open-source & collaborative security stack: respond to attacks & share signals across the community. Download it for free

MVP Award 12

By Troy Hunt
MVP Award 12

11 years now, wow 😲 It's actually 11 and a bit because it was April Fool's Day in 2011 that my first MVP award came through. At the time, I referred to myself as "The Accidental MVP" as I'd no expectation of an award, it just came from me being me. It's the same again today, and the last year has been full of just doing the stuff I love; loads of talks (which, like the one above at AusCERT, are actually starting to happen in front of real live humans again), live streams every week, blog posts and perhaps my favourite thing of all, open sourcing Pwned Passwords and standing up an ingestion pipeline for the FBI. Cool 😎

But it has to be said that all these things only happen through the support of the community. There'd be no open source Pwned Passwords if nobody wanted to contribute, no live streams or blog posts if people didn't want to watch them and no conference talks if nobody attended. So, thank you for tuning in and giving me a platform to do what I love 😊

Welcoming the Polish Government to Have I Been Pwned

By Troy Hunt
Welcoming the Polish Government to Have I Been Pwned

Continuing the rollout of Have I Been Pwned (HIBP) to national governments around the world, today I'm very happy to welcome Poland to the service! The Polish CSIRT GOV is now the 34th onboard the service and has free and open access to APIs allowing them to query their government domains.

Seeing the ongoing uptake of governments using HIBP to do useful things in the wake of data breaches is enormously fulfilling and I look forward to welcoming many more national CSIRTs in the future.

Weekly Update 302

By Troy Hunt
Weekly Update 302

In a complete departure from the norm, this week's video is the much-requested "cultural differences" one with Charlotte. No tech (other than my occasional plug for the virtues of JavaScript), but lots of experiences from both of us living and working in different parts of the world. Most of it is what Charlotte has learned being thrown into the deep end of Aussieness (without the option of even getting out of the country until very recently), which I thought made for some pretty funny viewing 🀣

We almost got through the entire content I had planned... then my phone went into battery saving mode and killed the mic so apologies for that last little bit of missing content. But hey, it was worth it when the battery was low due to capturing these epic shots earlier in the day:

Stunning 🀩 pic.twitter.com/s1TRJ3bcb1

β€” Troy Hunt (@troyhunt) July 1, 2022

I think this made for fun viewing with heaps of audience engagement, I hope you enjoy watching it 😊

Weekly Update 302
Weekly Update 302
Weekly Update 302
Weekly Update 302

References

  1. Sponsored by: Detack. Detect & prevent weak, leaked, shared passwords with EPAS, a patented, privacy compliant solution used in 40 countries. Try it free!

Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity

By Troy Hunt
Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity

Four and a half years ago now, I rolled out version 2 of HIBP's Pwned Passwords that implemented a really cool k-anonymity model courtesy of the brains at Cloudflare. Later in 2018, I did the same thing with the email address search feature used by Mozilla, 1Password and a handful of other paying subscribers. It works beautifully; it's ridiculously fast, efficient and above all, anonymous. Yet from time to time, I get messages along the lines of this:

Why are you using SHA-1? It's insecure and deprecated.

Or alternatively:

Our [insert title of person who fills out paperwork but has no technical understanding here] says that k-anonymity involves sending you PII.

Both these positions make no sense whatsoever when you peel back the covers and understand what's happening underneath, but I get how on face value these conclusions can be drawn. So, let's settle it here in a more complete fashion than what I can do via short tweets or brief emails.

SHA-1 is Just Fine for k-Anonymity

Let's begin with the actual problem SHA-1 presents. Actually, the multiple problems, the first of which is that it's just way too fast for storing user passwords in an online system. More than a decade ago now, I wrote about how Our Password Hashing Has no Clothes and in that post, showed the massive rate at which consumer-grade hardware can calculate these hashes and consequently "crack" the password. Since that time, Moore's Law has done its thing many times over making the proposition of SHA-1 (or SHA-256 or SHA-512) even worse than before. For a modern day reference of how you should be storing passwords, check out OWASP's Password Storage Cheat Sheet.

The other problem relates to how SHA-1 is used for integrity checks. Hashing algorithms provide an efficient means of comparing two files and establishing if their contents is the same due to the deterministic nature of the algorithm (the same input always produces the same output). If a trustworthy source says "the hash of the file is 3713..42" (shown in abbreviated form) then any file with that same hash is assumed to be the same as the one described by the trustworthy source. We use hashes all over the place for precisely this purpose; for example, if I wanted to download Windows 11 Business Editions from my MSDN subscription, I can refer to the hash Microsoft provides on the download page:

Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity

After download, I can then use a utility such as PowerShell's Get-FileHash to verify that the file I downloaded is indeed the same one listed above. (There's another rabbit hole we can go down about how you trust the hash above, but I'll leave that for another post.)

We also use hashes when implementing subresource integrity (SRI) on websites to ensure external dependencies haven't been modified. Every time this very blog loads Font Awesome from Cloudflare's CDN, for example, it's verified against the hash in the integrity attribute of the script tag (view source for yourself).

And finally (although not exhaustively - there are many other places we use hashing algorithms in tech), we use hashing algorithms on digital certificate signatures. To pick another example from this blog, the certificate issued by Cloudflare uses SHA-256 as the signature hash algorithm:

Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity

But ponder this: if a hashing algorithm always produces a fixed length output (in the case of SHA-1, it's 40 hexadecimal characters), then there are a finite number of hashes in the world. In that SHA-1 example, the finite number is 16^40 as there are 16 possible values (0-9 and a-f) and 40 positions for them. But how many different input strings are there in the world? Infinite! So, there must be multiple input strings that produce the same output, and this is what we refer to as a "hash collision". It's possible for this to occur naturally, although it's exceedingly unlikely simply due to the massive number of possibilities 16^40 presents. However, what if you could manufacture a hash collision? I mean what if you could take an existing hash for an existing document and say "I'm going to create my own document that's different but when passed through SHA-1, produces the same hash!"?

Half a decade ago now, Google researchers demonstrated precisely this with their SHAttered attack. Their simple infographic tells the story:

Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity

And this is the heart of the integrity problem with SHA-1: it's simply past its used by date as an algorithm we can be confident in. That's why the signature hash algorithm of the TLS cert on this blog uses SHA-256 instead, among other examples of where we've eschewed the weaker algorithm in favour of stronger variants.

So, now that you understand the problem with SHA-1, let's look at how it's used in HIBP and why it isn't a problem there. There are actually 2 reasons, and I'll start with a sample of passwords used in Pwned Passwords:

P@ssw0rd
abc123
635,someone@example.com,+61430978216,37 example street
money
qwerty

That middle line isn't a password, it's a parsing problem. Not necessarily my parsing problem, it just turns out that you can't always trust hackers to dump breached data in a clean format πŸ€·β€β™‚οΈ So, instead of providing passwords to people in plain text format, I provide them as SHA-1 hashes:

21BD12DC183F740EE76F27B78EB39C8AD972A757
6367C48DD193D56EA7B0BAAD25B19455E529F5EE
A4DDCDA001E137C72FF8259F36BC67C5F9E083AA
C95259DE1FD719814DAEF8F1DC4BD64F9D885FF0
B1B3773A05C0ED0176787A4F1574FF0075F7521E

4 of those hashes are easily cracked (Google is great at that, just try searching for the first one) and that's just fine; nobody is put at risk by learning that some unidentified party used a common password. The 1 hash that won't yield any search results (until Google indexes this blog post...) is the middle one. The fact that SHA-1 is fast to calculate and has proven hash collision attacks against its integrity doesn't diminish the purpose it serves in protecting badly parsed data.

The second reason is best explained by walking through the process of how the API is queried. Let's take an example of someone signing up to a website with the following password:

P@ssw0rd

This will pass many password complexity criteria (uppercase, lowercase, number, non-alphanumeric character, 8 chars long) but is clearly terrible. Because they're signing up to a responsible website that checks Pwned Passwords on registration, that website now creates a SHA-1 hash of the provided password:

21BD12DC183F740EE76F27B78EB39C8AD972A757

Let's pause here for a sec: whether it's a hash of a password or a hash of an email address, what we're looking at is a pseudonymous representation of the original data. There's no anonymity of substance achieved here because in the specific case above, you can simply Google the hash and in the case of an email address, you can determine with near certainty (hash collisions aside), if a given plain text email address is the one used to generate the hash.

This, however, is a different story:

21BD1

This is the first 5 characters only of the hash and it's passed to the Pwned Passwords API as follows:

https://api.pwnedpasswords.com/range/21BD1

You can easily run this yourself and see the result but to summarise, the API then responds with 788 lines, including the following 5:

2D6980B9098804E7A83DC5831BFBAF3927F:1
2D8D1B3FAACCA6A3C6A91617B2FA32E2F57:1
2DC183F740EE76F27B78EB39C8AD972A757:83129
2DE4C0087846D223DBBCCF071614590F300:3
2DEA2B1D02714099E4B7A874B4364D518F6:1

What we're looking at here is the hash suffix of every hash that begins with 21BD1 followed by the number of times that password has been seen. Turns out that "P@ssw0rd" ain't a great choice as it's the one in the middle that's been seen over 83k times. The consumer of the Pwned Passwords service knows it's this one because when combined with the prefix, it's a perfect match to the full hash of the password. I'll touch more on the mathematical properties of this in a moment, for now I want to explain the second reason why SHA-1 is used:

SHA-1 makes it very easy to segment the entire corpus of hashes into roughly equal equivalent sized chunks that can be queried by prefix. As I already touched on, there are 16^5 different possible hash prefixes which is specifically 1,048,576 or "roughly a million". Not every hash prefix has 788 associated suffixes, some have more and others less but if we take that as an average, that explains how the approximately 850M passwords in the service are divided down into a million smaller collections.

Why the first 5 characters? Because if it was the first 4 then each response would be 16 times larger and it would start hurting response times. If it was the first 6 then each response would be 16 times smaller and it would start hurting anonymity. 5 characters was the sweet spot between the two.

Why not SHA-256? Instead of 40 characters each hash would be 64 characters and whilst I could have achieved the same anonymity properties by still just using the first 5 characters of the hash, each suffix in the response would be an additional 24 characters and multiplying that 788 times over adds multiple kb to each response, even when compressed on the transport layer. It's also a slower hashing algorithm; still totally unsuitable for storing user passwords in an online system, but it can have a hit on the consuming service if doing huge amounts of calculations. And for what? Integrity doesn't matter because there's no value in modifying the source password to forge a colliding hash. You'd further increase the anonymity by 16^24 more possibilities, but then why not use SHA-512 which is 128 characters therefore another 16^64 possibilities than even SHA-256? Because, as you'll read in the next section, even SHA-1 provides way more practical anonymity than you'll ever need anyway.

In summary, think of the choice of SHA-1 simply being to obfuscate poorly parsed input data to protect inadvertently included info, and as a means of dividing the collection of data down into nice easily segmentable and queryable collections. If your position is "SHA-1 is broken", then you simply don't understand its purpose here.

PII and the Protection Provided by k-Anonymity

Let's turn the discussion more to the privacy aspects of the email address search I mentioned earlier on. The principles are identical to the password search but for one difference in the technical implementation: queries are done on the first 6 characters of a SHA-1 hash, not the first 5. The reason is simple: there are a lot more email addresses in the system than passwords, about 5 billion in total. Querying via the first 6 characters of a SHA-1 hash means there are 16 times more possibilities than with the password search, therefore 16^6 or just over 16M. Let's take this email address:

test@example.com

Which hashes down to this value with SHA-1:

567159D622FFBB50B11B0EFD307BE358624A26EE

And similar to the password search, it's only the prefix that is sent to HIBP when performing a query:

567159

So, putting the privacy hat on, what's the risk when a service sends this data to HIBP? Mathematically, with the next 34 characters unknown, there are 16^34 different possible hashes that this prefix could belong to. Just to really labour the point, given a 6 character SHA-1 hash prefix you could take a 1 in 87,112,285,931,760,200,000,000,000,000,000,000,000,000 guess as to what the full hash prefix is. And then due to the infinite number of potential input strings, multiply that number out to... well... infinity. That's the total number of possible email addresses it could represent. By any definition of the term, those first 6 characters tell you absolutely nothing useful about what email address is being searched for.

But we're left with a more semantic, possibly philosophical question: is "567159" personally identifiable information? In practice, no, for all intents and purposes it's impossible to tell who this belongs to without the remaining 34 characters and even then, you still need to be able to crack that hash which is most likely only going to happen if you have a dictionary of email address to work through in which the given one appears. But it's derived from pseudonymous PII, and this is where the occasional [insert title of person who fills out paperwork but has no technical understanding here] loses their mind.

To explain this in more colloquial terms, it's like saying that the "t" at the beginning of the email address I used above is personally identifying. Really? My own email address begins with a "t", so it must be mine! It's a nonsense argument.

I'll wrap up with a definition and I like NIST's the best, not just because it's clear and concise but because they're a great authoritative source on this sort of thing (it was actually their guidance on prohibiting passwords from previous breach corpuses that led me to create Pwned Passwords in the first place):

Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means.

Phone numbers are PII. Physical addresses are PII. IP addresses are PII. The first 6 characters of a SHA-1 hash of someone's email address is not PII.

Summary

None of the misunderstandings I've explained above have dented the adoption of these services. Pwned Passwords is now doing in excess of 2 billion queries a month and has an ongoing feed of new passwords directly from the FBI. The k-anonymity search for email addresses sees over 100M queries a month and is baked into everything from browsers to password managers to identity theft services. The success of these services isn't due to any technical genius on my part (hat-tip again to Cloudflare), but rather to their simple yet effective implementations that (almost) everyone can easily understand 😊

Weekly Update 301

By Troy Hunt
Weekly Update 301

First up, I'm really sorry about the audio quality on this one. It's the exact same setup I used last week (and carefully tested first) but it's obviously just super sensitive to the wind. If you look at the trees in the background you can see they're barely moving, but inevitably that was enough to really mess with the audio quality. I do actually have a windsock for the mic, but it's in a drawer at home so for the remainder of this trip it'll be indoor recording only. Speaking of which, because there was a lot of enthusiasm for Charlotte and I to do one together on the cultural differences we've both experienced living in different parts of the world, that'll be next week's video. Less techie, but hopefully something you'll all enjoy 😊

Weekly Update 301
Weekly Update 301
Weekly Update 301
Weekly Update 301

References

  1. NDC Melbourne was very much like a reunion being the first NDC event we've been back to since London in Jan 2020 (and being able to share it with the kids made it extra special 😊)
  2. The travel thread continues, with much more to come yet before hitting home (a lot of gorgeous Aussie countryside scenes in there, and the best is yet to come)
  3. Sixt had a data breach (but don't worry, lots of European companies are being hacked!)
  4. Sponsored by: Varonis for Salesforce. Protect Salesforce data from overexposure and cyberthreats. Try it free!

Weekly Update 300

By Troy Hunt
Weekly Update 300

Well, we're about 2,000km down on this trip and are finally in Melbourne, which was kinda the point of the drive in the first place (things just escalated after that). The whole journey is going into a long tweet thread you can find below (or mute - that's partly why it's in a single thread):

It’s time for the next great road trip 🏎 pic.twitter.com/9B9k9cXQvH

β€” Troy Hunt (@troyhunt) June 14, 2022

Next week is NDC Melbourne so please get along to the event if you're in town, it's kinda amazing to think I'll finally be back at an NDC after all this time 😊

Weekly Update 300
Weekly Update 300
Weekly Update 300
Weekly Update 300

References

  1. We're on another epic road trip (that's the tweet thread, I'll keep adding to it as we go)
  2. Been listening to the Hardcore History podcast which is epic... (...but very heavy listening I need to break into smaller sessions)
  3. It's NDC Melbourne nest week! (my first time back at an NDC since London in early 2020, and the inaugural event for Melbourne)
  4. The DivX SubTitles breach was 783k records worth of plain text passwords (it's a 12-year-old incident, but still...)
  5. Sponsored by: Meet compliance objectives in a remote-first world without resorting to rigid device management. Try Kolide for 14-days free!

Weekly Update 299

By Troy Hunt
Weekly Update 299

How on earth does an enterprise rack-mounted NAS not come with rails to actually install it in the rack?! So yeah, that's what's in the box, something that should have been in the original box and not in a separate purchase. Just to add to the Synology packaging insanity, I went to install a couple of spare NVMe drives in it today and... there were no screws in the NVMe slots πŸ€¦β€ I'll be doing the next four weekly updates from various locations around the country as we hit the road again, stay tuned for epic tweet threads of amazing locations 😎

Weekly Update 299
Weekly Update 299
Weekly Update 299
Weekly Update 299

References

  1. The MyElectronics.nl Raspberry Pi racks are really sweet (the rack is looking pretty slick now!)
  2. Apple Watch fall detection is pretty amazing when you actually see it work as intended (I've had lots of easily dismissible false-positives on mine, but my father just demonstrated precisely how it's meant to work)
  3. A lot of personal finance is just basic maths and simple market observations (why is anyone even remotely surprised that interest rates are going up?!)
  4. The Indonesian government is now the 33rd gov on board HIBP (also the first one from Asia)
  5. Sponsored by: Varonis for Salesforce. Detect suspicious behavior and strengthen your Salesforce security posture. Try it free!

❌