There are new available articles, click to refresh the page.

Before yesterdayTroy Hunt

Weekly Update 307

By Troy Hunt

A very early weekly update this time after an especially hectic week. The process with the couple of data breaches in particular was a real time sap and it shouldn't be this hard. Seriously, the amount of effort that goes into trying to get organisations to own their breach (or if they feel strongly enough about it, help attribute it to another party) is just nuts. It's not getting any better either 🙁 Regardless, listen to how these couple went and as always, if you've got any bright ideas about how to make this process less painful then I'd love to hear them.

References

The 3D models of Looney Toons characters are so cool! (were you looking for a "good" reason to get a 3D printer? 😊)
The bloke behind some nasty stalkerware has now been charged (how he managed to run this for 9 years is a bit beyond me...)
Just read the mSpy website about how by design, it's intended to evade detection and run surreptitiously ("think of the children" is a rubbish excuse)
Speaking of thinking of the children, here's how to do it right (native controls, conversations with your kids and simply being present)
There's lots of pointed to Tuned Global being involved in a breach (the way those dots line up with JB Hi-Fi's music service is the real smoking gun)
I ended up doing a long tweet thread on the QuestionPro breach right after this week's video (I've also now removed the "unverified" flag)
Ah, spammer hell 😈 (as I say in the vid, some tweaking of my initial email response is probably required to maximise the success rate)
Sponsored by: Cloudflare. Speed up and protect your apps, APIs and websites with the world's fastest DNS. Add CDN, SSL, WAF, bot management and much more.

August 6^th 2022 at 05:43

Troy Hunt
Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV
August 3^rd 2022 at 21:09

Sending Spammers to Password Purgatory with Microsoft Power Automate and Cloudflare Workers KV

By Troy Hunt

How best to punish spammers? I give this topic a lot of thought because I spend a lot of time sifting through the endless rubbish they send me. And that's when it dawned on me: the punishment should fit the crime - robbing me of my time - which means that I, in turn, need to rob them of their time. With the smallest possible overhead on my time, of course. So, earlier this year I created Password Purgatory with the singular goal of putting spammers through the hellscape that is attempting to satisfy really nasty password complexity criteria. And I mean really nasty criteria, like much worse than you've ever seen before. I opened-sourced it, took a bunch of PRs, built out the API to present increasingly inane password complexity criteria then left it at that. Until now because finally, it's live, working and devilishly beautiful 😈

Step 1: Receive Spam

This is the easy bit - I didn't have to do anything for this step! But let me put it into context and give you a real world sample:

Ugh. Nasty stuff, off to hell for them it is, and it all begins with filing the spam into a special folder called "Send Spammer to Password Purgatory":

That's the extent of work involved on a spam-by-spam basis, but let's peel back the covers and look at what happens next.

Step 2: Trigger a Microsoft Power Automate Flow

Microsoft Power Automate (previously "Microsoft Flow") is a really neat way of triggering a series of actions based on an event, and there's a whole lot of connectors built in to make life super easy. Easy on us as the devs, that is, less easy on the spammers because here's what happens as soon as I file an email in the aforementioned folder:

Using the built in connector to my Microsoft 365 email account, the presence of a new email in that folder triggers a brand new instance of a flow. Following, I've added the "HTTP" connector which enables me to make an outbound request:

All this request does is makes a POST to an API on Password Purgatory called "create-hell". It passes an API key because I don't want just anyone making these requests as it will create data that will persist at Cloudflare. Speaking of which, let's look at what happens over there.

Step 3: Call a Cloudflare Worker and Create a Record in KV

Let's start with some history: Back in the not too distant past, Cloudflare wasn't a host and instead would just reverse proxy requests through to origin services and do cool stuff with them along the way. This made adding HTTPS to any website easy (and free), added heaps of really neat WAF functionality and empowered us to do cool things with caching. But this was all in-transit coolness whilst the app logic, data and vast bulk of the codebase sat at that origin site. Cloudflare Workers started to change that and suddenly we had code on the edge running in hundreds of nodes around the world, nice and close to our visitors. Did that start to make Cloudflare a "host"? Hmm... but the data itself was still on the origin service (transient caching aside). Fast forward to now and there are multiple options to store data on Cloudflare's edges including their (presently beta) R2 service, Durable Objects, the (forthcoming) D1 SQL database and of most importance to this blogpost, Workers KV. Does this make them a host if you can now build entire apps within their environment? Maybe so, but let's skip the titles for now and focus on the code.

All the code I'm going to refer to here is open source and available in the public Password Purgatory Logger Github repo. Very early on in the index.js file that does all the work, you'll see a function called "createHell" which is called when the flow step above runs. That code creates a GUID then stores it in KV after which I can easily view it in the Cloudflare dashboard:

There's no value yet, just a key and it's returned via a JSON response in a property called "kvKey". To read that back in the flow, I need a "Parse JSON" step with a schema I generated from a sample:

At this point I now have a unique ID in persistent storage and it's available in the flow, which means it's time to send the spammer an email.

Step 4: Invite the Spammer to Hell

Because it would be rude not to respond, I'd like to send the spammer back an email and invite them to my very special registration form. To do this, I've grabbed the "Reply to email" connector and fed the kvKey through to a hyperlink:

It's an HTML email with the key hidden within the hyperlink tag so it doesn't look overtly weird. Using this connector means that when the email sends, it looks precisely like I've lovingly crafted it myself:

With the entire flow now executed, we can view the history of each step and see how the data moves between them:

Now, we play the waiting game 😊

Step 5: Log Spammer Pain

Wasting spammer time in and of itself is good. Causing them pain by having them attempt to pass increasingly obtuse password complexity criteria is better. But the best thing - the pièce de résistance - is to log that pain and share it publicly for our collective entertainment 🤣

So, by following the link the spammer ends up here (you're welcome to follow that link and have a play with it):

The kvKey is passed via the query string and the page invites the spammer to begin the process of becoming a partner. All they need to leave is an email address... and a password. That page then embeds 2 scripts from the Password Purgatory website, both of which you can find in the open source and public Github repository I created in the original blog post. Each attempt at creating an account sends off the password only to the original Password Purgatory API I created months ago, after which it responds with the next set of criteria. But each attempt also sends off both the criteria that was presented (none on the first go, then something increasingly bizarre on each subsequent go), the password they tried to use to satisfy the criteria and the kvKey so it can all be tied together. What that means is that the Cloudflare Workers KV entry created earlier gradually builds up as follows:

There are a couple of little conditions built into the code:

If a kvKey is passed in the log request that doesn't actually exist on Cloudflare, HTTP 404 is returned. This is to ensure randos out there don't attempt to submit junk logs into KV.
Once the first password is logged, there's a 15 minute window within which any further passwords can be logged. The reason is twofold: firstly, I don't want to share the spammers attempts publicly until I'm confident no more passwords can be logged just in case they add PII or something else inappropriate. Secondly, once they know the value of the kvKey a non-spammer could start submitting logs (for example, when I tweet it later on or share it via this blog post).

That's everything needed to lure the spammer in and record their pain, now for the really fun bit 😊

Step 6: Enjoy Revelling in Spammer Pain

The very first time the spammer's password attempt is logged, the Cloudflare Worker sends me an email to let me know I have a new spammer hooked (this capability using MailChannels only launched this year):

It was so exciting getting this email yesterday, I swear it's the same sensation as literally getting a fish on your line! That link is one I can share to put the spammer's pain on display for the world to see. This is achieved with another Cloudflare Workers route that simply pulls out the logs for the given kvKey and formats it neatly in an HTML response:

Ah, satisfaction 😊 I listed the amount of time the spammer burned with a goal to further refining the complexity criteria in the future to attempt to keep them "hooked" for longer. Is the requirement for a US post code in the password a bit too geographically specific, for example? Time will tell and I wholeheartedly welcome PRs to that effect in the original Password Purgatory API repo.

Oh - and just to ensure traction and exposure are maximised, there's a neatly formatted Twitter card that includes the last criteria and password used, you know, the ones that finally broke the spammer's spirit and caused them to give up:

Spammer burned a total of 80 seconds in Password Purgatory 😈 #PasswordPurgatory https://t.co/VwSCHNZ2AW
— Troy Hunt (@troyhunt) August 3, 2022

Summary

Clearly, I've taken a great deal of pleasure in messing with spammers and I hope you do too. I've gotta be honest - I've never been so excited to go through my junk mail! But I also thoroughly enjoyed putting this together with Power Automate and Workers KV, I think it's super cool that you can pull an app together like this with a combination of browser-based config plus code and storage that runs directly in hundreds of globally distributed edge nodes around the world. I hope the spammers appreciate just how elegant this all is 🤣

Related tags
- ❌
- Cloudflare
August 3^rd 2022 at 21:09

Troy Hunt
Weekly Update 306
July 31^st 2022 at 02:07

Weekly Update 306

By Troy Hunt

I didn't intend for a bunch of this week's vid to be COVID related, but between the breach of an anti-vaxxer website and the (unrelated) social comments directed at our state premier following some pretty simple advice, well, it just kinda turned out that way. But there's more on other breaches too, in particular the alleged Paytm one and the actual Customer.io one.

I'm really looking forward to next week's update, here's a little teaser of what you can expect to hear about then 🤣

References

I've updated the Paytm data breach to be flagged as "fabricated" (full thread on the reasons why, it's a tricky one)
Anti-vax dating site that let people advertise ‘mRNA FREE’ semen left all its user data exposed (😲😳😲)
I'm genuinely sympathetic to all politicians on any side of the political fence who have to deal with the COVID mess (just read the volume of ridiculous crap they're at the receiving end of)
We're still seeing the long tail of the Customer.io data breach (protecting against malicious insiders is a hard one)
Sponsored by: Kolide is an endpoint security solution for teams that want to meet SOC2 compliance goals without sacrificing privacy. Learn more here.

Related tags
July 31^st 2022 at 02:07

Troy Hunt
Weekly Update 305
July 22^nd 2022 at 06:31

Weekly Update 305

By Troy Hunt

I broke Yoda's stick! 3D printing woes, and somehow I managed to get through the explanation without reverting to a chorus of My Stick by a Bad Lip Reading (and now you'd got that song stuck in your head). Loads of data breaches this week and whilst "legacy", still managed to demonstrate how bad some practices remain today (hi Shadi.com 👋). Never a dull moment in data breach land, more from there next week 😊

References

The Yoda 3D print looks amazing (just don't touch his stick)
New flash - social media platform collects lots of data! (seriously, the TicTok hyperbole got a bit too much this week)
What if... some free stuff is actually free? (you're not always "the product" and in many cases, that's frankly a pretty disingenuous term)
Sponsored by: Kolide is a fleet visibility solution for Mac, Windows, and Linux that can help you securely scale your business. Learn more here.

Related tags
July 22^nd 2022 at 06:31

Troy Hunt
If You're Not Paying for the Product, You Are... Possibly Just Consuming Goodwill for Free
July 18^th 2022 at 07:03

If You're Not Paying for the Product, You Are... Possibly Just Consuming Goodwill for Free

By Troy Hunt

How many times have you heard the old adage about how nothing in life is free:

If you're not paying for the product, you are the product

Facebook. LinkedIn. TikTok. But this isn't an internet age thing, the origins go back way further, originally being used to describe TV viewers being served ads. Sure, TV was "free" in that you don't pay to watch it (screwy UK TV licenses aside), but running a television network ain't cheap so it was (and still is) supported by advertisers paying to put their message in front of viewers. A portion of those viewers then go out and buy the goods and services they've been pitched hence becoming the "product" of TV.

But what I dislike - no, vehemently hate - is when the term is used disingenuously to imply that nobody ever does anything for free and that there is a commercial motive to every action. To bring it closer to home for my audience, there is a suggestion that those of us who create software and services must somehow be in it for the money. Our time has a value. We pay for hardware and software to build things. We pay for hosting services. If not to make money, then why would we do it?

There are many, many non-financial motives and I'm going to talk about just a few of my own. In my very first ever blog post almost 13 years ago now, I posited that it was useful to one's career to have an online identity. My blog would give me an opportunity to demonstrate over a period of time where my interests lie and one day, that may become a very useful thing. Nobody that read that first post became a "product", quite the contrary if the feedback is correct.

The first really serious commitment I made to blogging was the following year when I began the OWASP Top 10 for ASP.NET series. That was ten blog posts of many thousands of words each that took a year and a half to complete. I had the idea whilst literally standing in the shower one day thinking about the things that bugged me at work: "I'm so sick of sending developers who write code for us basic guidance on simple security things". I wanted to solve that problem, and as I started writing the series, it turned out to be useful for a whole range of people which was awesome! Did that make them the product? No, of course not, it just made them a consumer of free content.

I can't remember exactly when I put ads on my blog. I think it was around the end of 2012, and they were terrible! I made next to no money out of them and I got rid of them altogether in 2016 in favour of the sponsorship line of text you still see at the top of the page today. Did either of these make viewers "the product" in a way that they weren't when reading the same content prior to their introduction? By any reasonable measure, no, not unless you stretch reality far enough to claim that the ads consumed some of their bandwidth or device power or in some other way was detrimental such that they pivoted from being a free consumer to a monetised reader. Then that argument dies when ads rolled to sponsorship. Perhaps it could be claimed that people became the product because the very nature of sponsorship is to get a message out there which may one day convert visitors (or their employers) to customers and that's very true, but that doesn't magically pivot them from being a free consumer of content to a "product" at the moment sponsorship arrived, that's a nonsense argument.

How about ASafaWeb in 2011? Totally free and designed to solve the common problem of ASP.NET website misconfiguration. I never made a cent from that. Never planned to, never did. So why do it? Because it was fun 🙂 Seriously, I really enjoyed building that service and seeing people get value from it was enormously fulfilling. Of course nobody was the product in that case, they just consumed something for free that I enjoyed building.

Which brings me to Have I Been Pwned (HIBP), the project that's actually turned out to be super useful and is the most frequent source of the "if you're not paying for the product" bullshit argument. There were 2 very simple reasons I built that and I've given this same answer in probably a hundred interviews since 2013:

I wanted to build something on Azure in anger. I was trying to drive Pfizer (where I worked at the time) down the cloud path and in particular, towards PaaS. I wanted to learn more about modern cloud paradigms myself and I didn't want to build "Hello World", so HIBP seemed like a good way to achieve this.
I wanted to build a data breach search service. Ok, obvious answer, but I'd just found both my personal and Pfizer email addresses in the Adobe data breach which was somewhere I never expected to see them. But I'd given them to Macromedia (Dreamweaver FTW!) and they subsequently flowed to the new parent company after the acquisition.

That's it. Those 2 reasons. No visions of grandeur, no expectation of a return on my time, just itches I wanted to scratch. Months later, I posed this question:

A number of people have asked for a donate button on @haveibeenpwned. What do you think? Worth donating to? Or does it come across as cheap?
— Troy Hunt (@troyhunt) March 7, 2014

Which is exactly what it looks like on face value: people appreciating the service and wanting to support what I was doing. It didn't make anyone "the product". Nor did the first commercial use of HIBP the following year make anyone a product, it didn't change their experience one little bit. The partnership with 1Password several years later is the same again; arguably, it made HIBP more useful for the masses or non-techies that had never given any consideration to a password manager.

What about Why No HTTPS? Definitely not a product either as the service itself or the people that use it. Or HTTPS is Easy? Nope, and Cloudflare certainly didn't pay me a cent for it either, they had no idea I was building it, I just got up and felt like it one day. Password Purgatory? I just want to mess with spammers, and I'm happy to spend some of my time doing that 😊 (Unless... do they become the product if their responses are used for our amusement?!) And then what must be 100+ totally free user group talks, webinars, podcasts and other things I can't even remember that by their very design, were simply intended to get information to people for free.

What gets me a bit worked up about the "you're the product" sentiment is that it implies there's an ulterior motive for any good deed. I'm dependent on a heap of goodwill for every single project I build and none of that makes me feel like "the product". I use NWebsec for a bunch of my security headers. I use Cloudflare across almost every single project (they provide services to HIBP for free) and that certainly doesn't make me a product. The footer of this blog mentions the support Ghost Pro provides me - that's awesome, I love their work! But I don't feel like a "product".

Conversely, there are many things we pay for yet we remain "the product" of by the definition referred to in this post. YouTube Premium, for example, is worth every cent but do you think you cease being "the product" once you subscribe versus when you consume the service for free? Can you imagine Google, of all companies, going "yeah, nah, we don't need to collect any data from paying subscribers, that wouldn't be cool". Netflix. Disqus. And pretty much everything else. Paying doesn't make you not the product any more than not paying makes you the product, it's just a terrible term used way too loosely and frankly, often feels insulting.

Before jumping on the "you're the product" bandwagon, consider how it makes those who simply want to build cool stuff and put it out there for free feel. Or if you're that jaded and convinced that everything is done for personal fulfilment then fine, go and give me a donation. And now you're thinking "I bet he wrote this just to get donations" so instead, go and give Let's Encrypt a donation... but then that would kinda make free certs a commercial endeavour! See how stupid this whole argument is?

Related tags
July 18^th 2022 at 07:03

Troy Hunt
Weekly Update 304
July 16^th 2022 at 04:15

Weekly Update 304

By Troy Hunt

It's very much a last-minute agenda this week as I catch up on the inevitable post-travel backlog and pretty much just pick stuff from my tweet timeline over the week 😊 But hey, there's some good stuff in there and I still managed to knock out almost an hour worth of content!

References

La Poste Mobile got themselves ransom'd and their data dumped (and they're still offline)
Mangatoon are very clearly covering up their breach (which is now hard to do given it's in HIBP and received plenty of press)
The "Seconds" app is my secret presenting sauce! (any workout app that can run a sequence of timed intervals will do it)
I'm totally loving Apple's AirTags to track all my things! (not loving that my AMG is still sitting Melbourne 🤦‍♂️)
The Wi-Fi BBQ thermometer is actually really neat (and it does benefit from being connected, too)
Sponsored by: Kolide can help you nail third-party audits and internal compliance goals with endpoint security for your entire fleet. Learn more here.

Related tags
July 16^th 2022 at 04:15

Troy Hunt
Weekly Update 303
July 9^th 2022 at 23:21

Weekly Update 303

By Troy Hunt

And we're finally done with this trip. 26 days, 14 different accommodations, 5,146km of driving through 4 states and the last 4 weekly vids all done on the road. Travel is great, but right now going home is even better 😊 Next week's vid will be back in my comfy office with good lighting, video, audio and better planning. Until then, here's a (late) weekly update 303:

References

If you're going to scrape someone else's content, don't embed the images directly off their site! (referrer header based Rickrolls 😎)
The Shanghai police data breach is massive... (if it turns out to be legitimate)
SHA-1 is fine and k-anonymity isn't PII (and frankly, if an organisation doesn't understand these simple facts, they've got bigger issues to deal with)
The Polish government is the 34th to use HIBP's gov service (and I'm still toying with the idea of doing a "visit all the govs" tour one day)
My 12th MVP award came in this week (it's still such an important part of my career 😊)
Sponsored by: CrowdSec - The open-source & collaborative security stack: respond to attacks & share signals across the community. Download it for free

Related tags
July 9^th 2022 at 23:21

Troy Hunt
MVP Award 12
July 6^th 2022 at 21:55

MVP Award 12

By Troy Hunt

11 years now, wow 😲 It's actually 11 and a bit because it was April Fool's Day in 2011 that my first MVP award came through. At the time, I referred to myself as "The Accidental MVP" as I'd no expectation of an award, it just came from me being me. It's the same again today, and the last year has been full of just doing the stuff I love; loads of talks (which, like the one above at AusCERT, are actually starting to happen in front of real live humans again), live streams every week, blog posts and perhaps my favourite thing of all, open sourcing Pwned Passwords and standing up an ingestion pipeline for the FBI. Cool 😎

But it has to be said that all these things only happen through the support of the community. There'd be no open source Pwned Passwords if nobody wanted to contribute, no live streams or blog posts if people didn't want to watch them and no conference talks if nobody attended. So, thank you for tuning in and giving me a platform to do what I love 😊

Related tags
- ❌
- MVP
July 6^th 2022 at 21:55

Troy Hunt
Welcoming the Polish Government to Have I Been Pwned
July 4^th 2022 at 07:11

Welcoming the Polish Government to Have I Been Pwned

By Troy Hunt

Continuing the rollout of Have I Been Pwned (HIBP) to national governments around the world, today I'm very happy to welcome Poland to the service! The Polish CSIRT GOV is now the 34th onboard the service and has free and open access to APIs allowing them to query their government domains.

Seeing the ongoing uptake of governments using HIBP to do useful things in the wake of data breaches is enormously fulfilling and I look forward to welcoming many more national CSIRTs in the future.

Related tags
- ❌
- Government
- Have
- I
- Been
- Pwned
July 4^th 2022 at 07:11

Troy Hunt
Weekly Update 302
July 2^nd 2022 at 07:23

Weekly Update 302

By Troy Hunt

In a complete departure from the norm, this week's video is the much-requested "cultural differences" one with Charlotte. No tech (other than my occasional plug for the virtues of JavaScript), but lots of experiences from both of us living and working in different parts of the world. Most of it is what Charlotte has learned being thrown into the deep end of Aussieness (without the option of even getting out of the country until very recently), which I thought made for some pretty funny viewing 🤣

We almost got through the entire content I had planned... then my phone went into battery saving mode and killed the mic so apologies for that last little bit of missing content. But hey, it was worth it when the battery was low due to capturing these epic shots earlier in the day:

Stunning 🤩 pic.twitter.com/s1TRJ3bcb1
— Troy Hunt (@troyhunt) July 1, 2022

I think this made for fun viewing with heaps of audience engagement, I hope you enjoy watching it 😊

References

Sponsored by: Detack. Detect & prevent weak, leaked, shared passwords with EPAS, a patented, privacy compliant solution used in 40 countries. Try it free!

Related tags
July 2^nd 2022 at 07:23

Troy Hunt
Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity
June 30^th 2022 at 07:21

Understanding Have I Been Pwned's Use of SHA-1 and k-Anonymity

By Troy Hunt

Four and a half years ago now, I rolled out version 2 of HIBP's Pwned Passwords that implemented a really cool k-anonymity model courtesy of the brains at Cloudflare. Later in 2018, I did the same thing with the email address search feature used by Mozilla, 1Password and a handful of other paying subscribers. It works beautifully; it's ridiculously fast, efficient and above all, anonymous. Yet from time to time, I get messages along the lines of this:

Why are you using SHA-1? It's insecure and deprecated.

Or alternatively:

Our [insert title of person who fills out paperwork but has no technical understanding here] says that k-anonymity involves sending you PII.

Both these positions make no sense whatsoever when you peel back the covers and understand what's happening underneath, but I get how on face value these conclusions can be drawn. So, let's settle it here in a more complete fashion than what I can do via short tweets or brief emails.

SHA-1 is Just Fine for k-Anonymity

Let's begin with the actual problem SHA-1 presents. Actually, the multiple problems, the first of which is that it's just way too fast for storing user passwords in an online system. More than a decade ago now, I wrote about how Our Password Hashing Has no Clothes and in that post, showed the massive rate at which consumer-grade hardware can calculate these hashes and consequently "crack" the password. Since that time, Moore's Law has done its thing many times over making the proposition of SHA-1 (or SHA-256 or SHA-512) even worse than before. For a modern day reference of how you should be storing passwords, check out OWASP's Password Storage Cheat Sheet.

The other problem relates to how SHA-1 is used for integrity checks. Hashing algorithms provide an efficient means of comparing two files and establishing if their contents is the same due to the deterministic nature of the algorithm (the same input always produces the same output). If a trustworthy source says "the hash of the file is 3713..42" (shown in abbreviated form) then any file with that same hash is assumed to be the same as the one described by the trustworthy source. We use hashes all over the place for precisely this purpose; for example, if I wanted to download Windows 11 Business Editions from my MSDN subscription, I can refer to the hash Microsoft provides on the download page:

After download, I can then use a utility such as PowerShell's Get-FileHash to verify that the file I downloaded is indeed the same one listed above. (There's another rabbit hole we can go down about how you trust the hash above, but I'll leave that for another post.)

We also use hashes when implementing subresource integrity (SRI) on websites to ensure external dependencies haven't been modified. Every time this very blog loads Font Awesome from Cloudflare's CDN, for example, it's verified against the hash in the integrity attribute of the script tag (view source for yourself).

And finally (although not exhaustively - there are many other places we use hashing algorithms in tech), we use hashing algorithms on digital certificate signatures. To pick another example from this blog, the certificate issued by Cloudflare uses SHA-256 as the signature hash algorithm:

But ponder this: if a hashing algorithm always produces a fixed length output (in the case of SHA-1, it's 40 hexadecimal characters), then there are a finite number of hashes in the world. In that SHA-1 example, the finite number is 16^40 as there are 16 possible values (0-9 and a-f) and 40 positions for them. But how many different input strings are there in the world? Infinite! So, there must be multiple input strings that produce the same output, and this is what we refer to as a "hash collision". It's possible for this to occur naturally, although it's exceedingly unlikely simply due to the massive number of possibilities 16^40 presents. However, what if you could manufacture a hash collision? I mean what if you could take an existing hash for an existing document and say "I'm going to create my own document that's different but when passed through SHA-1, produces the same hash!"?

Half a decade ago now, Google researchers demonstrated precisely this with their SHAttered attack. Their simple infographic tells the story:

And this is the heart of the integrity problem with SHA-1: it's simply past its used by date as an algorithm we can be confident in. That's why the signature hash algorithm of the TLS cert on this blog uses SHA-256 instead, among other examples of where we've eschewed the weaker algorithm in favour of stronger variants.

So, now that you understand the problem with SHA-1, let's look at how it's used in HIBP and why it isn't a problem there. There are actually 2 reasons, and I'll start with a sample of passwords used in Pwned Passwords:

P@ssw0rd
abc123
635,someone@example.com,+61430978216,37 example street
money
qwerty

That middle line isn't a password, it's a parsing problem. Not necessarily my parsing problem, it just turns out that you can't always trust hackers to dump breached data in a clean format 🤷‍♂️ So, instead of providing passwords to people in plain text format, I provide them as SHA-1 hashes:

21BD12DC183F740EE76F27B78EB39C8AD972A757
6367C48DD193D56EA7B0BAAD25B19455E529F5EE
A4DDCDA001E137C72FF8259F36BC67C5F9E083AA
C95259DE1FD719814DAEF8F1DC4BD64F9D885FF0
B1B3773A05C0ED0176787A4F1574FF0075F7521E

4 of those hashes are easily cracked (Google is great at that, just try searching for the first one) and that's just fine; nobody is put at risk by learning that some unidentified party used a common password. The 1 hash that won't yield any search results (until Google indexes this blog post...) is the middle one. The fact that SHA-1 is fast to calculate and has proven hash collision attacks against its integrity doesn't diminish the purpose it serves in protecting badly parsed data.

The second reason is best explained by walking through the process of how the API is queried. Let's take an example of someone signing up to a website with the following password:

P@ssw0rd

This will pass many password complexity criteria (uppercase, lowercase, number, non-alphanumeric character, 8 chars long) but is clearly terrible. Because they're signing up to a responsible website that checks Pwned Passwords on registration, that website now creates a SHA-1 hash of the provided password:

21BD12DC183F740EE76F27B78EB39C8AD972A757

Let's pause here for a sec: whether it's a hash of a password or a hash of an email address, what we're looking at is a pseudonymous representation of the original data. There's no anonymity of substance achieved here because in the specific case above, you can simply Google the hash and in the case of an email address, you can determine with near certainty (hash collisions aside), if a given plain text email address is the one used to generate the hash.

This, however, is a different story:

21BD1

This is the first 5 characters only of the hash and it's passed to the Pwned Passwords API as follows:

https://api.pwnedpasswords.com/range/21BD1

You can easily run this yourself and see the result but to summarise, the API then responds with 788 lines, including the following 5:

2D6980B9098804E7A83DC5831BFBAF3927F:1
2D8D1B3FAACCA6A3C6A91617B2FA32E2F57:1
2DC183F740EE76F27B78EB39C8AD972A757:83129
2DE4C0087846D223DBBCCF071614590F300:3
2DEA2B1D02714099E4B7A874B4364D518F6:1

What we're looking at here is the hash suffix of every hash that begins with 21BD1 followed by the number of times that password has been seen. Turns out that "P@ssw0rd" ain't a great choice as it's the one in the middle that's been seen over 83k times. The consumer of the Pwned Passwords service knows it's this one because when combined with the prefix, it's a perfect match to the full hash of the password. I'll touch more on the mathematical properties of this in a moment, for now I want to explain the second reason why SHA-1 is used:

SHA-1 makes it very easy to segment the entire corpus of hashes into roughly equal equivalent sized chunks that can be queried by prefix. As I already touched on, there are 16^5 different possible hash prefixes which is specifically 1,048,576 or "roughly a million". Not every hash prefix has 788 associated suffixes, some have more and others less but if we take that as an average, that explains how the approximately 850M passwords in the service are divided down into a million smaller collections.

Why the first 5 characters? Because if it was the first 4 then each response would be 16 times larger and it would start hurting response times. If it was the first 6 then each response would be 16 times smaller and it would start hurting anonymity. 5 characters was the sweet spot between the two.

Why not SHA-256? Instead of 40 characters each hash would be 64 characters and whilst I could have achieved the same anonymity properties by still just using the first 5 characters of the hash, each suffix in the response would be an additional 24 characters and multiplying that 788 times over adds multiple kb to each response, even when compressed on the transport layer. It's also a slower hashing algorithm; still totally unsuitable for storing user passwords in an online system, but it can have a hit on the consuming service if doing huge amounts of calculations. And for what? Integrity doesn't matter because there's no value in modifying the source password to forge a colliding hash. You'd further increase the anonymity by 16^24 more possibilities, but then why not use SHA-512 which is 128 characters therefore another 16^64 possibilities than even SHA-256? Because, as you'll read in the next section, even SHA-1 provides way more practical anonymity than you'll ever need anyway.

In summary, think of the choice of SHA-1 simply being to obfuscate poorly parsed input data to protect inadvertently included info, and as a means of dividing the collection of data down into nice easily segmentable and queryable collections. If your position is "SHA-1 is broken", then you simply don't understand its purpose here.

PII and the Protection Provided by k-Anonymity

Let's turn the discussion more to the privacy aspects of the email address search I mentioned earlier on. The principles are identical to the password search but for one difference in the technical implementation: queries are done on the first 6 characters of a SHA-1 hash, not the first 5. The reason is simple: there are a lot more email addresses in the system than passwords, about 5 billion in total. Querying via the first 6 characters of a SHA-1 hash means there are 16 times more possibilities than with the password search, therefore 16^6 or just over 16M. Let's take this email address:

test@example.com

Which hashes down to this value with SHA-1:

567159D622FFBB50B11B0EFD307BE358624A26EE

And similar to the password search, it's only the prefix that is sent to HIBP when performing a query:

So, putting the privacy hat on, what's the risk when a service sends this data to HIBP? Mathematically, with the next 34 characters unknown, there are 16^34 different possible hashes that this prefix could belong to. Just to really labour the point, given a 6 character SHA-1 hash prefix you could take a 1 in 87,112,285,931,760,200,000,000,000,000,000,000,000,000 guess as to what the full hash prefix is. And then due to the infinite number of potential input strings, multiply that number out to... well... infinity. That's the total number of possible email addresses it could represent. By any definition of the term, those first 6 characters tell you absolutely nothing useful about what email address is being searched for.

But we're left with a more semantic, possibly philosophical question: is "567159" personally identifiable information? In practice, no, for all intents and purposes it's impossible to tell who this belongs to without the remaining 34 characters and even then, you still need to be able to crack that hash which is most likely only going to happen if you have a dictionary of email address to work through in which the given one appears. But it's derived from pseudonymous PII, and this is where the occasional [insert title of person who fills out paperwork but has no technical understanding here] loses their mind.

To explain this in more colloquial terms, it's like saying that the "t" at the beginning of the email address I used above is personally identifying. Really? My own email address begins with a "t", so it must be mine! It's a nonsense argument.

I'll wrap up with a definition and I like NIST's the best, not just because it's clear and concise but because they're a great authoritative source on this sort of thing (it was actually their guidance on prohibiting passwords from previous breach corpuses that led me to create Pwned Passwords in the first place):

Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means.

Phone numbers are PII. Physical addresses are PII. IP addresses are PII. The first 6 characters of a SHA-1 hash of someone's email address is not PII.

Summary

None of the misunderstandings I've explained above have dented the adoption of these services. Pwned Passwords is now doing in excess of 2 billion queries a month and has an ongoing feed of new passwords directly from the FBI. The k-anonymity search for email addresses sees over 100M queries a month and is baked into everything from browsers to password managers to identity theft services. The success of these services isn't due to any technical genius on my part (hat-tip again to Cloudflare), but rather to their simple yet effective implementations that (almost) everyone can easily understand 😊

Related tags
- ❌
- Have
- I
- Been
- Pwned
June 30^th 2022 at 07:21

Troy Hunt
Weekly Update 301
June 25^th 2022 at 04:49

Weekly Update 301

By Troy Hunt

First up, I'm really sorry about the audio quality on this one. It's the exact same setup I used last week (and carefully tested first) but it's obviously just super sensitive to the wind. If you look at the trees in the background you can see they're barely moving, but inevitably that was enough to really mess with the audio quality. I do actually have a windsock for the mic, but it's in a drawer at home so for the remainder of this trip it'll be indoor recording only. Speaking of which, because there was a lot of enthusiasm for Charlotte and I to do one together on the cultural differences we've both experienced living in different parts of the world, that'll be next week's video. Less techie, but hopefully something you'll all enjoy 😊

References

NDC Melbourne was very much like a reunion being the first NDC event we've been back to since London in Jan 2020 (and being able to share it with the kids made it extra special 😊)
The travel thread continues, with much more to come yet before hitting home (a lot of gorgeous Aussie countryside scenes in there, and the best is yet to come)
Sixt had a data breach (but don't worry, lots of European companies are being hacked!)
Sponsored by: Varonis for Salesforce. Protect Salesforce data from overexposure and cyberthreats. Try it free!

Related tags
June 25^th 2022 at 04:49

Troy Hunt
Weekly Update 300
June 17^th 2022 at 22:29

Weekly Update 300

By Troy Hunt

Well, we're about 2,000km down on this trip and are finally in Melbourne, which was kinda the point of the drive in the first place (things just escalated after that). The whole journey is going into a long tweet thread you can find below (or mute - that's partly why it's in a single thread):

It’s time for the next great road trip 🏎 pic.twitter.com/9B9k9cXQvH
— Troy Hunt (@troyhunt) June 14, 2022

Next week is NDC Melbourne so please get along to the event if you're in town, it's kinda amazing to think I'll finally be back at an NDC after all this time 😊

References

We're on another epic road trip (that's the tweet thread, I'll keep adding to it as we go)
Been listening to the Hardcore History podcast which is epic... (...but very heavy listening I need to break into smaller sessions)
It's NDC Melbourne nest week! (my first time back at an NDC since London in early 2020, and the inaugural event for Melbourne)
The DivX SubTitles breach was 783k records worth of plain text passwords (it's a 12-year-old incident, but still...)
Sponsored by: Meet compliance objectives in a remote-first world without resorting to rigid device management. Try Kolide for 14-days free!

Related tags
June 17^th 2022 at 22:29

Troy Hunt
Weekly Update 299
June 12^th 2022 at 08:18

Weekly Update 299

By Troy Hunt

How on earth does an enterprise rack-mounted NAS not come with rails to actually install it in the rack?! So yeah, that's what's in the box, something that should have been in the original box and not in a separate purchase. Just to add to the Synology packaging insanity, I went to install a couple of spare NVMe drives in it today and... there were no screws in the NVMe slots 🤦‍ I'll be doing the next four weekly updates from various locations around the country as we hit the road again, stay tuned for epic tweet threads of amazing locations 😎

References

The MyElectronics.nl Raspberry Pi racks are really sweet (the rack is looking pretty slick now!)
Apple Watch fall detection is pretty amazing when you actually see it work as intended (I've had lots of easily dismissible false-positives on mine, but my father just demonstrated precisely how it's meant to work)
A lot of personal finance is just basic maths and simple market observations (why is anyone even remotely surprised that interest rates are going up?!)
The Indonesian government is now the 33rd gov on board HIBP (also the first one from Asia)
Sponsored by: Varonis for Salesforce. Detect suspicious behavior and strengthen your Salesforce security posture. Try it free!

Related tags
June 12^th 2022 at 08:18

Troy Hunt
Welcoming the Indonesian Government to Have I Been Pwned
June 6^th 2022 at 00:03

Welcoming the Indonesian Government to Have I Been Pwned

By Troy Hunt

Four years ago now, I started making domains belonging to various governments around the world freely searchable via a set of APIs in Have I Been Pwned. Today, I'm very happy to welcome the 33rd government, Indonesia! As of now, the Indonesian National CERT managed under the National Cyber and Crypto Agency has full access to this service to help protect government departments within the country.

Indonesia's inclusion marks the first Asian nation to take up this service and look forward to many more from across the globe following in future.

Related tags
- ❌
- Government
June 6^th 2022 at 00:03

Troy Hunt
Weekly Update 298
June 4^th 2022 at 08:08

Weekly Update 298

By Troy Hunt

I somehow ended up blasting through an hour and a quarter in this week's video with loads of discussion on the CTARS / NDIS data breach then a real time "let's see what the fuss is about" with news that one of our state's digital driver's licenses (DDL) may be easily forgeable. I think the whole discussion is actually really interesting when looked at through the lens of how on balance, a digitised license compares to a physical one. As you'll see, I think the reporting on this is overblown however... the weak encryption keys do seem like an oversight and the response of Service NSW to criticism has been lacklustre at best. Let's see how it goes in other states, I'll be first in line when they roll out in Queensland so I can finally start leaving my wallet at home!

References

I'm doing a meetup in Tassie on July 7 (in a brewery!!!)
I got pwned in the MGM Resorts data breach (I didn't even know until I checked my old Hotmail address)
The CTARS / NDIS data breach is really nasty (just really super sensitive medical data)
The controversary around the ability to forge New South Wales digital driver's feels overblown (let's stop asking whether it's a perfect security construct and instead ask how it differs to the old physical plastic licenses)
Sponsored by: Kolide enables cross-platform fleet visibility for your Linux, Mac, and Windows devices. Start your free 14-day trial today!

Related tags
June 4^th 2022 at 08:08

Troy Hunt
Weekly Update 297
May 29^th 2022 at 01:19

Weekly Update 297

By Troy Hunt

So I basically spent my whole day yesterday playing with Ubiquiti gear and live-tweeting the experience 😊 This was an unapologetically geeky pleasure and it pretty much dominates this week's video but hey, it's a fun topic. Still, there's a bunch of data breach stuff up front and as I write this, 25M more records courtesy of the MGM breach are making their way up into HIBP. Get ready for a bunch of notification emails going out on that one. Here's this week's video:

References

Finally worked out how to handle the MGM breach (it's loading now as a new breach to ensure HIBP subscribers are appropriately notified)
The Ubiquiti G4 PTZ is a mighty looking camera! (it'll take a professional to get it mounted though, stay tuned for more)
The G4 Doorbell Pro is a little more accessible and has a remarkably better picture quality than the old "standard" one (I know it's sold it, Ubiquiti knows it's sold out, fingers crossed for more supply soon)
The in-wall wifi 6 units look almost identical to the previous gen... (but they're not - they much more nicely made)
Sponsored by: Varonis for Salesforce. Protect Salesforce data from overexposure and cyberthreats. Try it free!

Related tags
May 29^th 2022 at 01:19

Troy Hunt
Weekly Update 296
May 20^th 2022 at 07:43

Weekly Update 296

By Troy Hunt

Data breaches, 3D printing and passwords - just the usual variety of things this week. More specifically, that really cool Pwned Passwords downloader that I know a bunch of people have been waiting on, and now we've finally released. It hits the existing k-anonymity API over 1 million times and that API is already going on 2 billion requests a month so I'm kinda curious to see what happens if everyone starts running the downloader at the same time... 🤔

References

This is a much better guide to what causes a 3D printer hot end to leak out the top of the heat block (the image there makes easy to understand)
Since I broke the heater cartridge anyway, a Revo 6 should do the job (see how the nozzle and heat break are all one part)
The Pwned Passwords downloader is here! (this is a great little tool put together by Stefán)
Sponsored by: Kolide provides endpoint security for teams that value privacy, transparency, and employee productivity. Try Kolide for free today!

Related tags
May 20^th 2022 at 07:43

Troy Hunt
Downloading Pwned Passwords Hashes with the HIBP Downloader
May 19^th 2022 at 22:34

Downloading Pwned Passwords Hashes with the HIBP Downloader

By Troy Hunt

Just before Christmas, the promise to launch a fully open source Pwned Passwords fed with a firehose of fresh data from the FBI and NCA finally came true. We pushed out the code, published the blog post, dusted ourselves off and that was that. Kind of - there was just one thing remaining...

The k-anonymity API is lovely and that's not just me saying that, that's people voting with their feet:

Downloading Pwned Passwords Hashes with the HIBP Downloader

That's already 58% by volume from my December blog post, only 5 months ago to the day. It's also just a rounding error off a 100% cache hit ratio too 😎 But the bit that remained was the promise I made in that last blog post:

Lastly, as of right now, the code to take the ingestion pipeline and dump all passwords into a downloadable corpus is yet to be written. We want to do this - we have every intention of doing this - but given how long it frequently was between releases, we don't feel the need to rush.

The idea of taking 16^5 hash ranges, bundling them all up into a single monolithic archive then making it all downloadable seemed a non-trivial task. Plus, I was still licking my wounds from the massive costs I got hit with after releasing the last archives and them exceeding the cacheable limit at the time on Cloudflare's edge. And that's when it hit me - why don't we just write a script to download all the hashes from the same k-anonymity API so many organisations are already using? It's just 16^5 separate requests and the responses could be dumped into a big text file, how hard could it be? It'd almost all be cached and there's super efficient brotli compression between the client and the Cloudflare edge so it should be fast too, so... why not?

I threw the idea over to Stefán and in his typically cool Icelandic way he not only built the feature, but did it much better than I was thinking in the first place. So, here's how it works in point form:

There's a public repository for the Pwned Passwords Downloader over on Github where you're welcome to grab the code, submit PRs or raise issues
There's also a NuGet package so if you don't want to download and compile code yourself, you can pull the executable directly via the command line

And that's it. Run it up and it looks like this:

The -p switch defines the level of parallelism to apply and when run in the Azure VM I tested this from, it took 26 minutes to pull everything down. Obviously YMMV based on connection speed, but with that massive cache hit ratio (also reflected in the output above), at least you'll be retrieving almost every single hash range from a location very close to you.

I'm conscious the one remaining gap we have is that this doesn't make the NTLM versions downloadable and there are folks out there eagerly awaiting that. I suspect we'll take a similar approach there so stay tuned for that, it shouldn't be a biggy now we've established a pattern. I'm also conscious that to make this tool more useful, it would be handy to know when to actually run it by seeing how many new password hashes have been added since a given date. That's on the list - we know it's wanted - and especially as the volume of inbound passwords ramp up I know it'll be super useful for people.

So, go forth and grab the tool, pull down the hashes whenever you feel like it and do good things with them. Now I'm kinda curious to see what those API hit numbers look like once the masses grab this tool and make 1M+ requests each 😊

May 19^th 2022 at 22:34

Troy Hunt
Weekly Update 295
May 15^th 2022 at 01:32

Weekly Update 295

By Troy Hunt

A short one this week as the previous 7 days disappeared with AusCERT and other commitments. Geez it was nice to not only be back at an event, but out there socialising and attending all the related things that tend to go along with it. I'll leave you with this tweet which was a bit of a highlight for me, having Ari alongside me at the event and watching his enthusiasm being part of the industry I love 😊

At #AusCERT with Ari for “take your son to work” day 🙂

I’m up next on stream 2 at 14:45 talking about Pwned Passwords, the FBI, the NCA and giving the whole thing over to the community, come say hi! https://t.co/PqSgb1AjMS pic.twitter.com/Z88xIrrHYW
— Troy Hunt (@troyhunt) May 12, 2022

References

The new Elgato mic boom arm is really slick (I accidentally ordered the "LP" low-profile model, which turned out to be a much better fit for the space)
I mentioned the Pwned Passwords downloader in the video so I'm sharing the link again here (I hope to blog about it this coming week, it just needs some minor tweaks first)
Sponsored by: Varonis for Salesforce. Detect suspicious behavior and strengthen your Salesforce security posture. Try it free!

Related tags
May 15^th 2022 at 01:32

Troy Hunt
Weekly Update 294
May 6^th 2022 at 21:38

Weekly Update 294

By Troy Hunt

It's back to business as usual with more data breaches, more poor handling of them and more IoT pain. I think on all those fronts there's a part of me that just likes the challenge and the opportunity to fix a broken thing. Or maybe I'm just a sucker for punishment, I don't know, but either way it's kept me entertained and given me plenty of new material for this week's video 😊

References

The book is almost ready to launch! (I've totally rewritten the intro, tweaked a bunch of the stories and added more - hopefully only a month off go-live)
My fallback position for the IoT not working is literally climbing over the wall (I'm going to solve - and blog - this issue around too much broadcast traffic)
Speaking of broadcast traffic, rolling from MQTT to the native Home Assistant Shelly integrations has been... not very good (I don't want to blame HA for this, it's a network-level issue)
The wifi proximity sensor I installed in my mailbox is heading for "the drawer of broken dreams" (I spoke the Lars after recording and he agreed - it sucks!)
I'll be speaking at AusCERT on the Gold Coast next week (I've decided to call my talk "Pwning Compromised Passwords with the FBI and NCA")
How PayHere in Sri Lanka has handled their data breach is pretty much a textbook example of what not to do (although kudos to the CEO for eventually apologising and acknowledging they "messed up")
Sponsored by: Got Slack? Got Macs? Get Kolide: Device security that fixes challenging problems by messaging users on Slack. Try Kolide for 14 days free.

Related tags
May 6^th 2022 at 21:38

Troy Hunt
Weekly Update 293
May 1^st 2022 at 00:52

Weekly Update 293

By Troy Hunt

Didn't get a lot done this week, unless you count scuba diving, snorkelling, spear fishing and laying around on tropical sand cays 😎 This week is predominantly about the time we just spent up on the Great Barrier Reef which has very little relevance to infosec, IoT, 3D printing and the other usual topics. But as I refer to in the guitar lessons blog post referenced below, I share what I do pretty transparently and organically and this week, that's what I want to talk about. So, either enjoy it or skip it until next week when I'll back to business as usual 😊

References

I followed Lars' guidance and installed the physical mailbox sensor (so far, I'm unhappy with it, more next week)
I've gotten a lot of mileage out of my guitar lessons blog post (watch the Ricky Gervais bit, it's funny... and true)
Pictures speak a thousand words... especially when they're amazing pictures of the Great Barrier Reef (that's the tweet thread of an amazing holiday)
Sponsored by: Got Slack? Got Macs? Get Kolide: Device security that fixes challenging problems by messaging users on Slack. Try Kolide for 14 days free.

Related tags
May 1^st 2022 at 00:52

Troy Hunt
Weekly Update 292
April 22^nd 2022 at 07:23

Weekly Update 292

By Troy Hunt

Well that was an unusual ending. Both my mouse and keyboard decided to drop off right at the end of this week's video and without any control whatsoever, there was no way to end the live stream! Wired devices from kids borrowed, I eventually got back control and later discovered that all things Bluetooth had suddenly decided to die without any warning whatsoever. I certainly wasn't updating drivers mid-live stream or anything like that so... 🤷‍♂️

Anyway, other than that it's business as usual this week, enjoy!

References

The shots I'm getting with the new drone are amazing! (it's crazy how much tech is jammed into this little thing)
I'm disappointed that Mailchimp has stopped offering a discount for users with 2FA enabled (I'd really love to think there was an ROI for them offering the discount)
You'd think an Attorney General's office would have better things to do than forwarding on a complaint from someone who thinks HIBP has been breached (seriously, it'd take about 3 mins for anyone paying attention to understand what's going on)
Disclosing data breaches is still way too hard (people found it painful to watch a 1 hour 15 minute video of me trying to disclose to Avvo - good - that's the point - it's painful!)
Sponsored by: Varonis for Salesforce. Protect Salesforce data from overexposure and cyberthreats. Try it free!

Related tags
April 22^nd 2022 at 07:23