FreshRSS

πŸ”’
❌ About FreshRSS
There are new available articles, click to refresh the page.
Before yesterdayTroy Hunt

Weekly Update 396

By Troy Hunt
Weekly Update 396

"More Data Breaches Than You Can Shake a Stick At". That seems like a reasonable summary and I suggest there are two main reasons for this observation. Firstly, there are simply loads of breaches happening and you know this already because, well, you read my stuff! Secondly, There are a couple of Twitter accounts in particular that are taking incidents that appear across a combination of a popular clear web hacking forum and various dark web ransomware websites and "raising them to the surface", so to speak. That is incidents that may have previously remained on the fringe are being regularly positioned in the spotlight where they have much greater visibility. The end result is greater awareness and a longer backlog of breaches to process than I've ever had before!

Weekly Update 396
Weekly Update 396
Weekly Update 396
Weekly Update 396

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. Le Slip Français was breached by "shopifyGUY" (I wonder where all these Shopify API keys are coming from?!)
  3. Roku got hit with a pretty sizeable credential stuffing attack (looks like they're now mandating multi-step auth for everyone, which is certainly one way of tackling this)
  4. There's an extraordinary rate of new breaches appearing at the moment (that's a link to the HackManac Twitter account that's been very good at reporting on these)

Weekly Update 395

By Troy Hunt
Weekly Update 395

Data breach verification: that seems like a good place to start given the discussion in this week's video about Accor. Watch the vid for the whole thing but in summary, data allegedly taken from Accor was published to a popular hacking forum and the headlines inevitably followed. However, per that story:

Cybernews couldn’t confirm the authenticity of the data. We reached out to Accor for clarification and are awaiting a response.

I couldn't confirm the authenticity of the data either and I wrote a short thread about it during the week:

I'm not convinced this data is from Accor. There are barely any references to "accor" in the data and the ones that are there just look like records where Accor is a customer of another service. https://t.co/4rT17eNQ7J

β€” Troy Hunt (@troyhunt) April 11, 2024

Yet that headline very clearly stated there'd been a breach, as did the SC News one a few days later: Accor database exposed by IntelBroker. So... no independent verification and no statement from the company, yet a headline stating a publicly listed multinational with billions of dollars of annual revenue has had customer data exposed. That's, uh, "brave" 😲

Weekly Update 395
Weekly Update 395
Weekly Update 395
Weekly Update 395

References

  1. Sponsored by:Β Kolide ensures only secure devices can access your cloud apps. It's Device Trust tailor-made for Okta. Book a demo today.
  2. I'm on Hamilton Island! (that's a Google search for Whitehaven Beach 😍)
  3. Indian service boAt had 7.5M records breached (apparently the breach was carried out by "shopifyGUY", who seems to be quite good at this...)
  4. ...hence the breach I made live during the stream, Canadian retailer Giant Tiger (and there's one more in the pipeline from shopifyGUY too)
  5. Just about everyone in El Salvador also ended up in a breach (the presence of what looks like passport photos for everyone is also a bit worried)
  6. Accor allegedly had a breach which really didn't look like Accor when I first reviewed it (but the suggestion during the live stream about it possibly being sourced from an Accor event facility was a really interesting one which deserves more investigation)

Weekly Update 394

By Troy Hunt
Weekly Update 394

I suggest, based on my experiences with data breaches over the years, that AT&T is about to have a very bad time of it. Class actions following data breaches have become all too common and I've written before about how much I despise them. The trouble for AT&T (in my non-legal but "hey, I'm the data breach guy" opinion), will be their denial of a breach in 2021 and the subsequent years in which tens of millions of social security numbers were floating around. As much as it's hard for the victim of identity theft to say "this happened because of that breach", it's also hard for the corporate victim of a breach to say that identity theft didn't happen because of their breach. Particularly in such a litigious part of the world, I wouldn't be at all surprised if the legal cost of this runs into the tens if not hundreds of millions of dollars. I doubt the plaintiffs will see much of this, but there's sure going to be some happy lawyers out there!

Weekly Update 394
Weekly Update 394
Weekly Update 394
Weekly Update 394

References

  1. Sponsored by:Β Kolide ensures only secure devices can access your cloud apps. It's Device Trust tailor-made for Okta. Book a demo today.
  2. AT&T have now confirmed their data breach (well, kind of: "AT&T data-specific fields were contained in a data set")
  3. The big telco is already getting hit with a bunch of class action law suits (that's at least 10 from one US state alone!)
  4. Pandabuy got breached (and very quickly tried to stop people talking about it!)
  5. Surveylama also got breached (that's another 4.4M email addresses now out there)
  6. Now that the new Prusa Mk4 is up and running, we're printing a modular hydroponic tower (the embedded video on that Printables page gives a great overview)

Weekly Update 393

By Troy Hunt
Weekly Update 393

A serious but not sombre intro this week: I mentioned at the start of the vid that I had the classic visor hat on as I'd had a mole removed from my forehead during the week, along with another on the back of my hand. Here in Australia, we have one of the highest rates of skin cancer in the world with apparently about two-thirds of us being diagnosed with it before turning 70. At present, the bits they cut off me were entirely unremarkable (small dot about an inch over my left eye if you're really curious), but the point I wanted to make was what I mentioned in the video about us doing annual checks; every year, we voluntarily front up at the GP and he checks (almost) every square inch of skin for stuff that we'd never normally notice but under the microscope, may look a bit dodgy. It's an absolute no-brainer that takes about 10 minutes and if he does decide to remove something, there's another 10 minutes and a stitch. If you're in the sun a lot like us, just do it πŸ™‚

With that community service notice done, let's get into today's video:

Weekly Update 393
Weekly Update 393
Weekly Update 393
Weekly Update 393

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. A MASSIVE thanks to fellow MVP Daniel Hutmacher who has been invaluable in helping us tune the new SQL bits in HIBP (turns out Daniel listened to this live stream and was happy to be named)
  3. Here's what we've landed on in terms of allowable email address alias patterns (we made it ever so slightly stricter today: no period at the end of the alias and no sequential periods either)
  4. The Prusa MK4 3D printer build is now complete! (finally wrapped it up yesterday after recording this vid, beautiful machine!)
  5. English Cricket suffered a data breach that exposed more than 40k records (queue all sorts of different cricket euphemisms...)

Weekly Update 392

By Troy Hunt
Weekly Update 392

Let's get straight to the controversial bit: email address validation. A penny-drop moment during this week's video was that the native browser address validator rejects many otherwise RFC compliant forms. As an example, I asked ChatGTP about the validity of the pipe symbol during the live stream and according to the AI, it's permissible "when properly quoted":

"john|doe"@example.com

Give that a go and see how far you get in an input of type "email". Mind you, that example allows a pipe when not quoted. And the more you read, the more contradictory things seem; try this Stack Overflow question about allowable characters in an address and you'll get a heap of "yeah, that one is allowed but only if quoted"... which means it won't work in an email input box! (Unless you use the "pattern" attribute and a regex that permits it - argh!)

tl;dr - especially for the purpose in question - extracting email addresses from a data dump - I think I'm just going to boilthis down to a handful of permissible characters that are broadly accepted by websites and just stick with those. If you're a unique enough snowflake to be putting a quoted pipe in your alias then you're clearly not signing up to very many websites.

Weekly Update 392
Weekly Update 392
Weekly Update 392
Weekly Update 392

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. It just went from bad to worse for Onerep with Mozilla cutting ties (it's hard to imagine they really had any choice left)
  3. Is the alleged AT&T breach really just "alleged"? (read the comments on that blog post and see what you think...)
  4. MediaWorks in NZ got breached and their data spread all over the place (although the data is pretty benign in the scheme of things)
  5. But hey, at least MediaWorks had some solid advice around protecting yourself online! (checking if you were included in "other" breaches now needs a bit of a revision...)

Inside the Massive Alleged AT&T Data Breach

By Troy Hunt
Inside the Massive Alleged AT&T Data Breach

I hate having to use that word - "alleged" - because it's so inconclusive and I know it will leave people with many unanswered questions. (Edit: 12 days after publishing this blog post, it looks like the "alleged" caveat can be dropped, see the addition at the end of the post for more.) But sometimes, "alleged" is just where we need to begin and over the course of time, proper attribution is made and the dots are joined. We're here at "alleged" for two very simple reasons: one is that AT&T is saying "the data didn't come from us", and the other is that I have no way of proving otherwise. But I have proven, with sufficient confidence, that the data is real and the impact is significant. Let me explain:

Firstly, just as a primer if you're new to this story, read BleepingComputer's piece on the incident. What it boils down to is in August 2021, someone with a proven history of breaching large organisations posted what they claimed were 70 million AT&T records to a popular hacking forum and asked for a very large amount of money should anyone wish to purchase the data. From that story:

From the samples shared by the threat actor, the database contains customers' names, addresses, phone numbers, Social Security numbers, and date of birth.

Fast forward two and a half years and the successor to this forum saw a post this week alleging to contain the entire corpus of data. Except that rather than put it up for sale, someone has decided to just dump it all publicly and make it easily accessible to the masses. This isn't unusual: "fresh" data has much greater commercial value and is often tightly held for a long period before being released into the public domain. The Dropbox and LinkedIn breaches, for example, occurred in 2012 before being broadly distributed in 2016 and just like those incidents, the alleged AT&T data is now in very broad circulation. It is undoubtedly in the hands of thousands of internet randos.

AT&T's position on this is pretty simple:

AT&T continues to tell BleepingComputer today that they still see no evidence of a breach in their systems and still believe that this data did not originate from them.

The old adage of "absence of evidence is not evidence of absence" comes to mind (just because they can't find evidence of it doesn't mean it didn't happen), but as I said earlier on, I (and others) have so far been unable to prove otherwise. So, let's focus on what we can prove, starting with the accuracy of the data.

The linked article talks about the author verifying the data with various people he knows, as well as other well-known infosec identities verifying its accuracy. For my part, I've got 4.8M Have I Been Pwned (HIBP) subscribers I can lean on to assist with verification, and it turns out that 153k of them are in this data set. What I'll typically do in a scenario like this is reach out to the 30 newest subscribers (people who will hopefully recall the nature of HIBP from their recent memory), and ask them if they're willing to assist. I linked to the story from the beginning of this blog post and got a handful of willing respondents for whom I sent their data and asked two simple questions:

  1. Does this data look accurate?
  2. Are you an AT&T customer and if not, are you a customer of another US telco?

The first reply I received was simple, but emphatic:

Inside the Massive Alleged AT&T Data Breach

This individual had their name, phone number, home address and most importantly, their social security number exposed. Per the linked story, social security numbers and dates of birth exist on most rows of the data in encrypted format, but two supplemental files expose these in plain text. Taken at face value, it looks like whoever snagged this data also obtained the private encryption key and simply decrypted the vast bulk (but not all of) the protected values.

Inside the Massive Alleged AT&T Data Breach

The above example simply didn't have plain text entries for the encrypted data. Just by way of raw numbers, the file that aligns with the "70M" headline actually has 73,481,539 lines with 49,102,176 unique email addresses. The file with decrypted SSNs has 43,989,217 lines and the decrypted dates of birth file only has 43,524 rows. (Edit: the reason for this later became clear - there is only one entry per date of birth which is then referenced from multiple records.) The last file, for example, has rows that look just like this:

.encrypted_value='*0g91F1wJvGV03zUGm6mBWSg==' .decrypted_value='1996-07-18'

That encrypted value is precisely what appears in the large file hence providing an easy way of matching all the data together. But those numbers also obviously mean that not every impacted individual had their SSN exposed, and most individuals didn't have their date of birth leaked. (Edit: per above, the same entries in the DoB file are referenced by multiple source records so whilst not every record had a DoB recorded, the difference isn't as stark as I originally reported.)

Inside the Massive Alleged AT&T Data Breach

As I'm fond of saying, there's only one thing worse than your data appearing on the dark web: it's appearing on the clear web. And that's precisely where it is; the forum this was posted to isn't within the shady underbelly of a Tor hidden service, it's out there in plain sight on a public forum easily accessed by a normal web browser. And the data is real.

That last response is where most people impacted by this will now find themselves - "what do I do?" Usually I'd tell them to get in touch with the impacted organisation and request a copy of their data from the breach, but if AT&T's position is that it didn't come from them then they may not be much help. (Although if you are a current or previous customer, you can certainly request a copy of your personal information regardless of this incident.) I've personally also used identity theft protection services since as far back as the 90's now, simply to know when actions such as credit enquiries appear against my name. In the US, this is what services like Aura do and it's become common practice for breached organisations to provide identity protection subscriptions to impacted customers (full disclosure: Aura is a previous sponsor of this blog, although we have no ongoing or upcoming commercial relationship).

What I can't do is send you your breached data, or an indication of what fields you had exposed. Whilst I did this in that handful of aforementioned cases as part of the breach verification process, this is something that happens entirely manually and is infeasible en mass. HIBP only ever stores email addresses and never the additional fields of personal information that appear in data breaches. In case you're wondering why that is, we got a solid reminder only a couple of months ago when a service making this sort of data available to the masses had an incident that exposed tens of billions of rows of personal information. That's just an unacceptable risk for which the old adage of "you cannot lose what you do not have" provides the best possible fix.

As I said in the intro, this is not the conclusive end I wanted for this blog post... yet. As impacted HIBP subscribers receive their notifications and particularly as those monitoring domains learn of the aliases in the breach (many domain owners use unique aliases per service they sign up to), we may see a more conclusive outcome to this incident. That may not necessarily be confirmation that the data did indeed originate from AT&T, it could be that it came from a third party processor they use or from another entity altogether that's entirely unrelated. The truth is somewhere there in the data, I'll add any relevant updates to this blog post if and when it comes out.

As of now, all 49M impacted email addresses are searchable within HIBP.

Edit (31 March): AT&T have just released a short statement making 2 important points:

AT&T data-specific fields were contained in a data set
it is not yet known whether the data in those fields originated from AT&T or one of its vendors

They've also been mass-resetting account passcodes after TechCrunch apparently alerted AT&T to the presence of these in the data set. That article also includes the following statement from AT&T:

Based on our preliminary analysis, the data set appears to be from 2019 or earlier, impacting approximately 7.6 million current AT&T account holders and approximately 65.4 million former account holders

Between originally publishing this blog post and AT&T's announcements today, there have been dozens of comments left below that attribute the source of the breach to AT&T in ways that made it increasingly unlikely that the data could have been sourced from anywhere else. I know that many journos (and myself) reached out to folks in AT&T to draw their attention to this, I'm happy to now end this blog post by quoting myself from the opening para 😊

But sometimes, "alleged" is just where we need to begin and over the course of time, proper attribution is made and the dots are joined.

Weekly Update 391

By Troy Hunt
Weekly Update 391

I'm in Japan! Without tripod, without mic and having almost completely forgotten to do this vid, simply because I'm enjoying being on holidays too much 😊 It was literally just last night at dinner the penny dropped - "don't I normally do something around now...?" The weeks leading up to this trip were especially chaotic and to be honest, I simply forgot all about work once we landed here. And when you see the pics in the thread below, you'll understand why:

Tokyo time! 🍣 pic.twitter.com/dG0Ja60eQb

β€” Troy Hunt (@troyhunt) March 13, 2024

Regardless, this week has a bunch of content primarily on the Onerep mess; can you imagine a company selling services to remove your data from the other services they're running?! That's the Krebs position and the story is a great read so go and check that out. We may not have heard the end of it yet either, especially given the Mozilla situation.

Weekly Update 391
Weekly Update 391
Weekly Update 391
Weekly Update 391

References

  1. Sponsored by:Β Kolide can get your cross-platform fleet to 100% compliance. It's Zero Trust for Okta. Want to see for yourself? Book a demo.
  2. Four new breaches into HIBP this week (these are older incidents, but they're helping us fine-tune the breach load process)
  3. Onerep got a thorough Krebsing (yet to hear any more about this too, even so much as a statement from the company)

Welcoming the Liechtenstein Government to Have I Been Pwned

By Troy Hunt
Welcoming the Liechtenstein Government to Have I Been Pwned

Over the last 6 years, we've been very happy to welcome dozens of national governments to have unhindered access to their domains in Have I Been Pwned, free from cost and manual verification barriers. Today, we're happy to welcome Liechtenstein's National Cyber Security Unit who now have full access to their government domains.

We provide this support to governments to help those tasked with protecting their national interests understand more about the threats posed by data breaches, and we look forward to welcoming many more national infosec teams in the future.

Weekly Update 390

By Troy Hunt
Weekly Update 390

Let me begin by quoting Stefan during the livestream: "​​Turns out having tons of data integrity is expensive". Yeah, and working with tons of data in a fashion that's both fast and cost effective is bloody painful. I'm reminded of the old "fast, good and cheap - pick 2" saying, but there's a lot more nuance to it than that, of course. I mean Table Storage was all 3 of those, just so long as we never needed to restore at all, let alone to a point in time. Or geo-replicate. Or do ad hoc queries and do on and so forth. Mind you, I think that with a combination of Azure SQL in Hyperscale mode, some better index optimisation, and a willingness to scale up more aggressively when processing large breaches, we might be able to find a happy balance. Literally as I'm writing this, we're upgrading to Hyperscale so hopefully when I do next week's video from Tokyo, there'll be a happy story to tell (or I'll be drowning my sorrows in sake).

Weekly Update 390
Weekly Update 390
Weekly Update 390
Weekly Update 390

References

  1. Sponsored by:Β Kolide ensures that if a device isn't secure, it can't access your apps. It's Device Trust for Okta. Watch the demo today!
  2. The German government has become the 35th national gov to be granted access to all their gov domains in HIBP (and one more to come next week)
  3. WoTLabs got very pwned (site defacement on top of leaked data is never a good look)
  4. The Онлайн Π’Ρ€Π΅ΠΉΠ΄ (Online Trade) breach was an oldie, but it's helping us tune the import process as part of the RDBMS rollover (which is... painful)
  5. Speaking of RDBMS rollover, most of the ideas I had during this video have proven to be completely useless, so we're now rolling to Hyperscale as well (it's actually only very slightly more expensive)
  6. We're still contributing to the HIBP UX rebuild repo (consider it a "soft launch" for now, I'll blog about it in more detail after I get back from Japan)

  • March 10th 2024 at 04:38

Welcoming the German Government to Have I Been Pwned

By Troy Hunt
Welcoming the German Government to Have I Been Pwned

Back in 2018, we started making Have I Been Pwned domain searches freely available to national government cybersecurity agencies responsible for protecting their nations' online infrastructure. Today, we're very happy to welcome Germany as the 35th country to use this service, courtesy of their CERTBund department. This access now provides them with complete access to the exposure of their government domains in data breaches.

With the unabated flood of data breaches, we're happy to provide this support to governments in the hope it better enables them to protect their national interests and we look forward to welcoming many more national CERTs in the future.

Weekly Update 389

By Troy Hunt
Weekly Update 389

How on earth are we still here? You know, that place where breached companies stand up and go all Iraqi information minister on the incident as if somehow, flatly denying the blatantly obvious will make it all go away. It's the ease of debunking the "no breach here" claim that I find particularly fascinating; the truth is always sitting there in the data and it doesn't take much to bring it to the surface. Ah well, as I always end up lamenting, with behaviour like this it's a good time to be in the industry πŸ€·β€β™‚οΈ

Weekly Update 389
Weekly Update 389
Weekly Update 389
Weekly Update 389

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. Cutout.Pro got breached and 20M email addresses leaked (for the most part, an unremarkable incident)
  3. I've stood up a GitHub repo to start collaborating on the HIBP UX redesign (consider this a "soft launch" for the moment, I'll blog about it later on)
  4. The Cutout.Pro breach isn't "alleged", it's real (it's crazy to say there's no evidence of a breach when there's all this evidence of a breach!)
  5. The FedEx phish post went up just after last week's video (still kinda nuts that's even a thing...)
  6. We're doing a full 3D printer build thread (watch the Prusa MK4 gradually take shape!)

Weekly Update 388

By Troy Hunt
Weekly Update 388

It's just been a joy to watch the material produced by the NCA and friends following the LockBit takedown this week. So much good stuff from the agencies themselves, not just content but high quality trolling too. Then there's the whole ecosystem of memes that have since emerged and provided endless hours of entertainment 😊 I'm sure we'll see a lot more come out of this yet and inevitably there's seized material that will still be providing value to further investigations years from now. Good job folks!

Weekly Update 388
Weekly Update 388
Weekly Update 388
Weekly Update 388

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. LockBit got seriously taken down by a coalition of law enforcement agencies this week (that's a link through to vxunderground's Twitter profile which has had exellcent commentary)
  3. FedEx or Phish? (I've since written up the blog post, so I'll talk more about that next week)

Thanks FedEx, This is Why we Keep Getting Phished

By Troy Hunt
Thanks FedEx, This is Why we Keep Getting Phished

I've been getting a lot of those "your parcel couldn't be delivered" phishing attacks lately and if you're a human with a phone, you probably have been too. Just as a brief reminder, they look like this:

Thanks FedEx, This is Why we Keep Getting Phished
Thanks FedEx, This is Why we Keep Getting Phished
Thanks FedEx, This is Why we Keep Getting Phished

These get through all the technical controls that exist at my telco and they land smack bang in my SMS inbox. However, I don't fall for the scams because I look for the warning signs: a sense of urgency, fear of missing out, and strange URLs that look nothing like any parcel delivery service I know of. They have a pretty rough go of convincing me they're from Australia Post by putting "auspost" somewhere or other within each link, but I'm a smart human so I don't fall for this (that's a joke, read why humans are bad at URLs).

However... I am expecting a parcel. It's well into the 2020's and post COVID so I'm always expecting a parcel, because that's just how we buy stuff these days. And so, when I received the following SMS earlier this week I was expecting a parcel and I was expecting phishing attacks:

Thanks FedEx, This is Why we Keep Getting Phished

So... which is it? Parcel or phish? Let's see what the people say:

Referring to the parent tweet, is this message legit and should I pay the duty and taxes?

β€” Troy Hunt (@troyhunt) February 20, 2024

Whoa - that's an 87% "dodgy AF" vote from over 4,000 respondents so yeah, that's pretty emphatic. Why such an overwhelmingly suspicious crowd? Let's break that message down into 7 "dodgy AF" signs:

Thanks FedEx, This is Why we Keep Getting Phished
  1. Phishers commonly make typos in their messaging and I know "FedEx" always capitalises the "E". And what's with the "-Exp"? Dodgy AF!
  2. Why does the shipment number look so short? And why is it identical to the requested payment below? Dodgy AF!
  3. Ah, so it's urgent is it? Urgency is a core tenet of social engineering as it encourages people to act without properly thinking it though. Dodgy AF!
  4. Why are the "D" and the "T" capitalised? Dodgy AF!
  5. This is a US-headquartered global delivery parcel service, why aren't they telling me the currency? Or even using a dollar sign? Dodgy AF!
  6. Does this even need explaining? What's this "bpoint.com.au" service? It's definitely not a FedEx domain nor an Aussie gov one if we're talking duty and taxes. Dodgy AF!
  7. So... you're going to give me the contact details for any "query" (not "queries", so there's another grammatical red flag), the very practice we're now moving away from for one simple reason: because it's dodgy AF!

And so, I was with the 87% of other people. However... I was expecting a package. From FedEx. Coming from outside Australia so it may attract duty and taxes. And I really want to get this package because it's a new 3D printer from Prusa, and they're awesome!

There's a sage piece of advice that's always relevant in these cases and it's very simple: if in doubt, go the website in question and verify the request yourself. So, I went to the purchase confirmation from Prusa, found the shipping details and followed the link to the FedEx website. Now it was simply a matter of finding the section that talks about tax, except...

Thanks FedEx, This is Why we Keep Getting Phished

Dodgy. A. F.

I went all through that page and couldn't find a single reference to duty, nor for anything tax related. Try as I might, I couldn't establish the authenticity of the SMS by going directly to the (alleged) source. But what I could easily establish is that if you follow that link in the SMS, you can change the tracking number, the customer name and the amount to absolutely anything you want!

Thanks FedEx, This is Why we Keep Getting Phished

This is all done by simply changing the URL parameters; I'm not modifying the browser DOM or intercepting traffic or doing anything fancy, it's literally just query string parameter tampering reflected XSS style. This feels like every phishing site ever, not a payment service run by Australia's largest bank. Seriously, BPOINT is provided by the Commonwealth Bank and after the experience above, I'm at the point of reaching out to them and making a disclosure. Except that this is how the system was obviously designed to work and it's a completely parallel issue to phishy FedEx SMSs. Speaking of which, the very next morning I got another one from the same sender:

Thanks FedEx, This is Why we Keep Getting Phished

I don't know if this makes it better or worse πŸ€¦β€β™‚οΈ Let's just jump into the highlights, both good and bad:

  1. My shipping number is now actually in the text of the email - yay!
  2. The words "duty" and "taxes" are now represented in the correct case - yay!
  3. The words "PAY NOW" are capitalised which seems... dodgy AF!
  4. And my favourite bit of all: the "link" isn't actually a link at all because it contains no scheme, no domain and no path, just the query string parameters! Dodgy AF!

It's quite unbelievable what they've done with the link because it makes the SMS entirely unactionable. It's impossible to click anywhere and pay the money. And while I'm here, why are all the query string parameter names now capitalised? It's like there's a completely different (broken) process somewhere generating these links. Or scammers just aren't consistent...

Because "dodgy AF" is the prevailing theme, I needed to dig deeper, so I searched for the 1800 number. One of the first results was for a Reverse Australia page for that number which upon reading the first 3 comments, perfectly summed up the sentiment so far:

Thanks FedEx, This is Why we Keep Getting Phished

And the more you read both on that site and other top links in the search results, the more people are totally confused about the legitimacy of the messages. There's only one thing to do - call FedEx. Not by the number in the (still potentially phishy) SMS, but rather via the number on their website. So, click the "Support" menu item, down to "Customer Support" and we end up here:

Thanks FedEx, This is Why we Keep Getting Phished

I'll save you the pain of reading the response that ensued, suffice to say that it only referred to email communications and boiled down to suggesting you read the domain of the sender. But I did manage to pin the system down on a phone number which as you'll see, is completely different to the one in the SMS messages:

Thanks FedEx, This is Why we Keep Getting Phished

So, I call the number and follow the voice prompts, selecting options via the keypad to route me through to the duty and taxes section. But eventually, several steps deep into the process, the system stops responding to key presses! "1" doesn't work and neither does "2" so without a response, the same message just repeats. But it does offer an alternative and suggestions I call 132610. That's the number I called in the first place to get stuck in this infinite loop!

I try again, this time following a different series of prompts that eventually asks for a tracking number and then proceeds to tell me precisely what the website already does! But it also provides the option to speak to a customer service operator and I'm actually promptly put through. The operator explains that my shipment is valued at US$799 which converts to AU$1,215.97 and it therefore subject to some inbound fees. "Great, but how much and does it match what's in the phishy SMSs I've received?" He promises someone will call be back shortly...

And then, out of the blue 3 days after the initial phishy SMS arrived, an email landed in my inbox:

Thanks FedEx, This is Why we Keep Getting Phished

The dollar figure, the BPOINT address and the messaging all lined up with the SMSs, but that's just merely correlation and if someone had both my phone number and email address they could easily attempt to phish both with the same details. But then, I looked at the attachment to the email and found this:

Thanks FedEx, This is Why we Keep Getting Phished

IT'S THE MISSING LINK!!!

My complete Prusa invoice was attached along with the order number, price and shipping details. In other words, 87% of you were wrong 😲

On a more serious note, Aussies alone are losing north of AU$3B annually to scams, and that's obviously only a drop in the ocean compared to the global scale of this problem. Our Australian Communications and Media Authority body (ACMA) recently reported 336M blocked scam SMSs and technical controls like these are obviously great, but absent from their reporting was the number of scam messages they didn't block. There's an easy explanation for this omission: they simply don't know how many are sent. But if I were to take a guess, they've merely blocked the tip of the iceberg. This is why in addition to technical controls, we reply on human controls which means helping people identify the patterns of a scam: requests for money, a sense of urgency, grammar and casing that's a bit off, odd looking URLs. You know, stuff like this:

Thanks FedEx, This is Why we Keep Getting Phished

What makes this situation so ridiculous is that while we're all watching for scammers attempting to imitate legitimate organisations, FedEx is out there imitating scammers! Here we are in the era of burgeoning AI-driven scams that are becoming increasingly hard for humans to identify, and FedEx is like "here, hold my beer" as they one-up the scammers at their own game and do a perfect job of being completely indistinguishable from them.

Ah well, as I ultimately lament in these situations, it's a good time to be in the industry 😊

Weekly Update 387

By Troy Hunt
Weekly Update 387

It's a short video this week after a few days in Sydney doing both NDC and the Azure user group. For the most part, I spoke about the same things as I did at NDC Security in Oslo last month... except that since then we've had the Spoutibe incident. It was fascinating to talk about this in front of a live audience and see everyone's reactions first hand, let's just say there were a lot of "oh wow!" responses 😲

Weekly Update 387
Weekly Update 387
Weekly Update 387
Weekly Update 387

References

  1. Sponsored by:Β Unpatched devices keeping you up at night? Kolide can get your entire fleet updated in days. It's Device Trust for Okta. Watch the demo!
  2. That's another NDC Sydney done and dusted (my "How I Met Your Data" talk will eventually be online and free to watch)
  3. Ransomware payments finally passed the $1B mark in 2023 (I've often commented over the last year that it feels like it's really up-ticked, now here we are)
  4. We're presently rolling HIBP from Table Storage to serverless SQL Azure (by next week's update we should actually have this live and I'll be able to talk a lot more about it)
  5. OpenAI's Sora is just mind-blowing mind 🀯 (it's the rate of change that has so many people stunned, just remember what AI video from text prompts looked like only a year ago...)

Weekly Update 386

By Troy Hunt
Weekly Update 386

Somehow, an hour and a half went by in the blink of an eye this week. The Spoutible incident just has so many interesting aspects to it: loads of data that should never be returned publicly, awesome response time to the disclosure, lacklustre transparency in their disclosure, some really fundamental misunderstands about hashing algorithms and a controversy-laden past if you read back over events of the last year. Phew! No wonder so much time went on this! (and if you want to just jump directly to the Spoutible bits, that's at the 8:50 mark)

Weekly Update 386
Weekly Update 386
Weekly Update 386
Weekly Update 386

References

  1. Sponsored by:Β Got Linux? (And Mac and Windows and iOS and Android?) Then Kolide has the device trust solution for you. Click here to watch the demo.
  2. I'll be speaking at NDC in Sydney next week (it's all about "How I Met Your Data")
  3. I'll also be at the Azure Sydney User Group (this one is "Cloud-Enhanced Cybersecurity Tales from the Dark Web")
  4. Spoutible's spurted deluge of personal data (how much data does it need to be before it's a deluge? πŸ€”)
  5. There are a lot more nuances to hashing algorithms than what many people seem to realise (perhaps most notably is that the strength of the password itself plays an enormous part in how likely a hash is to be cracked)

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

By Troy Hunt
How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

Ever hear one of those stories where as it unravels, you lean in ever closer and mutter β€œNo way! No way! NO WAY!” This one, as far as infosec stories go, had me leaning and muttering like never before. Here goes:

Last week, someone reached out to me with what they claimed was a Spoutible data breach obtained by exploiting an enumerable API. Just your classic case of putting someone else's username in the URL and getting back data about them, which at first glance I assumed was another scraping situation like we recently saw with Trello. They sent me a file with 207k scraped records and a URL that looked like this:

https://spoutible.com/sptbl_system_api/main/user_profile_box?username=troyhunt

But they didn't send me my account, in fact I didn't even have an account at the time and if I'm honest, I had to go and look up exactly what Spoutible was. The penny dropped as I read into it: Spoutible emerged in the wake of Elon taking over Twitter, which left a bunch of folks unhappy with their new social overlord so they sought out alternate platforms. Mastodon and Bluesky were popular options, Spoutible was another which was clearly intended to be an alternative to the incumbent.

In order to unravel this saga in increasing increments of "no way!" reactions, let's just start with the basics of what that API endpoint was returning:

{
  err_code: 0,
  status: 200,
  user: {
    id: 735525,
    username: "troyhunt",
    fname: "Troy",
    lname: "Hunt",
    about: "Creator of Have I Been Pwned. Microsoft Regional Director. Pluralsight author. Online security, technology and β€œThe Cloud”. Australian.",

Pretty standard stuff and I'd expect any of the major social platforms to do exactly the same thing. Name, username, bio and ID are all the sorts of data attributes you'd expect to find publicly available via an API or rendered into the HTML of the website. These fields, however, are quite different:

email: "[redacted]",
ip_address: "[redacted]",
verified_phone: "[redacted]",
gender: "M",

Ok, that's now a "no way!" because I had no expectation at all of any of that data being publicly available (note: phone number is optional, I chose to add mine). It's certainly not indicated on the pages where I entered it:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data
How Spoutible’s Leaky API Spurted out a Deluge of Personal Data
How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

But it's also not that different to previous scraping incidents; the aforementioned Trello scrape exposed the association of email addresses to usernames and the Facebook scrape of a few years ago did the same thing with phone numbers. That's not unprecedented, but this is:

password: "$2y$10$B0EhY/bQsa5zUYXQ6J.NkunGvUfYeVOH8JM1nZwHyLPBagbVzpEM2",

No way! Is it... real? Is that genuinely a bcrypt hash of my own password? Yep, that's exactly what it is:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

The Spoutible API enabled any user to retrieve the bcrypt hash of any other user's password.

I had to check, double check then triple check to make sure this was the case because I can only think of one other time I've ever seen an API do this...

<TangentialStory>

During my 14 years at Pfizer, I once reviewed an iOS app built for us by a low-cost off-shored development shop. I proxied the app through Fiddler, watched the requests and found an API that was returning every user record in the system and for each user, their corresponding password in plain text. When quizzing the developers about this design decision, their response was - and I kid you not, this isn't made up - "don't worry, our users don't use Fiddler" πŸ€¦β€β™‚οΈ

</TangentialStory>

I cannot think of any reason ever to return any user's hashed password to any interface, including an appropriately auth'd one where only the user themselves would receive it. There is never a good reason to do this. And even though bcrypt is the accepted algorithm of choice for storing passwords these days, it's far from uncrackable as I showed 7 years ago now after the Cloudpets breach. Here I used a small dictionary of weak, predictable passwords and easily cracked a bunch of the hashes. Weak passwords like... "spoutible". Wondering just how crazy things would get, I checked the change password page and found I could easily create a password of 6 or more characters (so long as it didn't exceed 20 characters) with no checks on strength whatsoever:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

Strong hashing algorithms like bcrypt are weakened when poor password choices are allowed and strong password choices (such as having more than 20 characters in it), are blocked. For exactly the same reason breached services advise customers to change their passwords even when hashed with a strong algorithm, all Spoutible users are now in the same boat - change you password!

But fortunately these days many people make use of 2 factor authentication to protect against account takeover attacks where the adversary knows the password. Which brings us to the next piece of data the API returned:

2fa_secret: "7GIVXLSNKM47AM4R",
2fa_enabled_at: "2024-02-03 02:26:11",
2fa_backup_code: "$2y$10$6vQRDRDHVjyZdndGUEKLM.gmIIZVDq.E5NWTWti18.nZNQcqsEYki",

Oh wow! Why?! Let's break this down and explore both the first and last line. The 2FA secret is the seed that's used to generate the one time password to be used as the second factor. If you - as an attacker - know this value then 2FA is rendered useless. To test that this was what it looked like, I asked StefΓ‘n to retrieve my data from the public API, take the 2FA secret and send me the OTP:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

It was a match. If StefΓ‘nΒ could have cracked my bcrypted password hash (and he's a smart guy so "spoutible" would have definitely been in his word list), he could have then passed the second factor challenge. And the 2FA backup code? Thinking that would also be exactly what it looked like, I'd screen grabbed it when enabling 2FA:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

Now, using the same bcrypt hash checker as I did for the password, here's what I found:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

What I just don't get is if you're going to return the 2FA secret anyway, why bother bcrypting the backup code? And further, it's only a 6 digit number, do you know how long it takes to crack a bcrypted 6 digit number? Let's find out:

570075, 2m59s

β€” Martin Sundhaug (@sundhaug92@mastodon.social) (@sundhaug92) February 4, 2024

Many other people worked it out in single-digit minutes as well, but Martin did it fastest at the time of writing so he gets the shout-out 😊

You know how I said you'd keep leaning in further and further? Yeah, we're not done yet because then I found this:

em_code: "c62fcf3563dc3ab38d52ba9ddb37f9b1577d1986"

Maybe I've just seen too many data breaches before, but as vague as this looks I had a really good immediate hunch of what it was but just to be sure, I logged out and went to the password reset page:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

Leaning in far enough now, anticipating what's going to happen next? Yep, it's exactly what you thought:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data
How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

NO WAY! Exposed password reset tokens meant that anyone could immediately takeover anyone else's account 🀯

After changing the password, no notification email was sent to the account holder so just to make things even worse, if someone's account was taken over using this technique they'd have absolutely no idea until they either realised their original password no longer worked or their account started spouting weird messages. There's also no way to see if there are other active sessions, for example the way Twitter shows them:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

Further, changing the password doesn't invalidate existing sessions so as best as I can tell, if someone has successfully accessed someone else's Spoutible account there's no way to know and no way to boot them out again. That's going to make recovering from this problematic unless Spoutible has another mechanism to invalidate all active sessions.

The one saving grace is that the token was rotated after reset so you can't use the one in the image above, but of course the new one was now publicly exposed in the API! And there's no 2FA challenge on password reset either but of course even if there was, well, you already read this far so you know how that could have been easily circumvented.

There's just one more "oh wow!" remaining, and it's the ease with which the vulnerable API was found. Spoutible has a feature called Pods and when you browse to that page, people listening to the pod are displayed with the ability to hover over their profile and display further information. For example, here's Rosetta and if we watch the request that's made in the dev tools...

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

By design, all the personal information including email and IP address, phone number, gender, bcrypt hashed password, 2FA secret and backup code and the code that can be immediately used to reset the password is returned to every single person that uses this feature. How many times has this API spouted troves of personal data out to people without them even knowing? Who knows, but I do know it wasn't the only API doing that because the one that listed the pods also did it:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

Because the vulnerable APIs was requested organically as a natural part of using the service as it was intended, Spoutible almost certainly won't be able to fully identify abuse of it. To use the definition of the infamous Missouri governor who recently attempt to prosecute a journalist for pressing F12, everyone who used those features inadvertently became a hacker.

Just one last finding and I've not been able to personally validate it so let's keep it out of "oh wow!" scope: the individual that sent me the data and details of the vulnerability said that the exposed data includes access tokens for other platforms. A couple of months ago, Spoutible announced cross-posting to Mastodon and Bluesky and my own data does have a "cross_posting_auth" node, albeit set to null. I couldn't see anywhere within the UI to enable this feature, but there are profiles with values in there. During the disclosure process (more on that soon), Spoutible did say that those value were encrypted and without evidence of a private key compromise, they believe they're safe.

Here's my full record as it was originally returned by the vulnerable API:

To be as charitable as possible to Spoutible, you could argue that this is largely just the one vulnerability that is the inadvertent exposure of internal data via a public API. This is data that has a legitimate purpose in their system and it may simply be a case of a framework automatically picking all entity attributes up from the data tier and returning them via the UI. But it's the circumstances that allowed this to happen and then exacerbated the problem when it did that concern me more; clearly there's been no security review around this feature because it was so easily discoverable (at least there certainly wasn't review whilst it was live), nor has been any thought put in to notifying people of potential account takeovers or providing them with the means to invalidate other sessions. Then there are periphery issues such as very weak password rules that make cracking bcrypt so much easier, weak 2FA backup codes and pointless bcrypting of them. Not major issues in and of themselves, but they amplify the problems the exposed data presents.

Clearly this required disclosure before publication, unfortunately Spoutible does not publish a security.txt file so I went directly to the founder Christopher Bouzy on both Twitter and email (obviously I could have reached out on Spoutible, but he's very active on Twitter and my profile has more credibility there than a brand new Spoutible account). Here's the timeline, all AEST:

  1. 4 Feb, 15:30: Initial outreach asking for security contact
  2. 4 Feb, 17:27: Response from Spoutible
  3. 4 Feb, 18:31: Full details provided to Spoutible
  4. 4 Feb, 19:48 (or earlier): API is fixed
  5. 5 Feb 01:28 (or earlier): Announcement made about the incident
  6. 5 Feb 07:52: Spoutible confirmed all em_code values have been rotated

To give credit where it's due, Spoutible's response time was excellent. In the space of only about 4 hours, the data returned by the API had a huge number of attributes trimmed off it and now aligns with what I'd expect to see (although the 207k previously scraped records obviously still contain all the data). I'll also add that Christopher's communication with me commendable; he's clearly genuinely passionate about the platform and was dismayed to learn of the vulnerability. I've dealt with many founders of projects in the past that had suffered data breaches and it's especially personal for them, having poured so much of themselves into it.

Here's their disclosure in its entirety:

How Spoutible’s Leaky API Spurted out a Deluge of Personal Data

The revised API is now returning over 80% less data and looks like this:

If you're a detail person, yes, the forward slashes are no longer escaped and the remaining fields are ordered slightly differently so it looks like the JSON encoder has changed. In case you're interested, here's a link to a diff between the two with a little bit of manipulation to make it easier to see precisely what's changed.

As to my own advice to Spoutible users, here are the actions I'd recommend:

  1. Change your Spoutible password and change any other account you reused that password on
  2. If you had 2FA turned on for Spoutible, turn it off then back on again so that it generates a different secret
  3. If you enabled cross-posting to Mastodon or Bluesky, out of an abundance of caution you should invalidate the keys on those platforms
  4. Recognise that your email address, IP address, phone number if you added it and any intentionally publicly visible data associated to your profile may have been exposed

The 207k exposed email addresses that were sent to me are now searchable in Have I Been Pwned and my impacted subscribers have received email notifications.

Weekly Update 385

By Troy Hunt
Weekly Update 385

I told ya so. Right from the beginning, it was pretty obvious what "MOAB" was probably going to be and sure enough, this tweet came true:

Interesting find by @MayhemDayOne, wonder if it was from a shady breach search service (we’ve seen a bunch shut down over the years)? Either way, collecting and storing this data is now trivial so not a big surprise to see someone screw up their permissions and (re)leak it all. https://t.co/DM7udeUcRk

β€” Troy Hunt (@troyhunt) January 22, 2024

What I didn't know at the time was the hilarity of how similar this service would be to those that had come before it... and been shut down by law enforcement agencies. I mean seriously, when you're literally copying and pasting clauses from LeakedSource, what do you think is going to happen?! I sense another "I told ya so" coming...

Weekly Update 385
Weekly Update 385
Weekly Update 385
Weekly Update 385

References

  1. Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. "MOAB" was the breach that wasn't (but it's very much the shady breach site that really is)
  3. I expected the poll on the impact of scraping to be more emphatically against it (but I do wonder if that's simply an issue of the short poll not properly explaining the impact)
  4. The Europcar breach wasn't a breach at all, but that's not what's noteworthy about it (not everything is "AI" FFS you over-hyped marketing droids!)

The Data Breach "Personal Stash" Ecosystem

By Troy Hunt
The Data Breach "Personal Stash" Ecosystem

I've always thought of it a bit like baseball cards; a kid has a card of this one player that another kid is keen on, and that kid has a card the first one wants so they make a trade. They both have a bunch of cards they've collected over time and by virtue of existing in the same social circles, trades are frequent, and cards flow back and forth on a regular basis. That's the analogy I often use to describe the data breach "personal stash" ecosystem, but with one key difference: if you trade a baseball card then you no longer have the original card, but if you trade a data breach which is merely a digital file, it replicates.

There are personal stashes of data breaches all over the place and they're usually presented like this one:

The Data Breach "Personal Stash" Ecosystem

You'll recognise many of those names because they're noteworthy incidents that received a bunch of press. My Space. Adobe. LinkedIn. Ashley Madison.

The same incidents appear here:

The Data Breach "Personal Stash" Ecosystem

And so on and so forth. Stashes of breaches like this are all over the place and they fuel an exchange ecosystem that replicates billions of records of personal data over and over again. Your data. My data. The data of a significant portion of the global internet-using population, just freely flowing backwards and forwards not just in the shady corners of "the dark web" but traded out there in the clear on mainstream websites. Until inevitably:

The Data Breach "Personal Stash" Ecosystem

Diogo Santos Coelho was 14 when he started RaidForums, and was 21 by the time he was arrested for running the service 2 years ago. A kid, exchanging data without the maturity to understand the consequences of his actions. RaidForums left a void that was quickly filled by BreachForums:

The Data Breach "Personal Stash" Ecosystem

Conor Fitzpatrick was 20 years old when he was finally picked up for running the service last year. Still just a kid, at least in the colloquial fashion in which we refer to youngsters as when we get a bit older, but surely still legally a minor when he chose to begin collecting data breaches.

Websites like these are taken down for a simple reason:

The ecosystem of personal stashes exchanged with other parties fuels crime.

For example, data breaches seed services set up with the express intent of monetising a broad range of personal attributes to the detriment of people who are already victims of a breach. Call them shady versions of Have I Been Pwned if you will, and this talk I gave at AusCERT a couple of years ago is a great explainer (deep-linked to the start of that segment):

The first service I spoke about in that segment was We Leak Info and it was run by two 22 year old guys. The website first appeared 3 years earlier - only a year after the creators had left childhood - and it allowed anyone with the money to access anyone else's personal data including:

names, email addresses, usernames, phone numbers, and passwords

One of the duo was later sentenced to 2 years in prison for his role, and when you read the sorts of conversations they were having, you can't help but think they behaved exactly like you'd expect a couple of young guys who thought they were anonymous would:

The Data Breach "Personal Stash" Ecosystem

In the video, I mentioned Jordan Bloom in relation to LeakedSource, a veritable older gentleman of this class of crime being 24 when the site first appeared.

The company operating LeakedSource, Defiant Tech Inc, which was founded by Jordan Bloom, eventually entered a guilty plea to charges that included trafficking in identity information and when you read what that involved, you can see why this would attract the ire of law enforcement agencies:

However, unlike other breach notification services, such as Have I Been Pwned, LeakedSource also gave subscribers access to usernames, passwords (including in clear text), email addresses and IP addresses. LeakedSource services were often advertised on hacking forums and there was suspicion that its operators were actively looking to hack organizations whose data they could add to their database.

In 2016, a well-wisher purchased my own data from LeakedSource and sent over a dozen different records similar to this one:

The Data Breach "Personal Stash" Ecosystem

Not mentioned in my talk but running in the same era was Leakbase, yet another service that collated huge volumes of sensitive data and sold it to absolutely anyone:

The Data Breach "Personal Stash" Ecosystem

And just like all the other ones, the same data appeared over and over again:

The Data Breach "Personal Stash" Ecosystem

It went dark at the end of 2017 amidst speculation the disappearance was tied to the takedown of the Hansa dark web market. If that was the case, why did we never hear of charges being laid as we did with We Leak Info and LeakedSource? Could it be that the operator of Leakbase was only ever so slightly younger than the other guys mentioned above and not having yet reached adulthood, managed to dodge charges? It would certainly be consistent with the demographic pattern of those with personal stashes of data breaches.

Speaking of patterns: We Leak Info, LeakedSource, Leakbase - it's like there's a theme of shady services attached to the word. As I say in the video, there's also a theme of attempting to remain anonymous (which clearly hasn't worked very well!), and a theme of attempting to eschew legal responsibility for how the data is used by merely putting words in the terms of service. For example, here's Jordan's go at deflecting his role in the ecosystem and yes, this was the entire terms of service:

The Data Breach "Personal Stash" Ecosystem

I particularly like this clause:

You may only use this tool for your own personal security and data research. You may only search information about yourself, or those you are authorized in writing to do so.

That's not going to keep you out of trouble! Time and time again, I see this sort of wording on services used as if it's going to make a difference when the law comes asking hard questions; "Hey we literally told people to play nice with the data!"

We Leak Info used similar entertaining wording with some of the highlights including:

  1. We Leak Info strictly prohibits the use of its Services to cause damage or harm to others
  2. You may not use Our Services in acts deemed illegal by the laws in Your region
  3. We Leak Info does not knowingly participate in the act of obtaining or distributing Data
  4. We Leak Info will cooperate with any legal investigations that it determines worthy and valid at its own discretion

That last one in particular is an absolute zinger! But again, remember, we're talking about guys who stood this service up as teenagers and literally worked on the assumption of "as [l]ong as we cooperate they [the FBI] won't fuck with us" πŸ€¦β€β™‚οΈ The ignorance of that attitude whilst advertising services on criminal forums is just mind-blowing, even for kids.

All of which brings me to the inspiration for this blog post:

Interesting find by @MayhemDayOne, wonder if it was from a shady breach search service (we’ve seen a bunch shut down over the years)? Either way, collecting and storing this data is now trivial so not a big surprise to see someone screw up their permissions and (re)leak it all. https://t.co/DM7udeUcRk

β€” Troy Hunt (@troyhunt) January 22, 2024

It's like I've seen it all before! No, really, because only a couple of days later someone running a service popped up and claimed responsibility for having exposed the data due to "a firewall misconfiguration". I'm not going to name or link the service, but I will describe a few key features:

  1. After purchasing access, it returns extensive personal information exposed in data breaches including names, email addresses, usernames, phone numbers, and passwords
  2. The operator is clearly trying to remain anonymous with no discoverable information about who is running it
  3. It has ToS that include: "You may only use this service for your own personal security and research. Furthermore, you may only search for information about yourself or those who you are authorized in writing to do so." (I know what you're thinking, so I diff'd it for you)
  4. The name of the service starts with the word "leak"

I could write predictions about the future of this service but if you've read this far and paid attention to the precedents, you can reliably form your own conclusion. The outcome is easily predictable and indeed it was the predictability of the whole situation when I started getting bombarded with queries about the "Mother of all Breaches" that frustrated me; of course it was someone's personal stash, because we've seen it all before and we live in an era where it's dead easy to build services like this. Cloud is ubiquitous and storage is cheap, you can stand up great looking websites in next to no time courtesy of freely available templates, and the whole data breach trading ecosystem I referred to earlier can easily seed services like this.

Maybe the young guy running this service (assuming the previously observed patterns apply) will learn from history and quietly exit while the getting is good, I don't know, time will tell. At the very least, if he reads this and takes nothing else away, don't go driving around in a bright green Lamborghini!

Edit: In the original version of this blog post, it was incorrectly implied that Jordan Bloom may have been the person who pled guilty to charges when in fact it was the company that ran LeakedSource, Defiant Tech Inc, that the plea was entered under. To the extent that the blog contained words to the effect of, or otherwise implied or contained innuendo that Mr Bloom engaged in criminal or otherwise illegal conduct, or pled guilty to trafficking identify information, I apologise and unreservedly retract such statements and this blog has been edited to ensure that the facts involved in this matter are accurately portrayed.

Weekly Update 384

By Troy Hunt
Weekly Update 384

I spent longer than I expected talking about Trello this week, in part because I don't feel the narrative they presented properly acknowledges their responsibility for the incident and in part because I think the impact of scraping in general is misunderstood. I suspect many of us are prone to looking at this in a very binary fashion: if the data is publicly accessible anyway, scraping it poses no risk. But in my view, there's a hell of a big difference between say, looking at one person's personal info on LinkedIn via the browser versus having a corpus of millions of records of the same data saved offline. That's before we even get into the issue of whether in Trello's case, it should ever be possible for a third party to match email address to username and IRL name.

To add some more perspective, I've just posted a poll immediately before publishing this blog post, let's see what the masses have to say:

Scraping: should we be concerned if an individual's personal data is scraped, aggregated en mass and redistributed if that same data is already publicly accessible on the service anyway? Vote and if possible, add more context in a reply.

β€” Troy Hunt (@troyhunt) January 28, 2024
Weekly Update 384
Weekly Update 384
Weekly Update 384
Weekly Update 384

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. Trello had 15M records scraped and posted publicly (somehow the narrative feels like it's pushing back on things that were never said to begin with)
  3. The "Mother of all Breaches"... which isn't (someone leaving their personal stash of existing breaches doesn't make everything re-breached)
  4. HIBP got a nice little shout-out from our MP for Cyber Security (I'm still fascinated at just how mainstream this little service has become 😊)

Weekly Update 383

By Troy Hunt
Weekly Update 383

They're an odd thing, credential lists. Whether they're from a stealer as in this week's Naz.API incident, or just aggregated from multiple data breaches (which is also in Naz.API), I inevitably get some backlash after loading them: "this doesn't tell me anything useful, why are you loading this?!" The answer is easy: because that's what the vast majority of people want me to do:

If I have a MASSIVE spam list full of personal data being sold to spammers, should I load it into @haveibeenpwned?

β€” Troy Hunt (@troyhunt) November 15, 2016

Spam lists are the same kettle of fish in that once you learn you're in one, I can't provide you any further info about where it came from and there's no recourse available to you. You're just in there, good luck! And if you do find yourself in one of these lists and are unhappy not that you're in there, but rather that I've told you you're in there, you have 2 easy options:

  1. Ignore it
  2. Unsubscribe

Or, if you've come along to HIBP, done a search and then been unhappy with me, my guitar lessons blog post is an entertaining read 😊

That's all from Europe folks, see you from the sunny side next week!

Weekly Update 383
Weekly Update 383
Weekly Update 383
Weekly Update 383

References

  1. Sponsored by:Β Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. The Naz.API stealer logs and credential stuffing lists got a lot of attention (big shout out to the folks angry that I wouldn't either store truck loads of plain text passwords for them or link them through to the original breach of everyone's personal info πŸ€¦β€β™‚οΈ)
  3. Couple of phillips head screws through a laptop will stop it from disappearing (and if your takeaway is the correct identification of the laptop make, you're kinda missing the point...)

Inside the Massive Naz.API Credential Stuffing List

By Troy Hunt
Inside the Massive Naz.API Credential Stuffing List

It feels like not a week goes by without someone sending me yet another credential stuffing list. It's usually something to the effect of "hey, have you seen the Spotify breach", to which I politely reply with a link to my old No, Spotify Wasn't Hacked blog post (it's just the output of a small set of credentials successfully tested against their service), and we all move on. Occasionally though, the corpus of data is of much greater significance, most notably the Collection #1 incident of early 2019. But even then, the rapid appearance of Collections #2 through #5 (and more) quickly became, as I phrased it in that blog post, "a race to the bottom" I did not want to take further part in.

Until the Naz.API list appeared. Here's the back story: this week I was contacted by a well-known tech company that had received a bug bounty submission based on a credential stuffing list posted to a popular hacking forum:

Inside the Massive Naz.API Credential Stuffing List

Whilst this post dates back almost 4 months, it hadn't come across my radar until now and inevitably, also hadn't been sent to the aforementioned tech company. They took it seriously enough to take appropriate action against their (very sizeable) user base which gave me enough cause to investigate it further than your average cred stuffing list. Here's what I found:

  1. 319 files totalling 104GB
  2. 70,840,771 unique email addresses
  3. 427,308 individual HIBP subscribers impacted
  4. 65.03% of addresses already in HIBP (based on a 1k random sample set)

That last number was the real kicker; when a third of the email addresses have never been seen before, that's statistically significant. This isn't just the usual collection of repurposed lists wrapped up with a brand-new bow on it and passed off as the next big thing; it's a significant volume of new data. When you look at the above forum post the data accompanied, the reason why becomes clear: it's from "stealer logs" or in other words, malware that has grabbed credentials from compromised machines. Apparently, this was sourced from the now defunct illicit.services website which (in)famously provided search results for other people's data along these lines:

Inside the Massive Naz.API Credential Stuffing List

I was aware of this service because, well, just look at the first example query πŸ€¦β€β™‚οΈ

So, what does a stealer log look like? Website, username and password:

Inside the Massive Naz.API Credential Stuffing List

That's just the first 20 rows out of 5 million in that particular file, but it gives you a good sense of the data. Is it legit? Whilst I won't test a username and password pair on a service (that's way too far into the grey for my comfort), I regularly use enumeration vectors on websites to validate whether an account actually exists or not. For example, take that last entry for racedepartment.com, head to the password reset feature and mash the keyboard to generate a (quasi) random alias @hotmail.com:

Inside the Massive Naz.API Credential Stuffing List

And now, with the actual Hotmail address from that last line:

Inside the Massive Naz.API Credential Stuffing List

The email address exists.

The VideoScribe service on line 9:

Inside the Massive Naz.API Credential Stuffing List

Exists.

And even the service on the very first line:

Inside the Massive Naz.API Credential Stuffing List

From a verification perspective, this gives me a high degree of confidence in the legitimacy of the data. The question of how valid the accompanying passwords remain aside, time and time again the email addresses in the stealer logs checked out on the services they appeared alongside.

Another technique I regularly use for validation is to reach out to impacted HIBP subscribers and simply ask them: "are you willing to help verify the legitimacy of a breach and if so, can you confirm if your data looks accurate?" I usually get pretty prompt responses:

Yes, it does. This is one of the old passwords I used for some online services.Β 

When I asked them to date when they might have last used that password, they believed it was was either 2020 or 2021.

And another whose details appears alongside a Webex URL:

Yes, it does. but that was very old password and i used it for webex cuz i didnt care and didnt use good pass because of the fear of leaking

And another:

Yes these are passwords I have used in the past.

Which got me wondering: is my own data in there? Yep, turns out it is and with a very old password I'd genuinely used pre-2011 when I rolled over to 1Password for all my things. So that sucks, but it does help me put the incident in more context and draw an important conclusion: this corpus of data isn't just stealer logs, it also contains your classic credential stuffing username and password pairs too. In fact, the largest file in the collection is just that: 312 million rows of email addresses and passwords.

Speaking of passwords, given the significance of this data set we've made sure to roll every single one of them into Pwned Passwords. StefÑn has been working tirelessly the last couple of days to trawl through this massive corpus and get all the data in so that anyone hitting the k-anonymity API is already benefiting from those new passwords. And there's a lot of them: it's a rounding error off 100 million unique passwords that appeared 1.3 billion times across the corpus of data 😲 Now, what does that tell you about the general public's password practices? To be fair, there are instances of duplicated rows, but there's also a massive prevalence of people using the same password across multiple difference services and completely different people using the same password (there are a finite set of dog names and years of birth out there...) And now more than ever, the impact of this service is absolutely huge!

When we weren't looking, @haveibeenpwned's Pwned Passwords rocketed past 7 *billion* requests in a month 😲 pic.twitter.com/hVDxWp3oQG

β€” Troy Hunt (@troyhunt) January 16, 2024

Pwned Passwords remains totally free and completely open source for both code and data so do please make use of it to the fullest extent possible. This is such an easy thing to implement, and it has a profound impact on credential stuffing attacks so if you're running any sort of online auth service and you're worried about the impact of Naz.API, this now completely kills any attack using that data. Password reuse remain rampant so attacks of this type prosper (23andMe's recent incident comes immediately to mind), definitely get out in front of this one as early as you can.

So that's the story with the Naz.API data. All the email addresses are now in HIBP and searchable either individually or via domain and all those passwords are in Pwned Passwords. There are inevitably going to be queries along the lines of "can you show me the actual password" or "which website did my record appear against" and as always, this just isn't information we store or return in queries. That said, if you're following the age-old guidance of using a password manager, creating strong and unique ones and turning 2FA on for all your things, this incident should be a non-event. If you're not and you find yourself in this data, maybe this is the prompt you finally needed to go ahead and do those things right now πŸ™‚

Edit: A few clarifications based on comments:

  1. The blog post refers to both stealer logs and classic credential stuffing lists. Some of this data does not come from malware and has been around for a significant period of time. My own email address, for example, accompanied a password not used for well over a decade and did not accompany a website indicating it was sourced from malware.
  2. If you're in this corpus of data and are not sure which password was compromised, 1Password can automatically (and anonymously) scan all your passwords against Pwned Passwords which includes all passwords from this corpus of data.
  3. It's already in the last para of the blog post but given how many comments have asked the question: no, we don't store any data beyond the email addresses in the breach. This means we don't store any additional data from the breach such as if a specific website was listed next to a given address.

Weekly Update 382

By Troy Hunt
Weekly Update 382

Geez it's nice to be back in Oslo! This city has such a special place in my heart for so many reasons, not least of which by virtue of being Charlotte's home town we have so many friends and family here. Add in NDC Security this week with so many more mutual connections, beautiful snowy weather, snowboarding, sledging and even curling, it's just an awesome time. Awesome enough to still be here for the next weekly update so until then, I'll leave you with the pics I promised at the end of this week's vid. Enjoy 😊

Perfect Oslo - fresh snow, cool temps and sunshine πŸ‡³πŸ‡΄ pic.twitter.com/yPtnCkKIwo

β€” Troy Hunt (@troyhunt) January 15, 2024
Weekly Update 382
Weekly Update 382
Weekly Update 382
Weekly Update 382

References

  1. Sponsored by:Β Kolide ensures that if a device isn't secure, it can't access your apps. It's Device Trust for Okta. Watch the demo today!
  2. Standardising on USB-C as a common connector for all phones, tablets and cameras can only be a good thing (by extension, hopefully that will filter through to all the other USB-A / C / Mini / Micro connectors as well)
  3. Capelli finally got back to Scott and Joe regarding their lapsed domain the guys subsequently registered (yet still, their JavaScript remains running on the Capelli website πŸ€·β€β™‚οΈ)
  4. The Hathway ISP in India went into HIBP (it's a weeks old incident, but it seems they're unwilling to make a statement on the breach whatsoever)

Weekly Update 381

By Troy Hunt
Weekly Update 381

It's another weekly update from the other side of the world with Scott and I in Rome as we continue a bit of downtime before hitting NDC Security in Oslo next week. This week, Scott's sharing details of how he and Joe Tiedman registered a domain Capelli Sport let lapse and now have their JavaScript running on the websites shopping cart page (check your browser console after loading that link) 😲 That's not the crazy bit though, the crazy bit is the months they've spent trying to disclose this to Capelli and getting absolutely nowhere. I'll give them a shout-out this week and see if I have any more luck but when it's this hard to report egregiously bad security issues, is it any wonder we have so many data breaches. As I keep lamenting, it's a great time to be in this industry...

Weekly Update 381
Weekly Update 381
Weekly Update 381
Weekly Update 381

References

  1. Sponsored by:Β Unpatched devices keeping you up at night? Kolide can get your entire fleet updated in days. It's Device Trust for Okta. Watch the demo!
  2. 23andMe is blaming end users for account takeover attacks (it's obviously lawyery deflection, but they're also partly right)
  3. Anyone got a security contact at Capelli Sport? (I'll give that line a push publicly this coming week, it's just nuts how hard it is to report this stuff)

Weekly Update 380

By Troy Hunt
Weekly Update 380

We're in Paris! And feeling proper relaxed after several days of wine and cheese too, I might add. This was a very impromptu end of 2023 weekly update as we balanced family time with doing the final video for the year. On the cyber side, the constant theme over the last week has been ransomware; big firms, little firms, Aussie firms, American firms - it's just completely indiscriminate. Anecdotally, this seems to have really ramped up over 2023 so on that basis, 2024 will bring... well, let's wait and see, this industry is nothing if not full of surprises. Happy New Year friends 😊

Weekly Update 380
Weekly Update 380
Weekly Update 380
Weekly Update 380

References

  1. Sponsored by:Β Unpatched devices keeping you up at night? Kolide can get your entire fleet updated in days. It's Device Trust for Okta. Watch the demo!
  2. Eagers Automotive in Australia got ransom'd (that's a fairly significant Aussie brand)
  3. The University of Western Australia has had a dump turn up on a popular hacking forum (not ransom by the look of it, but obviously still bad)
  4. Ohio Lottery is another ransomware victim (play the odds, lose your data)
  5. And no, you definitely can't use a credit card in the UK to buy lottery tickets (borrowing money to gamble ain't exactly financially sensible)
  6. Even a very localised Aussie taxi firm is on this week's ransomware books (I suspect there's a degree of automation that makes it a no-brainer to add even small firms)

Weekly Update 379

By Troy Hunt
Weekly Update 379

It's that time of the year again, time to head from the heat to the cold as we jump on the big plane(s) back to Europe. The next 4 weekly updates will all be from places of varying degrees colder than home, most of them done with Scott Helme too so they'll be a little different to usual. For now, here's a pretty casual Christmas edition, see you next week from the other side πŸ™‚

Weekly Update 379
Weekly Update 379
Weekly Update 379
Weekly Update 379

References

  1. Sponsored by:Β Unpatched devices keeping you up at night? Kolide can get your entire fleet updated in days. It's Device Trust for Okta. Watch the demo!
  2. K'gari / Fraser Island is just exceedingly beautiful (and now we need a bigger wall to put these photos up on 🀣)
  3. The Ubiquiti Dream Wall is a really sweet looking piece of kit (awesome solution to avoid having a full rack setup if you don't need it)
  4. I'll be back as NDC Oslo in June for the first time since 2019 (this is the event that gave me everything from a career to a wife - it's kinda special to me 😊)
  5. The story about a marketing company pitching ads based on eavesdropped conversations by mobile devices is really wild (for so long, this amounted to tinfoil-hattery, now here we are...)

Weekly Update 378

By Troy Hunt
Weekly Update 378

I'd say the balloon fetish segment was the highlight of this week's video. No, seriously, it's a moment of levity in an otherwise often serious industry. It's still a bunch of personal info exposed publicly and that suchs regardless of the nature of the site, but let's be honest, the subject matter did make for some humorous comments 🀣

Weekly Update 378
Weekly Update 378
Weekly Update 378
Weekly Update 378

References

  1. Sponsored by:Β Identity theft isn’t cheap. Secure your family with Aura the #1 rated proactive protection that helps keep you safe online. Get started.
  2. I now have solar radiation and UV sensors tied into my IoT (in a week of bright sun constantly interjected by storm cells, this has been a really cool way to control lighting)
  3. Many people were left feeling deflated after the balloon fetish website got pwned (the whole thing was a real let down)
  4. The Twitter XSS + CSRF bug was rather nasty (but - assuming the reporting is accurate - it's their claimed handling of the bug report that's particularly bad)
  5. The DC Health Link breach was earlier this year and not particularly large at only 48k records (but it's in DC with a lot of politicians in it)

Weekly Update 377

By Troy Hunt
Weekly Update 377

10 years later... 🀯 Seriously, how did this thing turn into this?! It was the humblest of beginning with absolutely no expectations of anything, and now it's, well, massive! I'm a bit lost for words if I'm honest, I hope the chat with Charlotte adds some candour to this week's update, she's seen this thing grow since before its first birthday, through the hardest times and the best times and now lives and breathes HIBP day in day out with me. I hope you enjoy this video, and we'd both love to hear those swag ideas from you too 😊

Weekly Update 377
Weekly Update 377
Weekly Update 377
Weekly Update 377

References

  1. Sponsored by:Β Get insights into malware’s behavior with ANY.RUN: instant results, live VM interaction, fresh IOCs, and configs without limit.
  2. I wrote up a blog post on the highlights earlier this week (it still feels like I've missed a million things)

A Decade of Have I Been Pwned

By Troy Hunt
A Decade of Have I Been Pwned

A decade ago to the day, I published a tweet launching what would surely become yet another pet project that scratched an itch, was kinda useful to a few people but other than that, would shortly fade away into the same obscurity as all the other ones I'd launched over the previous couple of decades:

It's alive! "Have I been pwned?" by @troyhunt is now up and running. Search for your account across multiple breaches http://t.co/U0QyHZxP6k

β€” Have I Been Pwned (@haveibeenpwned) December 4, 2013

And then, as they say, things kinda escalated quickly. The very next day I published a blog post about how I made it so fast to search through 154M records and thus began a now 185-post epic where I began detailing the minutiae of how I built this thing, the decisions I made about how to run it and commentary on all sorts of different breaches. And now, a 10th birthday blog post about what really sticks out a decade later. And that's precisely what this 185th blog post tagging HIBP is - the noteworthy things of the years past, including a few things I've never discussed publicly before.

Pwned?

You know why it's called "Have I Been Pwned"? Try coming up with almost any conceivable normal sounding English name and getting a .com domain for it. Good luck! That was certainly part of it, but another part of the name choice was simply that I honestly didn't expect this thing to go anywhere. It's like I said in the intro of this post where I fully expected this to be another failed project, so why does the name matter?

But it's weird how "pwned" has stuck and increasingly, become synonymous with HIBP. For many people, the first time they ever hear the word is in the context of "Have I Been..." with an ensuing discussion often explaining the origins of the term as it relates to gaming culture. And if you do go and look for a definition of the term online, you'll come across resources such as How β€œPWNED” went from hacker slang to the internet’s favourite taunt:

Then in 2013, when various web services and sites saw an uptick in personal data breaches, security expert Troy Hunt created the website β€œHave I Been Pwned?” Anyone can type in an email address into the site to check if their personal data has been compromised in a security breach.

And somehow, this little project is now referenced in the definition of the name it emerged from. Weird.

But, because it's such an odd name that has so frequently been mispronounced or mistyped, I've ended up with a whole raft of bizarre domain names including haveibeenpaened.com, haveibeenpwnded.com, haveibeenporned.com and my personal favourite, haveibeenprawned.com (because a journo literally pronounced it that way in a major news segment πŸ€¦β€β™‚οΈ). Not to mention all the other weird variations including haveibeenburned.com, haveigotpwned.com, haveibeenrekt.com and after someone made the suggestion following the revelation that PornHub follows me, haveibeenfucked.com πŸ€·β€β™‚οΈ

Press

It's difficult to even know where to start here. How does the little site with the weird name end up in the press? Inevitably, "because data breaches", and it's nuts just how much exposure this project has had because of them. These are often mainstream news events and what reporters often want to impart to people is along the lines of "Here's what you should do if you've been impacted", which often boils down to checking HIBP.

Press is great for raising awareness of the project, but it has also quite literally DDoS'd the service with the Martin Lewis Money Show in the UK knocking it offline in 2016. Cool! No, for real, I learned some really valuable lessons from that experience which, of course, I shared in a blog post. And then ensured could never happen again.

Back in 2018, Gizmodo reckoned HIBP was one of the top 100 websites that shaped the internet as we knew it, alongside the likes of Wikipedia, Google, Amazon and Goatse (don't Google it). Only the year after it launched, TIME magazine reckon'd it was one of the 50 best websites of the year. And every time I do a Google search for a major news outlet, I find this little website. The Wall Street Journal. The Standard (nice headline!) USA Today. Toronto Star. De Telegraaf. VG. Le Monde. Corriere della Sera. It's wild - I just kept Googling for the largest newspapers in various parts of the world and kept getting hits!

The point is that it's had impact, and nobody is more surprised about that than me.

Congress

How on earth did I end up here?!

A Decade of Have I Been Pwned

6 years and a few days ago now, I found myself in a place I'd only ever seen before in the movies: Congress. American Congress. Saying "pwned"!

For reasons I still struggle to completely grasp, the folks there thought it would be a good idea if I flew to the other side of the world and talked about the impact of data breaches on identity verification. "You know they're just trying to get you to DC so they can arrest you for all that stolen data you have, right?! 🀣", the internet quipped. But instead, I had one of the most memorable moments of my career as I read my testimony (these are public hearings so it's all recorded and available to watch), responded to questions from congressmen and congresswomen and rounded out the trip staring down at where they inaugurate presidents:

A Decade of Have I Been Pwned

Today, that photo adorns the wall outside my office and dozens of times a day I look at it and ask the same question - how did it all lead to this?!

Svalbard

The potential sale of HIBP was a very painful, very expensive chapter of life, announced in a blog post from June 2019. For the most part, I was as transparent and honest as I could be about the reasons behind the decision, including the stress:

To be completely honest, it's been an enormously stressful year dealing with it all.

More than one year later, I finally wrote about the source of so much of that stress: divorce. Relationship circumstances had put a huge amount of pressure on me and I needed a relief valve which at the time, I thought would be the sale of the project I loved so much but was becoming increasingly demanding. Ultimately, Project Svalbard (the code name for the sale of HIBP), had the opposite effect as years of bitter legal battles with my ex ensued, in part due to the perceived value that would have been realised had it been sold and some big tech company owned my arse for years to come. The project I built out of a passion to do community good was now being used as a tool to extract as much money out of me as possible. There's a wild story to be told there one day but whilst that saga is now well and truly behind me, the scars are still raw.

There were many times throughout Project Svalbard where I felt like I was living out an episode of Silicon Valley, especially as I hopped between interviews at the who's-who of tech firms in San Francisco to meet potential acquirers. But there was one moment in particular that I knew at the time would form an indelible memory, so I took a photo of it:

A Decade of Have I Been Pwned

I'm sitting in a rental car in Yosemite whilst driving from the aforementioned meetings in SF and onto Vegas for the annual big cyber-events. I had a scheduled call with a big tech firm who was a potential acquirer and should that deal go through, the guy I was speaking to would be my new boss. I'd done that dozens of times by now and I don't know if it was because I was especially tired or emotional or if there was something in the way he phrased the question, but this triggered something deep inside me:

So Troy, what would your perfect day in the office look like?

I didn't say it this directly, but I kid you not this is exactly what popped into my mind:

I get on my jet ski and I do whatever the fuck I want

My potential new overlord had somehow managed to find exactly the raw nerve to touch that made me realise how valuable independence had become to me. 6 months later, Project Svalbard was dead after a deal I'd struck fell through. I still can't talk about the precise circumstances due to being NDA'd up to wazoo, but the term we chose to use was "a change of business circumstances on behalf of the purchaser". With the benefit of hindsight, I've never been so happy to have lost so much 😊

The FBI

10 years ago, I certainly didn't see this on the cards:

This is so cool, thanks @FBI 😊 pic.twitter.com/aqMi3as91O

β€” Troy Hunt (@troyhunt) June 28, 2023

Nor did I expect them to be actively feeding data into HIBP. Or the UK's NCA to be feeding data in. Or various other law enforcement agencies the world over. And I never envisioned a time where dozens of national governments would be happy to talk about using the service.

A couple of months ago, the ABC wrote a long piece on how this whole thing is, to use their term, a strange sign of the times.

He’s just β€œa dude on the web”, but Troy HuntΒ has ended up playingΒ an oddly central role inΒ globalΒ cybersecurity.
A Decade of Have I Been Pwned

It's strange until you look at through the lens of aligned objectives: the whole idea of HIBP was "to do good things after bad things happen" which is well aligned with the mandates of law enforcement agencies. You could call it... common ground:

This is something I suspect a lot of people don't understand - that law enforcement agencies often work in conjunction with private enterprise to further their goals of protecting people just like you and me. It's something I certainly didn't understand 10 years ago, and I still remember the initial surprise when agencies started reaching out. Many years on, these have become really productive relationships with a bunch of top notch people, a number of whom I now count as friends and make an effort to spend time with on my travels.

Passwords

This was never on the cards originally. In fact, I'd always been adamant that there should never be passwords in HIBP although in my defence, the sentiment was that they should never appear next to the username to which they originally accompanied. But looking at passwords through the lens of how breach data can be used to do good things, a list of known compromised passwords disassociated from any form of PII made a lot of sense. So, in 2017, Pwned Passwords was born. You know what I was saying earlier about things escalating quickly? Yeah:

Setting all new records for Pwned Passwords this week: biggest day ever yesterday at 282M requests and biggest rolling 30 days ever, now passing the 6 *billion* requests mark! pic.twitter.com/dQiuQim3da

β€” Troy Hunt (@troyhunt) September 12, 2023

As if to make the point, I just checked the latest stats and last week we did 301.6M requests in a single day. 100% of those requests - and that's not a rounded number either, it's 100.0000000000% - were served from Cloudflare's cache 🀯

There's so much I love about this service. I love that it's free, there's no auth, it's entirely open source (both code and data), the FBI feeds data into it and perhaps most importantly, it has real impact on security. It's such a simple thing, but every time you see a headline such as "Big online website hit with credential stuffing attack", a significant portion of the accounts being taken over have passwords that could easily have been blocked.

The Paradox of Handling Data Breaches

On multiple occasions now, I've had conversations that can best be paraphrased as follows:

Random Internet Person: I'm going to report you to the FBI for having all that stolen data

Me: Maybe you should start by Googling "troy hunt fbi" first...

But I understand where they're coming from and the paradox I refer to is the perceived conflict between handling what is usually the output of a crime whilst simultaneously trying to perform a community good. It's the same discussion I've often had with people citing privacy laws in their corner of the world (often the EU and GDPR) as the reason why HIBP shouldn't exist: "but you're processing data without informed consent!", they'll claim. The issue of there being other legal bases for processing aside, nobody consents to being in a data breach! The natural progression of that conversation is that being in a data breach is a parallel discussion to HIBP then indexing it and making it searchable, which is something I've devoted many words to addressing in the past.

But for all the bluster the occasional random internet person can have (and honestly, I could count the number of annual instances of this on one hand), nothing has come of any complaints. And when I say "complaints", it's often nothing more than a polite conversation which may simply conclude with an acknowledgment of opposing views and that's it. There has been one exception in the entire decade of running this service where a complaint did come via a government privacy regulator, I responded to all the questions that were asked and that was the end of it.

People

When you have a pet project like HIBP was in the beginning, it's usually just you putting in the hours. That's fine, it's a hobby and you're scratching an itch, so what does it matter that there's nobody else involved? Like many similar passion projects, HIBP consumed a lot of hours from early on, everything from obviously building the service then sourcing data breaches, verifying and disclosing them, writing up descriptions and even editing every single one of those 700+ logos by hand to be just the right dimensions and file size. But in the beginning, if I'd just stopped one day, what would happen? Nothing. But today, a genuinely important part of the internet that a huge number of individuals, corporations and governments have built dependencies on would stop working if I lost interest.

The dependency on just me was partly behind the possible sale in 2019, but clearly that didn't eventuate. There was always the option to employ people and build it out like most people would a normal company, but every time I gave that consideration it just didn't stack up for a whole bunch of reasons. It was certainly feasible from the perspective of building some sort of valuable commercial entity, but in just the same way as that question about my perfect day in the office sucked the soul from my body, so did the prospect of being responsible for other people. Employment contracts. Salary negotiations. Performance reviews. Sick leave and annual leave and all sorts of other people issues from strangers I'd need to entrust with "my baby". So, bringing in more people was a really unattractive idea, with 2 exceptions:

In early 2021, my (soon to be at the time) wife Charlotte started working for HIBP.

A Decade of Have I Been Pwned

Charlotte had spent the last 8 years working with people just like me; software nerds. As a project manager for the NDC conferences based out of Norway, she'd dealt with hundreds of speakers (including me on many occasions), and thousands of attendees at the best conference I've ever been a part of. Plus, she spent a great deal of time coordinating sponsors, corporate attendees and all sorts of other folks that live in the tech world HIBP inhabited. For Charlotte, even though she's not a technical person (her qualifications are in PR and entrepreneurial studies), this was very familiar territory.

So, for the last few years, Charlotte has done absolutely everything that she can to ensure that I can focus on the things that need my attention. She onboards new corporate subscribers, handles masses of tickets for API and domain subscribers and does all the accounting and tax work. And she does this tirelessly every single day at all sorts of hours whether we're at home or travelling. She is... amazing 🀩

Earlier this year, StefΓ‘n JΓΆkull SigurΓ°arson started working for us part time writing code, cleaning up code, migrating code and, well, doing lots of different code things.

A Decade of Have I Been Pwned

Just today I asked StefΓ‘nΒ what I should write about him, thinking he'd give me some bullet points I'd massage and then incorporate into this blog post. Instead, I reckon what he wrote was so spot on that I'm just going to quote the entire thing here:

"Just" that having had my eye on the service since it was released and then developing one of the first big integrations with the PwnedPasswords v2 API in EVE, coinciding with us meeting for the first time at NDC Oslo in 2018 shortly after,Β  HIBP has managed to take me on this awesome journey where it has been a part of launching my public speaking career, contributing to OSS with Pwned Passwords, becoming an MVP and helped me meet a bunch of awesome people and allowed me to contribute to a better and hopefully safer internet. I'm very happy and honoured to a be a part of this project which is full of awesome challenges and interesting problems to deal with. Having meeting invites from the FBI in my inbox a few years after doing a few experimental rest calls to the Pwned Passwords API in early 2018 was definitely not something I was expecting πŸ˜…

What really resonated with me in StefΓ‘n's message is that for him, this isn't just a job, it's a passion. His journey is my journey in that we freely devoted our time to do something we love and it led to many wonderful things, including MVP roles and speaking at "Charlotte's" conference, NDC. StefΓ‘nΒ is based in Iceland, but we've still had many opportunities to share beers together and establish a relationship that transcends merely writing code. I can't think of anyone better to do what he does today.

Breaches

731 breaches later, here we are. So, what stands out? Just going off the top of my head here:

Ashley Madison. Every knows the name so it needs no introduction, but that incident in 2015 had a major impact on HIBP in terms of use of the service, and also a major impact on me in terms of the engagements I had with impacted parties. My blog post on Here’s what Ashley Madison members have told me still feels harrowing to read.

Collection #1. This is the one that really contributed to my stress levels in early 2019 and had a profound impact on my decision to look at selling the service. Read about where those 773M records came from (still the largest breach in HIBP to date).

Rosebutt. Don't make a joke about it, don't make a joke about it, don't... aw man, thanks The Register! (link to an archive.org version as they seem to have thought better of their image choice later on...) The point is that even serious data breaches can have their moments of levity.

Shit Express. Sometimes, you just need a bit of hilarity in your data breach. Shit Express is literally a site to send other people pieces of that - anonymously - and they got breached, thus somewhat affecting their anonymity. The more serious point is that as I later wrote, claims of anonymity are often highly misleading.

Future

I often joke about my life being very much about getting up each morning, reading my emails and events from overnight and then just winging it from there. Of course there are the occasional scheduled things not to mention travel commitments, but for the most part it's very much just rolling with whatever is demanding attention on the day. This is also probably a significant part of why I don't really want to see this thing grow into a larger concern with more responsibilities, I just don't want to lose that freedom. Yet...

We're gradually moving in a direction where things become more formalised. 3 years ago, I did 100% of everything myself. 1 year ago, I did everything technical myself. 6 months ago, we had no ticketing system for support. But these are small, incremental steps forward and that's what I'd like to see continuing. I want HIBP to outlive me, I just don't want it to become a burden I'm beholden to in the process. I'd like to have more people involved but as you can see from above, that's been a very slow process with only those very close to me playing a role.

The only thing I have real certainty on at the moment is that there will be more breaches. I've commented many times recently that the scourge that is ransomware feels like it's really accelerated lately, I wonder how many of the people in the emails and documents and all sorts of other data that get dumped there ever learn of their exposure? It's a non-trivial exercise to index that (for all sorts of reasons), but it also seems like an increasingly worthy exercise. Who knows, let's see how I feel when I get up tomorrow morning πŸ™‚

Finally, for this week's regular video, I'm going to make a birthday special and do it live with Charlotte. Please come and join us, I'm not entirely sure what we'll cover (I'll work it out on the morning!) but let's make a virtual 10th birthday party out of it πŸŽ‚

Weekly Update 376

By Troy Hunt
Weekly Update 376

I'm irrationally excited about the new Prusa 3D printer on order, and I think that's mostly to do with planning for the NDC Oslo talk I plan to do with Elle, my 11-year old daughter. I'm all for getting the kids exposure not just to tech, but also to being able to talk to others about tech and involving them in conference talks since a young age has been a big part of that. But what I'm especially excited about is that this won't just be an "aw, isn't it cute seeing kids talk at a conference" kinda thing; she genuinely knows enough about this technology that together, we can make a talk that adults will learn something from. That's cool 😎

Weekly Update 376
Weekly Update 376
Weekly Update 376
Weekly Update 376

References

  1. Sponsored by:Β Kolide ensures that if a device isn't secure, it can't access your apps. It's Device Trust for Okta. Watch the demo today!Β 
  2. Prusa MK4 inbound! (the MK3 has been such an awesome machine, the MK4 will be part of the NDC Oslo talk Elle and I do in June)
  3. If you're handy with .NET and feel like contributing to a cool open source project, have a look at our HIBP email address extractor (check out the open issues, there are a bunch of things there waiting for input)
  4. Breaches, breaches, breaches (there's a pretty regular cadence of new breaches flowing through right now, about one every 2-and-a-bit days based on the last 4 weeks.)

Weekly Update 375

By Troy Hunt
Weekly Update 375

For a weekly update with no real agenda, we sure did spend a lot of time talking about the ridiculous approach Harvey Norman took to dealing with heavy traffic on Black Friday. It was just... unfathomable. A bunch of people chimed into the tweet thread and suggested it may have been by design, but they certainly wouldn't have set out to achieve the sorts of headlines that adorned the news afterwards. Who knows, but it made for entertaining content this week πŸ™‚

Weekly Update 375
Weekly Update 375
Weekly Update 375
Weekly Update 375

References

  1. Sponsored by:Β Kolide ensures that if a device isn't secure, it can't access your apps. It's Device Trust for Okta. Watch the demo today!
  2. The Harvey Norman website outage was just, dumb (some people suggested it was a deliberate strategy to create demand)
  3. Unifi has launched a search feature for license plate recognition in their Protect app (I'd really like to see this data surfaced into Home Assistant so I can trigger events off specific vehicles)
  4. I mentioned Ubiquiti's funny ads about subscription services for video being reminiscent of the old "Mac versus PC ads" (there's a whole series of these, check out their YouTube channel for more)
  5. Australia Post's approach to verifying identities using digital driver's license appears to be "she'll be right mate" (let's see if that's just a teething problem and they start using the proper verifier soon)

Weekly Update 374

By Troy Hunt
Weekly Update 374

Think about it like this: in 2015, we all lost our proverbial minds at the idea of the Kazakhstan government mandating the installation of root certificates on their citizens' devices. We were outraged at the premise of a government mandating the implementation of a model that could, at their bequest, allow them to intercept traffic without any transparency or accountability. The EFF said the following at the time:

If the country's ruling regime were to successfully implement this plan, it would be able to snoop on, impersonate, and alter the online communications of anyone within their bordersβ€”effectively performing aΒ Man in the MiddleΒ attack on its entire population.

Now watch the video, listen to Scott and ask yourself how different the technical capacity he discusses is from the Kazakhstan situation. Not from a policy perspective or the intentions of the respective government bodies, but rather it terms of the capabilities and lack of transparency it results in. It's nuts. But hey, it's a good time to be in this industry!

Weekly Update 374
Weekly Update 374
Weekly Update 374
Weekly Update 374

References

  1. Sponsored by:Β Identity theft isn’t cheap. Secure your family with Aura the #1 rated proactive protection that helps keep you safe online. Get started.
  2. If it looks like a duck, swims like a duck, and QWACs like a duck, then it's probably an EV Certificate (Scott's original Jan 2022 post on the emergence of QWACs)
  3. What the QWAC?! (Scott's post from this month that expands on eIDAS, root certificates and other - to use the technical term - batshit crazy ideas)
  4. Dead we learn nothing from the death of EV certificates?! (I posted that more than 4 years ago now after the EV indicator was removed from browser omnibars, effectively making them invisible to all but the most tech-savvy people)

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

By Troy Hunt
Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

Allegedly, Acuity had a data breach. That's the context that accompanied a massive trove of data that was sent to me 2 years ago now. I looked into it, tried to attribute and verify it then put it in the "too hard basket" and moved onto more pressing issues. It was only this week as I desperately tried to make some space to process yet more data that I realised why I was short on space in the first place:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

Ah, yeah - Acuity - that big blue 437GB blob. What follows is the process I went through trying to work out what an earth this thing is, the confusion surrounding the data, the shady characters dealing with it and ultimately, how it's now searchable in Have I Been Pwned (HIBP), which may be what brought you to this blog post in the first place.

One of the first things I do after receiving a data breach is to literally just Google it: acuity data breach. Which immediately yielded this top result from June:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

Ah, so Acuity is a healthcare company. But wait - here's the next result:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

That's not about healthcare, that's Acuity Brands. How many companies called "Acuity" that have been breached are there?! Let's see what references I have in my email:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

Another one πŸ€¦β€β™‚οΈ That "breach" could be circumstantial, so we'll call it a "maybe", but it's yet another Acuity with a question mark next to it. So how many "Acuity" companies are out there in total?! Just in the course of investigating this data, I came across a total of 6 of them that as far as I can tell, are completely unrelated:

  1. Acuity Healthcare (definitely breached): acuity.healthcare
  2. Acuity Brands (definitely breached): acuitybrands.com
  3. Acuity Scheduling (maybe breached): acuityscheduling.com
  4. Acuity Insurance: acuity.com
  5. Acuity "Innovative technical solutions for Federal agencies that support the National Security & Public Safety missions": myacuity.com
  6. Acuity Ads: acuityads.com (now redirects to illumin.com)

Ugh, great. We'll work through them and try to figure out where they fit into the picture in a moment, but first let's look at the actual data. We already know it's 437GB, but it's the breadth of column headings that's most stunning; here's all 414 of them:

Just by eyeballing these, it really doesn't feel like the sort of data that comes from a healthcare provider, a brands company or a scheduler. The other 3, however... Maybe.

Some more data points before going further:

  1. The files is named "ACUITY_MASTER_18062020.csv" (this is the date I've elected to stamp the breach with - 18 June 2020)
  2. There are 21,873,706 email addresses in the file
  3. Of those, "only" 14,055,729 are unique so there's some redundancy
  4. The data is cleansed and formatted in a fashion that definitely isn't reflective of how data is entered by end users

On that final point, here's an example of what I'm talking about:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

The last names are the same, as are the salutations. The physical addresses are spot on accurate in their structure as are the phone numbers; there are no spaces, no dashes and no other artifacts typical of millions of different humans entering data. This is clean - too clean.

The "datasource" field is another interesting data point with the top 10 values being:

  1. Buy.com
  2. Popularliving.com
  3. studentsreview.com
  4. TAGGED.COM
  5. jamster.com
  6. Expedia.com
  7. cbsmarketwatch.com
  8. netflix.com
  9. selfwealthsystem.com
  10. gocollegedegree.com

Each of these entries appeared at least hundreds of thousands of times, if not millions. Does that mean that Netflix, for example, provided customer data to this list? Almost certainly no, but it does feel reminiscent of the Acxiom / Live Ramp misattribution post I wrote a year ago where I listed full counts of a similar column. One of the top values there was also "TAGGED.COM" (also all in uppercase), alongside several other values that also appeared in both sources.

Back to attribution and a post on a popular hacking forum jumps out:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

Many things here line up, for example the column names that are very unique to this data source, including "estimatedincomecode", "del_point_check_digit" and "secondaryaddresspresent". The attribution is to the insurance company named "Acuity", but is that accurate? Insurance companies collect a lot of data as it's relevant to how they run their business, but that data is highly unlikely to include fields such as:

  1. SpectatorSportsBasketball
  2. SewingKnittingNeedlework
  3. PresenceOfUpscaleRetailCard

That's much more in the "data enrichment" space where a company sells a massive data set so that it can expand the profile data of the purchaser's existing customer base. It's a legitimate, honest, legal business model. It's also indistinguishable from this:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

Hey, it's 437GB! And the column names line up! And it's called Acuity! Slightly different column count to mine (and similar but different to the hacker forum post), and slightly different email count, but the similarities remain striking. How I got to this resource is also interesting, having come by someone I was discussing the data with a couple of years ago:

Acuity Who? Attempts and Failures to Attribute 437GB of Breached Data

The YouTube video is a walkthrough of a campaign management tool to send emails to customers. Could that indicate the data as coming from Acuity Ads (now Illumin)? No, not in and of itself, the walkthrough there isn't that dissimilar to other campaign tools I've used in the past. No matter how much I looked, I just couldn't find a solid lead back to Acuity Ads and anything even remotely related was merely circumstantial. It could be from them, but it could also be from many other places and the mere fact that a near identical corpus of data was sitting there on an outright spam site only makes the whole mystery that much deeper. There was just one more interesting data point in that email:

i myself am in that dataset and i've been getting 100x more phishing/scam calls, emails, and physical mail

Let me end this with a best guess: this feels like the same situation as the massive Master Deeds incident in South Africa in 2017. In that case, a legally operating data aggregator (I think you know how I feel about those by now...) sold personal information to a real estate business who then left it publicly exposed. I say it feels the same because it's just such a clean set of data and it's clearly very comprehensive in terms of the columns. It's exactly what I'd expect a data aggregator to prepare and sell to other businesses so they could identify which of their existing customers likes needlework.

In the past, publishing blog posts like this has helped identify an origin service and if that happens again here then I'll be sure to provide an update. For now, I've loaded it into HIBP and flagged it as a spam list which means it won't impact the size of anyone's domains and bump them into a different subscription level. If you do have any interesting insights on this data, please leave a comment below and with any luck, one of the Acuity entities out there will emerge as the source.

Note: just after loading the data, I ran the calcs on how many of the addresses were pre-existing in HIBP. This seems like a statistically significant number 😲

So, 100% (just under actually, but it rounded up). Working through a bunch of sample addresses, they appeared across all sorts of other existing spam lists and dodgy data aggregator breaches. Who knows which ones came first, just more data in the big swimming pool of breaches. https://t.co/Ux2rw6uaAk

β€” Troy Hunt (@troyhunt) November 15, 2023

Weekly Update 373

By Troy Hunt
Weekly Update 373

Most of this week's video went on the scraped (and faked) LinkedIn data, but it's the ransomware discussion that keeps coming back to mind. Even just this morning, 2 days after recording this live stream, I ended up on nation TV talking about the DP World security incident and whilst we don't have any confirmation yet, it has all the hallmarks of another ransomware case. In advance of that interview, I was trawling through various ransomware Tor sites and the volume of big names appearing there is just staggering. It does get me thinking: how many other individuals and corporations alike are being exposed through these and are never told about it? I wonder...

Weekly Update 373
Weekly Update 373
Weekly Update 373
Weekly Update 373

References

  1. Sponsored by:Β Webinar: 'How to Defend Against the Evilginx2.' Kuba Gretzky (Evilginx2) & Marcin Szary (Secfense) show a tool that counters MFA bypass.
  2. The LinkedIn scrape was a combination of data intended to be publicly consumable and lots of guessed email addresses (if you guess enough email addresses, you're bound to get some right!)
  3. The ransomware situation is getting just nuts, and it seems like there's no level criminals won't stoop to (that's a fascinating thread by Matt Johansen)
  4. The RDBMS component of HIBP is now running on "serverless" SQL Azure (yes, there are still servers, but it's not as obvious any more)

Hackers, Scrapers & Fakers: What's Really Inside the Latest LinkedIn Dataset

By Troy Hunt
Hackers, Scrapers & Fakers: What's Really Inside the Latest LinkedIn Dataset

Edit (1 day later): After posting this, the party responsible for leaking the data turned around and said "that was only a small part of it, here's the whole thing", and released records encompassing a further 14M records. I've added those into HIBP and will shortly be re-sending notifications to people monitoring domains as the count of impacted addresses will likely have changed. Everything else about the subsequent dataset is consistent with what you'll read below in terms of structure, patterns and conclusions.

The same threat actor has leaked larger amounts of data from LinkedIn dated 2023. They claim this new data contains 35M lines and is 12 GB uncompressed. They also issue an apology to @troyhunt. #Breach #Clearnet #DarkWeb #DarkWebInformer #Database #Leaks #Leaked #LinkedIn https://t.co/qBFAofvppU pic.twitter.com/Clg5o92b6t

β€” Dark Web Informer (@DarkWebInformer) November 7, 2023

I like to think of investigating data breaches as a sort of scientific search for truth. You start out with a theory (a set of data coming from an alleged source), but you don't have a vested interested in whether the claim is true or not, rather you follow the evidence and see where it leads. Verification that supports the alleged source is usually quite straightforward, but disproving a claim can be a rather time consuming exercise, especially when a dataset contains fragments of truth mixed in with data that is anything but. Which is what we have here today.

To lead with the conclusion and save you reading all the details if you're not inclined, the dataset so many people flagged me this week titled "Linkedin Database 2023 2.5 Millions" turned out to be a combination of publicly available LinkedIn profile data and 5.8M email addresses mostly fabricated from a combination of first and last name. It all began with this tweet:

A threat actor has allegedly leaked a database from LinkedIn @LinkedIn dated 2023. They claim the database shows emails, profile data, phones, full names, and more confidential info. #Breach #Clearnet #DarkWeb #DarkWebInformer #Database #Leaks #Leaked #LinkedIn pic.twitter.com/8MQecKc1vz

β€” Dark Web Informer (@DarkWebInformer) November 4, 2023

All good lies are believable at face value; is it feasible a massive corpus of LinkedIn data is floating around? Well, they were proper breached in 2012 to the tune of 164M records (by which I mean that incident was genuinely internal data such as email addresses and passwords extracted out by a vulnerability), then they were massively scraped in 2021 with another 126M records going into Have I Been Pwned (HIBP). So, when you see a claim like the one above, it seems highly feasible at face value which is what many people take it at. But I'm a bit more suspicious than most people πŸ™‚

First, the claim:

This one is similar to my twitter data scrapped [sic] but for linkedin plus 2023

Now, there's a whole debate about whether scraped data is breached data and indeed whether the definition of it even matters. With the rising prevalence of scraped data, this topic came up enough that I wrote a dedicated blog post about it a couple of years ago and concluded the following in terms of how we should define the term "breach":

A data breach occurs when information is obtained by an unauthorised party in a fashion in which it was not intended to be made available

Which makes scrapes like this alleged one a breach. If indeed it was accurate, LinkedIn data had been taken and redistributed in a way it was never intended to be by either the service itself or the individuals whose data was in this corpus. So, it's something to take seriously, and that warranted further investigation.

I scrolled through the 10M+ rows of data (many records spanned multiple rows due to line returns), and my eyes fell on a fellow Aussie who for the purposes of this exercise we'll call "EM", being the initials of her first and last name. Whilst the data I'm going to refer to is either public by design or fabricated, I don't want to use a real person as an example without their consent so let's just play it safe. Here's a fragment of EM's record:

Hackers, Scrapers & Fakers: What's Really Inside the Latest LinkedIn Dataset

There are 5 noteworthy parts of this I that immediately caught my attention:

  1. There are 5 different email addresses here with the alias for each one represented in "[first name].[last name]@" form. These exist in a column titled "PROFILE_USERNAMES". (Incidentally, this is why the headline of 2.5M accounts expands out to 5.8M email addresses as there are often multiple addresses per account.)
  2. There's a LinkedIn profile ID in the form of "[first name]-[last name]-[random hexadecimal chars]" under a column titled "PROFILE_LINKEDIN_ID". That successfully loaded EM's legitimate profile at https://www.linkedin.com/in/[id]/
  3. The numeric value in the "PROFILE_LINKEDIN_MEMBER_ID" column matched with the value on EM's profile from the previous point.
  4. The 2 dates starting with "2020-" are in columns titled "PROFILE_FETCHED_AT" and "PROFILE_LINKEDIN_FETCHED_AT". I assume these are self-explanatory.
  5. EM's first and last name, precisely as it appears in each of her 5 email addresses.

On its own, this record would be unremarkable. It'd be entirely feasible - this could very well be legit - except when you keep looking through the remainder of the data. A pattern quickly emerged and I'm going to bold it here because it's the smoking gun that ultimately indicates that a bunch of this data is fake:

Every single record with multiple email addresses had exactly the same alias on completely unrelated domains and it was almost always in the form of "[first name].[last name]@".

Representing email addresses in this fashion is certainly common, but it's far from ubiquitous, and that's easy to demonstrate. For example, I have tons of emails from Pluralsight so I dig one out from my friend "CU":

Hackers, Scrapers & Fakers: What's Really Inside the Latest LinkedIn Dataset

There's no dot, rather a dash. Every single real Pluralsight email address I looked at was a dash rather than a dot, yet when I delved into the alleged LinkedIn data and dig out another sample Pluralsight address, here's what I found:

Hackers, Scrapers & Fakers: What's Really Inside the Latest LinkedIn Dataset

That's not LM's real address because it has a dot instead of a dash. Every. Single. One. Is. Fake.

Let's try this the other way around and load up the existing breached accounts in HIBP for the domain of one of EM's alleged email addresses and see how they're formed:

Hackers, Scrapers & Fakers: What's Really Inside the Latest LinkedIn Dataset

That's definitely not the same format as EM's address, not by a long shot. And time and time again, the same pattern of addresses in the corpus of data in the original tweet emerged, drawing me to what seems to be a pretty logical conclusion:

Each email address was fabricated by taking the actual domain of a company the individual legitimately worked at and then constructing the alias from their name.

And these are legitimate companies too because every single LinkedIn profile I checked had all the cues of accurate information and each domain I checked in the corpus of data was indeed the correct one for the company they worked at. I imagine someone has effectively worked through the following logic:

  1. Get a list of LinkedIn profiles whether that be by ID or username or simply parsing them out of crawler results
  2. Scrape the profiles and pull down legitimate information about each individual, including their employment history
  3. Resolve the domain for each company they worked at and construct the email addresses
  4. Profit?

On that final point, what is the point? The data wasn't being sold in that original tweet, rather it was freely downloadable. But per the date on EM's profile, the data could have been obtained much earlier and previously monetised. And on that, the date wasn't constant across records, rather there was a broad range of them as recent as July last year and as old as... well, I stopped when the records got older than me. What is this?!

I suspect the answer may partly lie in the column headings which I've pasted here in their entirety:

"PROFILE_KEY", "PROFILE_USERNAMES", "PROFILE_SPENDESK_IDS", "PROFILE_LINKEDIN_PUBLIC_IDENTIFIER", "PROFILE_LINKEDIN_ID", "PROFILE_SALES_NAVIGATOR_ID", "PROFILE_LINKEDIN_MEMBER_ID", "PROFILE_SALESFORCE_IDS", "PROFILE_AUTOPILOT_IDS", "PROFILE_PIPL_IDS", "PROFILE_HUBSPOT_IDS", "PROFILE_HAS_LINKEDIN_SOURCE", "PROFILE_HAS_SALES_NAVIGATOR_SOURCE", "PROFILE_HAS_SALESFORCE_SOURCE", "PROFILE_HAS_SPENDESK_SOURCE", "PROFILE_HAS_ASGARD_SOURCE", "PROFILE_HAS_AUTOPILOT_SOURCE", "PROFILE_HAS_PIPL_SOURCE", "PROFILE_HAS_HUBSPOT_SOURCE", "PROFILE_FETCHED_AT", "PROFILE_LINKEDIN_FETCHED_AT", "PROFILE_SALES_NAVIGATOR_FETCHED_AT", "PROFILE_SALESFORCE_FETCHED_AT", "PROFILE_SPENDESK_FETCHED_AT", "PROFILE_ASGARD_FETCHED_AT", "PROFILE_AUTOPILOT_FETCHED_AT", "PROFILE_PIPL_FETCHED_AT", "PROFILE_HUBSPOT_FETCHED_AT", "PROFILE_LINKEDIN_IS_NOT_FOUND", "PROFILE_SALES_NAVIGATOR_IS_NOT_FOUND", "PROFILE_EMAILS", "PROFILE_PERSONAL_EMAILS", "PROFILE_PHONES", "PROFILE_FIRST_NAME", "PROFILE_LAST_NAME", "PROFILE_TEAM", "PROFILE_HIERARCHY", "PROFILE_PERSONA", "PROFILE_GENDER", "PROFILE_COUNTRY_CODE", "PROFILE_SUMMARY", "PROFILE_INDUSTRY_NAME", "PROFILE_BIRTH_YEAR", "PROFILE_MARVIN_SEARCHES", "PROFILE_POSITION_STARTED_AT", "PROFILE_POSITION_TITLE", "PROFILE_POSITION_LOCATION", "PROFILE_POSITION_DESCRIPTION", "PROFILE_COMPANY_NAME", "PROFILE_COMPANY_LINKEDIN_ID", "PROFILE_COMPANY_LINKEDIN_UNIVERSAL_NAME", "PROFILE_COMPANY_SALESFORCE_ID", "PROFILE_COMPANY_SPENDESK_ID", "PROFILE_COMPANY_HUBSPOT_ID", "PROFILE_SKILLS", "PROFILE_LANGUAGES", "PROFILE_SCHOOLS", "PROFILE_EXTERNAL_SEARCHES", "PROFILE_LINKEDIN_HEADLINE", "PROFILE_LINKEDIN_LOCATION", "PROFILE_SALESFORCE_CREATED_AT", "PROFILE_SALESFORCE_STATUS", "PROFILE_SALESFORCE_LAST_ACTIVITY_AT", "PROFILE_SALESFORCE_OWNER_CONTACT_ID", "PROFILE_SALESFORCE_OWNER_CONTACT_NAME", "PROFILE_SPENDESK_SIGNUP_AT", "PROFILE_SPENDESK_DELETED_AT", "PROFILE_SPENDESK_ROLES", "PROFILE_SPENDESK_AVERAGE_NPS_SCORE", "PROFILE_SPENDESK_NPS_SCORES_COUNT", "PROFILE_SPENDESK_FIRST_NPS_SCORE", "PROFILE_SPENDESK_LAST_NPS_SCORE", "PROFILE_SPENDESK_LAST_NPS_SCORE_SENT_AT", "PROFILE_SPENDESK_PAYMENTS_COUNT", "PROFILE_SPENDESK_TOTAL_EUR_SPENT", "PROFILE_SPENDESK_ACTIVE_SUBSCRIPTIONS_COUNT", "PROFILE_SPENDESK_LAST_ACTIVITY_AT", "PROFILE_AUTOPILOT_MAIL_CLICKED_COUNT", "PROFILE_AUTOPILOT_LAST_MAIL_CLICKED_AT", "PROFILE_AUTOPILOT_MAIL_OPENED_COUNT", "PROFILE_AUTOPILOT_LAST_MAIL_OPENED_AT", "PROFILE_AUTOPILOT_MAIL_RECEIVED_COUNT", "PROFILE_AUTOPILOT_LAST_MAIL_RECEIVED_AT", "PROFILE_AUTOPILOT_MAIL_UNSUBSCRIBED_AT", "PROFILE_AUTOPILOT_MAIL_REPLIED_AT", "PROFILE_AUTOPILOT_LISTS", "PROFILE_AUTOPILOT_SEGMENTS", "PROFILE_HUBSPOT_CFO_CONNECT_SLACK_MEMBER_STATUS", "PROFILE_HUBSPOT_IS_CFO_CONNECT_MEETUPS_MEMBER", "PROFILE_HUBSPOT_CFO_CONNECT_AREAS_OF_EXPERTISE", "PROFILE_HUBSPOT_CORPORATE_FINANCE_EXPERIENCE_YEARS_RANGE"

Check out some of those names: LinkedIn is obviously there, but so is Salesforce and Spendesk and Hubspot, among others. This reads more like an aggregation of multiple sources than it does data solely scraped from LinkedIn. My hope is that in posting this someone might pop up and say "I recognise those column headings, they're from..." Who knows.

So, here's where that leaves us: this data is a combination of information sourced from public LinkedIn profiles, fabricated emails address and in part (anecdotally based on simply eyeballing the data this is a small part), the other sources in the column headings above. But the people are real, the companies are real, the domains are real and in many cases, the email addresses themselves are real. There are over 1.8k HIBP subscribers in the data set and this is folks that have double opted-in so they've successfully received an email to that address in the past. Further, when the data was loaded into HIBP there were nearly a million email addresses that were already in the system so evidently, they were addresses that had previously been in use. Which stands to reason because even if every address was constructed by an algorithm, the pattern is common enough that there'll be a bunch of hits.

Because the conclusion is that there's a significant component of legitimate data in this corpus, I've loaded it into HIBP. But because there are also a significant number of fabricated email addresses in there, I've flagged it as a spam list which means the addresses won't impact the scale of anyone's paid subscription if they're monitoring domains. And whilst I know some people will suggest it shouldn't go in at all, time and time again when I've polled the public about similar incidents the overwhelming majority of people have said "we want to know about it then we'll make up our own minds what action needs to be taken". And in this case, even if you find an email address on your domain that doesn't actually exist, that person who either currently works at your company or previously did has still had their personal data dumped in this corpus. That's something most people will still want to know.

Lastly, one of the main reasons I decided to invest hours into this today is that I loathe disinformation and I hate people using that to then make statements that are completely off base. I'm looking at my Twitter feed now and see people angry at LinkedIn for this, blaming an insider due to recent layoffs there, accusing them of mishandling our data and so on and so forth. No, not this time, the evidence has led us somewhere completely different.

Weekly Update 372

By Troy Hunt
Weekly Update 372

Yes, the Lenovo is Chinese. No, I'm not worried about Superfish. Yes, I'm running windows. No, I don't want a Framework laptop. Seemed to be a lot of time this week gone on talking all things laptops, and there are clearly some very differing views on the topic. Some good suggestions, some neat alternatives and some ideas that, well, just seem a little crazy. But hey, I'm super happy with the machine, it's an absolute beast and I expect I'll get many years of hard work out of it. That and more in this week's video, enjoy 😊

Weekly Update 372
Weekly Update 372
Weekly Update 372
Weekly Update 372

References

  1. Sponsored by:Β Need centralized and real-time visibility into threat detection and mitigation? We got you! Discover the CrowdSec Console today.
  2. My primary mobile machine is now a Lenovo P16 Gen 2 ThinkPad (super happy with this machine, it's an absolute beast!)
  3. If you don't want my Coinhive script running on your website, don't put my Coinhive script on your website (I don't mean to state the obvious, but yeah...)
  4. I Lenny Troll'd our Ubiquiti doorbell to mess with kids on Halloween (these audio files are great, I've gotta actually put them to use against scammers 🀣)
  5. The kitchen is done! (compare that to where we started in the first tweet 😲)

Weekly Update 371

By Troy Hunt
Weekly Update 371

So I wrapped up this week's live stream then promptly blew hours mucking around with Zigbee on Home Assistant. Is it worth it, as someone asked in the chat? Uh, yeah, kinda, mostly. But seriously, having a highly automated house is awesome and I suggest that most people watching these vids harbour the same basic instinct as I do to try and improve our lives through technology. The coordination of lights with times of day, the security checks around open doors, the controlling of fans and air conditioning to keep everyone comfy, it just rocks... when it works 😎

Weekly Update 371
Weekly Update 371
Weekly Update 371
Weekly Update 371

References

  1. Sponsored by:Β Got Linux? (And Mac and Windows and iOS and Android?) Then Kolide has the device trust solution for you. Click here to watch the demo.
  2. 1Password got caught up in the Okta incident (it had no impact, but it does make you wonder about the soundness of passing around HAR files...)
  3. Does a service use HIBP for their "dark web" search? (it depends: some state it explicitly and some explicitly ask it not to be stated, so I simply neither confirm nor deny)
  4. It's finally time to migrate HIBP away from Table Storage (that post is almost a decade old now and explains why I went with this construct to begin with)
  5. I'm rolling all my Zigbee things from deCONZ with a Conbee to ZHA with Home Assistant Yellow (it's painful, but shout out to those who helped during the live stream and followed up later via Twitter)

Weekly Update 370

By Troy Hunt
Weekly Update 370

I did it again - I tweeted about Twitter doing something I thought was useful and the hordes did descend on Twitter to tweet about how terrible Twitter is. Right, gotcha, so 1.3M views of that tweet later... As I say in this week's video, there's a whole bunch of crazy arguments in there but the thing that continues to get me the most in every one of these discussions is the argument that Elon is a poo poo head. No, seriously, I explain it at the end of the video how so constantly the counterarguments have no rational base and they constantly boil down to a dislike of the guy. Ironically, continuing to use Twitter to have a rant about stuff just shows that Twitter is just the same as it always was 🀣

Weekly Update 370
Weekly Update 370
Weekly Update 370
Weekly Update 370

References

  1. Sponsored by:Β Got Linux? (And Mac and Windows and iOS and Android?) Then Kolide has the device trust solution for you. Click here to watch the demo.
  2. I put out a little tweet about Twitter charging new accounts in a couple of test markets $1... (...and people lost. their. minds.)
  3. The virtual cards service Simon mentioned is privacy.com (I gave it a go and got about 10 seconds into it before getting "You must be a US resident, and agree to the terms and authorizations", after which I was asked for name, DoB and address... and this helps anonymity?!)
  4. If you were IM'ing like it's 1999, you may be one of 75k people in the Phoenix breach (it's "vintage messaging reborn")
  5. The AndroidLista breach with 6.6M records went into HIBP (that one had been around for a while but with no disclosure and no response when I reached out, it just took a while)

Weekly Update 369

By Troy Hunt
Weekly Update 369

There seemed to be an awful lot of time gone on the 23andMe credential stuffing situation this week, but I think it strikes a lot of important chords. We're (us as end users) still reusing credentials, still not turning on MFA and still trying to sue when we don't do these things. And we as builders are still creating systems that allow this to happen en mass. All that said, I don't know how we build systems that are resilient to a single person coming along and entering someone else's (probably) reused credentials into a normal browser session, at least not without introducing additional barriers to entry that will upset the marketing manager. And so, I'm back at the only logical conclusion I think we can all agree on right now: it's a great time to be working in this industry 😊

Weekly Update 369
Weekly Update 369
Weekly Update 369
Weekly Update 369

References

  1. Sponsored by:Β Online fraud is everywhere. Secure your finances and personal info with Aura’s award-winning identity protection. Protect your identity now.
  2. 23andMe has been getting hammered in a credential stuffing attack (as I always say, defending against this is a shared responsibility: individuals need to work on their account security hygiene, and websites need to expect and defend against this sort of thing)
  3. And now they're getting sued in a class action, a mere 4 days after the event πŸ€¦β€β™‚οΈ (someone really should write a blog post about how stupid this is...)
  4. ...here's a blog post about how stupid class actions like this are! (when I'm getting lawyers asking me to advertise their class action suits on HIBP, you know damn well who's getting rich out of all this, and it ain't the plaintiffs)
  5. The Bureau van Dijk data breach is now in HIBP (we should be asking a lot more questions about why data aggregators collecting this sort of info still exist)

Weekly Update 368

By Troy Hunt
Weekly Update 368

This must be my first "business as usual" weekly update since August and damn it's nice to be back to normal! New sponsor, new breaches, new blog post and if you're in this part of the world, a brand new summer creeping over the horizon. I've now got a couple of months with very little in the way of travel plans and a goal to really knock a bunch of new HIBP features out of the park, some of which I talk about in this week's video. Enjoy! 🍻

Weekly Update 368
Weekly Update 368
Weekly Update 368
Weekly Update 368

References

  1. Sponsored by: NTT’s Samurai XDR offers affordable enterprise-grade security for businesses of any size. $40 /endpoint/year. Try it free for 30 days!
  2. The Horse Isle breach went into HIBP (if you're a big fan of fantasy horse games, this one is for you!)
  3. The Activision breach also went into HIBP (only employees and what looks like contractors in this one, probably more embarrassing for the organisation than actually impactful)
  4. And the Hjedd breach went into HIBP too (if you're a big fan of Chinese porn, well, uh, yeah...)
  5. You never actually believed the claims of "safe, secure, anonymous", did you? (turns out that's literally horseshit 🐎)

Safe, Secure, Anonymous, and Other Misleading Claims

By Troy Hunt
Safe, Secure, Anonymous, and Other Misleading Claims

Imagine you wanted to buy some shit on the internet. Not the metaphorical kind in terms of "I bought some random shit online", but literal shit. Turds. Faeces. The kind of thing you never would have thought possible to buy online until... Shitexpress came along. Here's a service that enables you to send an actual piece of smelly shit to "An irritating colleague. School teacher. Your ex-wife. Filthy boss. Jealous neighbour. That successful former classmate. Or all those pesky haters." But it would be weird if the intended recipient of the aforementioned shit knew it came from you, so, Shitexpress makes a bold commitment:

Safe, Secure, Anonymous, and Other Misleading Claims

100% anonymous! Not 90%, not 95% but the full whack 100%! And perhaps they really did deliver on that promise, at least until one day last year:

New sensitive breach: Faeces delivery service Shitexpress had 24k email addresses breached last week. Data also included IP and physical addresses, names, and messages accompanying the posted shit. 76% were already in @haveibeenpwned. Read more: https://t.co/7R7vdi1ftZ

β€” Have I Been Pwned (@haveibeenpwned) August 16, 2022

When you think about it now, the simple mechanics of purchasing either metaphorical or literal shit online dictates collecting information that, if disclosed, leaves you anything but anonymous. At the very least, you're probably going to provide your own email address, your IP will be logged somewhere and payment info will be provided that links back to you (Bitcoin was one of many payment options and is still frequently traceable to an identity). Then of course if it's a physical good, there's a delivery address although in the case above, that's inevitably not going to be the address of the purchaser (sending yourself shit would also just be weird). Which is why following the Shitexpress data breach, we can now easily piece together information such as this:

Safe, Secure, Anonymous, and Other Misleading Claims

Here we have an individual who one day last year, went on an absolute (literal) shit-posting bender posting off half a dozen boxes of excrement to heavy hitters in the US justice system. For 42 minutes, this bright soul (whose IP address was logged with each transaction), sent abusive messages from their iPhone (the user agent is also in the logs) to some of the most powerful people in the land. Did they only do this on the assumption of being "100% anonymous"? Possibly, it certainly doesn't seem like the sort of activity you'd want to put your actual identity to but hey, here we are. Who knows if there were any precautions taken by this individual to use an IP that wasn't easily traceable back to them, but that's not really the point; an attribute that will very likely be tied back to a specific individual if required was captured, stored and then leaked. IP not enough to identify someone? Hmmm... I wonder what other information might be captured during a purchase...

Safe, Secure, Anonymous, and Other Misleading Claims

Uh, yeah, that's all pretty personally identifiable! And there are nearly 10k records in the "invoices_stripe.csv" file that include invoice IDs so if you paid by credit card, good luck not having that traced back to you (KYC obligations ain't real compatible with anonymously posting shit).

Now, where have we heard all this before? The promise of anonymity and data protection? Hmmm...

Safe, Secure, Anonymous, and Other Misleading Claims

"Anonymous". "Discreet". That was July 2015, and we all know what happened next. It wasn't just the 30M+ members of the adultery website that were exposed in the breach, it was also the troves of folks who joined the service, thought better of it, paid to have their data deleted and then realised the "full delete" service, well, didn't. Why did they think their data would actually be deleted? Because the website told them it would be.

Vastaamo, the Finnish service referred to "the McDonalds of psychotherapy" was very clear around the privacy of the data they collected:

Safe, Secure, Anonymous, and Other Misleading Claims

Until a few years ago when the worst conceivable scenario was realised:

A security flaw in the company’s IT systems had exposed its entire patient database to the open internetβ€”not just email addresses and social security numbers, but the actual written notes that therapists had taken.

What made the Vastaamo incident particularly insidious was that after failing to extract the ransom demand from the company itself, the perpetrator (for whom things haven't worked out so well this year), then proceeded to ransom the individuals:

If we do not receive this payment within 24 hours, you still have another 48 hours to acquire and send us 500 euros worth of Bitcoins. If we still don't receive our money after this, your information will be published: your address, phone number, social security number, and your exact patient report, which includes e.g. transcriptions of your conversations with the Receptionist's therapist/psychiatrist.

And then it was all dumped publicly anyway.

Here's what I'm getting at with all this:

Assurances of safety, security and anonymity aren't statements of fact, they're objectives, and they may not be achieved

I've written this post as I have so many others so that it may serve as a reference in the future. Time and time again, I see the same promises as above as though somehow words on a webpage are sufficient to ensure data security. You can trust those words just about as much as you can trust the promise of being able to choose the animal the excrement is sourced from, which turns out to be total horseshit 🐎

Safe, Secure, Anonymous, and Other Misleading Claims

Weekly Update 367

By Troy Hunt
Weekly Update 367

Ah, home 😊 It's been more than a month since I've been able to sit at this desk and stream a weekly video. And now I'm doing it with the glorious spring weather just outside my window, which I really must make more time to start enjoying. Anyway, this week is super casual due to having had zero prep time, but I hope the discussion about the ABC's piece on HIBP and I in particular is interesting. I feel like this whole story has a long way to go yet, hopefully now having a few months at home will give us an opportunity to lay the foundation for the next phase. Stay tuned!

Weekly Update 367
Weekly Update 367
Weekly Update 367
Weekly Update 367

References

  1. Sponsored by: EPAS by Detack. No EPAS protected password has ever been cracked and won't be found in any leaks. Give it a try, millions of users use it.
  2. "A strange sign of the times" (the ABC's piece on HIBP and I)
  3. I mentioned "Outliers, the Story of Success" as one of my favourite books (turns out it's a combination of hard work and good luck, neither of which is sufficient by itself)
  4. Talking about good luck, the story of my leaving Pfizer is in one of my favourite evers talks, "Hack Your Career" (I need to do a follow-up on this, there's so much more to add now)

Weekly Update 366

By Troy Hunt
Weekly Update 366

Well that's it, Europe is done! I've spent the week in Prague with highlights including catching up with Josef Prusa, keynoting at Experts Live EU and taking a "beer spa" complete with our own endless supply of tap beer. Life is good 🍻

That’s it - we’ve peaked - life is all downhill from here 🀣 🍻 #BeerSpa pic.twitter.com/ezCpUC6XEK

β€” Troy Hunt (@troyhunt) September 21, 2023

All that and more in this week's video, next week I'll come to you from back home in the sunshine 😎

Weekly Update 366
Weekly Update 366
Weekly Update 366
Weekly Update 366

References

  1. Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. I caught up with Josef Prusa in Prague (what he has created at Prusa is massively impressive!)
  3. Experts Live EU was an awesome event 😎 (felt a lot of love in Prague, thanks everyone 😊)
  4. The dbForums data breach went into HIBP (and... that's me pwned again 😭)
  5. The ApexSMS spam operation that exposed data a few years back also went into HIBP (it's one of those ones you really can't do anything about, think of it as an "FYI")

Weekly Update 365

By Troy Hunt
Weekly Update 365

It's another week of travels, this time from our "second home", Oslo. That's off the back of 4 days in the Netherlands and starting tomorrow, another 4 in Prague. But today, the 17th of September, is extra special 😊

1 year today ❀️ pic.twitter.com/vsRChdDshn

β€” Troy Hunt (@troyhunt) September 17, 2023

We'll be going out and celebrating accordingly as soon as I get this post published so I'll be brief: enjoy this week's video!

Weekly Update 365
Weekly Update 365
Weekly Update 365
Weekly Update 365

References

  1. Sponsored by: 1 in 3 families have been affected by fraud. Secure your personal info with Aura’s award-winning identity protection. Start free trial.
  2. We had a great visit to Politie Nederland in Rotterdam this week (lots of common goals shared, and I'm really happy we've been able to assist with victim notification via HIBP)
  3. 932k Viva Air email addresses went into HIBP (that's a Colombian airline which no longer exists, they were pwned and ransomed last year)
  4. 4.3M Malindo Air email addresses went into HIBP (it's a 2019 breach so not new, but a third of people in there had never appeared in a loaded breach before)
  5. Wasn't really expecting to be named on a notorious ransomware website, but here we are (2 days after recording I still haven't heard anything further)
  6. I wasn't expecting anything revolutionary, but I'd really hoped for more excitement in the new iPhones (but I ordered us both Pro Max units anyway 😎)

Weekly Update 364

By Troy Hunt
Weekly Update 364

I'm in Spain! Alicante, to be specific, where we've spent the last few days doing family wedding things, and I reckon we scrubbed up pretty well:

Getting fancy in Spain 😍 pic.twitter.com/iDFmBORnHa

β€” Troy Hunt (@troyhunt) September 9, 2023

Next stop is Amsterdam and by the end of today, we'll be sipping cold beer canal side in the 31C heat 😎 Meanwhile, this week's video focuses mostly on the Dymocks breach and the noteworthiness of what appears to be excessive data retention. After recording this video, someone also pointed out that the data is already being abused in a pretty traceable fashion:

@troyhunt not sure if this is particularly useful but I just received this scam attempt. I use iCloud's Hide My Email service and the address this email was sent to was the same address iCloud generated for use with my Dymocks account. pic.twitter.com/GiFZ7EIDo2

β€” matt (@matt_0833) September 9, 2023

That's all for this week, a little shorter as I was rushing for the wedding, I'll come to you next week from our second home, Oslo πŸ‡³πŸ‡΄

Weekly Update 364
Weekly Update 364
Weekly Update 364
Weekly Update 364

References

  1. Sponsored by: Fastmail. Check out Masked Email, built with 1Password. One click gets you a unique email address for every online signup. Try it now!
  2. Dymocks Australia found themselves breached (I suspect the significant number of retained inactive records will cause them some grief)
  3. No, data breaches don't typically just sit on the "dark web", they circulate broadly on easily accessible forums (that's true of the vast bulk of data in HIBP!)

  • September 10th 2023 at 07:58

Weekly Update 363

By Troy Hunt
Weekly Update 363

I'm super late pushing out this week's video, I mean to the point where I now have a couple of days before doing the next one. Travel from the opposite side of the world is the obvious excuse, then frankly, just wanting to hang out with friends and relax. And now, I somehow find myself publishing this from the most mind-bending set of circumstances:

Heading to 31C. Cold beer. Warm pool. How is this in England?! 🀯 pic.twitter.com/tQSbHaoLhG

β€” Troy Hunt (@troyhunt) September 6, 2023

On that note, straight into the video, links below and I'll do it all again in a couple of days from Spain:

Weekly Update 363
Weekly Update 363
Weekly Update 363
Weekly Update 363

References

  1. The FBI took down Qakbot and sent the data over to HIBP (that's both email addresses and passwords that are now searchable)
  2. CERT Poland also sent over a bunch of data snagged from phishing activities (another 68k records now searchable in HIBP)
  3. The Pampling breach went into HIBP despite not being able to get a response from them... (...until it went into HIBP and customers started asking questions)
  4. PlayCyberGames was also breached and the data went into HIBP... (...and they also didn't respond to disclosure attempts - at all)
  5. If you're building websites and you haven't given Report URI a go yet, you don't know what you're missing! (seriously, CSPs are so cool 😎)
  6. Sponsored by: Fastmail. Check out Masked Email, built with 1Password. One click gets you a unique email address for every online signup. Try it now!

68k Phishing Victims are Now Searchable in Have I Been Pwned, Courtesy of CERT Poland

By Troy Hunt
68k Phishing Victims are Now Searchable in Have I Been Pwned, Courtesy of CERT Poland

Last week I was contacted by CERT Poland. They'd observed a phishing campaign that had collected 68k credentials from unsuspecting victims and asked if HIBP may be used to help alert these individuals to their exposure. The campaign began with a typical email requesting more information:

68k Phishing Victims are Now Searchable in Have I Been Pwned, Courtesy of CERT Poland

In this case, the email contained a fake purchase order attachment which requested login credentials that were then posted back to infrastructure controlled by the attacker:

68k Phishing Victims are Now Searchable in Have I Been Pwned, Courtesy of CERT Poland

All in all, CERT Poland identified 202 other phishing campaigns using the same infrastructure which has subsequently been taken offline. Data accumulated by the malicious activity spanned from October 2022 until just last week.

The advice to impacted individuals is as follows:

  1. Get a digital password manager to help you make all passwords strong and unique
  2. If you've been reusing passwords, change them to strong and unique versions now, starting with the most important services you use
  3. Turn on multi-factor authentication wherever it's available, especially for important accounts such as email, social media and banking
  4. Never open attachments or follow links unless you're confident in the trustworthiness of their origin and if in doubt, delete the email

Data From The Qakbot Malware is Now Searchable in Have I Been Pwned, Courtesy of the FBI

By Troy Hunt
Data From The Qakbot Malware is Now Searchable in Have I Been Pwned, Courtesy of the FBI

Today, the US Justice Department announced a multinational operation involving actions in the United States, France, Germany, the Netherlands, and the United Kingdom to disrupt the botnet and malware known as Qakbot and take down its infrastructure. Beyond just taking down the backbone of the operation, the FBI began actively intercepting traffic from the botnet and instructing infected machines the uninstall the malware:

To disrupt the botnet, the FBI was able to redirect Qakbot botnet traffic to and through servers controlled by the FBI, which in turn instructed infected computers in the United States and elsewhere to download a file created by law enforcement that would uninstall the Qakbot malware

As part of the operation, the FBI have requested support from Have I Been Pwned (HIBP) to help notify impacted victims of their exposure to the malware. We provided similar support in 2021 with the Emotet botnet, although this time around with a grand total of 6.43M impacted email addresses. These are now all searchable in HIBP albeit with the incident is flagged as "sensitive" so you'll need to verify you control the email address via the notification service first, or you can search any domains you control via the domain search feature. Further, the passwords from the malware will shortly be searchable in the Pwned Passwords service which can either be checked online or via the API. Pwned Passwords is presently requested 5 and a half billion times each month to help organisations prevent people from using known compromised passwords.

Guidance for those impacted by this incident is the same tried and tested advice given after previous malware incidents:

  1. Keep security software such as antivirus up to date with current definitions. I personally use Microsoft Defender which is free, built into Windows and updates automatically via Windows Update.
  2. If you're reusing passwords across services, get a password manager and change them to be strong and unique.
  3. Enable multi-factor authentication where supported, at least for your most important services (email, banking, social, etc.)
  4. For administrators with affected users, CISA has a report which explains the malware in more detail, including links to YARA rules to help identify the presence of the malware within your network.

Weekly Update 362

By Troy Hunt
Weekly Update 362

Somehow in this week's video, I forgot to talk about the single blog post I wrote this week! So here's the elevator pitch: Cloudflare's Turnstile is a bot-killing machine I've had enormous success with for the "API" (quoted because it's not meant to be consumed by others), behind the front page of HIBP. It's unintrusive, is super easy to implement and kills bots dead. There you go, how's that for a last minute pitch? 😊

Weekly Update 362
Weekly Update 362
Weekly Update 362
Weekly Update 362

References

  1. Sponsored by: Unpatched devices keeping you up at night? Kolide can get your entire fleet updated in days. It's Device Trust for Okta. Watch the demo!
  2. Fight the bots with Cloudflare's Turnstile (and hey, if you can find a way through it, let me know and I'll pass your feedback on to Cloudflare)
  3. If you enjoy discussing escorts on public forums, you may be in the ECCIE breach (along with your email and IP address 😳)
  4. But you probably won't be in the Atmeltomo breach (unless you're Japanese and looking for a friend)
  5. The Duolingo scrape from earlier this year is now doing the rounds (that's a 100% hit rate with other breaches)
  6. And SevenRooms had their near half a TB breach from December start circulating (that's one of the largest we've seen in a long time)

Fighting API Bots with Cloudflare's Invisible Turnstile

By Troy Hunt
Fighting API Bots with Cloudflare's Invisible Turnstile

There's a "hidden" API on HIBP. Well, it's not "hidden" insofar as it's easily discoverable if you watch the network traffic from the client, but it's not meant to be called directly, rather only via the web app. It's called "unified search" and it looks just like this:

Fighting API Bots with Cloudflare's Invisible Turnstile

It's been there in one form or another since day 1 (so almost a decade now), and it serves a sole purpose: to perform searches from the home page. That is all - only from the home page. It's called asynchronously from the client without needing to post back the entire page and by design, it's super fast and super easy to use. Which is bad. Sometimes.

To understand why it's bad we need to go back in time all the way to when I first launched the API that was intended to be consumed programmatically by other people's services. That was easy, because it was basically just documenting the API that sat behind the home page of the website already, the predecessor to the one you see above. And then, unsurprisingly in retrospect, it started to be abused so I had to put a rate limit on it. Problem is, that was a very rudimentary IP-based rate limit and it could be circumvented by someone with enough IPs, so fast forward a bit further and I put auth on the API which required a nominal payment to access it. At the same time, that unified search endpoint was created and home page searches updated to use that rather than the publicly documented API. So, 2 APIs with 2 different purposes.

The primary objective for putting a price on the public API was to tackle abuse. And it did - it stopped it dead. By attaching a rate limit to a key that required a credit card to purchase it, abusive practices (namely enumerating large numbers of email addresses) disappeared. This wasn't just about putting a financial cost to queries, it was about putting an identity cost to them; people are reluctant to start doing nasty things with a key traceable back to their own payment card! Which is why they turned their attention to the non-authenticated, non-documented unified search API.

Let's look at a 3 day period of requests to that API earlier this year, keeping in mind this should only ever be requested organically by humans performing searches from the home page:

Fighting API Bots with Cloudflare's Invisible Turnstile

This is far from organic usage with requests peaking at 121.3k in just 5 minutes. Which poses an interesting question: how do you create an API that should only be consumed asynchronously from a web page and never programmatically via a script? You could chuck a CAPTCHA on the front page and require that be solved first but let's face it, that's not a pleasant user experience. Rate limit requests by IP? See the earlier problem with that. Block UA strings? Pointless, because they're easily randomised. Rate limit an ASN? It gets you part way there, but what happens when you get a genuine flood of traffic because the site has hit the mainstream news? It happens.

Over the years, I've played with all sorts of combinations of firewall rules based on parameters such as geolocations with incommensurate numbers of requests to their populations, JA3 fingerprints and, of course, the parameters mentioned above. Based on the chart above these obviously didn't catch all the abusive traffic, but they did catch a significant portion of it:

Fighting API Bots with Cloudflare's Invisible Turnstile

If you combine it with the previous graph, that's about a third of all the bad traffic in that period or in other words, two thirds of the bad traffic was still getting through. There had to be a better way, which brings us to Cloudflare's Turnstile:

With Turnstile, we adapt the actual challenge outcome to the individual visitor or browser. First, we run a series of small non-interactive JavaScript challenges gathering more signals about the visitor/browser environment. Those challenges include, proof-of-work, proof-of-space, probing for web APIs, and various other challenges for detecting browser-quirks and human behavior. As a result, we can fine-tune the difficulty of the challenge to the specific request and avoid ever showing a visual puzzle to a user.

"Avoid ever showing a visual puzzle to a user" is a polite way of saying they avoid the sucky UX of CAPTCHA. Instead, Turnstile offers the ability to issue a "non-interactive challenge" which implements the sorts of clever techniques mentioned above and as it relates to this blog post, that can be an invisible non-interactive challenge. This is one of 3 different widget types with the others being a visible non-interactive challenge and a non-intrusive interactive challenge. For my purposes on HIBP, I wanted a zero-friction implementation nobody saw, hence the invisible approach. Here's how it works:

Fighting API Bots with Cloudflare's Invisible Turnstile

Get it? Ok, let's break it down further as it relates to HIBP, starting with when the front page first loads and it embeds the Turnstile widget from Cloudflare:

<script src="https://challenges.cloudflare.com/turnstile/v0/api.js" async defer></script>

The widget takes responsibility for running the non-interactive challenge and returning a token. This needs to be persisted somewhere on the client side which brings us to embedding the widget:

<divΒ ID="turnstileWidget"Β class="cf-turnstile"Β data-sitekey="0x4AAAAAAADY3UwkmqCvH8VR"Β data-callback="turnstileCompleted"></div>

Per the docs in that link, the main thing here is to have an element with the "cf-turnstile" class set on it. If you happen to go take a look at the HIBP HTML source right now, you'll see that element precisely as it appears in the code block above. However, check it out in your browser's dev tools so you can see how it renders in the DOM and it will look more like this:

Fighting API Bots with Cloudflare's Invisible Turnstile

Expand that DIV tag and you'll find a whole bunch more content set as a result of loading the widget, but that's not relevant right now. What's important is the data-token attribute because that's what's going to prove you're not a bot when you run the search. How you implement this from here is up to you, but what HIBP does is picks up the token and sets it in the "cf-turnstile-response" header then sends it along with the request when that unified search endpoint is called:

Fighting API Bots with Cloudflare's Invisible Turnstile

So, at this point we've issued a challenge, the browser has solved the challenge and received a token back, now that token has been sent along with the request for the actual resource the user wanted, in this case the unified search endpoint. The final step is to validate the token and for this I'm using a Cloudflare worker. I've written a lot about workers in the past so here's the short pitch: it's code that runs in each one of Cloudflare's 300+ edge nodes around the world and can inspect and modify requests and responses on the fly. I already had a worker to do some other processing on unified search requests, so I just added the following:

const token = request.headers.get('cf-turnstile-response');

if (token === null) {
    return new Response('Missing Turnstile token', { status: 401 });
}

const ip = request.headers.get('CF-Connecting-IP');

let formData = new FormData();
formData.append('secret', '[secret key goes here]');
formData.append('response', token);
formData.append('remoteip', ip);

const turnstileUrl = 'https://challenges.cloudflare.com/turnstile/v0/siteverify';
const result = await fetch(turnstileUrl, {
    body: formData,
    method: 'POST',
});
const outcome = await result.json();

if (!outcome.success) {
    return new Response('Invalid Turnstile token', { status: 401 });
}

That should be pretty self-explanatory and you can find the docs for this on Cloudflare's server-side validation page which goes into more detail, but in essence, it does the following:

  1. Gets the token from the request header and rejects the request if it doesn't exist
  2. Sends the token, your secret key and the user's IP along to Turnstile's "siteverify" endpoint
  3. If the token is not successfully verified then return 401 "Unauthorised", otherwise continue with the request

And because this is all done in a Cloudflare worker, any of those 401 responses never even touch the origin. Not only do I not need to process the request in Azure, the person attempting to abuse my API gets a nice speedy response directly from an edge node near them πŸ™‚

So, what does this mean for bots? If there's no token then they get booted out right away. If there's a token but it's not valid then they get booted out at the end. But can't they just take a previously generated token and use that? Well, yes, but only once:

If the same response is presented twice, the second and each subsequent request will generate an error stating that the response has already been consumed.

And remember, a real browser had to generate that token in the first place so it's not like you can just automate the process of token generation then throw it at the API above. (Sidenote: that server-side validation link includes how to handle idempotency, for example when retrying failed requests.) But what if a real human fails the verification? That's entirely up to you but in HIBP's case, that 401 response causes a fallback to a full page post back which then implements other controls, for example an interactive challenge.

Time for graphs and stats, starting with the one in the hero image of this page where we can see the number of times Turnstile was issued and how many times it was solved over the week prior to publishing this post:

Fighting API Bots with Cloudflare's Invisible Turnstile

That's a 91% hit rate of solved challenges which is great. That remaining 9% is either humans with a false positive or... bots getting rejected 😎

More graphs, this time how many requests to the unified search page were rejected by Turnstile:

Fighting API Bots with Cloudflare's Invisible Turnstile

That 990k number doesn't marry up with the 476k unsolved ones from before because they're 2 different things: the unsolved challenges are when the Turnstile widget is loaded but not solved (hopefully due to it being a bot rather than a false positive), whereas the 401 responses to the API is when a successful (and previously unused) Turnstile token isn't in the header. This could be because the token wasn't present, wasn't solved or had already been used. You get more of a sense of how many of these rejected requests were legit humans when you drill down into attributes like the JA3 fingerprints:

Fighting API Bots with Cloudflare's Invisible Turnstile

In other words, of those 990k failed requests, almost 40% of them were from the same 5 clients. Seems legit πŸ€”

And about a third were from clients with an identical UA string:

Fighting API Bots with Cloudflare's Invisible Turnstile

And so on and so forth. The point being that the number of actual legitimate requests from end users that were inconvenienced by Turnstile would be exceptionally small, almost certainly a very low single-digit percentage. I'll never know exactly because bots obviously attempt to emulate legit clients and sometimes legit clients look like bots and if we could easily solve this problem then we wouldn't need Turnstile in the first place! Anecdotally, that very small false positive number stacks up as people tend to complain pretty quickly when something isn't optimal, and I implemented this all the way back in March. Yep, 5 months ago, and I've waited this long to write about it just to be confident it's actually working. Over 100M Turnstile challenges later, I'm confident it is - I've not seen a single instance of abnormal traffic spikes to the unified search endpoint since rolling this out. What I did see initially though is a lot of this sort of thing:

Fighting API Bots with Cloudflare's Invisible Turnstile

By now it should be pretty obvious what's going on here, and it should be equally obvious that it didn't work out real well for them 😊

The bot problem is a hard one for those of us building services because we're continually torn in different directions. We want to build a slick UX for humans but an obtrusive one for bots. We want services to be easily consumable, but only in the way we intend them to... which might be by the good bots playing by the rules!

I don't know exactly what Cloudflare is doing in that challenge and I'll be honest, I don't even know what a "proof-of-space" is. But the point of using a service like this is that I don't need to know! What I do know is that Cloudflare sees about 20% of the internet's traffic and because of that, they're in an unrivalled position to look at a request and make a determination on its legitimacy.

If you're in my shoes, go and give Turnstile a go. And if you want to consume data from HIBP, go and check out the official API docs, the uh, unified search doesn't work real well for you any more 😎

Weekly Update 361

By Troy Hunt
Weekly Update 361

This week hasd been manic! Non-stop tickets related to the new HIBP domain subscription service, scrambling to support invoicing and resellers, struggling our way through some odd Stripe things and so on and so forth. It's all good stuff and there have been very few issues of note (and all of those have merely been people getting to grips with the new model), so all in all, it's happy days 😊

Weekly Update 361
Weekly Update 361
Weekly Update 361
Weekly Update 361

References

  1. Sponsored by: Unpatched devices keeping you up at night? Kolide can get your entire fleet updated in days. It's Device Trust for Okta. Watch the demo!
  2. Brett Adams built a really cool Splunk app using the new domain search API (and he talked me into adding a couple of other ones too)
  3. iMenu360 had 3.4M customer records appear in a breach (and ignored every single attempt made to disclose it πŸ€·β€β™‚οΈ)
  4. We now have a model for education facilities, non-profits and charities (for now, it boils down to "log a ticket and we'll help you out")

All New Have I Been Pwned Domain Search APIs and Splunk Integration

By Troy Hunt
All New Have I Been Pwned Domain Search APIs and Splunk Integration

I've been teaching my 13-year old son Ari how to code since I first got him started on Scratch many years ago, and gradually progressed through to the current day where he's getting into Python in Visual Studio Code. As I was writing the new domain search API for Have I Been Pwned (HIBP) over the course of this year, I was trying to explain to him how powerful APIs are:

Think of HIBP as one website that does pretty much one thing; you load it in your browser and search through data breaches which then display on the screen. But when you have an API, it's no longer just locked into your browser, it's in all sorts of other systems. Mobile apps, other websites, dashboards and if you really want, you can even integrate the lights in your room with HIBP! Why? How? Well, there's a Home Assistant integration for HIBP and being pwned in a new breach could raise an event there you can then use YAML to perform an action with, for example flashing a light red. That might be weird and unnecessary, but when you have an API, suddenly all these things you never thought of are possible.

It took Brett Adams less than a day after we released the new domain search API last Monday for him to reach out to me with one of those ideas. He wanted to build a Splunk app (Brett is a Splunk MVP so this was right up his alley) to surface breached data about an organisation's domains right into the place where so many security engineers spend their days. He just wanted 2 new APIs to make the user experience the best it could be:

  1. One that can show you the subscription level for someone's key
  2. One that can show you all the domains they're monitoring

That seems so ridiculously obvious, why didn't I think of that originally?! But hey, easy fix, so the next day Brett had his APIs. And today, you also have the APIs because they're now all publicly documented and ready for you to consume. You also have Brett's Splunk app and because he's published it to Splunkbase, you can go and pull it into your own Splunk instance, plug in your HIBP API key and it's job done!

I'll leave you with a bunch of screen caps from Brett's work, starting with a zoomed in grab of what I suspect folks will find the most valuable - the addresses on their domains and their appearances across breaches:

All New Have I Been Pwned Domain Search APIs and Splunk Integration

That's a fragment of the broader dashboard that also breaks down the incidents over time:

All New Have I Been Pwned Domain Search APIs and Splunk Integration

The starting point for this is simply plugging your API key into the interface:

All New Have I Been Pwned Domain Search APIs and Splunk Integration

I like these headline figures and I picture particularly large organisations that have gone through various acquisitions of different brands with various domains finding this really useful:

All New Have I Been Pwned Domain Search APIs and Splunk Integration
All New Have I Been Pwned Domain Search APIs and Splunk Integration
All New Have I Been Pwned Domain Search APIs and Splunk Integration

And speaking of breaches, there's a lot of them which Brett has visualised across the course of time:

All New Have I Been Pwned Domain Search APIs and Splunk Integration

So that's it, you can see all the APIs documented on the HIBP website and you can grab Brett's app right now from Splunkbase. You can also find all the code for this in Brett's GitHub repo should you wish to have a read through it.

The HIBP APIs are there for other people to build awesome things. If you're one of those people, please get in touch with me and show me what you've created, I can't wait to see more integrations like Brett's 😊

Weekly Update 360

By Troy Hunt
Weekly Update 360

So about those domain searches... 😊 The new subscription model launched this week and as many of you know from your own past experiences, pushing major new code live is always a bit of a nail-biting exercise. It went out silently on Sunday morning, nothing major broke so I published the blog post Monday afternoon then emailed all the existing API key subscribers Tuesday morning and now here we are!

One thing I talk a bit about in the video today are the 2 new APIs someone reached out and requested. This was an awesome idea and I can't wait to show you what they've built with them. I expect I'll blog that this coming week and probably quietly slip out the documentation on the 2 new endpoints in advance. Stay tuned for that one, what he's done with this looks so cool 😎

Weekly Update 360
Weekly Update 360
Weekly Update 360
Weekly Update 360

References

  1. Sponsored by: Secure your assets, identity and online accounts with our award-winning ID theft protection. Get started with Aura today.
  2. It's almost all about the domain searches today (I'm really happy about how this has been received!)
  3. Education facilities and non-profits have come up a bit as organisations we might need to treat a bit differently (we're working a model for them, for now that's a link to the KB requesting they log a ticket we can then review)

Welcome to the New Have I Been Pwned Domain Search Subscription Service

By Troy Hunt
Welcome to the New Have I Been Pwned Domain Search Subscription Service

This is a big one. A massive one. It's the culmination of a solid 7 months of work that finally, as of now, is live. The full back story is in my blog post from mid-June about The Big 5 Announcements but to save you trawling through all of that, here are the cliff notes:

  1. Domain searches in HIBP are resource intensive and the impact was becoming increasingly obvious
  2. More than half the Fortune 500 are using this feature, along with a who's who of big brands
  3. We decided to introduce pricing tiers to the largest domain searches...
  4. ...but also add stuff, most notably domain searches by API and formal support...
  5. ...and remove stuff, most notably the need for verifying control of a domain after you've done it once

I've spent the last 8 weeks since publishing that post crunching numbers, writing code, doing loads of formal things (namely terms of use and privacy policy), and regularly talking about it on my weekly video. I've had loads of enormously useful feedback, much of which has shaped the state of the services we're launching here today. Thank you everyone who contributed, now let me get into it and explain exactly what we've come up with πŸ™‚

The Pricing Structure

We've been thinking about the best way to structure this since January. How do we take something that has been provided for free for almost a decade and put a reasonable price on it? That's a highly subjective word - reasonable - and there'll never be complete consensus, so it's more about passing the pub test where your average person will look at this and go "yeah, that seems fair enough". Let me explain the thinking and how we reached the pricing structure you'll see further down:

Firstly, we wanted most domain searches to remain free. This keeps with the spirit of HIBP's roots being a community service and ensures the data is accessible without barrier to the majority of people. It would also mean that for most people, these changes would have absolutely no impact on the way they've been using the service, not unless they want access to the new bits.

Next, we wanted to divide the commercial offerings into a manageable number of tiers. The public API key has 4 tiers and I reckon that's the sweet spot; it's not too many options, but it's enough to provide a good separation between the scale of each. We then wanted to distribute the number of domains that would fall into the commercial category roughly equally between those 4 tiers, so it was pretty much a matter of taking what was left after the free ones and dividing them into 4 groups and putting a price on them.

Finally, we wanted the first commercial tier to be easily affordable so that most people could access it without thinking twice about it. My measure for that has always been "the cost of a cup of coffee", so I went down to my favourite local and checked what I was blindly paying when I waved my watch in the general direction of the EFTPOS machine:

Welcome to the New Have I Been Pwned Domain Search Subscription Service

$6 Aussie, or just under $4 in USD. Which led us to here (all in USD from now on):

Plan Breached addresses Percent of all domains Price / m
Pwned 0 Up to 10 60% Free!
Pwned 1 Up to 25 10% $3.95
Pwned 2 Up to 100 10% $16.95
Pwned 3 Up to 500 10% $28.50
Pwned 4 Unlimited 10% $115.00

What you're looking at here is a list of plan names (more on that soon), the size of the domain it covers (expressed in the number of breached email addresses on it), what percentage of all domains presently being monitored in HIBP this represents and, of course, the monthly price. As with the public API, if you subscribe annually then it's "pay for 10, get 12" which means that "Pwned 1" price works out at only $3.25 a month. As I flagged in the earlier post, this is all based around the number of addresses that appear in a breach, with one important caveat I'll expand on later: this number excludes all breaches flagged as a spam list. As a rough rule of thumb, over the years I've found approximately 20% of addresses on a domain have been breached so by that logic, you'll need 55 actual email addresses on a domain before there's a cost. Or up to 130 before it costs more than a coffee a month. (If you're a stickler for detail and are thinking those percentages are too perfect, I've rounded them from their actual values of 59.1%, 9.7%, 11.3%, 10.4% and 9.4%.)

But what if you have multiple domains? Easy - the one plan will cover all your domains within the size of that plan. For example, if you have 3 domains and one has 5 breached addresses, one has 20 and one has 90, you can get a single "Pwned 2" plan and cover them all. Or get a single "Pwned 1" plan and cover just the first 2. It's pretty simple.

So that was our initial thinking - stand this up as a product that sits alongside the existing API key one then you just purchase whichever one you want. Then, Brendan gave me a much better idea - combine them altogether! You can see the gears turning around in my head as I read his suggestion and as the days progressed and I gave it more thought, it became a brilliant idea. It massively simplifies the code base, it removes a lot of confusion that I'm sure would have otherwise ensued and perhaps most importantly, it gives you all something more than you would have had otherwise. The one fly in the ointment was the price disparity; the above prices are 13% to 15% higher than the old corresponding API key ones. So, what we've decided to do is run the old prices until 8 October then revise everything to the new prices above. That gives more than 60 days' notice to everyone with an existing API key (we'll have to email everyone anyway as the terms of use have changed to incorporate the domain bits), and there's clear verbiage everywhere about the change for anyone purchasing a new subscription. Plus, it gives everyone a little incentive to lock in for a year now and delay the increase until later in 2024. Thanks Brendan! 😊

So that's the rationale. There's no change for 60% of domains that have previously been searched, a negligible cost for the next 10% of them with the remainder paying commensurately more based on their scale. But we didn't just want to whack a cost on an existing service and you're down a few bucks a month with nothing more to show for it, let's talk about new stuff!

But Wait, There's More!

There are two brand new features we're now offering to all commercial subscribers. Even if your domain is small and has less than 10 breached addresses on it, you can still get access to these features via the entry level plan and they're both pretty self-explanatory: API-level access and formal support.

API first as I think it's the coolest and it's exactly what it sounds like: there's now a public endpoint you can throw a domain at and get a JSON response of breached aliases and the incidents they've appeared in. It looks just like this:

GET https://haveibeenpwned.com/api/v3/breacheddomain/{domain}
hibp-api-key: [your key]

Which then responds like this:

{
  "alias1": [
   "Adobe"
  ],
  "alias2": [
    "Adobe",
    "Gawker",
    "Stratfor"
  ],
  "alias3": [
    "AshleyMadison"
  ]
}

If you're already paying for an API key, you have immediate access to this! Same key, same logic in terms of resolving the returned breach name to the full thing via the unauthenticated API that returns breach metadata, the only caveats are that is has to be a domain you've previously demonstrated you control and it has to be within your plan size (e.g. you have a Pwned 1 plan and your domains don't exceed 25 breached addresses). Otherwise:

Subscription upgrade required.

Just one more thing with the domain search API: it only makes sense to hit it after a new breach is loaded. There's absolutely no point in hammering away at it non-stop as you'll only get the same result so instead, try polling the brand new API we've just added to return only the most recent breach (it's massively cached at Cloudflare anyway) and just hit the domain search API when there's a new one. But because not everybody will do this and domain searches are expensive relative to other queries, the terms and conditions include this clause:

Controls such as rate limiting may be added to the domain search API if excessive API requests are made despite no new breaches appearing since the last request.

There is a rate limit based on a variety of factors and it's possible you may receive an HTTP 429 if you request it more frequently than is necessary. The only reason I'm not going into the details of how that works here is that I expect it will adapt and change pretty frequently in response to how people use the service. What I can confidently say now though, is that if you use the domain search feature in the way it's intended to work - querying each domain after a new breach is added - you won't have a problem with rate limits.

I'm really excited to see how people will integrate this data into their existing tooling, do please let me know if you do something awesome 😊

Then there's the formal support which we offer via Zendesk at support.haveibeenpwned.com. That launched with the API key upgrades last November and since that time, we've answered almost 600 tickets. We've been trying to fine tune things to the extent that the knowledge base there answers the most common questions, but there's certainly a great deal of time that still goes into supporting the questions that pop up. Adding domain searches to the mix will inevitably increase that, possibly by a significant order of magnitude which is why we're only making this available to commercial subscribers.

So, that's the new bits. If you're in that 60% group of people with smaller domains outside of the commercial tiers, you can get access to both the API and support by subscribing to the smallest possible plan for that cup of coffee a month. We feel that's a pretty reasonable balance, and I hope you do too.

Speaking of reasonable, about those spam lists...

Data Breaches Ain't Data Breaches

I mentioned sharing as much as I could in my weekly update videos, including the intended pricing structure and how it would be based on the number of breached email addresses on a domain. Several people raised a very important point as it related to the calculations: data breaches ain't data breaches or more specifically, there are breaches in HIBP that shouldn't be treated like the other ones as they artificially inflate the pwn count. Could these be excluded?

The Onliner Spambot incident was the worst culprit and in the case of one person that contacted me, it caused his personal domain to read as though hundreds of addresses had been breached when the correct number was... zero. Someone else had their domain pegged at 40 breached addresses whereas once you took this breach out, the number came down to 13. This created somewhat of a rock and hard place situation because whilst those aliases did appear in this incident, they weren't real addresses. But what's a "real" email address anyway? Or more specifically, how can I tell via a string alone whether an address is real or not? A decade ago now I wrote about how hard this is and per the comments on that post, concluded that the only way to tell for sure is to send an email and have the recipient perform some sort of explicit action such as clicking on a link. Clearly, that's not feasible in this situation but equally, putting a price on a service based on a metric that has been artificially inflated just wasn't fair.

Adding spam lists back in 2016 was the right thing to do but equally, excluding them from the number that determines the pricing tier is also the right thing to do. We've tried to make this logic as clear as possible throughout the system and focus on a simple UX that's explicit but can also provide more insight if required,

Welcome to the New Have I Been Pwned Domain Search Subscription Service

And if you're interested in which breaches specifically have been classified as a spam list, I've added a filter to the API that lists all breaches. It's an unauthenticated API you can load directly in your browser via GET request and at the time of writing, has 11 breaches on it with nearly 1.4 billion records.

The very last thing from that screen cap is the "Enable debug mode" link and for that, we need to talk about "domain creep".

Domain Creep, and Getting What You Paid For

Data breaches are obviously an ongoing thing. Always have been, always will be so what that means is when you look at a domain today and see, say, 20 breached accounts on it, that might be 30 breached accounts tomorrow. I think everyone who uses HIBP understands that, but it does create a bit of a problem when domain searches are priced on a metric that can "creep". What if you've just paid for a year's worth of Pwned 1 subscription and per the example here, you've suddenly got more than 25 breached accounts on your domain and can no longer search it?

The sentiment of how this should be handled was always obvious: people have to get what they pay for. We didn't want a situation where someone could be left disappointed, and our fear was that the organic increase in breaches could lead to that event. The solution was easy: when you buy a subscription at a certain scale, every domain you're currently monitoring that can be searched on the first day of the subscription can still be searched on the last day of the subscription. If you take out one year of Pwned 1 today and per the example above, the domain creeps beyond 25 breached accounts tomorrow, it'll have zero impact for the next 364 days.

I'm conscious that this concept can get confusing: domain searches are based on the number of breached accounts on the domain but not including spam lists and then locked in at the size of the domain until the next subscription renew... phew! The debug mode link mentioned above aims to show all this logic in its raw detail:

Welcome to the New Have I Been Pwned Domain Search Subscription Service

Even though domain1.com in this example has grown to 26 breached addresses, because it was 22 breached addresses when the subscription was taken out then that's the number it's locked at until it renews in August next year. I hope this is clear enough, do please leave a comment if we can do better.

Lastly, let me put some raw numbers around the "domain creep" situation as I foresee this causing concern beyond what might be warranted. Let's start with the number of unique email addresses which is approximately 6 billion. There have been about 723M records added in the last 12 months and a bunch of those will be for the same email address (shout out to everyone who was pwned again in the last year!) Further, of that number, most email addresses were already pwned. That's a link through to the Twitter feed where I broadcast the percentage of previously seen addresses and you'll see that number is regularly around the 60% to 70% range. In other words, it's probably in the order of 250M new addresses we've seen in the last year which is appx 4% of the entire corpus. So, yes, over the course of time we'll see domains slip into higher plans, but only at about the rate of CPI.

Lastly, locking domain counts for the duration of the subscription creates additional incentive to make it an annual one, and that's beyond the existing incentive of "buy 10 months, get 12 months". That's also in addition to massively cutting down on the number of times you may need to deal with corporate bureaucracy. Speaking of which...

Satisfying Corporate Bureaucracy

Let me start with a story: Many years ago during my lengthy tenure at Pfizer, I pushed hard to drive us away from traditional hosting models and towards modern cloud paradigms, namely the Azure App Service. Here we had a model where you could self-service provision resources that cost about $50 per month and completely replaced a model that was costing us tens of thousands a year. It was an easy win, however... the organisation demanded vendor assessments, compliance paperwork and a billing model which, of course, was favourable to them. But Microsoft's model was "chuck your credit card in and off you go", so that's what one of my colleagues did. And paid for it himself, entirely out of his own pocket in order to save one of the world's largest companies money. My point is that I've done time on the inside and I understand the barriers organisations put in place "because reasons". I touched on this in the June post about the upcoming domain changes:

To be honest, the experience with the public API keys has taught me that it's usually not money that's the barrier to using commercial services, it's corporate procurement bureaucracy. Onboarding documentation. Vendor assessments. Tax forms.

And so too, I have the experience from the outside having regularly received requests to invest hours doing manual labour for the sake of something an organisation is paying a few bucks a month for. That simply doesn't scale and the whole point of providing services like this at volume is that you can go and set everything up yourself with nothing more than a credit card. This one came in while preparing this blog post:

My company is looking to purchase an API key so we can automate user lookups on your site. Our procurement process is wildly complex and I was wondering if we have the option of submitting a Purchase Order instead of using the Stripe credit card payment method?

If this situation resonates, you have my sympathies and my own corporate bureaucracy scars are still raw! If there's more we can do to ease the onboarding path without creating manual labour on a per-customer basis then please let me know. I'm sure there are improvements that can be made, the last thing I want to see is you ending up like my old mate from Pfizer 😞

We've tried to do everything possible to remove barriers. We've made significant investments in legal counsel to get the terms of use and privacy policy right and we've tried to provide answers to all the regular questions in the FAQs. We've even publicly provided a W-8BEN-E US tax form which was often requested by folks in the US. But it won't be enough for some organisations, which is why we do exactly the same thing as Pfizer often found themselves doing which is to provide an enterprise-orientated process where we deal with all this rigmarole... and charge accordingly. If that's you, then get in touch with me.

But What About...?

There will be lots of "but what about...?" edge cases. Let me give you some examples and our views on them:

But what about addresses that don't actually exist?
For most data breaches, email addresses are extracted using a regular expression run over the entire corpus of data. You can see what this looks like in the open source email address extractor used to process breaches. So, what is an email address? Per my earlier explanation, it's anything that matches the regex when run across the breach. That could mean strings that aren't actually an address on a domain get caught up and reported incorrectly. It happens, but there's no way to practically stop it and it's extraordinarily rare.

But what about email addresses from years ago that still appear as breached on a domain?
The argument here is that whilst these are genuine addresses that did indeed exist at one point, they aren't really relevant anymore either due to their age or the address no longer existing (e.g. ex staff). I have both a philosophical and a technical view on this, with the former being that data breaches are immutable. At a point in time, addresses were exposed, and that fact can never be reversed. As for the latter point, those addresses remain in a storage construct we need to continue to support, and every single domain query needs to pick those addresses up and return them to the code processing the search (the design of HIBP means that Azure's Table Storage returns the entire partition on each domain query). Further, in most cases, that doesn't change the total number of breached accounts being a reasonable metric for organisation size and subsequently, the pricing tier they should fit into.

But what about old breaches I don't care about any more causing me to require a higher plan?
It's a similar answer to the previous point insofar as the immutability of history and the need to store the data. It also remains the most reliable metric we have to determine the size of the domain and in many cases, the organisation that owns it. Think of this measurement primarily as a means of slicing up the corpus of data within HIBP and distributing the cost as equitably as possible across the organisations using the domain search feature.

But what about people who don't want to use a credit card?
I'll give you a two-part answer on this, beginning with the recognition that cards can pose legitimate challenges for some people. Just as I was drafting this blog post, someone trying to sign up to the public API reached out after failing to subscribe multiple times with different cards:

Welcome to the New Have I Been Pwned Domain Search Subscription Service

For a variety of reasons, I believe the guy is legit, but Stripe reports two payments declined by his bank and another due to an invalid CVC. But using Stripe doesn't just mean credit cards, it also means Apple Pay and Google Pay, WeChat Pay in China, EPS in Belgium, Afterpay in Australia and a raft of other payment mechanisms in different parts of the world. It's hard to imagine a legitimate case where someone does not have access to any of the available payment mechanisms, which brings me to the second part:

The reason we don't support the likes of anonymous cryptocurrency and rely solely on fiat money payments is that it very quickly weeds out the bad actors. That was the whole rationale for putting a payment gateway on the public API back in 2019 - to cut out the abuse. It turns out that once you have to pass the sort of KYC barriers financial institutions put in place, people don't misbehave under their own identity. And yes, there's always fraudulent use of cards, but Stripe has gotten so good at handling that (we pay for their Radar service as well), our dispute rate is only one in many thousands of transactions.

But what about [other reasons related to calculations and costs]?
Amongst the corpus of 12.6 billion records, there will be anomalies. It'll almost certainly be sub-1% and the anomalies won't be evenly distributed across domains; they'll affect some more than others. It's infeasible to ever get that down to zero and it's also infeasible to respond to every single request I know will come through asking for an anomaly to be rectified. The most practical way we could find to deal with this is to keep the pricing structure such that anomalies will be unlikely to have much impact of consequence.

We're also conscious that some people will challenge the cost and it happens all the time with the existing public API key either because of the individual's position in life or the nature of the organisation they work in. But this is why we've structured it as we have, with the majority of domains being within that free tier and the entry level cost being the cup of coffee that gets you access to things like API level access and formal support. This was the most reasonable, equitable model we could come up with and I hope that shines through in the explanations above.

Summary

I know there'll be individuals with catch all domains that have ended up in a couple of dozen data breaches and they think paying $3.95 to see them is unreasonable. I know there'll be organisations with much larger numbers who feel it's unreasonable because similarly sized orgs are more profitable. But I also know that I've been running domain searches totally out of my own pocket for almost a decade so whilst I'm sympathetic to anyone who now needs to pay for a service that was previously free, I'm also comfortable that a reasonable and well thought out model has been arrived at.

I'm excited to see what people do with the new API. The email address search one is presently requested millions of times a day and people have built all sorts of amazing things with it, everything from corporate awareness campaigns to tooling to help protect customers from account takeover attacks to integration within the corporate SOC. It's cases like that last one where I think the domain search API will really shine and if you do something awesome with it, please get in touch and let me know.

I know this was a long read, I hope it adequately explains the rationale for the subscription service and that you use it to do amazing things 😊

You can get started right now from the domain search page on HIBP.

Update: Following feedback and consultation with a range of existing users of the service, we now provide a model for the education and non-profit sectors. See the KB titled Do you provide discounts based on the nature of the organisation? for more information.

Weekly Update 359

By Troy Hunt
Weekly Update 359

Somewhere in the next few hours from publishing this post, I'll finally push the HIBP domain search changes live. I've been speaking about it a lot in these videos over recent weeks so many of you have already know what it entails, but it's the tip of the iceberg you've seen publicly. This is the culmination of 7 months of work to get this model right with a ridiculous amount of background effort having gone into it. Case in point: read my pain from last night about converting thousands of words of lawyer speak T&Cs from Microsoft Word to HTML. As if preparing these wasn't painful enough, trying to make them simply play nice on a web page has been a nightmare! (I settled for dumping stuff in a <pre> tag for now and will invest the time in doing it right later on.)

I hope you enjoy this week's video, I'll talk much more about the domain search bits in the next video, hopefully following a successful launch!

Weekly Update 359
Weekly Update 359
Weekly Update 359
Weekly Update 359

References

  1. Sponsored by: EPAS by Detack. No EPAS protected password has ever been cracked and won't be found in any leaks. Give it a try, millions of users use it.
  2. What's the best tooling to start teaching kids to code Python on Windows with? (I decided taking Python from the Windows store then using Visual Studio Code with the Python extension made the most sense)
  3. The MagicDuel Adventure MMORPG got breached (it's a short disclosure notice, but kudos to them for that probably being the fastest turnaround from me reaching out to them disclosing I've ever seen!)
  4. My Home Assistant Yellow has finally landed! (hoping it solves the intermittent restart problems which now that I think about it, haven't happened for weeks πŸ€”)
  5. Finding a CM4 was the hard bit (Amazon link to the unit I bought a month ago... at A$274 at the time 😭)
  6. It's the final hours before the all new bits for domain search go live in HIBP! (the community input has been awesome - thank you!)

Weekly Update 358

By Troy Hunt
Weekly Update 358

IoT, breaches and largely business as usual so I'll skip that in the intro to this post and jump straight to the end: the impending HIBP domain search changes. As I say in the vid, I really value people's feedback on this so if nothing else, please skip through to 48:15, listen to that section and let me know what you think. By the time I do next week's vid my hope is that all the coding work is done and I'm a couple of days out from shipping it, so now is your time to provide input if you think there's something I'm missing that really should be in there πŸ™‚

Weekly Update 358
Weekly Update 358
Weekly Update 358
Weekly Update 358

References

  1. Sponsored by: Kolide ensures that if a device isn't secure, it can't access your apps. It's Device Trust for Okta. Watch the demo today!
  2. Messing with door-knocking real estate agents is a really good use of Home Assistant and Ubiquiti IMHO (channelling my inner Password Purgatory demons on this one!)
  3. The BookCrossing breach went into HIBP (plain text passwords FTW!)
  4. An old Roblox breach surfaced and also went into HIBP (Roblox has had quite the time of it lately...)
  5. BreachForums, was itself, breached (definitely legit too, given the presence of a "lurker" account I created there)

Weekly Update 357

By Troy Hunt
Weekly Update 357

Sad news to wake up to today. Kevin was a friend and as I say in this week's video, probably the most well-known identity in infosec ever, and for good reason. He made a difference, and I have fun memories with him 😊

Felt really sad waking up and seeing β€œRIP Kevin” in my timeline. I doubt there is a more well known name in our industry but if he’s unfamiliar to you (or you haven’t read this book), go and grab β€œGhost in the Wires” which is an exceptional read.

Kevin started regularly coming… pic.twitter.com/w1UMm7mGa8

β€” Troy Hunt (@troyhunt) July 20, 2023

In other news, I share a lot more on the upcoming domain search changes in this week's video and I've gotta say, I'm feeling pretty good about them. I spent most of the day after recording this writing code and drafting the blog post and I'm pretty damn happy with each right now. I'll keep sharing more info via these updates to the extent that by the time everything launches in a couple of weeks, you'll know it all anyway if you're paying attention here 😎

Weekly Update 357
Weekly Update 357
Weekly Update 357
Weekly Update 357

References

  1. Sponsored by: Kolide ensures that if a device isn't secure, it can't access your apps. It's Device Trust for Okta. Watch the demo today!
  2. If you haven't done already, go read Ghost in the Wires, the Kevin Mitnick story (it's a genuinely entertaining read)
  3. If you mistype an email address, it will go to the wrong place! 🀯 (the .mil conflation with .ml story has received way more airtime than what it's due IMHO)
  4. Shellys, Shellys everywhere (after feedback from Richard and Lars on this week's video, I'm pretty sure I'm going to ditch MQTT altogether now)
  5. The Roblox Developers Conference had 4k people's data leaked (goes back a few years and they did eventually disclose, but it would have been nice for them to beat me to it)
  6. It's more than a month ago now that I wrote about the impending domain search changes (but not long to go now πŸ™‚)

Weekly Update 356

By Troy Hunt
Weekly Update 356

Today was a bit back-to-back having just wrapped up the British Airways Magecart attack webinar with Scott. That was actually a great session with loads of engagement and it's been recorded to so look out for that one soon if you missed it. Anyway, I filled this week's update with a bunch of random things from the week. I especially enjoyed discussing the HIBP domain search progress and as I say in the video, talking through it with other people really helps crystalise things so I think I'll keep doing that as the dev work continues. Stay tuned for more on that next week, see you then 😊

Weekly Update 356
Weekly Update 356
Weekly Update 356
Weekly Update 356

References

  1. Sponsored by: Americans lost $8.8B to identity theft in 2022. Secure your online info with Aura the #1 rated identity theft protection. Start free trial.
  2. Scott Helme and I did a Report URI webinar just before this video, all about the Magecart attack on British Airways (stay tuned for the recording)
  3. The renos have been very trying on my patience (but the garage is looking totally epic 😎)
  4. I finally fixed this hum when the camera was on... by using a USB cable to charge it instead (this was so painful, obviously some sort of electrical interference going on there)
  5. I completely forgot to talk about my IoT lock batteries (but yeah, that linked tweet sums it all up)
  6. A full "baker's dozen" of MVP awards! (that's 13 years running now 😲)

Lucky MVP 13

By Troy Hunt
Lucky MVP 13

Each year since 2011, Microsoft has sent me a lovely email around this time:

Lucky MVP 13

I've been fortunate enough to find a passion in life that has allowed me to do what I love and make a great living out of it all whilst contributing to the community in a meaningful and impactful way. In last year's MVP announcement blog post, I talked about one of my favourite contributions of all that year being the Pwned Passwords ingestion pipeline for the FBI. This year, they sent me something nice in return:

This is so cool, thanks @FBI 😊 pic.twitter.com/aqMi3as91O

β€” Troy Hunt (@troyhunt) June 28, 2023

Thank you to everyone that helps me on this journey by consuming the things I create. Reading my posts, watching my videos, turning up to my talks and consuming services like HIBP and Pwned Passwords. The latter is a great example of community uptake: as of today, there were 5.12 billion requests to that service in just the last 30 days 🀯 That's amazing, thank you everyone 😊

Weekly Update 355

By Troy Hunt
Weekly Update 355

Alrighty, "The Social Media". Without adding too much here as I think it's adequately covered in the video, since last week we've had another change at Twitter that has gotten some people cranky (rate limits) and another social media platform to jump onto (Threads). I do wonder how impactful the 1k tweet view limit per day is for most people (I have no idea how many I usually see, I just know I've never hit the limit yet), and as I say in the video, I find it increasingly hard to tell when community outrage is evidence-based versus "because Elon". Strange times, for now I'll just keep a foot in each camp and then who knows how the whole thing will play out in the future.

Weekly Update 355
Weekly Update 355
Weekly Update 355
Weekly Update 355

References

  1. Sponsored by: EPAS by Detack. No EPAS protected password has ever been cracked and won't be found in any leaks. Give it a try, millions of users use it.
  2. We're still seeing the sights in Thailand (food, scenery, wildlife, people - it's all πŸ‘Œ)
  3. I'm now on Threads by Instagram owned by Meta (because we needed yet another social media platform to fragment across...)
  4. Some spammer somewhere has been spoofing my phone number (no further incidents since recording, but clearly the phone system is a mess as it relates to verifying phone numbers being used)

Weekly Update 354

By Troy Hunt
Weekly Update 354

I'm in Thailand! It's spectacular here, and even more so since recording this video and getting out of Bangkok and into the sorts of natural beauty you see in all the videos. Speaking of which, rather than writing more here (whilst metres away from the most amazing scenery), I'm going to push the publish button on this week's video and go enjoy it. Seeya! 😊

Weekly Update 354
Weekly Update 354
Weekly Update 354
Weekly Update 354

References

  1. Sponsored by Kolide. Kolide can get your cross-platform fleet to 100% compliance. It's Zero Trust for Okta. Want to see for yourself? Book a demo.
  2. We're in Thailand, and it's amazing 🀩 (the pictures speak for themselves, check out the linked thread)
  3. The Insta360 GO 3 is a really impressive piece of hardware (editing software could do with work, but that's fixable)
  4. The BreachForums clone got itself breached (irony upon irony, and oh so predictable too )
  5. The FBI sent me a really cool piece of recognition (definitely going straight to the pool room!)

❌