It’s not unusual for the data brokers behind people-search websites to use pseudonyms in their day-to-day lives (you would, too). Some of these personal data purveyors even try to reinvent their online identities in a bid to hide their conflicts of interest. But it’s not every day you run across a US-focused people-search network based in China whose principal owners all appear to be completely fabricated identities.
Responding to a reader inquiry concerning the trustworthiness of a site called TruePeopleSearch[.]net, KrebsOnSecurity began poking around. The site offers to sell reports containing photos, police records, background checks, civil judgments, contact information “and much more!” According to LinkedIn and numerous profiles on websites that accept paid article submissions, the founder of TruePeopleSearch is Marilyn Gaskell from Phoenix, Ariz.
The saucy yet studious LinkedIn profile for Marilyn Gaskell.
Ms. Gaskell has been quoted in multiple “articles” about random subjects, such as this article at HRDailyAdvisor about the pros and cons of joining a company-led fantasy football team.
“Marilyn Gaskell, founder of TruePeopleSearch, agrees that not everyone in the office is likely to be a football fan and might feel intimidated by joining a company league or left out if they don’t join; however, her company looked for ways to make the activity more inclusive,” this paid story notes.
Also quoted in this article is Sally Stevens, who is cited as HR Manager at FastPeopleSearch[.]io.
Sally Stevens, the phantom HR Manager for FastPeopleSearch.
“Fantasy football provides one way for employees to set aside work matters for some time and have fun,” Stevens contributed. “Employees can set a special league for themselves and regularly check and compare their scores against one another.”
Imagine that: Two different people-search companies mentioned in the same story about fantasy football. What are the odds?
Both TruePeopleSearch and FastPeopleSearch allow users to search for reports by first and last name, but proceeding to order a report prompts the visitor to purchase the file from one of several established people-finder services, including BeenVerified, Intelius, and Spokeo.
DomainTools.com shows that both TruePeopleSearch and FastPeopleSearch appeared around 2020 and were registered through Alibaba Cloud, in Beijing, China. No other information is available about these domains in their registration records, although both domains appear to use email servers based in China.
Sally Stevens’ LinkedIn profile photo is identical to a stock image titled “beautiful girl” from Adobe.com. Ms. Stevens is also quoted in a paid blog post at ecogreenequipment.com, as is Alina Clark, co-founder and marketing director of CocoDoc, an online service for editing and managing PDF documents.
The profile photo for Alina Clark is a stock photo appearing on more than 100 websites.
Scouring multiple image search sites reveals Ms. Clark’s profile photo on LinkedIn is another stock image that is currently on more than 100 different websites, including Adobe.com. Cocodoc[.]com was registered in June 2020 via Alibaba Cloud Beijing in China.
The same Alina Clark and photo materialized in a paid article at the website Ceoblognation, which in 2021 included her at #11 in a piece called “30 Entrepreneurs Describe The Big Hairy Audacious Goals (BHAGs) for Their Business.” It’s also worth noting that Ms. Clark is currently listed as a “former Forbes Council member” at the media outlet Forbes.com.
Entrepreneur #6 is Stephen Curry, who is quoted as CEO of CocoSign[.]com, a website that claims to offer an “easier, quicker, safer eSignature solution for small and medium-sized businesses.” Incidentally, the same photo for Stephen Curry #6 is also used in this “article” for #22 Jake Smith, who is named as the owner of a different company.
Stephen Curry, aka Jake Smith, aka no such person.
Mr. Curry’s LinkedIn profile shows a young man seated at a table in front of a laptop, but an online image search shows this is another stock photo. Cocosign[.]com was registered in June 2020 via Alibaba Cloud Beijing. No ownership details are available in the domain registration records.
Listed at #13 in that 30 Entrepreneurs article is Eden Cheng, who is cited as co-founder of PeopleFinderFree[.]com. KrebsOnSecurity could not find a LinkedIn profile for Ms. Cheng, but a search on her profile image from that Entrepreneurs article shows the same photo for sale at Shutterstock and other stock photo sites.
DomainTools says PeopleFinderFree was registered through Alibaba Cloud, Beijing. Attempts to purchase reports through PeopleFinderFree produce a notice saying the full report is only available via Spokeo.com.
Lynda Fairly is Entrepreneur #24, and she is quoted as co-founder of Numlooker[.]com, a domain registered in April 2021 through Alibaba in China. Searches for people on Numlooker forward visitors to Spokeo.
The photo next to Ms. Fairly’s quote in Entrepreneurs matches that of a LinkedIn profile for Lynda Fairly. But a search on that photo shows this same portrait has been used by many other identities and names, including a woman from the United Kingdom who’s a cancer survivor and mother of five; a licensed marriage and family therapist in Canada; a software security engineer at Quora; a journalist on Twitter/X; and a marketing expert in Canada.
Cocofinder[.]com is a people-search service that launched in Sept. 2019, through Alibaba in China. Cocofinder lists its market officer as Harriet Chan, but Ms. Chan’s LinkedIn profile is just as sparse on work history as the other people-search owners mentioned already. An image search online shows that outside of LinkedIn, the profile photo for Ms. Chan has only ever appeared in articles at pay-to-play media sites, like this one from outbackteambuilding.com.
Perhaps because Cocodoc and Cocosign both sell software services, they are actually tied to a physical presence in the real world — in Singapore (15 Scotts Rd. #03-12 15, Singapore). But it’s difficult to discern much from this address alone.
Who’s behind all this people-search chicanery? A January 2024 review of various people-search services at the website techjury.com states that Cocofinder is a wholly-owned subsidiary of a Chinese company called Shenzhen Duiyun Technology Co.
“Though it only finds results from the United States, users can choose between four main search methods,” Techjury explains. Those include people search, phone, address and email lookup. This claim is supported by a Reddit post from three years ago, wherein the Reddit user “ProtectionAdvanced” named the same Chinese company.
Is Shenzhen Duiyun Technology Co. responsible for all these phony profiles? How many more fake companies and profiles are connected to this scheme? KrebsOnSecurity found other examples that didn’t appear directly tied to other fake executives listed here, but which nevertheless are registered through Alibaba and seek to drive traffic to Spokeo and other data brokers. For example, there’s the winsome Daniela Sawyer, founder of FindPeopleFast[.]net, whose profile is flogged in paid stories at entrepreneur.org.
Google currently turns up nothing else for in a search for Shenzhen Duiyun Technology Co. Please feel free to sound off in the comments if you have any more information about this entity, such as how to contact it. Or reach out directly at krebsonsecurity @ gmail.com.
A mind map highlighting the key points of research in this story. Click to enlarge. Image: KrebsOnSecurity.com
It appears the purpose of this network is to conceal the location of people in China who are seeking to generate affiliate commissions when someone visits one of their sites and purchases a people-search report at Spokeo, for example. And it is clear that Spokeo and others have created incentives wherein anyone can effectively white-label their reports, and thereby make money brokering access to peoples’ personal information.
Spokeo’s Wikipedia page says the company was founded in 2006 by four graduates from Stanford University. Spokeo co-founder and current CEO Harrison Tang has not yet responded to requests for comment.
Intelius is owned by San Diego based PeopleConnect Inc., which also owns Classmates.com, USSearch, TruthFinder and Instant Checkmate. PeopleConnect Inc. in turn is owned by H.I.G. Capital, a $60 billion private equity firm. Requests for comment were sent to H.I.G. Capital. This story will be updated if they respond.
BeenVerified is owned by a New York City based holding company called The Lifetime Value Co., a marketing and advertising firm whose brands include PeopleLooker, NeighborWho, Ownerly, PeopleSmart, NumberGuru, and Bumper, a car history site.
Ross Cohen, chief operating officer at The Lifetime Value Co., said it’s likely the network of suspicious people-finder sites was set up by an affiliate. Cohen said Lifetime Value would investigate to determine if this particular affiliate was driving them any sign-ups.
All of the above people-search services operate similarly. When you find the person you’re looking for, you are put through a lengthy (often 10-20 minute) series of splash screens that require you to agree that these reports won’t be used for employment screening or in evaluating new tenant applications. Still more prompts ask if you are okay with seeing “potentially shocking” details about the subject of the report, including arrest histories and photos.
Only at the end of this process does the site disclose that viewing the report in question requires signing up for a monthly subscription, which is typically priced around $35. Exactly how and from where these major people-search websites are getting their consumer data — and customers — will be the subject of further reporting here.
The main reason these various people-search sites require you to affirm that you won’t use their reports for hiring or vetting potential tenants is that selling reports for those purposes would classify these firms as consumer reporting agencies (CRAs) and expose them to regulations under the Fair Credit Reporting Act (FCRA).
These data brokers do not want to be treated as CRAs, and for this reason their people search reports typically don’t include detailed credit histories, financial information, or full Social Security Numbers (Radaris reports include the first six digits of one’s SSN).
But in September 2023, the U.S. Federal Trade Commission found that TruthFinder and Instant Checkmate were trying to have it both ways. The FTC levied a $5.8 million penalty against the companies for allegedly acting as CRAs because they assembled and compiled information on consumers into background reports that were marketed and sold for employment and tenant screening purposes.
The FTC also found TruthFinder and Instant Checkmate deceived users about background report accuracy. The FTC alleges these companies made millions from their monthly subscriptions using push notifications and marketing emails that claimed that the subject of a background report had a criminal or arrest record, when the record was merely a traffic ticket.
The FTC said both companies deceived customers by providing “Remove” and “Flag as Inaccurate” buttons that did not work as advertised. Rather, the “Remove” button removed the disputed information only from the report as displayed to that customer; however, the same item of information remained visible to other customers who searched for the same person.
The FTC also said that when a customer flagged an item in the background report as inaccurate, the companies never took any steps to investigate those claims, to modify the reports, or to flag to other customers that the information had been disputed.
There are a growing number of online reputation management companies that offer to help customers remove their personal information from people-search sites and data broker databases. There are, no doubt, plenty of honest and well-meaning companies operating in this space, but it has been my experience that a great many people involved in that industry have a background in marketing or advertising — not privacy.
Also, some so-called data privacy companies may be wolves in sheep’s clothing. On March 14, KrebsOnSecurity published an abundance of evidence indicating that the CEO and founder of the data privacy company OneRep.com was responsible for launching dozens of people-search services over the years.
Finally, some of the more popular people-search websites are notorious for ignoring requests from consumers seeking to remove their information, regardless of which reputation or removal service you use. Some force you to create an account and provide more information before you can remove your data. Even then, the information you worked hard to remove may simply reappear a few months later.
This aptly describes countless complaints lodged against the data broker and people search giant Radaris. On March 8, KrebsOnSecurity profiled the co-founders of Radaris, two Russian brothers in Massachusetts who also operate multiple Russian-language dating services and affiliate programs.
The truth is that these people-search companies will continue to thrive unless and until Congress begins to realize it’s time for some consumer privacy and data protection laws that are relevant to life in the 21st century. Duke University adjunct professor Justin Sherman says virtually all state privacy laws exempt records that might be considered “public” or “government” documents, including voting registries, property filings, marriage certificates, motor vehicle records, criminal records, court documents, death records, professional licenses, bankruptcy filings, and more.
“Consumer privacy laws in California, Colorado, Connecticut, Delaware, Indiana, Iowa, Montana, Oregon, Tennessee, Texas, Utah, and Virginia all contain highly similar or completely identical carve-outs for ‘publicly available information’ or government records,” Sherman said.
There has been an exponential increase in breaches within enterprises despite the carefully constructed and controlled perimeters that exist around applications and data. Once an attacker can access… Read more on Cisco Blogs
Join the guided tour outside the Security Operations Center, where we’ll discuss real time network traffic of the RSA Conference, as seen in the NetWitness platform. Engineers will be using Cisco S… Read more on Cisco Blogs
The proliferation of applications across hybrid and multicloud environments continues at a blistering pace. For the most part, there is no fixed perimeter, applications and environments are woven… Read more on Cisco Blogs
Microsoft today issued security updates for more than 100 newly-discovered vulnerabilities in its Windows operating system and related software, including four flaws that are already being exploited. In addition, Apple recently released emergency updates to quash a pair of zero-day bugs in iOS.
Apple last week shipped emergency updates in iOS 17.0.3 and iPadOS 17.0.3 in response to active attacks. The patch fixes CVE-2023-42724, which attackers have been using in targeted attacks to elevate their access on a local device.
Apple said it also patched CVE-2023-5217, which is not listed as a zero-day bug. However, as Bleeping Computer pointed out, this flaw is caused by a weakness in the open-source “libvpx” video codec library, which was previously patched as a zero-day flaw by Google in the Chrome browser and by Microsoft in Edge, Teams, and Skype products. For anyone keeping count, this is the 17th zero-day flaw that Apple has patched so far this year.
Fortunately, the zero-days affecting Microsoft customers this month are somewhat less severe than usual, with the exception of CVE-2023-44487. This weakness is not specific to Windows but instead exists within the HTTP/2 protocol used by the World Wide Web: Attackers have figured out how to use a feature of HTTP/2 to massively increase the size of distributed denial-of-service (DDoS) attacks, and these monster attacks reportedly have been going on for several weeks now.
Amazon, Cloudflare and Google all released advisories today about how they’re addressing CVE-2023-44487 in their cloud environments. Google’s Damian Menscher wrote on Twitter/X that the exploit — dubbed a “rapid reset attack” — works by sending a request and then immediately cancelling it (a feature of HTTP/2). “This lets attackers skip waiting for responses, resulting in a more efficient attack,” Menscher explained.
Natalie Silva, lead security engineer at Immersive Labs, said this flaw’s impact to enterprise customers could be significant, and lead to prolonged downtime.
“It is crucial for organizations to apply the latest patches and updates from their web server vendors to mitigate this vulnerability and protect against such attacks,” Silva said. In this month’s Patch Tuesday release by Microsoft, they have released both an update to this vulnerability, as well as a temporary workaround should you not be able to patch immediately.”
Microsoft also patched zero-day bugs in Skype for Business (CVE-2023-41763) and Wordpad (CVE-2023-36563). The latter vulnerability could expose NTLM hashes, which are used for authentication in Windows environments.
“It may or may not be a coincidence that Microsoft announced last month that WordPad is no longer being updated, and will be removed in a future version of Windows, although no specific timeline has yet been given,” said Adam Barnett, lead software engineer at Rapid7. “Unsurprisingly, Microsoft recommends Word as a replacement for WordPad.”
Other notable bugs addressed by Microsoft include CVE-2023-35349, a remote code execution weakness in the Message Queuing (MSMQ) service, a technology that allows applications across multiple servers or hosts to communicate with each other. This vulnerability has earned a CVSS severity score of 9.8 (10 is the worst possible). Happily, the MSMQ service is not enabled by default in Windows, although Immersive Labs notes that Microsoft Exchange Server can enable this service during installation.
Speaking of Exchange, Microsoft also patched CVE-2023-36778, a vulnerability in all current versions of Exchange Server that could allow attackers to run code of their choosing. Rapid7’s Barnett said successful exploitation requires that the attacker be on the same network as the Exchange Server host, and use valid credentials for an Exchange user in a PowerShell session.
For a more detailed breakdown on the updates released today, see the SANS Internet Storm Center roundup. If today’s updates cause any stability or usability issues in Windows, AskWoody.com will likely have the lowdown on that.
Please consider backing up your data and/or imaging your system before applying any updates. And feel free to sound off in the comments if you experience any difficulties as a result of these patches.
There's a "hidden" API on HIBP. Well, it's not "hidden" insofar as it's easily discoverable if you watch the network traffic from the client, but it's not meant to be called directly, rather only via the web app. It's called "unified search" and it looks just like this:
It's been there in one form or another since day 1 (so almost a decade now), and it serves a sole purpose: to perform searches from the home page. That is all - only from the home page. It's called asynchronously from the client without needing to post back the entire page and by design, it's super fast and super easy to use. Which is bad. Sometimes.
To understand why it's bad we need to go back in time all the way to when I first launched the API that was intended to be consumed programmatically by other people's services. That was easy, because it was basically just documenting the API that sat behind the home page of the website already, the predecessor to the one you see above. And then, unsurprisingly in retrospect, it started to be abused so I had to put a rate limit on it. Problem is, that was a very rudimentary IP-based rate limit and it could be circumvented by someone with enough IPs, so fast forward a bit further and I put auth on the API which required a nominal payment to access it. At the same time, that unified search endpoint was created and home page searches updated to use that rather than the publicly documented API. So, 2 APIs with 2 different purposes.
The primary objective for putting a price on the public API was to tackle abuse. And it did - it stopped it dead. By attaching a rate limit to a key that required a credit card to purchase it, abusive practices (namely enumerating large numbers of email addresses) disappeared. This wasn't just about putting a financial cost to queries, it was about putting an identity cost to them; people are reluctant to start doing nasty things with a key traceable back to their own payment card! Which is why they turned their attention to the non-authenticated, non-documented unified search API.
Let's look at a 3 day period of requests to that API earlier this year, keeping in mind this should only ever be requested organically by humans performing searches from the home page:
This is far from organic usage with requests peaking at 121.3k in just 5 minutes. Which poses an interesting question: how do you create an API that should only be consumed asynchronously from a web page and never programmatically via a script? You could chuck a CAPTCHA on the front page and require that be solved first but let's face it, that's not a pleasant user experience. Rate limit requests by IP? See the earlier problem with that. Block UA strings? Pointless, because they're easily randomised. Rate limit an ASN? It gets you part way there, but what happens when you get a genuine flood of traffic because the site has hit the mainstream news? It happens.
Over the years, I've played with all sorts of combinations of firewall rules based on parameters such as geolocations with incommensurate numbers of requests to their populations, JA3 fingerprints and, of course, the parameters mentioned above. Based on the chart above these obviously didn't catch all the abusive traffic, but they did catch a significant portion of it:
If you combine it with the previous graph, that's about a third of all the bad traffic in that period or in other words, two thirds of the bad traffic was still getting through. There had to be a better way, which brings us to Cloudflare's Turnstile:
With Turnstile, we adapt the actual challenge outcome to the individual visitor or browser. First, we run a series of small non-interactive JavaScript challenges gathering more signals about the visitor/browser environment. Those challenges include, proof-of-work, proof-of-space, probing for web APIs, and various other challenges for detecting browser-quirks and human behavior. As a result, we can fine-tune the difficulty of the challenge to the specific request and avoid ever showing a visual puzzle to a user.
"Avoid ever showing a visual puzzle to a user" is a polite way of saying they avoid the sucky UX of CAPTCHA. Instead, Turnstile offers the ability to issue a "non-interactive challenge" which implements the sorts of clever techniques mentioned above and as it relates to this blog post, that can be an invisible non-interactive challenge. This is one of 3 different widget types with the others being a visible non-interactive challenge and a non-intrusive interactive challenge. For my purposes on HIBP, I wanted a zero-friction implementation nobody saw, hence the invisible approach. Here's how it works:
Get it? Ok, let's break it down further as it relates to HIBP, starting with when the front page first loads and it embeds the Turnstile widget from Cloudflare:
<script src="https://challenges.cloudflare.com/turnstile/v0/api.js" async defer></script>
The widget takes responsibility for running the non-interactive challenge and returning a token. This needs to be persisted somewhere on the client side which brings us to embedding the widget:
<div ID="turnstileWidget" class="cf-turnstile" data-sitekey="0x4AAAAAAADY3UwkmqCvH8VR" data-callback="turnstileCompleted"></div>
Per the docs in that link, the main thing here is to have an element with the "cf-turnstile" class set on it. If you happen to go take a look at the HIBP HTML source right now, you'll see that element precisely as it appears in the code block above. However, check it out in your browser's dev tools so you can see how it renders in the DOM and it will look more like this:
Expand that DIV tag and you'll find a whole bunch more content set as a result of loading the widget, but that's not relevant right now. What's important is the data-token attribute because that's what's going to prove you're not a bot when you run the search. How you implement this from here is up to you, but what HIBP does is picks up the token and sets it in the "cf-turnstile-response" header then sends it along with the request when that unified search endpoint is called:
So, at this point we've issued a challenge, the browser has solved the challenge and received a token back, now that token has been sent along with the request for the actual resource the user wanted, in this case the unified search endpoint. The final step is to validate the token and for this I'm using a Cloudflare worker. I've written a lot about workers in the past so here's the short pitch: it's code that runs in each one of Cloudflare's 300+ edge nodes around the world and can inspect and modify requests and responses on the fly. I already had a worker to do some other processing on unified search requests, so I just added the following:
const token = request.headers.get('cf-turnstile-response');
if (token === null) {
return new Response('Missing Turnstile token', { status: 401 });
}
const ip = request.headers.get('CF-Connecting-IP');
let formData = new FormData();
formData.append('secret', '[secret key goes here]');
formData.append('response', token);
formData.append('remoteip', ip);
const turnstileUrl = 'https://challenges.cloudflare.com/turnstile/v0/siteverify';
const result = await fetch(turnstileUrl, {
body: formData,
method: 'POST',
});
const outcome = await result.json();
if (!outcome.success) {
return new Response('Invalid Turnstile token', { status: 401 });
}
That should be pretty self-explanatory and you can find the docs for this on Cloudflare's server-side validation page which goes into more detail, but in essence, it does the following:
And because this is all done in a Cloudflare worker, any of those 401 responses never even touch the origin. Not only do I not need to process the request in Azure, the person attempting to abuse my API gets a nice speedy response directly from an edge node near them 🙂
So, what does this mean for bots? If there's no token then they get booted out right away. If there's a token but it's not valid then they get booted out at the end. But can't they just take a previously generated token and use that? Well, yes, but only once:
If the same response is presented twice, the second and each subsequent request will generate an error stating that the response has already been consumed.
And remember, a real browser had to generate that token in the first place so it's not like you can just automate the process of token generation then throw it at the API above. (Sidenote: that server-side validation link includes how to handle idempotency, for example when retrying failed requests.) But what if a real human fails the verification? That's entirely up to you but in HIBP's case, that 401 response causes a fallback to a full page post back which then implements other controls, for example an interactive challenge.
Time for graphs and stats, starting with the one in the hero image of this page where we can see the number of times Turnstile was issued and how many times it was solved over the week prior to publishing this post:
That's a 91% hit rate of solved challenges which is great. That remaining 9% is either humans with a false positive or... bots getting rejected 😎
More graphs, this time how many requests to the unified search page were rejected by Turnstile:
That 990k number doesn't marry up with the 476k unsolved ones from before because they're 2 different things: the unsolved challenges are when the Turnstile widget is loaded but not solved (hopefully due to it being a bot rather than a false positive), whereas the 401 responses to the API is when a successful (and previously unused) Turnstile token isn't in the header. This could be because the token wasn't present, wasn't solved or had already been used. You get more of a sense of how many of these rejected requests were legit humans when you drill down into attributes like the JA3 fingerprints:
In other words, of those 990k failed requests, almost 40% of them were from the same 5 clients. Seems legit 🤔
And about a third were from clients with an identical UA string:
And so on and so forth. The point being that the number of actual legitimate requests from end users that were inconvenienced by Turnstile would be exceptionally small, almost certainly a very low single-digit percentage. I'll never know exactly because bots obviously attempt to emulate legit clients and sometimes legit clients look like bots and if we could easily solve this problem then we wouldn't need Turnstile in the first place! Anecdotally, that very small false positive number stacks up as people tend to complain pretty quickly when something isn't optimal, and I implemented this all the way back in March. Yep, 5 months ago, and I've waited this long to write about it just to be confident it's actually working. Over 100M Turnstile challenges later, I'm confident it is - I've not seen a single instance of abnormal traffic spikes to the unified search endpoint since rolling this out. What I did see initially though is a lot of this sort of thing:
By now it should be pretty obvious what's going on here, and it should be equally obvious that it didn't work out real well for them 😊
The bot problem is a hard one for those of us building services because we're continually torn in different directions. We want to build a slick UX for humans but an obtrusive one for bots. We want services to be easily consumable, but only in the way we intend them to... which might be by the good bots playing by the rules!
I don't know exactly what Cloudflare is doing in that challenge and I'll be honest, I don't even know what a "proof-of-space" is. But the point of using a service like this is that I don't need to know! What I do know is that Cloudflare sees about 20% of the internet's traffic and because of that, they're in an unrivalled position to look at a request and make a determination on its legitimacy.
If you're in my shoes, go and give Turnstile a go. And if you want to consume data from HIBP, go and check out the official API docs, the uh, unified search doesn't work real well for you any more 😎