Reading view

Open dataset: 100k+ multimodal prompt injection samples with per-category academic sourcing

I submitted an earlier version of this dataset and was declined on the basis of missing methodology and unverifiable provenance. The feedback was fair. The documentation has since been rewritten to address it directly, and I would very much appreciate a second look.

What the dataset contains

101,032 samples in total, balanced 1:1 attack to benign.

Attack samples (50,516) across 27 categories sourced from over 55 published papers and disclosed vulnerabilities. Coverage spans:

  • Classical injection - direct override, indirect via documents, tool-call injection, system prompt extraction
  • Adversarial suffixes - GCG, AutoDAN, Beast
  • Cross-modal delivery - text with image, document, audio, and combined payloads across three and four modalities
  • Multi-turn escalation - Crescendo, PAIR, TAP, Skeleton Key, Many-shot
  • Emerging agentic attacks - MCP tool descriptor poisoning, memory-write exploits, inter-agent contagion, RAG chunk-boundary injection, reasoning-token hijacking on thinking-trace models
  • Evasion techniques - homoglyph substitution, zero-width space insertion, Unicode tag-plane smuggling, cipher jailbreaks, detector perturbation
  • Media-surface attacks - audio ASR divergence, chart and diagram injection, PDF active content, instruction-hierarchy spoofing

Benign samples (50,516) are drawn from Stanford Alpaca, WildChat, MS-COCO 2017, Wikipedia (English), and LibriSpeech. The benign set is matched to the surface characteristics of the attack set so that classifiers must learn genuine injection structure rather than stylistic artefacts.

Methodology

The previous README lacked this section entirely. The current version documents the following:

  1. Scope definition. Prompt injection is defined per Greshake et al. and OWASP LLM01 as runtime text that overrides or redirects model behaviour. Pure harmful-content requests without override framing are explicitly excluded.
  2. Four-layer construction. Hand-crafted seeds, PyRIT template expansion, cross-modal delivery matrix, and matched benign collection. Each layer documents the tool used, the paper referenced, and the design decision behind it.
  3. Label assignment. Labels are assigned by construction at the category level rather than through per-sample human review. This is stated plainly rather than overclaimed.
  4. Benign edge-case design. The ten vocabulary clusters used to reduce false positives on security-adjacent language are documented individually.
  5. Quality control. Deduplication audit results are included: zero duplicate texts in the benign pool, zero benign texts appearing in attacks, one documented legacy duplicate cluster with cause noted.
  6. Known limitations. Six limitations are stated explicitly: text-based multimodal representation, hand-crafted seed counts, English-skewed benign pool, no inter-rater reliability score, ASR figures sourced from original papers rather than re-measured, and small v4 seed counts for emerging categories.

Reproducibility

Generators are deterministic (random.seed(42)). Running them reproduces the published dataset exactly. Every sample carries attack_source and attack_reference fields with arXiv or CVE links. A reviewer can select any sample, follow the citation, and verify that the attack class is documented in the literature.

Comparison to existing datasets

The README includes a comparison table against deepset (500 samples), jackhhao (2,600), Tensor Trust (126k from an adversarial game), HackAPrompt (600k from competition data), and InjectAgent (1,054). The gap this dataset aims to fill is multimodal cross-delivery combinations and emerging agentic attack categories, neither of which exists at scale in current public datasets.

What this is not

To be direct: this is not a peer-reviewed paper. The README is documentation at the level expected of a serious open dataset submission - methodology, sourcing, limitations, and reproducibility - but it does not replace academic publication. If that bar is a requirement for r/netsec specifically, that is reasonable and I will accept the feedback.

Links

I am happy to answer questions about any construction decision, provide verification scripts for specific categories, or discuss where the methodology falls short.

submitted by /u/BordairAPI
[link] [comments]
  •  

Designing for What’s Next: Securing AI-Scale Infrastructure Without Compromise

6100 Series is an ultra-high-end firewall, delivering exceptional performance, line-rate threat protection, and modular scalability at AI-ready data centers.
  •  

Patch Tuesday, April 2026 Edition

Microsoft today pushed software updates to fix a staggering 167 security vulnerabilities in its Windows operating systems and related software, including a SharePoint Server zero-day and a publicly disclosed weakness in Windows Defender dubbed “BlueHammer.” Separately, Google Chrome fixed its fourth zero-day of 2026, and an emergency update for Adobe Reader nixes an actively exploited flaw that can lead to remote code execution.

A picture of a windows laptop in its updating stage, saying do not turn off the computer.

Redmond warns that attackers are already targeting CVE-2026-32201, a vulnerability in Microsoft SharePoint Server that allows attackers to spoof trusted content or interfaces over a network.

Mike Walters, president and co-founder of Action1, said CVE-2026-32201 can be used to deceive employees, partners, or customers by presenting falsified information within trusted SharePoint environments.

“This CVE can enable phishing attacks, unauthorized data manipulation, or social engineering campaigns that lead to further compromise,” Walters said. “The presence of active exploitation significantly increases organizational risk.”

Microsoft also addressed BlueHammer (CVE-2026-33825), a privilege escalation bug in Windows Defender. According to BleepingComputer, the researcher who discovered the flaw published exploit code for it after notifying Microsoft and growing exasperated with their response. Will Dormann, senior principal vulnerability analyst at Tharros, says he confirmed that the public BlueHammer exploit code no longer works after installing today’s patches.

Satnam Narang, senior staff research engineer at Tenable, said April marks the second-biggest Patch Tuesday ever for Microsoft. Narang also said there are indications that a zero-day flaw Adobe patched in an emergency update on April 11 — CVE-2026-34621 — has seen active exploitation since at least November 2025.

Adam Barnett, lead software engineer at Rapid7, called the patch total from Microsoft today “a new record in that category” because it includes nearly 60 browser vulnerabilities. Barnett said it might be tempting to imagine that this sudden spike was tied to the buzz around the announcement a week ago today of Project Glasswing — a much-hyped but still unreleased new AI capability from Anthropic that is reportedly quite good at finding bugs in a vast array of software.

But he notes that Microsoft Edge is based on the Chromium engine, and the Chromium maintainers acknowledge a wide range of researchers for the vulnerabilities which Microsoft republished last Friday.

“A safe conclusion is that this increase in volume is driven by ever-expanding AI capabilities,” Barnett said. “We should expect to see further increases in vulnerability reporting volume as the impact of AI models extend further, both in terms of capability and availability.”

Finally, no matter what browser you use to surf the web, it’s important to completely close out and restart the browser periodically. This is really easy to put off (especially if you have a bajillion tabs open at any time) but it’s the only way to ensure that any available updates get installed. For example, a Google Chrome update released earlier this month fixed 21 security holes, including the high-severity zero-day flaw CVE-2026-5281.

For a clickable, per-patch breakdown, check out the SANS Internet Storm Center Patch Tuesday roundup. Running into problems applying any of these updates? Leave a note about it in the comments below and there’s a decent chance someone here will pipe in with a solution.

  •  

Can Your Wearable Health Monitors Be Compromised?

Wearable health devices are designed to give you more control over your body and your data. 

But in 2026, the bigger risk isn’t someone spying on your smartwatch or smartring in real time. It’s what happens if the data connected to that device gets exposed. 

Health data, login credentials, and behavioral patterns tied to wearables can become valuable signals for cybercriminals. And once that data is out, it can fuel everything from identity theft to highly targeted scams. 

Here’s what’s actually at risk, and how to protect yourself. 

What Is Wearable Health Data (and Why It Matters) 

Wearable health data refers to the personal information collected and stored by devices like fitness trackers, smartwatches, and connected medical monitors. 

This can include: 

  • Heart rate and activity levels  
  • Sleep patterns  
  • Location data  
  • Medical metrics (like glucose levels)  
  • Account credentials tied to apps and dashboards  

On its own, this data may seem harmless. But combined, it creates a highly detailed profile of your habits, routines, and health status. 

The Real Risk in 2026 Isn’t the Device. It’s the Data. 

Early conversations around wearable security focused on device hacking or surveillance. 

Today, the bigger concern is data exposure. 

If wearable platforms, apps, or connected services are breached, your data could be: 

  • Sold on the dark web  
  • Used to impersonate you  
  • Leveraged in targeted phishing or health-related scams  

And because this data is personal and specific, scams built from it can feel far more convincing than generic spam. 

How Exposed Wearable Data Can Lead to Scams 

When cybercriminals gain access to personal data, they don’t just sit on it. They use it. 

Here’s how that plays out: 

Scenario  What It Looks Like  Why It Works 
Health-related phishing  “Your insurance claim was denied” or “Update your health profile”  Feels relevant and urgent 
Account takeover attempts  Password reset emails tied to known apps  Uses real account signals 
Personalized scams  Messages referencing routines, devices, or conditions  Builds trust quickly 
Fake alerts or services  “Device security issue detected”  Mimics real product behavior 

 

This is where the risk shifts from data privacy → real-world financial and identity impact. 

6 Smart Ways to Protect Your Wearable Data 

1)Install updates immediately
Security patches fix known vulnerabilities. Delaying updates leaves gaps open.  

2) Use layered protection, not just device settings
A VPN and security software help protect data in transit and block threats before they reach you.  

3) Strengthen your login credentials
Use strong, unique passwords and enable two-factor authentication wherever possible.  

4) Limit what you share
Review app permissions and only connect devices to services you trust.  

5) Verify every message or alert
If you receive a message tied to your device or health data, double-check the source before clicking.  

6) Monitor your accounts regularly
Small signs of unusual activity can be early indicators of larger issues. 

How McAfee Helps Protect Your Data Beyond the Device 

Protecting your wearable doesn’t stop at the device itself. It extends to what happens if your data is exposed or targeted. 

Identity Monitoring 

McAfee helps track your personal information across known breach sources and alerts you if your data appears where it shouldn’t. 

This gives you early warning if wearable-related accounts or associated data are compromised. 

Scam Detector 

If your data is exposed, scammers often follow. 

McAfee’s Scam Detector helps identify suspicious messages, links, and communications before you engage, and explains why something was flagged, so you can make informed decisions quickly. 

Together, these tools help protect not just your device, but the chain reaction that can follow a data breach. 

The post Can Your Wearable Health Monitors Be Compromised? appeared first on McAfee Blog.

  •  

Weekly Update 499

Weekly Update 499

I'm starting to become pretty fond of Bruce. Actually, I've had a bit of an epiphany: an AI assistant like Bruce isn't just about auto-responding to tickets in an entirely autonomous manner; it's also pretty awesome at responding with just a little bit of human assistance. Charlotte and I both replied to some tickets today that were way too specific for Bruce to ever do on his own, but by feeding in just a little bit of additional info (such as the number of domains someone was presently monitoring), Bruce was able to construct a really good reply and "own" the ticket. So maybe that's the sweet spot: auto-reply to the really obvious stuff and then take just a little human input on everything else.

Weekly Update 499
Weekly Update 499
Weekly Update 499
Weekly Update 499
  •  

Unpatched RAGFlow Vulnerability Allows Post-Auth RCE

The current version of RAGFlow, a widely-deployed Retrieval Augmented Generation solution, contains a post-auth vulnerability that allows for arbitrary code execution.

This post includes a POC, walkthrough and patch.

The TL;DR is to make sure your RAGFlow instances aren't on the public internet, that you have the minimum number of necessary users, and that those user accounts are protected by complex passwords. (This is especially true if you're using Infinity for storage.)

submitted by /u/Prior-Penalty
[link] [comments]
  •  

CVE-2026-22666: Dolibarr 23.0.0 dol_eval() whitelist bypass -> RCE (full write-up + PoC)

Root cause: the $forbiddenphpstrings blocklist is only enforced in blacklist mode -> the default whitelist mode never touches it. The whitelist regex is also blind to PHP dynamic callable syntax (('exec')('cmd')). Either bug alone limits impact; together they reach OS command execution. Coordinated disclosure - patch available as of 4/4/2026.

submitted by /u/JivaSecurity
[link] [comments]
  •  
❌