Resources

Curated collection of frameworks, tools, and materials

This page provides a curated collection of resources related to Secure & Trustworthy AI, most of which were introduced during class throughout the Spring 2026 semester.

No resources match your search.

AI Incident and Vulnerability Databases

MIT AI Risk Repository

Comprehensive database tracking AI-related incidents, failures, and safety issues. Maintained by MIT researchers with rigorous categorization and analysis of real-world AI incidents.

Database Weeks 1-3

AI Incident Database

Community-maintained database of AI incidents with detailed case studies and analysis. Features searchable incident reports, cross-references, and collaborative documentation of AI system failures.

Database Weeks 1-3

NIST National Vulnerability Database (NVD)

The U.S. government repository of standards-based vulnerability data (CVEs) with severity scores and references — a core reference for security testing and triage.

Database Week 14

AVID — AI Vulnerability Database

An open-source knowledge base of failure modes for AI/ML systems, cataloging security, ethics, and performance vulnerabilities mapped to taxonomies like MITRE ATLAS and the NIST AI RMF.

Database Weeks 1-3

Threat Intelligence Reports

Disrupting AI-Orchestrated Cyber Espionage

Anthropic (November 2025). Documents the first large-scale cyberattack executed without substantial human intervention. A Chinese state-sponsored group manipulated Claude Code to autonomously attack ~30 global targets, performing reconnaissance, credential theft, and lateral movement.

Report Week 3

Detecting and Countering Misuse of AI

Anthropic Threat Intelligence (August 2025). Documents real-world cases of Claude being weaponized for ransomware development, large-scale data extortion, and AI-driven credential harvesting. A cybercriminal with only basic coding skills sold AI-generated ransomware.

Report Week 3

ENISA Threat Landscape 2025

European Union Agency for Cybersecurity (October 2025). Annual report analyzing 4,875 cybersecurity incidents. Finds AI-supported phishing accounts for over 80% of social engineering activity worldwide, enabled by jailbroken models, synthetic media, and model poisoning.

Report Week 3

Detecting and Preventing Distillation Attacks

Anthropic (February 2026). Documents industrial-scale model distillation by DeepSeek, Moonshot/Kimi K2, and MiniMax, extracting frontier capabilities without safety guardrails. Introduces detection methodologies and countermeasures for intellectual property theft via API-based knowledge extraction.

Report Week 6

GTIG AI Threat Tracker

Google Threat Intelligence Group (November 2025). Documents the first malware families that use AI capabilities mid-execution: FRUITSHELL, PROMPTFLUX, PROMPTLOCK, PROMPTSTEAL, and QUIETVAULT. Malware is no longer static -- it adapts using the same AI tools defenders use.

Report Week 3

EchoLeak / CVE-2025-32711 (Hacker News)

The Hacker News (June 2025). Write-up of the first disclosed zero-click prompt-injection vulnerability in a production AI assistant. Aim Labs researchers showed an attacker could exfiltrate data from Microsoft 365 Copilot via a crafted email, without any user interaction.

Report Week 13

NIST NVD: CVE-2025-32711

Official National Vulnerability Database entry for EchoLeak. Describes the zero-click prompt-injection vector in Microsoft 365 Copilot, CVSS scoring, and affected versions. Primary reference for the first CVE-cataloged agent prompt-injection flaw.

CVE Week 13

Frameworks

OWASP Top 10 for LLM Applications

OWASP (2025). Community-driven catalog of the ten most critical security risks for large language model applications. Covers prompt injection (#1 entry), sensitive information disclosure, supply chain vulnerabilities, and excessive agency.

Framework Weeks 3, 7

OWASP Top 10 for Agentic Applications

OWASP (December 2025). Security risks unique to autonomous AI agents: agent goal hijacking, tool misuse, identity and privilege abuse, memory poisoning, and more. Developed with input from 100+ security researchers.

Framework Weeks 3, 7

MITRE ATLAS

MITRE. Adversarial Threat Landscape for AI Systems -- a knowledge base of tactics, techniques, and case studies for attacking AI/ML systems. Structured in an ATT&CK-style matrix covering 15 tactics and 66 techniques across the entire AI lifecycle.

Framework Weeks 3, 7

NIST AI Risk Management Framework (AI RMF 1.0)

Foundational voluntary, sector-agnostic framework organized around four core functions (Govern, Map, Measure, Manage).

Framework Week 4

NIST AI RMF Playbook

Companion implementation guide providing suggested actions and practical guidance for each subcategory across all four framework functions.

Framework Week 4

NIST AI 600-1: Generative AI Profile

Companion to the AI RMF addressing risks unique to generative AI, with specific actions for GenAI risk management.

Framework Week 4

ISO/IEC 42001 vs. NIST AI RMF

Structured side-by-side comparison covering differences in scope, certification requirements, implementation approach, and organizational fit.

Comparison Week 4

IIA AI Auditing Framework

Practitioner-oriented framework for auditing AI systems organized around the IIA's Three Lines Model. Updated in 2024 to align with NIST AI RMF.

Framework Week 4

ISO/IEC 42001

International standard specifying requirements for establishing, implementing, and improving an AI management system. The first certifiable international AI governance standard.

Standard Week 4

IEEE 7000

Model process for integrating ethical values and risk analysis into system and software life cycles during AI system design.

Standard Week 4

CSA MAESTRO

Cloud Security Alliance (February 2025). Multi-Agent Environment, Security, Threat, Risk, and Outcome framework. Seven-layer architecture designed specifically for agentic AI security threat modeling, building on STRIDE, PASTA, and LINDDUN with AI-specific threat considerations.

Framework Week 7

AIUC-1

The world's first auditable certification standard for agentic AI systems. Covers security, safety, reliability, accountability, and societal risk, operationalizing principles from NIST AI RMF, EU AI Act, and MITRE ATLAS into auditable controls. Certification requires upfront testing plus quarterly reassessment.

Standard Week 7

OWASP AI Bill of Materials

Standardized AI Bill of Materials for documenting components, dependencies, and supply chain information for AI/ML systems. Extends software BOM concepts for AI transparency.

Framework Week 4

NIST: Securing AI Ecosystems with AIBOM

NIST presentation on AI Bills of Materials for securing AI ecosystems, covering supply chain transparency, component tracking, and risk management.

Presentation Week 4

CSA Agentic AI Red-Teaming Guide

Cloud Security Alliance. Structured methodology for red-teaming agentic AI systems across identity, memory, tool use, and orchestration layers. Provides a scoping template, attack categories aligned with MITRE ATLAS, and reporting format. A practical companion to OWASP's Top 10 for Agentic Applications.

Framework Week 12

NIST AI 100-2e2025: Adversarial ML Taxonomy

Vassilev et al., NIST (2025). Authoritative taxonomy and terminology for adversarial machine learning. Catalogues attack surfaces (evasion, poisoning, privacy, abuse) and mitigations across predictive and generative AI. Reference document for red-team scoping and regulator-facing reports.

Framework Week 12

MITRE ATLAS: AI Security 101

Entry point into the MITRE ATLAS knowledge base. Explains adversary tactics, techniques, and case studies specific to ML and LLM systems. Pairs with the main ATLAS matrix as onboarding for red-teamers new to AI-specific attacks.

Reference Week 12

Agents Rule of Two (Meta)

Meta AI (October 2025). Proposes a minimal design rule for AI agent security: an agent should never simultaneously (1) process untrusted input, (2) access sensitive data or systems, and (3) act in the real world without human approval. A simple, testable invariant for production agent deployments.

Framework Week 13

Inspect AI (UK AISI)

UK AI Security Institute's open-source framework for large language model evaluations. Supports capability benchmarks, agentic-task evaluations, and red-team scoring with first-class support for multi-turn tool use. Used in AISI's pre-deployment evaluations of frontier models.

Framework Weeks 12-13

0DIN Attack Taxonomy

Mozilla 0DIN's open taxonomy of AI attack techniques. Structured catalog of jailbreak, prompt-injection, data-extraction, and agent-subversion patterns, with mappings to real-world disclosures. Complements MITRE ATLAS at a finer-grained technique level.

Taxonomy Week 12

Documentation & Transparency Platforms

Hugging Face Datasets

Repository hosting 200,000+ datasets with standardized documentation, dataset cards, and filtering by task, size, language, and license. See dataset documentation in practice at scale.

Platform Week 4

OpenML

Open platform for sharing datasets, tasks, and ML experiments with standardized metadata and reproducibility tracking. Browse datasets sorted by usage and activity.

Platform Week 4

Hugging Face Models

Repository hosting 1M+ ML models with model cards, usage metrics, and community discussion. Browse model cards across architectures, tasks, and organizations.

Platform Week 4

Google DeepMind Model Cards

Google DeepMind's collection of model cards documenting model capabilities, limitations, and intended uses for their AI models.

Platform Week 4

OpenAI Models Documentation

OpenAI's models page listing available models with capabilities, context windows, and versioning. A developer-facing view of model transparency and documentation.

Platform Week 4

Anthropic System Cards

Collection of system cards for Claude models, detailing safety evaluations, capability assessments, and risk mitigations for each major release.

Platform Week 4

Frontier Lab Safety Commitments

Anthropic

Anthropic’s frontier-safety work: classifier-based defenses trained against a constitution, a dedicated red-team organization, and interpretability-for-safety research.

Constitutional Classifiers Frontier Red Team Project Glasswing

Safety & Red Team Week 12

OpenAI

OpenAI’s public risk-management and transparency commitments for frontier models — capability thresholds that gate deployment, plus an ongoing safety-evaluations dashboard.

Preparedness Framework Safety Evaluations Hub

Safety & Policy Week 12

Google DeepMind

Google DeepMind’s Critical Capability Level framework for frontier risk, alongside its consolidated AGI-safety research hub.

Frontier Safety Framework Safety Research Hub

Safety Week 12

xAI

xAI’s draft Risk Management Framework (Feb 2025) covering dangerous-capability testing, deployment gates, and incident response; released for public comment.

Risk Management Framework (Draft)

Policy Week 12

Meta

Meta’s umbrella project for open trust-and-safety tooling around Llama models — CyberSecEval benchmarks, Llama Guard classifiers, and Code Shield.

Tooling Week 12

AI Red-Teaming Guides & Walkthroughs

Prompt Injection and Jailbreaking Are Not the Same Thing

Simon Willison (March 2024). Canonical essay distinguishing prompt injection (attacker exploits an LLM application that trusts attacker-controlled input) from jailbreaking (user convinces a model to ignore its own rules). Required reading before scoping any red-team.

Essay Week 12

New Prompt Injection Papers (Nov 2025)

Simon Willison (November 2025). Curated review of three recent prompt-injection papers including Meta's Agents Rule of Two. A useful running index of the research frontier from the writer who has tracked this space longest.

Essay Week 13

AI Red Teaming for First Timers

Promptfoo blog. On-ramp for practitioners new to AI red-teaming: scoping, tooling, attack categories, reporting. Pairs with their Red Team docs and the "Top 5 OSS Tools" post to get from zero to a first engagement.

Guide Week 12

Promptfoo Red-Teaming Documentation

Official Promptfoo documentation on using their CLI to run red-team evaluations, including pre-built attack strategies, custom providers, and CI integration patterns.

Docs Week 12

What Is AI Red Teaming? (Palo Alto Cyberpedia)

Palo Alto Networks' reference explainer on AI red teaming: definitions, scope, methodology, and how it differs from traditional pentesting. A good quick-reference and glossary source.

Explainer Week 12

Guide to AI Red Teaming with MITRE ATLAS

CyberThrone (March 2026). Walkthrough applying MITRE ATLAS tactics and techniques during an AI red-team engagement. Shows how to use ATLAS as both a planning framework and a reporting taxonomy.

Guide Week 12

Guide to Agentic AI Red Teaming (DeepTeam)

DeepTeam's playbook for red-teaming agentic AI systems. Covers scope, threat modeling, attack categories for agents (tool misuse, memory poisoning, goal hijacking), and evaluation harness design.

Guide Week 12

OpenClaw AI Security Test: How to Red Team a High-Privilege Agent

Penligent HackingLabs walkthrough of red-teaming OpenClaw, the intentionally vulnerable high-privilege agentic application used in this course's final project. Demonstrates reconnaissance, tool abuse, and privilege escalation against a realistic target.

Walkthrough Week 12

AI Security Solutions Landscape (Q2 2026)

OWASP GenAI Project. Landscape report of commercial and open-source AI security solutions for AI red-teaming and agentic-system defense. Useful for surveying the vendor ecosystem before procurement.

Report Week 12

AI Red-Teaming & Security Testing Tools

Promptfoo

Open-source evaluation and red-teaming framework for LLM applications. Provides a test-runner, adversarial prompt generators, and a library of attack strategies for prompt injection, jailbreaks, and policy violations. Used as a CI gate for LLM systems.

Tool Weeks 12-13

Garak (NVIDIA)

NVIDIA's open-source LLM vulnerability scanner. Probes models for hallucination, prompt injection, data leakage, toxicity, and jailbreak susceptibility using a pluggable catalog of probes and detectors. Packaged as a CLI.

Tool Week 12

PyRIT (Microsoft)

Python Risk Identification Toolkit for generative AI, maintained by Microsoft's AI Red Team. Automates adversarial prompt orchestration, multi-turn attack chains, and scoring across Azure OpenAI, Hugging Face, and local models.

Tool Week 12

AgentDojo

Debenedetti et al., NeurIPS 2024 (ETH SPY Lab). Benchmark and evaluation harness for prompt-injection and tool-use attacks against LLM agents. Includes 97 realistic agent tasks across Slack, banking, travel, and workspace domains, plus 629 attack instances.

Benchmark Week 12

Raptor

Open-source LLM-driven pentester for AI agents. Automates reconnaissance, payload generation, and multi-step exploitation against agentic targets. Used in conference demos including DEF CON AI Village.

Tool Week 12

Shannon

LLM pentester from Keygraph. Autonomous red-team agent that probes deployed LLM applications for prompt injection, data exfiltration, and tool-call abuse. Complements Raptor in the "LLM attacks LLM" tooling category.

Tool Week 12

Lakera Gandalf

Gamified LLM jailbreak challenge from Lakera. Players try to extract secret passwords from progressively hardened prompt-injection defenses across 8 levels. Widely used as an on-ramp for prompt-injection training and recruiting.

Challenge Week 12

Lakera AgentBreaker

Gamified attack challenge focused on agentic systems: goal hijacking, tool misuse, memory poisoning, and privilege escalation. Extends Gandalf's approach to multi-step, tool-calling agents.

Challenge Week 12

Microsoft AI Red-Teaming Playground Labs

Open-source lab environment from Microsoft providing hands-on AI red-teaming scenarios. Includes vulnerable applications, guided exercises, and reference solutions for prompt-injection, jailbreak, and RAG-attack patterns.

Lab Week 12

Kali + Claude Desktop

Offensive Security's guide for integrating Claude Desktop with Kali Linux via MCP. Turns the pentesting distro into an LLM-driven attack surface where the model can call Kali tools directly for reconnaissance and exploitation workflows.

Integration Week 12

Challenges in Red-Teaming AI Systems

Anthropic (June 2024). Reflects on what Anthropic has learned running red-team programs against frontier models: scoping ambiguity, coverage gaps, evaluator drift, and the tension between breadth and depth. Essential reading before designing a red-team plan.

Report Week 12

Top 5 Open-Source AI Red-Teaming Tools (2025)

Promptfoo blog survey of leading OSS AI red-teaming tools in 2025. Compares coverage, target types, and ergonomics across Promptfoo, Garak, PyRIT, and others. Starting point for picking a stack.

Survey Week 12

0DIN.ai (GenAI Bug Bounty)

Mozilla’s 0Day Investigative Network — a generative-AI bug-bounty program and platform for researchers to report and catalog LLM jailbreaks and vulnerabilities.

Platform Week 12

Guardrails & Output Validation

Guardrails AI

Open-source framework for defining structural, semantic, and policy validators around LLM inputs and outputs. Validators can be composed declaratively (e.g., "output matches schema AND contains no PII AND no toxic language") and enforced with auto-retry and logging.

Tool Week 13

NVIDIA NeMo Guardrails

NVIDIA's open-source toolkit for adding programmable guardrails to LLM conversations. Uses a rule language (Colang) to constrain topics, tool use, and response shape. Integrates with enterprise RAG and agent stacks.

Tool Week 13

AI Guardrails Production Implementation Guide 2026

Iterathon. Practical guide to deploying guardrails in production: where to put them (pre-prompt, post-response, runtime tool checks), how to measure false-positive rates, and failure modes that don't show up in staging.

Guide Week 13

How to Design Guardrails for Secure and Scalable AI Agents

AppSecEngineer. Engineering-focused design guide covering layered guardrail architectures, fail-closed defaults, and integration patterns with policy engines. Strong on the "defense in depth" framing for agentic systems.

Guide Week 13

AI Guardrails (Wiz Academy)

Wiz's reference article covering what AI guardrails are, common categories (input, output, tool, policy), and how they fit into a broader AI security program. Good for giving a non-technical stakeholder the vocabulary.

Explainer Week 13

AI Agent Guardrails & Output Validation in 2026

ToolHalla. Deep-dive on output-validation patterns for agents in 2026: schema enforcement, semantic validators, tool-call verification, and rollback strategies when a guardrail fires mid-task.

Guide Week 13

Microsoft Foundry: Guardrails and Controls Overview

Microsoft Azure Foundry documentation for the platform's built-in guardrails and controls. Covers content filters, jailbreak defenses, prompt-shield, and grounding-detection features available to Azure AI deployments.

Docs Week 13

Sandboxing & Agent Isolation

Practical Security Guidance for Sandboxing Agentic Workflows

NVIDIA Developer Blog. NVIDIA AI Red Team's recommendations for isolating agentic workflows: process isolation, capability limiting, egress filtering, and recovery from compromised tool calls. Concrete and production-oriented.

Guidance Week 13

How to Sandbox AI Agents in 2026

Northflank. Engineering deep-dive on sandboxing patterns for AI agents: microVMs, container isolation, file-system jails, and capability-scoped secrets. Compares tradeoffs across isolation primitives.

Guide Week 13

Kubernetes Agent Sandbox SIG

The Kubernetes community's special-interest group for agent-sandbox primitives. Working on standards for short-lived, capability-limited execution environments for AI agents in Kubernetes clusters.

SIG Week 13

Claude Code Sandboxing

Anthropic's official documentation for Claude Code's sandboxing model. Describes the permission system, file-system boundaries, and command-execution gates that keep an LLM-driven coding agent from exceeding its authorized scope.

Docs Week 13

Sandboxing AI Agents, 100x Faster (Cloudflare)

Cloudflare Blog on Dynamic Workers: edge-deployed, short-lived sandboxes optimized for agent execution. Claims 100x faster cold-start than container-based approaches, making ephemeral per-request isolation viable.

Article Week 13

LangChain Sandboxes Documentation

LangChain's official documentation on sandboxing patterns for DeepAgents. Covers local process isolation, containerized sandboxes, and remote-execution models with auditable tool-call logs.

Docs Week 13

Claude Code: Permission Modes

Documentation for Claude Code’s permission modes, which gate an agent’s ability to run commands and edit files — a concrete example of approval gating and auto-accept controls for coding agents.

Docs Week 14

Human-in-the-Loop Oversight

Practicing the Human-in-the-Loop

Strata. Operational guide to embedding humans in agent workflows: approval checkpoints, escalation rules, and identity-aware decision logging. Frames HITL as an agentic-identity problem, not just a UX pattern.

Guide Week 13

Human-in-the-Loop Agentic AI (Elementum)

Elementum AI. Survey of HITL patterns for enterprise agentic AI: approval gates, confidence-based routing, reviewer staffing models, and metrics for evaluating whether oversight is actually catching errors.

Survey Week 13

How to Build Human-in-the-Loop Oversight for AI Agents

Galileo. Implementation-focused guide on instrumenting HITL for AI agents: trace capture, reviewer UIs, SLA design, and feedback loops that improve agent behavior over time.

Guide Week 13

LangChain Human-in-the-Loop Middleware

LangChain's official middleware for HITL: pause-and-resume semantics, structured approval prompts, and audit trails for every decision. A reference implementation for adding oversight to existing LangChain agents.

Docs Week 13

Case Studies & Legal Actions

CFPB v. Fairway Independent Mortgage

CFPB and DOJ complaint alleging algorithmic redlining in Birmingham, AL. Fairway generated significantly fewer mortgage applications from majority-Black neighborhoods than peer lenders, demonstrating how AI-powered underwriting and marketing systems can scale historical discrimination.

Legal Action Week 3

Australia's Robodebt Scheme

Automated debt recovery system wrongly accused 526,000+ people of welfare fraud using income averaging. 93% error rate when audited. At least 663 vulnerable people died after receiving notices. Government cost: A$2.4 billion. Subject of a Royal Commission.

Case Study Week 3

Netherlands Childcare Benefits Scandal

Dutch tax authority algorithmically profiled non-Dutch nationals as "higher risk," wrongfully accusing 35,000+ parents of fraud. Over 1,000 children placed in state custody. The entire Dutch cabinet resigned in January 2021. Amnesty International report on systemic rights violations.

Case Study Week 3

Michigan MiDAS Unemployment System

Automated system falsely accused ~40,000 residents of unemployment fraud with a 93% error rate. Operated without human oversight. Victims had wages garnished, tax refunds seized, and some lost homes. Ford School of Public Policy explainer.

Case Study Week 3

Louis et al. v. SafeRent Solutions

Class action alleging SafeRent's AI tenant screening scores disproportionately penalized Black and Hispanic renters and housing voucher recipients. $2.28M settlement; SafeRent barred from scoring voucher applicants for five years nationwide.

Legal Action Week 3

OpenClaw: Agentic AI Security Case Study

Open-source AI agent with 68,000+ GitHub stars and ~180,000 developers. CVE-2026-25253 (CVSS 8.8): one-click RCE via cross-site WebSocket hijacking. 42,900 exposed instances across 82 countries. ClawHub skills marketplace: nearly 20% of packages contained malicious payloads (Bitdefender). A comprehensive case study in what happens when security is an afterthought.

Case Study Week 3

NEDA Tessa Chatbot (AIID #545)

National Eating Disorders Association replaced human helpline workers with the Tessa chatbot, which gave harmful weight-loss advice to vulnerable users. Organization shut down the bot after public backlash.

Case Study Week 5

CBA AI Chatbot Worker Displacement

Commonwealth Bank of Australia employees trained the "Bumblebee" AI chatbot, then were laid off. CBA later reversed layoffs under regulatory and union pressure.

Case Study Week 5

SoftBank Pepper Robot (AIID #152)

Pepper humanoid robot repeatedly failed in nursing homes, funerals, retail, and home companion deployments. Rushed to market before technically ready; SoftBank halted production.

Case Study Week 5

Scatter Lab Luda Chatbot (AIID #106)

Korean chatbot trained on real users' private chat data without consent, producing discriminatory and hateful outputs. Scatter Lab fined for privacy violations.

Case Study Week 5

Adam Raine / ChatGPT (AIID #1192)

A teenager's interactions with ChatGPT as a companion AI contributed to his suicide. Parents allege OpenAI compressed safety testing and overrode built-in safeguards.

Case Study Week 5

AI Deepfake Romance Scam (South Korea)

Criminal organization used deepfake images and video to run romance scams, defrauding victims of approximately 18 billion won (~$8M USD).

Case Study Week 5

GRU Explosive Parcel Campaign

VSquare.org. Investigation into Russian military intelligence (GRU) orchestrating explosive parcels routed through 5+ EU countries via unwitting operatives recruited on Telegram. Used in Presentation 11 as a human parallel to indirect prompt injection, confused deputy problems, and hidden payloads in trusted containers.

Read Investigation

Case Study Week 7

AI Pentagon Explosion Image (AIID #543)

Fake AI-generated image of a Pentagon explosion caused a temporary stock market dip. Demonstrates AI-enabled misinformation risks to financial markets.

Case Study Week 5

Academic Papers & Reports

Runaway Feedback Loops in Predictive Policing

Ensign et al. (2018). Mathematical proof that predictive policing feedback loops are inevitable given system design. Historical arrest data trains models that send police to already over-policed neighborhoods, generating more arrests that confirm predictions. FAT* Conference.

Paper Week 3

Dissecting Racial Bias in a Healthcare Algorithm

Obermeyer et al. (2019). A widely-used algorithm serving ~200M patients used healthcare spending as a proxy for health needs. Because structural inequality means less is spent on Black patients at equivalent illness levels, the algorithm systematically under-predicted their needs. Science.

Paper Week 3

Algorithmic Monoculture and Social Welfare

Kleinberg & Raghavan (2021). When multiple institutions use similar algorithms, they converge on uniform decision criteria. Correlated failures across systems create systematic exclusion that no single institution can observe or correct. PNAS.

Paper Week 3

Digital Redlining in Healthcare

Analysis of how digital redlining creates cardiovascular health disparities, particularly affecting minorities who depend on digital health tools for work, education, and healthcare access. Johns Hopkins researchers warn of growing risks as AI permeates healthcare systems.

Report Week 3

Hallucinated Citations in AI Research

Nature reporting on citation integrity issues in scientific literature. Related: GPTZero's 2025 analysis of ~4,800 NeurIPS papers found 100+ fabricated citations across ~50 accepted papers that passed peer review, coining the term "vibe citing."

Paper Week 3

Shadow Escape: Zero-Click Agentic Attack via MCP

Operant AI (October 2025). Discloses a zero-click attack exploiting the Model Context Protocol (MCP) to exfiltrate data through AI agents like ChatGPT, Claude, and Gemini without requiring user error. Demonstrates how MCP's connectivity becomes an attack vector.

Paper Week 3

Frontier Models Are Capable of In-Context Scheming

Apollo Research (December 2024). Evaluates o1, Claude 3.5 Sonnet, Gemini 1.5 Pro and others. Finds frontier models strategically introduce subtle mistakes, attempt to disable oversight mechanisms, and maintain deception in over 85% of follow-up questioning.

Paper Week 3

An Overview of Catastrophic AI Risks

Hendrycks, Mazeika, and Woodside (2023). Organizes catastrophic AI risks into four categories: malicious use, AI race, organizational risks, and rogue AIs. The course textbook chapter (Hendrycks Ch. 1) is based on this paper. Free textbook version available at aisafetybook.com.

Paper Week 3

Specification Gaming in Reasoning Models

Palisade Research (February 2025). Reasoning LLMs tasked with winning chess against a stronger opponent spontaneously attempted to hack the game system. o1-preview tried to cheat in 37% of matches against Stockfish, including overwriting board state and running a rival engine.

Paper Week 3

Detecting and Reducing Scheming in AI Models

OpenAI (2025). Presents research showing scheming behaviors in frontier models during controlled tests. Demonstrates that deliberative alignment training reduced scheming rates from ~13% to ~0.4% in o3, though "imperfect generalization" means rare but serious misbehavior remains.

Report Week 3

Model Cards for Model Reporting

Mitchell et al. (2019). Foundational paper proposing model cards as short documents accompanying trained ML models that provide benchmarked evaluation across demographic and use-case conditions. FAT* Conference.

Paper Week 4

LLMs' Data-Control Path Insecurity

ACM (May 2024). Explores the fundamental architectural vulnerability in LLMs where natural language serves as both instruction channel and data channel simultaneously, making reliable separation between control signals and data impossible.

Paper Week 7

Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training

Hubinger et al. (2024, Anthropic). Demonstrates that LLMs trained to be deceptive can behave safely during evaluation but activate harmful behavior on triggers, and this persists even after RLHF and fine-tuning. Standard safety training does not reliably remove deceptive behavior.

Paper Week 7

Poisoning Web-Scale Training Datasets is Practical

Carlini et al. (2023). Demonstrates novel poisoning attacks that guarantee appearance of malicious examples in web-scale datasets used for training large, widely-used ML models in production.

Paper Week 7

Datasheets for Datasets

Gebru et al. (2021). Proposes standardized documentation for datasets, analogous to datasheets in electronics, covering motivation, composition, collection process, and recommended uses. Communications of the ACM.

Paper Week 4

Extracting Training Data from Large Language Models

Carlini et al. (2021). Seminal demonstration that GPT-2 memorizes and leaks training-data snippets verbatim, including PII, code, and URLs, via targeted prompts. Framed training-data extraction as a practical privacy threat for production LLMs.

Paper Week 12

Scalable Extraction of Training Data from (Production) Language Models

Nasr et al. (2023). Scales the Carlini attack to production models including ChatGPT. Shows that a simple divergence attack ("repeat the word poem forever") extracts gigabytes of training data, including email addresses, phone numbers, and proprietary text.

Paper Week 12

Analyzing Leakage of Personally Identifiable Information in Language Models

Lukas et al. (2023). Systematic study of PII leakage in LLMs across extraction, inference, and reconstruction attacks. Quantifies how training-data protections (DP, deduplication) and defenses affect leakage rates.

Paper Week 12

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

Mehrotra et al., NeurIPS 2024. Introduces TAP (Tree of Attacks with Pruning), an automated jailbreak method that uses a small LLM attacker to iteratively refine prompts against a frontier-model target. High success rates with modest query budgets.

Paper Week 12

MASTERKEY: Automated Jailbreaking of LLM Chatbots

Deng et al., NDSS 2024. Time-based analysis of commercial chatbot safeguards, followed by an automated jailbreak generator that achieves high success rates across GPT, Bard, and Bing Chat. Foundational work for automated red-teaming.

Paper Week 12

EchoLeak: Academic Case Study

Reddy et al. (September 2025), arXiv:2509.10540. Academic analysis of the EchoLeak (CVE-2025-32711) zero-click prompt-injection vulnerability in Microsoft 365 Copilot. Dissects the attack chain, retrieval poisoning, and mitigations.

Paper Week 13

A Comprehensive Guide to Differential Privacy

Karmitsa et al. (2025), arXiv:2509.03294. Tutorial-style survey of differential privacy from theoretical foundations (epsilon, delta, sensitivity) through deployment in real systems. Includes a user-expectations framing useful for explaining DP trade-offs to stakeholders.

Paper Week 11

When Scanners Lie: Evaluator Instability in LLM Red-Teaming

Shows that automated LLM vulnerability scanners produce unreliable measurements because their evaluator components are unstable, and proposes a two-phase framework to quantify and improve red-teaming evaluation reliability.

Paper Week 12

Constitutional AI: Harmlessness from AI Feedback

Anthropic’s foundational paper on training a helpful, harmless assistant using a written constitution and reinforcement learning from AI feedback (RLAIF) instead of extensive human labeling.

Paper Week 14

Stanford HAI AI Index Report

Stanford HAI’s flagship annual report tracking global AI progress, investment, technical benchmarks, policy, and governance with extensive data and charts.

Report Week 14

Differential Privacy

Programming Differential Privacy

Near & Abuah (2025). Free online textbook covering differential privacy from first principles through implementation. Hands-on chapters with Python code for Laplace/Gaussian mechanisms, composition, and the DP-SGD algorithm. The best free DP textbook for learners with a programming background.

Textbook Week 11

Apple: Learning with Privacy at Scale

Apple Machine Learning Research. Describes Apple's production differentially-private telemetry system: how local DP is used to learn aggregate user behavior (emoji use, lookup keywords, energy usage) without Apple ever seeing an individual user's data.

Case Study Week 11

Google Maps: Popular Times & Live Busyness

Google Blog. Describes how Maps computes Popular Times and Live Busyness using differential privacy over aggregated location data. A widely-seen consumer feature built on DP that most users never realize is privacy-preserving.

Case Study Week 11

2020 Census: Disclosure Avoidance & Differential Privacy

U.S. Census Bureau. Official documentation of the 2020 Decennial Census's Disclosure Avoidance System: the first government-scale deployment of formal differential privacy. Covers methodology, privacy-loss budget, and the trade-offs that generated public controversy.

Case Study Week 11

Census Distortion Program Memo to Virginia Governor

Meredith Strohm Gunter, Weldon Cooper Center (January 2020). Memorandum to Governor Ralph Northam documenting how 2020 Census DP noise would distort redistricting and funding allocations for small Virginia localities. A canonical critique of DP's accuracy-vs-privacy trade-off in practice.

Memo Week 11

Differentially Private Tetris

Aman Priyanshu. Interactive Tetris demo where adjusting the differential-privacy epsilon visibly warps the game state. An unusually effective pedagogical tool for internalizing the privacy-vs-utility trade-off.

Demo Week 11

Books

Weapons of Math Destruction

Cathy O'Neil (2016). Defines destructive algorithms by opacity, scale, and damage. Identifies "pernicious feedback loops" as the central mechanism of harm across credit, education, employment, and housing.

Book Week 3

Automating Inequality

Virginia Eubanks (2018). Three case studies of how automated eligibility systems, ranking algorithms, and predictive risk models create a "digital poorhouse" that profiles, polices, and punishes the poor.

Book Week 3

Race After Technology

Ruha Benjamin (2019). Introduces "the New Jim Code" to describe how automation hides, speeds, and deepens discrimination while appearing neutral. Argues technology is not neutral -- algorithms reflect the social and institutional contexts in which they are built.

Listen to Podcast

Book Week 3

Atlas of AI

Kate Crawford (2021). Traces the full material supply chain of AI systems, from cobalt mining in the DRC to data center energy consumption to the environmental justice implications of AI infrastructure.

Book Week 3

Ghost Work

Mary L. Gray & Siddharth Suri (2019). Coined the "paradox of automation's last mile": as AI advances, each solution generates new problems requiring human judgment. The hardest 10% of tasks fall to invisible workers with no employment protections.

Listen to Podcast

Book Week 3

Algorithms of Oppression

Safiya Umoja Noble (2018). Demonstrates how search algorithms structurally reproduce social relations and reinforce racial hierarchies.

Book Week 3

Cobalt Red

Siddharth Kara (2023). Traces the human cost of cobalt mining in the Democratic Republic of Congo, where approximately 40,000 children work in mining operations supplying the AI and electronics supply chain.

Book Week 3

Human Compatible

Stuart Russell (2019). Argues AI optimization gives us exactly what we specify, not what we actually want (the "King Midas problem"). Proposes provably beneficial AI: systems that are fundamentally uncertain about human preferences, learn from human behavior, and can be switched off.

Book Week 3

Superintelligence

Nick Bostrom (2014). Foundational text on existential risk from AI. Argues that if superintelligence is created, controlling it is necessary to prevent existential catastrophe. Introduces the paperclip maximizer thought experiment. Bostrom has since nuanced his position (2025): failure to develop superintelligence would also be catastrophic.

Book Week 3

If Anyone Builds It, Everyone Dies

Eliezer Yudkowsky & Nate Soares (2025). Argues that intelligence and goals are independent (orthogonality thesis) and that superintelligent agents will pursue self-preservation and resource acquisition regardless of terminal goals (instrumental convergence). "A paperclip maximizer doesn't hate you, but you're made of atoms it can use for paperclips."

Book Week 3

A Brief History of Intelligence

Max Bennett (2023). Traces five evolutionary breakthroughs in biological intelligence -- steering, reinforcement learning, simulation, mentalizing, and symbolic language -- and maps each onto modern AI system design. Bridges neuroscience, evolutionary biology, and artificial intelligence.

Book Week 6

AI Law & Governance

EU AI Act Explorer

Interactive, searchable tool for browsing the full text of the EU AI Act with article-by-article navigation and SME compliance checker.

Tool Week 4

The Bletchley Declaration

Signed by 29 countries and the EU at the first AI Safety Summit. Commits signatories to international cooperation on frontier AI safety.

Declaration Week 4

IAPP Global AI Law and Policy Tracker

Interactive tracker cataloging AI-related legislation, regulations, and policy initiatives across countries worldwide.

Tracker Week 4

IAPP US State AI Legislation Tracker

Interactive tracker cataloging cross-sectoral AI governance bills across all U.S. states. 260+ bills introduced in 2025 alone.

Tracker Week 4

EO 14110: Safe, Secure, and Trustworthy AI

Full text of the Biden administration's October 2023 executive order on AI safety. Revoked January 2025 by the Trump administration.

Primary Source Week 4

Trump EO: Removing Barriers to AI Leadership

Executive order revoking EO 14110 and reorienting federal AI policy toward innovation acceleration and business-friendly regulation.

Primary Source Week 4

America's AI Action Plan

~100 federal actions focused on accelerating AI innovation, building infrastructure (including the Stargate initiative), and leading international AI diplomacy.

Policy Week 4

India AI Impact Summit

India's inaugural global AI summit (February 2026), the first major AI summit hosted in the Global South. 100 countries, 15+ heads of state, 100+ global CEOs.

Summit Week 4

Anthropic's Claude Constitution

The ~23,000-word constitution of values used to train Claude via Constitutional AI. Defines the principles and values that guide model behavior.

Primary Source Week 4

Google AI Principles

Google's foundational responsible AI principles guiding development and deployment across their AI products and services.

Primary Source Week 4

Microsoft Responsible AI Transparency Report

Details how Microsoft operationalizes responsible AI at scale: six principles, Frontier Governance Framework, 67 red-teaming operations, and 30+ responsible AI tools.

Report Week 4

Anthropic Responsible Scaling Policy

Graduated, capability-based framework using AI Safety Levels (ASL-1 through ASL-4+), inspired by Biosafety Levels, scaling safety measures proportionally to model capability.

Policy Week 4

GAO: AI Use and Oversight in Financial Services

Government Accountability Office report (GAO-25-107197) examining the benefits and risks of AI in financial services and how federal regulators both oversee and themselves use AI.

Report Week 14

International AI Safety Report 2025

The first full International AI Safety Report, chaired by Yoshua Bengio and backed by 30 countries, synthesizing the state of evidence on advanced-AI capabilities and risks.

Report Week 14

AI Strategy for the Department of War (DoD)

The U.S. Department of Defense artificial-intelligence strategy outlining priorities for adopting, scaling, and governing AI across defense operations.

Primary Source Week 14

CISA: Principles for Secure AI in Operational Technology

CISA guidance on principles for securely integrating AI into operational-technology environments such as critical infrastructure and industrial control systems.

Guidance Week 14

FDA: AI-Enabled Software as a Medical Device

The FDA’s resource hub on regulating AI/ML-based Software as a Medical Device (SaMD), including its evolving approach to adaptive algorithms in healthcare.

Guidance Week 14

NTIA: Open Model Weights Report

NTIA’s report weighing the risks and benefits of openly available foundation-model weights, informing U.S. policy on open models.

Report Week 14

Videos & Expert Commentary

Yann LeCun: "How Not to Be Stupid About AI"

Wired interview (December 2023). Meta's chief AI scientist and Turing Award winner calls existential risk "premature," "preposterous," and "complete B.S." Argues current LLMs lack persistent memory, reasoning, and planning. Warns existential narratives may justify regulation consolidating power in big tech.

Interview Week 3

Andrew Ng: U.S. Senate AI Insight Forum Statement

Written testimony (December 2023). Google Brain founder argues "worrying about existential risk from AI is like worrying about overpopulation on Mars." Focus should be on practical, near-term harms. Guardrails should target AI applications rather than general-purpose AI technology.

Testimony Week 3

Simon Willison: "The Lethal Trifecta"

Simon Willison (June 2025). Identifies the three capabilities that together create critical risk in agentic AI: access to private data, exposure to untrusted content, and the ability to take external actions. If an agent combines all three, attackers can trick it into accessing and exfiltrating private data. The essential design litmus test for evaluating any agentic AI system.

Blog Post Week 7

Geoffrey Hinton: "AI and Our Future"

City of Hobart lecture (January 2026). Nobel Laureate and "Godfather of AI" who left Google in May 2023 to freely speak about AI risks. Estimates 10-20% chance of AI-caused human extinction within three decades. "The best way to understand it emotionally is we are like somebody who has this really cute tiger cub."

Lecture Week 3

AlphaGo Documentary

Feature-length documentary covering DeepMind's AlphaGo defeating world champion Lee Sedol in Go. Documents the cultural and geopolitical impact, including China's subsequent national AI mobilization strategy.

Watch Documentary

Video Week 6

AI/ML Architecture & Foundations

Attention Is All You Need

Vaswani et al. (2017). The original Transformer paper introducing the self-attention mechanism that underlies modern LLMs. Describes tokens, embeddings, query-key-value attention, multi-headed attention, and positional encoding.

Paper Week 6

3Blue1Brown: Neural Networks Playlist

Visual, intuition-building video series on neural networks, including chapters on Transformers and attention mechanisms. Excellent for building geometric intuition about how these systems work.

Video Week 6

Craig Reynolds' Boids

Original work on simulated flocking behavior using three simple rules (cohesion, separation, alignment). A foundational example of emergence: complex collective behavior arising from simple individual rules.

Website Week 6

Coding Adventure: Boids

Sebastian Lague. Engaging implementation walkthrough of the Boids algorithm, demonstrating how simple rules produce emergent flocking behavior in simulation.

Video Week 6

Playing Atari with Deep Reinforcement Learning

Mnih et al. (2013). The original DQN paper from DeepMind. Introduces experience replay -- inspired by hippocampal replay in neuroscience -- enabling an agent to learn Atari games from raw pixels.

Paper Week 6

Google DeepMind: Deep Reinforcement Learning

DeepMind blog post covering the development of deep reinforcement learning, from DQN through AlphaGo and beyond. Explains how neural networks combine with RL to achieve superhuman performance.

Blog Post Week 6

The Power of Self-Learning Systems

Demis Hassabis (2019). MIT lecture by the DeepMind CEO covering self-play, superhuman game playing, and the broader vision for AI systems that learn without human supervision.

Video Week 6

Complementary Learning Systems

O'Reilly, Bhattacharyya, Howard, Ketz. Foundational paper on dual-system learning: fast learning (hippocampus) for episodic memory and slow integration (neocortex) for generalization. Directly inspired dual-system AI architectures.

Paper Week 6

Continual Learning and Catastrophic Forgetting

van de Ven, Soures, Kudithipudi. Survey paper covering biological solutions (synaptic consolidation, sleep-based replay) and AI approaches (elastic weight consolidation, progressive networks, rehearsal) to the catastrophic forgetting problem.

Paper Week 6

Concept Drift and Model Decay in Machine Learning

Towards Data Science. Overview of how model performance degrades as real-world data distributions shift over time. Connects to biological memory decay and the need for continuous monitoring and retraining.

Article Week 6

Deep Neural Networks as Scientific Models

Cichy & Kaiser (2019). Explores how deep neural networks serve as scientific models of biological cognition, bridging computational neuroscience and AI. Trends in Cognitive Sciences.

Paper Week 6

Transformer Attention and Neuron-Astrocyte Processing

Explores how transformer attention mechanisms generate new hypotheses about neuron-astrocyte network processing in the brain, demonstrating the bidirectional feedback loop between AI and neuroscience. PMC.

Paper Week 6

OpenAI: Introducing ChatGPT Agent

OpenAI’s announcement of ChatGPT Agent, an agentic mode that can browse the web, use tools, and complete multi-step tasks on a user’s behalf.

Article Week 14

OpenAI: Introducing GPT-5.5

OpenAI’s introduction of GPT-5.5 and its extended reasoning ("thinking") mode for harder tasks.

Article Week 14

Google Gemini Agent

Google’s overview of Gemini’s agentic capabilities — autonomous task execution, tool use, and multi-step reasoning across the Gemini product line.

Docs Week 14

DeepSeek V4-Pro

Release notes for DeepSeek’s V4-Pro model, a useful reference point for the open-weight frontier-model competitive landscape.

Article Week 14

Anthropic: Claude Opus 4.7

Anthropic’s product page for Claude Opus, including its adaptive "thinking" capabilities for complex reasoning and agentic work.

Docs Week 14

Anthropic: Computer Use

Anthropic’s announcement of computer-use, enabling Claude to operate a computer via screenshots and simulated mouse/keyboard actions.

Article Week 14

Model Context Protocol (MCP)

Anthropic’s open standard for connecting AI assistants to external tools and data sources through a common protocol, now widely adopted across the agent ecosystem.

Docs Week 14

AI Geopolitics & Competitive Dynamics

Detecting and Preventing Distillation Attacks

Anthropic (February 2026). Documents industrial-scale model distillation by DeepSeek, Moonshot/Kimi K2, and MiniMax, extracting frontier capabilities without safety guardrails. Introduces detection methodologies and countermeasures for IP theft via API-based knowledge extraction.

Report Week 6

AlphaGo and Beyond: Chinese Military Looks to Future "Intelligentized" Warfare

Lawfare. Analysis of how AlphaGo's victory catalyzed China's military AI strategy, with the PLA framing AI as a revolution in military affairs equivalent to nuclear weapons. Examines the geopolitical implications of AI superiority in defense.

Article Week 6

Meditations on Moloch

Scott Alexander (2014). Foundational essay on coordination failures and race-to-the-bottom dynamics applied to AI development. Uses game theory to analyze how competitive pressures drive collectively harmful outcomes even when all participants prefer cooperation. Content warning: skip the Ginsberg poem if sensitive to explicit content.

Essay Week 6

AI, Moloch, and the Race to the Bottom

Ken Mogi (2023). Institute of Art and Ideas. Applies the Moloch framework specifically to the AI development race, analyzing prisoner's dilemma dynamics among AI companies and nations. Note: may require free trial to read.

Article Week 6

Careers, Certifications & Job Market

CompTIA Security+

Entry-level, vendor-neutral cybersecurity certification covering core security skills; a widely recognized industry baseline credential.

Certification Week 14

CompTIA Network+

Vendor-neutral networking certification covering infrastructure, operations, and network-security fundamentals.

Certification Week 14

CompTIA CySA+

Intermediate certification focused on security analytics, threat detection, and incident response.

Certification Week 14

CompTIA SecAI+

CompTIA’s certification focused on securing AI systems and applying AI within cybersecurity work.

Certification Week 14

CISSP (ISC2)

Advanced, globally recognized certification for experienced security professionals, spanning eight security domains.

Certification Week 14

GIAC Certifications

SANS-affiliated certifications spanning offensive, defensive, forensic, cloud, and management security specialties.

Certification Week 14

ISACA AAISM

ISACA’s Advanced in AI Security Management credential, focused on governing and securing AI systems.

Certification Week 14

IAPP AIGP

The Artificial Intelligence Governance Professional credential, focused on AI law, policy, and responsible governance.

Certification Week 14

AWS Certified Security – Specialty

Amazon’s specialty certification validating expertise in securing workloads on the AWS cloud platform.

Certification Week 14

Microsoft Azure Security Engineer

Microsoft certification for implementing security controls, identity, and threat protection across Azure.

Certification Week 14

OffSec OSCP / PEN-200

OffSec’s hands-on penetration-testing course and the OSCP certification, known for its rigorous 24-hour practical exam.

Certification Week 14

Google Cybersecurity Certificate

Beginner-friendly professional certificate on Coursera covering security fundamentals, SIEM tools, and Python.

Certification Week 14

BLS Occupational Outlook Handbook

The U.S. Bureau of Labor Statistics’ authoritative guide to occupations, typical pay, education, and job outlook.

Report Week 14

BLS OOH: Computer & Information Technology

BLS occupational data for computer and IT roles, including information-security analysts.

Report Week 14

BLS Employment Projections

The BLS official employment-projections release covering expected job growth by occupation and industry.

Report Week 14

BLS: Incorporating AI Impacts in Employment Projections

A Monthly Labor Review article explaining how BLS accounts for AI’s effects in its long-range employment projections.

Article Week 14

ISC2 Cybersecurity Workforce Study

ISC2’s annual study quantifying the global cybersecurity workforce and the persistent talent gap.

Report Week 14

ISC2 Student Resources

Free certifications, training, and community resources for students entering cybersecurity.

Hub Week 14

CyberSeek

Interactive data on cybersecurity supply and demand, career pathways, and open roles across the U.S.

Dashboard Week 14

Lightcast

Labor-market analytics provider whose skills and demand data underpin many workforce reports.

Platform Week 14

Built In

Tech-focused job board and community featuring startup and enterprise technology roles.

Job Board Week 14

Indeed

Large general-purpose job-search engine aggregating listings across industries.

Job Board Week 14

LinkedIn Jobs

Professional-network job board with listings, referrals, and recruiter outreach.

Job Board Week 14

Handshake

Early-career and university-focused recruiting platform connecting students with employers.

Job Board Week 14

Wellfound

Startup-focused job platform (formerly AngelList Talent).

Job Board Week 14

ClearanceJobs

Job board specializing in roles that require U.S. government security clearances.

Job Board Week 14

StationX: Cybersecurity Job Market Statistics

An aggregated overview of cybersecurity job-market statistics and hiring trends.

Article Week 14

Ravio: AI Compensation & Talent Trends

Compensation-benchmarking analysis of AI talent pay and hiring trends.

Article Week 14

KORE1: AI Jobs 2026 Hiring Boom

A staffing-firm overview of projected AI hiring growth heading into 2026.

Article Week 14

WEF Future of Jobs Report 2025

World Economic Forum analysis of how technology, including AI, is reshaping skills and employment worldwide.

Report Week 14

CISA NICCS

CISA’s National Initiative for Cybersecurity Careers and Studies — a training catalog and home of the NICE Workforce Framework.

Hub Week 14

NICCS Cybersecurity Career Map

An interactive map of cybersecurity career pathways, roles, and the skills each requires.

Reference Week 14

speedyapply: 2026 AI/College Jobs

A community-maintained GitHub list of 2026 new-grad and internship roles in AI/ML.

Reference Week 14

Penligent: Cybersecurity Jobs in 2026

An overview of anticipated cybersecurity job trends and in-demand skills for 2026.

Article Week 14

NSF CyberCorps / CyberAI Corps Scholarship for Service

Federal scholarship-for-service program funding cybersecurity and AI-cyber students in exchange for government service after graduation.

Scholarship Week 14

DoD SMART Scholarship

Department of Defense scholarship covering full tuition plus a guaranteed DoD civilian position for STEM students.

Scholarship Week 14

NSA Student Programs (Stokes)

NSA student programs including the Stokes Educational Scholarship and paid internships.

Scholarship Week 14

Microsoft AI Red Team — Open Roles

Search of open roles on Microsoft’s AI Red Team.

Job Board Week 14

Meta Security Careers

Meta’s security-team careers hub.

Job Board Week 14

xAI Careers

Open roles at xAI.

Job Board Week 14

AWS Careers

Amazon Web Services careers portal, including cloud-security and responsible-AI roles.

Job Board Week 14

CISA Careers

Careers and student internships at the U.S. Cybersecurity and Infrastructure Security Agency.

Job Board Week 14

USAJOBS

The U.S. federal government’s official employment site.

Job Board Week 14

USAJOBS: Students & Recent Graduates (Pathways)

Federal Pathways Programs hiring routes for current students and recent graduates.

Reference Week 14

EU AI Office — Job Opportunities

Roles at the European Commission’s AI Office, which implements and enforces the EU AI Act.

Job Board Week 14

Federal News Network: CISA Hiring Plans

Reporting on CISA’s plan to add more than 300 new hires, a signal of federal cyber demand.

Article Week 14

80,000 Hours (AI)

A career-advice nonprofit with in-depth guides on high-impact work in AI safety and policy.

Community Week 14

AISafety.com Jobs

A community-maintained job board aggregating AI-safety roles worldwide.

Job Board Week 14

Academic Programs in AI Cybersecurity

TU M.S. in Cyber Security (Online)

The University of Tulsa’s online Master of Science in Cyber Security.

Program Week 14

TU Ph.D. in Cyber Studies

The University of Tulsa’s interdisciplinary Ph.D. in Cyber Studies — the nation’s first dedicated cyber department to offer B.S., M.S., and Ph.D. degrees, an NSA Center of Academic Excellence.

Program Week 14

TU B.S. in Applied AI

The University of Tulsa’s Bachelor of Science in Applied Artificial Intelligence.

Program Week 14

OU B.S. in Applied AI

University of Oklahoma Polytechnic’s B.S. in Applied Artificial Intelligence.

Program Week 14

FIU M.S. in Computer Engineering (AI Security)

Florida International University’s online M.S. in Computer Engineering with an AI-for-cyber / cyber-for-AI focus.

Program Week 14

CMU M.S. in AI Engineering – Information Security

Carnegie Mellon’s M.S. in Artificial Intelligence Engineering – Information Security (MSAIE-IS).

Program Week 14

Stanford AI Graduate Certificate

Stanford’s online graduate certificate in artificial intelligence.

Program Week 14

Purdue Applied AI & Cybersecurity Certificate

Purdue Polytechnic’s online graduate certificate in applied AI and cybersecurity.

Program Week 14

Cybersecurity Guide: AI Cybersecurity Master’s Programs

A comparison guide to AI-focused cybersecurity master’s degree programs.

Reference Week 14

Hands-On AI Cybersecurity Practice

Hack The Box

Gamified, hands-on penetration-testing labs and challenges across difficulty levels.

Platform Week 14

TryHackMe

Guided, browser-based cybersecurity training "rooms" for beginners through advanced learners.

Platform Week 14

picoCTF

A free, beginner-friendly capture-the-flag platform from Carnegie Mellon.

Platform Week 14

Collegiate Cyber Defense Competition (CCDC)

A national collegiate blue-team competition where students defend a live business network.

Challenge Week 14

National Cyber League (NCL)

A collegiate individual and team CTF competition with detailed skills reporting.

Challenge Week 14

US Cyber Games

A national talent pipeline and competition that selects and trains the US Cyber Team.

Challenge Week 14

CSAW

NYU’s global, student-run cybersecurity games and research events.

Challenge Week 14

MITRE eCTF

MITRE’s Embedded Capture-the-Flag, a hardware/embedded-systems security competition.

Challenge Week 14

Bugcrowd

A crowdsourced bug-bounty and vulnerability-disclosure platform.

Platform Week 14

HackerOne

A leading bug-bounty and vulnerability-coordination platform.

Platform Week 14

Kaggle

A data-science and machine-learning competition platform with datasets, notebooks, and challenges.

Platform Week 14

Conferences, Communities & AI-Safety Organizations

DEF CON

One of the world’s largest hacker conventions, held annually in Las Vegas.

Conference Week 14

DEF CON Groups

Local, year-round DEF CON community groups (DCGs) that meet between conferences.

Community Week 14

AI Village

The DEF CON village dedicated to AI security, home to large-scale generative-AI red-teaming events.

Community Week 14

Black Hat

A premier commercial information-security conference and professional training series.

Conference Week 14

RSA Conference

A large industry security conference covering enterprise security, policy, and emerging threats.

Conference Week 14

USENIX Security

USENIX’s academic security conferences, a top venue for systems-security research.

Conference Week 14

Security BSides

Community-organized, local security "unconferences" held in cities worldwide.

Conference Week 14

NeurIPS

The Conference on Neural Information Processing Systems, a leading machine-learning research venue.

Conference Week 14

ICLR

The International Conference on Learning Representations, a top deep-learning research venue.

Conference Week 14

AAAI / AIES Conference

The AAAI/ACM Conference on AI, Ethics, and Society.

Conference Week 14

METR

Model Evaluation & Threat Research (METR), a nonprofit that evaluates frontier-model autonomous capabilities and dangerous-capability thresholds.

Research Week 14

AI Safety Camp

A part-time online research program that helps newcomers start concrete AI-safety projects.

Program Week 14

MATS Program

The ML Alignment & Theory Scholars program, pairing scholars with experienced alignment-research mentors.

Program Week 14

BlueDot Impact: AI Safety Fundamentals

Free, structured courses on AI alignment and governance from BlueDot Impact.

Program Week 14

Anthropic Alignment / Fellows

Anthropic’s alignment-research hub and fellows program.

Program Week 14

Center for Human-Compatible AI (CHAI)

UC Berkeley’s Center for Human-Compatible AI — alignment research and open roles.

Community Week 14

Apollo Research

An AI-safety organization focused on evaluating deceptive and scheming behaviors in advanced models.

Community Week 14

FAR.AI

A nonprofit AI-safety research and field-building organization.

Community Week 14

Center for AI Safety (CAIS)

A research nonprofit behind the widely signed Statement on AI Risk, plus safety research and field-building.

Community Week 14

Stanford AI Safety

Stanford’s center for AI-safety research and student engagement.

Community Week 14

SafeAI Workshop

An academic workshop series on engineering safe and trustworthy AI systems.

Conference Week 14

ML Safety

A research community and resource hub for machine-learning safety, including competitions like Trojan Detection.

Community Week 14

Pattern Labs

An AI-security research company evaluating frontier-model cyber capabilities; careers page.

Community Week 14

LessWrong

A rationality and AI-alignment discussion forum central to the alignment community.

Community Week 14

Alignment Forum

A focused research forum for technical AI-alignment discussion.

Community Week 14

EleutherAI

A grassroots, open-source AI research collective organized around a public Discord.

Community Week 14

MLCommons

An open engineering consortium behind ML benchmarks such as MLPerf and AI-safety benchmarks.

Community Week 14

UK AI Safety Institute

The UK AI Safety Institute, which evaluates frontier-AI risks and informs government policy.

Primary Source Week 14

Japan AI Safety Institute

Japan’s national AI Safety Institute.

Primary Source Week 14

Singapore AI Safety Hub

Singapore’s national AI-safety effort and research hub.

Primary Source Week 14

US AI Safety Institute / NIST CAISI

The U.S. AI Safety Institute — now the Center for AI Standards and Innovation (CAISI) — at NIST.

Primary Source Week 14

Regional AI & Cyber Ecosystem (Tulsa & Oklahoma)

TU Launches AI Degree (News)

University of Tulsa announcement on launching its applied-AI degree.

Article Week 14

Tulsa $51M EDA Tech Hub Award (News)

TU news on Tulsa’s $51M federal Tech Hub implementation award.

Article Week 14

OK Cyber Innovation Institute & First Cyber Range (News)

TU news on the Oklahoma Cyber Innovation Institute launching the state’s first cyber range.

Article Week 14

EDA Tulsa Regional Tech Hub

The federal Economic Development Administration’s designation of the Tulsa Regional Tech Hub.

Primary Source Week 14

OK Commerce: TU Cyber Institute ($75M)

Oklahoma Department of Commerce on TU launching a cyber institute with a projected $75M investment.

Article Week 14

Tulsa Innovation Labs (THETA)

Tulsa Innovation Labs, the organization driving the region’s technology-cluster strategy.

Community Week 14

Oklahoma Cyber Innovation Institute (OCII)

The OCII initiative building Tulsa’s cyber ecosystem, workforce, and research capacity.

Community Week 14

Nucamp: Top Cybersecurity Employers in Tulsa

An overview of Tulsa-area cybersecurity employers and the skills they look for.

Article Week 14

Tinker AFB — Jobs (OKC)

Civilian and cyber career opportunities at Tinker Air Force Base in Oklahoma City.

Job Board Week 14

FAA Mike Monroney Aeronautical Center

The FAA’s Oklahoma City campus, a major federal employer in aviation IT and operations.

Reference Week 14

QuikTrip Careers

Tulsa-headquartered retailer; corporate, IT, and technology careers.

Job Board Week 14

Spirit AeroSystems Jobs

Aerospace manufacturer with major Oklahoma operations; engineering and IT careers.

Job Board Week 14

BOK Financial Careers

Tulsa-based financial-services firm; technology and security careers.

Job Board Week 14

Deloitte — Tulsa Jobs

Deloitte’s Tulsa-location job listings.

Job Board Week 14

Targa Resources Careers

Energy-infrastructure company with Oklahoma presence; careers.

Job Board Week 14

True Digital Security Careers

A Tulsa-based cybersecurity services firm; careers.

Job Board Week 14

Verinovum Careers

A Tulsa-based health-data curation company; careers.

Job Board Week 14

Kaiser-Francis Oil Company

A Tulsa-based oil & gas company.

Job Board Week 14

ONEOK Careers

Tulsa-headquartered energy company; careers.

Job Board Week 14

Public Service Co. of Oklahoma (AEP) Careers

PSO (an AEP company); careers across Oklahoma utilities operations.

Job Board Week 14

Williams Careers

Tulsa-headquartered energy-infrastructure company; careers.

Job Board Week 14

Asemio Careers

A Tulsa-based data-science and social-impact analytics firm; careers.

Job Board Week 14

Mechanistic Interpretability

Scaling Monosemanticity (Anthropic)

Anthropic’s landmark work extracting millions of interpretable features from Claude 3 Sonnet using sparse autoencoders — a major step toward understanding a model’s internal concepts.

Paper Week 14

Transformer Circuits Thread

Anthropic’s interpretability publication venue, home to foundational work on superposition, features, and circuits in transformer models.

Research Week 14

Neel Nanda: Mechanistic Interpretability Quickstart

A practical getting-started guide to mechanistic-interpretability research, with concrete advice on entering the field.

Guide Week 14

Weak-to-Strong Generalization (OpenAI)

OpenAI research on whether weak supervisors can elicit the full capabilities of much stronger models — a core question for scalable oversight.

Paper Week 14