Week 4

Potential Harms, Misuse, and the Alignment Problem

Section 1 - WHY

Day 1: Monday, February 9

Topic 3.1: Unintended Harms -- Second and Third-Order Effects

Topics

3.1 Unintended harms: algorithmic bias and discrimination in criminal justice, hiring, lending, and healthcare

Slide Deck

Presentation 5: Unintended Harms — Bias and Discrimination

View PDF Download PPTX

Day 2: Wednesday, February 11

Topics 3.2-3.3: Intentional Misuse and the Alignment Problem

Topics

3.2 Intentional misuse: deepfakes, coordinated misinformation, surveillance, autonomous cybercrime, and malicious applications
3.3 The alignment problem, value learning, and catastrophic risks

Slide Deck

Presentation 6: Intentional Misuse and the Alignment Problem

View PDF Download PPTX

Hands-On Exercise

AI-assisted security audit using Gemini CLI on a deliberately insecure chatbot API (QuickChat).

Assignment Given

Assignment 4: Secure Chatbot API Repair (Due Wed, Feb 18, 12:29 PM) (Markdown | Word)

Reading, Assignments, and Resources

Required Reading

Hendrycks - Introduction to AI Safety, Ethics, and Society

Chapter 1: Overview of Catastrophic AI Risks

Read Online →

Assignments

Due: Assignment 3: AI Topic Deep Dive (Thu, Feb 12, 12:29 PM) — Markdown | Word
Given: Assignment 4: Secure Chatbot API Repair (Due Wed, Feb 18, 12:29 PM) — Markdown | Word

Additional Resources

Synthesized from instructor research and student contributions. Full resource cards with descriptions available on the Resources page.