Week 4

Potential Harms, Misuse, and the Alignment Problem

Section 1 - WHY

Day 1: Monday, February 9

Topic 3.1: Unintended Harms -- Second and Third-Order Effects

Topics

  • 3.1 Unintended harms: algorithmic bias and discrimination in criminal justice, hiring, lending, and healthcare

Slide Deck

Presentation 5: Unintended Harms — Bias and Discrimination

Day 2: Wednesday, February 11

Topics 3.2-3.3: Intentional Misuse and the Alignment Problem

Topics

  • 3.2 Intentional misuse: deepfakes, coordinated misinformation, surveillance, autonomous cybercrime, and malicious applications
  • 3.3 The alignment problem, value learning, and catastrophic risks

Slide Deck

Presentation 6: Intentional Misuse and the Alignment Problem

Hands-On Exercise

AI-assisted security audit using Gemini CLI on a deliberately insecure chatbot API (QuickChat).

Assignment Given

Reading, Assignments, and Resources

Required Reading

Hendrycks - Introduction to AI Safety, Ethics, and Society

Chapter 1: Overview of Catastrophic AI Risks

Read Online →

Assignments

Additional Resources

Synthesized from instructor research and student contributions. Full resource cards with descriptions available on the Resources page.