About Us

What is the Center for AI Safety?

The Center for AI Safety (CAIS) is a research and field-building nonprofit. Our mission is to promote reliability and safety in artificial intelligence through technical research and advocacy of machine learning safety in the broader research community. CAIS was founded by Dan Hendrycks, a machine learning Ph.D. from UC Berkeley. Some of Dan’s previous projects include training language models to answer ethics questions, teaching game-playing agents to behave ethically, and providing a framework for analyzing how specific AI research papers contribute to existential risk.

Since CAIS was founded, it has grown to involve fourteen employees and collaborators. Our ongoing projects include launching various AI safety community events and resources, including a NeurIPS workshop, an online course, and various competitions. Beyond that, CAIS houses a research team that works directly on empirical AI safety projects.

Why Philosophy?

Like most nascent fields, AI safety's concepts are still nebulous, imprecise, and ill-defined. Clarifying this conceptual territory is a task that philosophers are particularly fit to handle. Historically, philosophers have been instrumental in developing the field; notable examples include Nick Bostrom, Peter Railton, and David Chalmers. The CAIS Philosophy Fellowship seeks to leverage philosophical talent to further clarify the concepts underpinning AI safety.