AI safety without disclosure is not safety

by admin

May 18, 2025

Introduction

As artificial intelligence (AI) capabilities scale, safety concerns have become central to global policy discussions. Yet while governments, industry, and civil society acknowledge the risks, one critical institutional gap persists: the absence of mandatory disclosure requirements for AI system failures, vulnerabilities, and safety incidents. Without legal obligations to disclose known risks, AI safety remains largely a function of voluntary corporate self-reporting.

Disclosure as a Safety Norm

In other high-stakes industries such as aviation, pharmaceuticals, and nuclear energy, mandatory reporting of safety failures is a foundational governance principle (Cummings, 2021; Nuclear Energy Agency, 2016). These regimes mandate incident reporting, independent investigation, and public disclosure to prevent systemic failures.

By contrast, no such statutory framework governs safety disclosures for AI systems. AI developers currently operate without binding legal obligations to report known safety incidents, emergent dangerous behaviors, or model failures to public authorities or independent bodies (Brundage et al., 2020).

Voluntary Disclosure Regimes

Some private AI labs have created internal disclosure protocols as part of their safety frameworks. For example:

Anthropic’s Responsible Scaling Policy includes internal monitoring, staged capability evaluations, and risk thresholds (Anthropic, 2023).
OpenAI’s Preparedness Framework establishes internal processes for risk monitoring and red-teaming (OpenAI, 2023).

However, these frameworks remain entirely self-governed. There are no independent audits, enforcement mechanisms, or legal penalties for non-disclosure. As Fjeld et al. (2020) argue, industry self-regulation provides some early structure but lacks the institutional authority required for systemic safety governance.

The Institutional Disclosure Gap

The absence of legal disclosure obligations creates structural vulnerabilities in AI safety governance:

Hidden failure modes: Dangerous behaviors may remain known only to private developers.
Limited external oversight: Regulators, researchers, and civil society lack full access to safety-relevant data.
Delayed policy response: Without verified empirical evidence of failures, policymakers struggle to calibrate effective safety regulations.

As Crootof and Bowers (2022) note, transparency without binding disclosure rules enables private actors to control both the timing and content of risk information released to the public.

Policy Options for Disclosure Governance

Closing the disclosure gap requires institutional mechanisms that impose enforceable obligations. Policy options include:

Statutory reporting requirements for safety incidents in high-risk AI systems.
Third-party audit mandates with access to internal model evaluations.
Legal protections for independent safety researchers and whistleblowers.
Centralized public registries for documented AI failures.

Establishing such mechanisms would align AI governance with established safety norms in other high-stakes technological domains.

Conclusion

AI safety cannot rest on voluntary disclosure alone. As systems scale into increasingly consequential applications, enforceable disclosure obligations will be necessary to ensure that safety knowledge is not privately enclosed. Governance that relies solely on corporate transparency lacks both institutional resilience and public legitimacy.

References

Anthropic. (2023). Responsible scaling policy. Retrieved from https://www.anthropic.com/policies/responsible-scaling

Brundage, M., Avin, S., Clark, J., Toner, H., Eckersley, P., Garfinkel, B., … & Anderson, H. (2020). Toward trustworthy AI development: mechanisms for supporting verifiable claims. Futures, 116, 102500.

Crootof, R., & Bowers, M. (2022). Regulating AI transparency. Yale Journal on Regulation Bulletin, 39, 46-59.

Cummings, M. (2021). Rethinking the maturity of AI governance: lessons from aviation. AI & Society, 36(2), 567–574.

Fjeld, J., Achten, N., Hilligoss, H., Nagy, A., & Srikumar, M. (2020). Principled artificial intelligence: Mapping consensus in ethical and rights-based approaches to principles for AI. Berkman Klein Center for Internet & Society.

Nuclear Energy Agency. (2016). The Fukushima Daiichi Nuclear Power Plant Accident: OECD/NEA Nuclear Safety Review of the Fukushima Daiichi Nuclear Power Plant Accident. Paris: OECD Publishing.

OpenAI. (2023). Preparedness framework. Retrieved from https://openai.com/preparedness

Get In Touch

Gallery

Leave a Reply Cancel reply

Sign Up Newsletter

Quick Links

Opening Hours

Popular Updates

The $1 AI Gambit: A Constitutional Crisis in the Algorithmic Age

AI as the Fourth Industrial Revolution: Governance Lessons from Prior Human Revolutions