predict.info — Premium Domain For Sale Domain only: USD 200,000. Prediction platform technology priced separately. predict.info
Ai Safety News | Blockchain.News

AI SAFETY

OpenAI Updates Model Spec with U18 Teen Safety Principles for ChatGPT
Ai Safety

OpenAI Updates Model Spec with U18 Teen Safety Principles for ChatGPT

OpenAI introduces new U18 Principles to its Model Specification, establishing age-appropriate AI safety guidelines for teenage ChatGPT users ages 13-17.

Anthropic Enhances AI Safeguards for Sensitive Conversations
Ai Safety

Anthropic Enhances AI Safeguards for Sensitive Conversations

Anthropic has implemented advanced safeguards for its AI, Claude, to better handle sensitive topics such as suicide and self-harm, ensuring user safety and well-being.

AI Development Framework Aims for Greater Transparency and Safety
Ai Safety

AI Development Framework Aims for Greater Transparency and Safety

Anthropic proposes a framework for AI transparency, focusing on safety and accountability. This initiative aims to enhance public safety and responsible AI development.

Anthropic Strengthens AI Safeguards for Claude
Ai Safety

Anthropic Strengthens AI Safeguards for Claude

Anthropic enhances its AI model Claude's safety and reliability with robust safeguards, ensuring beneficial outcomes while preventing misuse and harmful impacts.

Character.AI Implements New Safety Measures for Teen Users
Ai Safety

Character.AI Implements New Safety Measures for Teen Users

Character.AI announces significant changes to enhance the safety of its platform for users under 18, including removing open-ended chat and introducing age assurance tools.

OpenAI Enhances GPT-5 for Sensitive Conversations with New Safety Measures
Ai Safety

OpenAI Enhances GPT-5 for Sensitive Conversations with New Safety Measures

OpenAI has released an addendum to the GPT-5 system card, showcasing improvements in handling sensitive conversations with enhanced safety benchmarks.

NVIDIA Introduces Safety Measures for Agentic AI Systems
Ai Safety

NVIDIA Introduces Safety Measures for Agentic AI Systems

NVIDIA has launched a comprehensive safety recipe to enhance the security and compliance of agentic AI systems, addressing risks such as prompt injection and data leakage.

NVIDIA NeMo Guardrails Enhance LLM Streaming for Safer AI Interactions
Ai Safety

NVIDIA NeMo Guardrails Enhance LLM Streaming for Safer AI Interactions

NVIDIA introduces NeMo Guardrails to enhance large language model (LLM) streaming, improving latency and safety for generative AI applications through real-time, token-by-token output validation.

Ensuring AI Reliability: NVIDIA NeMo Guardrails Integrates Cleanlab's Trustworthy Language Model
Ai Safety

Ensuring AI Reliability: NVIDIA NeMo Guardrails Integrates Cleanlab's Trustworthy Language Model

NVIDIA's NeMo Guardrails, in collaboration with Cleanlab's Trustworthy Language Model, aims to enhance AI reliability by preventing hallucinations in AI-generated responses.

OpenAI Releases Comprehensive GPT-4o System Card Detailing Safety Measures
Ai Safety

OpenAI Releases Comprehensive GPT-4o System Card Detailing Safety Measures

OpenAI's report on GPT-4o highlights extensive safety evaluations, red teaming, and risk mitigations prior to release.

Anthropic Expands AI Model Safety Bug Bounty Program
Ai Safety

Anthropic Expands AI Model Safety Bug Bounty Program

Anthropic broadens its AI model safety bug bounty program to address universal jailbreak vulnerabilities, offering rewards up to $15,000.

Anthropic Unveils Initiative to Enhance Third-Party AI Model Evaluations
Ai Safety

Anthropic Unveils Initiative to Enhance Third-Party AI Model Evaluations

Anthropic announces a new initiative aimed at funding third-party evaluations to better assess AI capabilities and risks, addressing the growing demand in the field.

Guaranteed Safe AI Systems: A Solution for the Future of AI Safety?
Ai Safety

Guaranteed Safe AI Systems: A Solution for the Future of AI Safety?

Exploring the potential of guaranteed safe AI systems in ensuring the safety and reliability of artificial general intelligence (AGI).

Exploring AGI Hallucination: A Comprehensive Survey of Challenges and Mitigation Strategies
Ai Safety

Exploring AGI Hallucination: A Comprehensive Survey of Challenges and Mitigation Strategies

A new survey delves into the phenomenon of AGI hallucination, categorizing its types, causes, and current mitigation approaches while discussing future research directions.

British Standards Institution Pioneers International AI Safety Guidelines for Sustainable Future
Ai Safety

British Standards Institution Pioneers International AI Safety Guidelines for Sustainable Future

BSI's release of the first international AI safety guideline, BS ISO/IEC 42001, marks a significant step in standardizing the safe and ethical use of AI, reflecting global demand for robust AI governance.