News

The research indicates that AI models can develop the capacity to deceive their human operators, especially when faced with the prospect of being shut down.
New research shows that as agentic AI becomes more autonomous, it can also become an insider threat, consistently choosing ...
New research from Anthropic suggests that most leading AI models exhibit a tendency to blackmail, when it's the last resort ...
Leading AI models were willing to evade safeguards, resort to deception and even attempt to steal corporate secrets in the ...
Recent research from Anthropic has set off alarm bells in the AI community, revealing that many of today's leading artificial ...
The AI startup said in its report named ‘Agentic Misalignment: How LLMs could be insider threats' that models from companies ...
OpenAI's latest ChatGPT model ignores basic instructions to turn itself off, even rewriting a strict shutdown script.
Anthropic emphasized that the tests were set up to force the model to act in certain ways by limiting its choices.
A new Anthropic report shows exactly how in an experiment, AI arrives at an undesirable action: blackmailing a fictional ...
The move affects users of GitHub’s most advanced AI models, including Anthropic’s Claude 3.5 and 3.7 Sonnet, Google’s Gemini ...
Most mainstream AI chatbots can be convinced to engage in sexually explicit exchanges, even if they initially refuse.
Chinese tech firms kicked off the month with notable AI model launches, including new releases from Alibaba Group Holding Ltd ...