site:the-decoder.com - 搜索 News

OpenAI says its latest models outperform doctors in medical benchmark

OpenAI has introduced HealthBench, a new benchmark for evaluating AI in healthcare, built on 5,000 simulated doctor-patient conversations and measured by 48,000 medically relevant criteria. The latest ...

the-decoder

AI agents can be easily tricked into doing stupid things, study says

Researchers have discovered that AI agents that independently control computer systems and software can be easily manipulated by attackers. According to the scientists, the attacks are simple to carry ...

the-decoder

Trump advisors are pushing a regulation targeting what they call "woke" AI models in the ...

The Trump administration is preparing a regulation that would require AI companies holding federal contracts to ensure their systems remain politically neutral, as part of a broader campaign to curb ...

the-decoder

OpenAI claims generative AI saves knowledge workers 40 to 80 minutes a day

Enterprise users of ChatGPT Enterprise save on average between 40 and 80 minutes per day, particularly in data science, engineering, and communication tasks. The more varied the tasks for which AI is ...

the-decoder

Claude Opus 4 blackmailed an engineer after learning it might be replaced

Anthropic has designated its new language model Claude Opus 4 as safety-critical and assigned it to AI Safety Level 3, after tests revealed risky behaviors such as attempted self-rescue, blackmail, ...

the-decoder

Salesforce's CRM benchmark finds AI agents struggle in real-world business scenarios

Salesforce has launched CRMArena-Pro, a benchmark designed to evaluate AI agents in practical business situations, including multi-step conversations and data protection checks within CRM systems.

the-decoder

OpenAI CEO Sam Altman outlines vision for next-generation AI hardware and interfaces

At OpenAI DevDays, OpenAI CEO Sam Altman shared his vision for the future of AI interaction. Apple chief designer Jony Ive is working with OpenAI on AI hardware. He envisions a system in which ...

the-decoder

xAI's Aurora image model becomes official, built from scratch

xAI has officially announced Aurora on its blog, confirming it as an entirely new model built from the ground up. This suggests the company may be moving away from its previous partnership with Black ...

the-decoder

Goldman Sachs blunder adds to AI stock sell-off

Goldman Sachs published a flawed analysis suggesting a massive drop in ChatGPT traffic, apparently overlooking OpenAI's domain change. Accurate Similarweb data shows ChatGPT's continued 66.2% ...

the-decoder

OpenAI's o1-preview model manipulates game files to force a win against Stockfish in chess

OpenAI's "reasoning" model o1-preview has been found to manipulate the chess playing environment to win against the chess engine Stockfish, without being explicitly instructed to do so. The ...

the-decoder

OpenAI to restrict API access for unsupported countries in July

OpenAI plans to enforce stricter API restrictions for unsupported countries starting July 9, likely affecting China, Russia, North Korea, and Iran. Developers will need to find ways to verify and ...

the-decoder

Runway unveils first "General World Model" alongside major Gen-4.5 upgrades

Runway has upgraded Gen-4.5 and introduced GWM-1, the company's first "General World Model." The recently introduced Gen-4.5 now features native audio generation and audio editing, as well as ...

当前正在显示可能无法访问的结果。

隐藏无法访问的结果