【讯息】Introducing Claude 4

【讯息】Introducing Claude 4
Kobe_zyxIntroducing Claude 4 介绍 Claude 4
Today,
we’re introducing the next generation of Claude models: Claude Opus 4 and Claude Sonnet 4,
setting new standards for coding, advanced reasoning, and AI agents.
今天,
我们将介绍下一代 Claude 模型:Claude Opus 4和 Claude Sonnet 4 ,
为编码、高级推理和 AI 代理设定新的标准。
Claude Opus 4 is the world’s best coding model, with sustained performance on complex, long-running tasks and agent workflows.
Claude Sonnet 4 is a significant upgrade to Claude Sonnet 3.7, delivering superior coding and reasoning while responding more precisely to your instructions.
Claude Opus 4 是全球领先的编码模型,在复杂、长时间运行的任务和代理工作流中拥有持续的性能。
Claude Sonnet 4 是 Claude Sonnet 3.7 的重大升级,提供卓越的编码和推理能力,同时更精确地响应您的指令。
Alongside the models, we’re also announcing:
除了这些模型之外,我们还宣布:
Extended thinking with tool use (beta): Both models can use tools—like web search—during extended thinking, allowing Claude to alternate between reasoning and tool use to improve responses.
使用工具进行扩展思考(测试版) :两种模型都可以在扩展思考过程中使用工具(例如网络搜索 ),从而使 Claude 能够在推理和工具使用之间交替进行以改善响应。
New model capabilities: Both models can use tools in parallel, follow instructions more precisely, and—when given access to local files by developers—demonstrate significantly improved memory capabilities, extracting and saving key facts to maintain continuity and build tacit knowledge over time.
新模型功能 :两种模型都可以并行使用工具,更精确地遵循指令,并且当开发人员授予其访问本地文件的权限时,可以显著提高记忆能力,提取和保存关键事实以保持连续性并随着时间的推移建立隐性知识。
Claude Code is now generally available: After receiving extensive positive feedback during our research preview, we’re expanding how developers can collaborate with Claude. Claude Code now supports background tasks via GitHub Actions and native integrations with VS Code and JetBrains, displaying edits directly in your files for seamless pair programming.
Claude Code 现已正式发布 :在研究预览期间收到大量积极反馈后,我们正在扩展开发者与 Claude 的协作方式。Claude Code 现在支持通过 GitHub Actions 执行后台任务,并与 VS Code 和 JetBrains 原生集成,可直接在文件中显示编辑内容,实现无缝的结对编程。
New API capabilities: We’re releasing four new capabilities on the Anthropic API that enable developers to build more powerful AI agents: the code execution tool, MCP connector, Files API, and the ability to cache prompts for up to one hour.
新的 API 功能: 我们在 Anthropic API 上发布了四项新功能 ,使开发人员能够构建更强大的 AI 代理:代码执行工具、MCP 连接器、文件 API 以及将提示缓存长达一小时的能力。
Claude Opus 4 and Sonnet 4 are hybrid models offering two modes: near-instant responses and extended thinking for deeper reasoning. The Pro, Max, Team, and Enterprise Claude plans include both models and extended thinking, with Sonnet 4 also available to free users. Both models are available on the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Pricing remains consistent with previous Opus and Sonnet models: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.
Claude Opus 4 和 Sonnet 4 是混合模型,提供两种模式:近乎即时的响应和用于更深层次推理的扩展思维。Pro、Max、Team 和 Enterprise Claude 套餐包含两种模型和扩展思维,Sonnet 4 也面向免费用户开放。两种模型均可在 Anthropic API、Amazon Bedrock 和 Google Cloud 的 Vertex AI 上使用。定价与之前的 Opus 和 Sonnet 模型保持一致:Opus 4 为每百万代币(输入/输出)15/75 美元,Sonnet 4 为 3/15 美元。
Claude 4 克劳德 4
Claude Opus 4 is our most powerful model yet and the best coding model in the world, leading on SWE-bench (72.5%) and Terminal-bench (43.2%). It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, with the ability to work continuously for several hours—dramatically outperforming all Sonnet models and significantly expanding what AI agents can accomplish.
Claude Opus 4 是我们迄今为止最强大的模型,也是全球最佳的编码模型,在 SWE-bench(72.5%)和 Terminal-bench(43.2%)上均领先。它在需要专注投入和数千个步骤的长时间运行任务中表现出色,能够连续工作数小时——其性能远超所有 Sonnet 模型,并显著扩展了 AI 代理的功能。
Claude Opus 4 excels at coding and complex problem-solving, powering frontier agent products. Cursor calls it state-of-the-art for coding and a leap forward in complex codebase understanding. Replit reports improved precision and dramatic advancements for complex changes across multiple files. Block calls it the first model to boost code quality during editing and debugging in its agent, codename goose, while maintaining full performance and reliability. Rakuten validated its capabilities with a demanding open-source refactor running independently for 7 hours with sustained performance. Cognition notes Opus 4 excels at solving complex challenges that other models can’t, successfully handling critical actions that previous models have missed.
Claude Opus 4 擅长编码和解决复杂问题,为前沿代理产品提供动力。Cursor 称其为编码领域的最新技术,并在复杂代码库理解方面实现了飞跃。Replit 报告称,其跨多个文件的复杂更改的精度和显著改进。Block 称其为第一个在其代理 (代号 goose) 中在编辑和调试过程中提高代码质量,同时保持完整性能和可靠性的模型。Rakuten 通过独立运行 7 小时且性能稳定的高要求开源重构验证了其功能。Cognition 指出 , Opus 4 擅长解决其他模型无法解决的复杂挑战,成功处理了以前的模型遗漏的关键操作。
Claude Sonnet 4 significantly improves on Sonnet 3.7’s industry-leading capabilities, excelling in coding with a state-of-the-art 72.7% on SWE-bench. The model balances performance and efficiency for internal and external use cases, with enhanced steerability for greater control over implementations. While not matching Opus 4 in most domains, it delivers an optimal mix of capability and practicality.
Claude Sonnet 4 在 Sonnet 3.7 业界领先的功能基础上进行了显著提升,在 SWE-bench 上实现了 72.7% 的出色编码效率。该模型在内部和外部用例的性能和效率之间取得了平衡,并增强了可控性,从而更好地控制实现。虽然在大多数领域都无法与 Opus 4 匹敌,但它实现了功能和实用性的最佳结合。
GitHub says Claude Sonnet 4 soars in agentic scenarios and will introduce it as the model powering the new coding agent in GitHub Copilot. Manus highlights its improvements in following complex instructions, clear reasoning, and aesthetic outputs. iGent reports Sonnet 4 excels at autonomous multi-feature app development, as well as substantially improved problem-solving and codebase navigation—reducing navigation errors from 20% to near zero. Sourcegraph says the model shows promise as a substantial leap in software development—staying on track longer, understanding problems more deeply, and providing more elegant code quality. Augment Code reports higher success rates, more surgical code edits, and more careful work through complex tasks, making it the top choice for their primary model.
GitHub 表示,Claude Sonnet 4 在代理场景中表现出色,并将作为 GitHub Copilot 中新编码代理的模型引入。Manus 强调了其在执行复杂指令、清晰推理和美观输出方面的改进。iGent 报告称,Sonnet 4 在自主多功能应用程序开发方面表现出色,并显著改进了问题解决和代码库导航能力,将导航错误率从 20% 降至接近零。Sourcegraph 表示 ,该模型有望成为软件开发的一大飞跃——能够更长时间地保持正轨,更深入地理解问题,并提供更优雅的代码质量。Augment Code 报告称其成功率更高,代码编辑更精准,复杂任务的处理也更加细致,使其成为其主要模型的首选。
These models advance our customers’ AI strategies across the board: Opus 4 pushes boundaries in coding, research, writing, and scientific discovery, while Sonnet 4 brings frontier performance to everyday use cases as an instant upgrade from Sonnet 3.7.
这些模型全面推进了我们客户的人工智能战略:Opus 4 在编码、研究、写作和科学发现方面突破了界限,而 Sonnet 4 作为 Sonnet 3.7 的即时升级,为日常用例带来了前沿性能。
Model improvements 模型改进
In addition to extended thinking with tool use, parallel tool execution, and memory improvements, we’ve significantly reduced behavior where the models use shortcuts or loopholes to complete tasks. Both models are 65% less likely to engage in this behavior than Sonnet 3.7 on agentic tasks that are particularly susceptible to shortcuts and loopholes.
除了通过工具使用、并行工具执行和内存改进来扩展思维之外,我们还显著减少了模型使用捷径或漏洞完成任务的行为。在特别容易受到捷径和漏洞影响的代理任务上,这两个模型出现此类行为的可能性都比 Sonnet 3.7 低 65%。
Claude Opus 4 also dramatically outperforms all previous models on memory capabilities. When developers build applications that provide Claude local file access, Opus 4 becomes skilled at creating and maintaining ‘memory files’ to store key information. This unlocks better long-term task awareness, coherence, and performance on agent tasks—like Opus 4 creating a ‘Navigation Guide’ while playing Pokémon.
Claude Opus 4 在内存能力方面也显著超越了所有前代型号。当开发者构建允许 Claude 访问本地文件的应用程序时,Opus 4 能够熟练地创建和维护“内存文件”来存储关键信息。这能够提升代理在长期任务中的感知能力、连贯性和执行性能——例如,Opus 4 在玩宝可梦时能够创建“导航指南”。
Finally, we’ve introduced thinking summaries for Claude 4 models that use a smaller model to condense lengthy thought processes. This summarization is only needed about 5% of the time—most thought processes are short enough to display in full. Users requiring raw chains of thought for advanced prompt engineering can contact sales about our new Developer Mode to retain full access.
最后,我们为 Claude 4 模型引入了思维摘要功能,该功能使用较小的模型来压缩冗长的思维过程。这种摘要功能仅在约 5% 的情况下才需要使用——大多数思维过程都足够短,可以完整显示。需要原始思维链进行高级快速工程的用户可以联系我们的销售人员, 了解我们全新的开发者模式,以保留完整访问权限。
Claude Code
Claude Code, now generally available, brings the power of Claude to more of your development workflow—in the terminal, your favorite IDEs, and running in the background with the Claude Code SDK.
Claude Code 现已普遍可用,它将 Claude 的强大功能带入您的更多开发工作流程 - 在终端、您最喜欢的 IDE 中,以及使用 Claude Code SDK 在后台运行。
New beta extensions for VS Code and JetBrains integrate Claude Code directly into your IDE. Claude’s proposed edits appear inline in your files, streamlining review and tracking within the familiar editor interface. Simply run Claude Code in your IDE terminal to install.
VS Code 和 JetBrains 的新 Beta 扩展将 Claude Code 直接集成到您的 IDE 中。Claude 建议的编辑会以内联方式显示在文件中,从而简化了您在熟悉的编辑器界面中的审阅和跟踪。只需在 IDE 终端中运行 Claude Code 即可安装。
Beyond the IDE, we’re releasing an extensible Claude Code SDK, so you can build your own agents and applications using the same core agent as Claude Code. We’re also releasing an example of what’s possible with the SDK: Claude Code on GitHub, now in beta. Tag Claude Code on PRs to respond to reviewer feedback, fix CI errors, or modify code. To install, run /install-github-app from within Claude Code.
除了 IDE 之外,我们还发布了可扩展的 Claude Code SDK,让您能够使用与 Claude Code 相同的核心代理构建自己的代理和应用程序。我们还发布了一个示例,展示该 SDK 的强大功能:GitHub 上的 Claude Code,目前处于 Beta 阶段。在 PR 上标记 Claude Code,即可回复审阅者的反馈、修复 CI 错误或修改代码。安装方法:在 Claude Code 中运行 /install-github-app。