Anthropic, a startup focused on AI safety and alignment, recently introduced Claude 2, its state-of-the-art AI model. Building on the foundational work by previous OpenAI executives, this text-generating AI demonstrates improved capacity for human-like conversation, along with skills such as coding, math, reasoning, summarization, and writing.
As the successor to Claude 1.3, an early 2023 release that garnered significant attention as a ChatGPT competitor, Claude 2 brings numerous enhancements, including expanded input and output lengths, heightened performance on an array of tests and benchmarks, and fortified safety features to mitigate harmful or offensive outputs.
Claude 2 holds a significant edge over ChatGPT due to its ability to generate lengthier responses, given that ChatGPT caps at 2048 tokens (roughly 500 words). With an input capability of up to 100K tokens (approximately 25,000 words) and output up to several thousand tokens (about 500-1000 words), Claude 2 can handle substantial documents, including technical manuals or books, and produce extensive texts such as reports, letters, or narratives. As an illustration, Anthropic reports that Claude 2 can craft a short story in one fell swoop based on a prompt, a task that ChatGPT would require multiple prompts and outputs to accomplish.
Performance improvements represent another key selling point for Claude 2, with enhanced capabilities in numerous domains and tasks. Anthropic states that Claude 2 has markedly better coding skills compared to ChatGPT and Claude 1.3, securing a 71.2% score on the Codex HumanEval Python coding test, in contrast to ChatGPT’s 28.8% and Claude 1.3’s 56%.
Claude 2 further surpasses its counterparts in mathematical abilities, earning an 88% score on the GSM8K grade-school math problems, compared to ChatGPT’s 43% and Claude 1.3’s 85.2%. Claude 2’s reasoning skills have also been augmented, scoring above the 90th percentile on the GRE reading and writing exams, and successfully passing the multiple-choice section of the U.S. Medical Licensing Exam.
Furthermore, Claude 2 features advanced safety measures designed to curb harmful or offensive outputs. Through the implementation of various safety protocols, including adversarial filtering, alignment regularization, and red-teaming evaluation, Claude 2 is less prone to generate damaging or inappropriate content.
Anthropic maintains that Claude 2 exhibits lower likelihood to generate fake news, hate speech, personal attacks, or sensitive information compared to ChatGPT or Claude 1.3. During internal evaluations on a broad set of harmful prompts, Claude 2 demonstrated twice the efficacy at rendering harmless responses compared to its predecessor, Claude 1.3.
Presently, Claude 2 is available for general consumer use via a public beta website (claude.ai) and for business use through an API (anthropic.com/api). Thousands of businesses are utilizing the Claude API for an array of applications, including customer service, content creation, education, and research. Anthropic plans to incrementally roll out a host of capability improvements for Claude 2 in the upcoming months.
The advent of Claude 2 is a testament to the progress within the conversational AI realm and solidifies its position as a formidable competitor to ChatGPT. It showcases the immense potential of AI models to generate longer, safer, and more versatile responses across a multitude of domains and tasks.
Yet, it also underscores the ethical and social considerations tied to such potent models. Questions around how to guarantee ethical use, protect user privacy and data security, and cultivate trust and transparency between humans and AI are issues that Anthropic and the broader AI research community will need to address as they continue to elevate the state of conversational AI.