Upstage's AI beats global peers in benchmark math tests

Its MathGPT outperformed Microsoft's ToRA math-specific large language model in two global tests

Upstage's CEO Kim Seong-hoon (Courtesy of Upstage)
Upstage's CEO Kim Seong-hoon (Courtesy of Upstage)
Kang-Ho Jang 1
2024-01-08 20:14:43 autonomy@hankyung.com
Artificial intelligence

South Korean artificial intelligence tech startup Upstage said on Monday that its math-specific large language model (LLM), jointly developed with local startup Masspresso and telecom leader KT Corp., has outperformed Microsoft Corp.’s ToRA in two global math benchmark tests. 

Upstage’s MathGPT achieved 0.488 out of a full score of 1 in the latest MATH benchmark test for LLMs having 13 billion parameters or less. The test is based on a dataset of 12,500 challenging math problems. 

The Korean model outperformed OpenAI's LLM GPT-4, which scored 0.425, chatbot ChatGPT's 0.355 and ToRA's 0.481, Upstage said. 

In the GSM8K benchmark, or Grade School Math 8K, MathGPT topped the LLM list. The Korean AI scored 0.782, beating ToRA’s 0.758. The benchmark is based on a dataset of 8,500 high quality, linguistically diverse grade school math word problems.

Math has been a difficult field in which to apply LLMs due to the need for logical reasoning and abstract thinking. 

Upstage has been developing the math-specific LLM with Masspresso, the operator of the AI-backed learning platform Qanda, since last year. This is part of the two AI startups’ partnership with KT, which last September invested 10 billion won ($7.6 million) in each of the tech ventures to strengthen its hyperscale AI capabilities

Masspresso, which collects around 10 million data on math problems and explanations per day, has provided Upstage with the dataset.

KT operates Korea’s largest graphics processing unit (GPU) farm, a set of servers that allocate resources to quickly perform calculations, to accelerate the two startups’ math-specific LLM development. 

Upstage will lead the innovation of generative AI in math and other domains with its global top LLM tech, said Chief Executive Kim Seong-hoon.

AI in the global edtech industry, which has been at the level of Google search, will be upgraded with MathGPT, said a Quanda official.

Write to Kang-Ho Jang at autonomy@hankyung.com


Jihyun Kim edited this article.

Upstage to develop math-specific LLM with Qanda

Upstage to develop math-specific LLM with Qanda

(Courtesy of Upstage) South Korean artificial intelligence tech startup Upstage will challenge global math solver AIs such as OpenAI’s ChatGPT and Google’s Photomath with its own math-specific large language model (LLM), which would be the first of its kind in the country.  Ups

Upstage to develop shopping-specific AI with ConnectWave

Upstage to develop shopping-specific AI with ConnectWave

South Korea’s artificial intelligence (AI) startup Upstage on Monday said it will develop a shopping-specific AI service with domestic online commerce company ConnectWave.The two companies on Friday signed a memorandum of understanding to launch the country's first private large language

S.Korea's KT invests in domestic startups Upstage, Qanda

S.Korea's KT invests in domestic startups Upstage, Qanda

KT CEO Kim Young-shub (left) and Qanda CEO Lee Yong-jae South Korea's KT Corp. has invested 20 billion won ($15 million) in two domestic artificial intelligence (AI) startups. For the second time in this year's second half, KT has plunked down tens of billions of won in startups in a bid t

S.Korean LLM by Upstage beats global benchmark ChatGPT

S.Korean LLM by Upstage beats global benchmark ChatGPT

Upstage tops the HuggingFace Open LLM Leaderboard on August 1, 2023 (Courtesy of Upstage) The latest artificial intelligence model of South Korean AI startup Upstage scored higher than ChatGPT by global AI giant OpenAI in a world-recognized open-source language model evaluation, becoming the fi

Upstage’s AskUp offers Korea’s first GPT-4-powered chatbot service

Upstage’s AskUp offers Korea’s first GPT-4-powered chatbot service

Upstage chatbot AskUp's generative text in Korean (Courtesy of Upstage) Upstage, a South Korean AI tech startup, has unveiled a new version of its generative AI chatbot powered by GPT-4, to become the first to offer OpenAI’s latest large multimodal model-backed chatbot service in Korea, l

(* comment hide *}