NCSOFT unveils AI dataset to rival hyperscale language models

FoCus Dataset is the first of its kind, utilizing both user personas and outside knowledge 

NCSoft building in Pangyo, near South Korea's capital Seoul (Courtesy of NCSOFT Corp.)
NCSoft building in Pangyo, near South Korea's capital Seoul (Courtesy of NCSOFT Corp.)
Jee Abbey Lee 2
2022-04-14 16:45:53 jal@hankyung.com
Artificial intelligence

NCSOFT Corp. unveiled an artificial intelligence (AI) conversation dataset it developed with Korea University’s research center on Thursday.

The South Korean game developer and publisher headquartered in Pangyo city is positioning the latest development as the much-awaited rival to the hyperscale language models dominating the natural language processing (NLP) field. 

Lim Hui-seok, a professor of computer science and engineering at the university, led the research. Lim also heads the academic institute’s NLP and AI research center. 

The collection of data is named FoCus Dataset, a short form of For Customized Conversation Dataset. 

The research team says it is the first such dataset that encompasses both user persona and outside knowledge. As it stands, it is comprised of more than 15,000 conversations on some 8,000 subjects. 

An AI that is equipped with the FoCus Dataset will be able to comprehend the experience and preferences of the person with whom it is having a conversation. Not only that, it will be able to source and learn the latest information available on Wikipedia in real-time. 

The collection and utilization of language data for AI adaptation falls in the NLP category. The goal of the machine learning technology is to program computers to process and analyze large amounts of the language spoken by humans for seamless communication between machines and people. 

In this process, a persona refers to a profile that represents large segments of data since it is easier to test a given strategy against an average of different individuals, i.e. a persona, as opposed to thousands of individuals. 

What sets FoCus Dataset apart from other data collections is that it can enable sophisticated conversations without the help of hyperscale language models.

Even though typical large-scale language models take a long time to learn and deduct meaning from, they still hit a bottleneck when it comes to inferring real-time data and reflecting personal experiences.

In late February, NCSOFT and Korea University jointly published a paper on the dataset at the AAAI 2022 conference. Founded in 1979, the Association for the Advancement of Artificial Intelligence is one of the highest-regarded scientific societies in the AI community. 

Come this October, the two entities will host the first workshop on the customized chat technology at COLING 2022, an international conference on computational linguistics. 

“Recently in the NLP academic circle, the need for alternative conversation technologies that will rival hyperscale language models has risen – for financial and environmental reasons,” Lee Yeon-soo, director of NCSOFT’s Language AI Lab said. 

The lead scientist at NCSOFT elaborated that he hopes the dataset will spark vibrant conversation and technological development within the NLP sector. 

NCSOFT is best known for the distribution of massively multiplayer online role-playing games (MMORPGs) such as Lineage and Guild Wars. In recent years, it has been expanding its foothold in other tech sectors. 

Write to Jee Abbey Lee at jal@hankyung.com

Naver aims to attract 1 bn users via active M&As by 2027

 Naver aims to attract 1 bn users via active M&As by 2027

Naver Corp.'s first millennial CEO Choi Soo-yeon at Naver Meetup on April 13 Naver Corp. vows to attract 1 billion users around the world and generate 15 trillion won ($12.2 billion) in five years. Only FAANG companies boast such massive userbases at present. The acronym refers to Facebo

Netmarble, NCSoft, Kakao, DoubleUGames on top publisher list

Netmarble, NCSoft, Kakao, DoubleUGames on top publisher list

Netmarble's Marvel Future Revolution. Netmarble ranked No.10 on data.ai’s top 52 app publisher list South Korea’s major tech companies such Netmarble Corp., NCSoft Corp., Kakao Corp. and DoubleUGames were selected among the world’s most profitable app publishers last year.App

NCSoft releases trailers of two upcoming games

NCSoft releases trailers of two upcoming games

Throne and Liberty concept image (Courtesy of NCSoft Corp.) South Korea’s gaming behemoth NCSoft Corp. released trailers for two new titles named Throne and Liberty and Project E on Thursday. The company explained the two games are connected to one another. The stories are set in two

LG AI Research forms 'Hyperscale' AI alliance with 13 companies

LG AI Research forms 'Hyperscale' AI alliance with 13 companies

LG AI Research chief Bae Kyung-hoon explains the center’s global alliance network LG AI Research unveiled an alliance network consisting of 13 corporations and organizations based in South Korea and abroad on Tuesday. There are four LG subsidiaries within the Expert AI Alliance, name

LG's AI designer debuts at New York Fashion Week

LG's AI designer debuts at New York Fashion Week

LG Corp.'s AI designer Tilda debuts its fall-winter collection at New York Fashion Week South Korean fashion label Greedilous showcased its 2022 Fall-Winter collection on the main stage of the New York Fashion Week on Monday local time. Beyonce and Paris Hilton are both fans of the Greedil

Naver recruits two AI scholars based in US

Naver recruits two AI scholars based in US

Yoon Kim (left) and Karl Stratos (right) join the US-based Naver Search CIC  Naver Corp. is ramping up its recruitment of global talent in artificial intelligence. The South Korean tech giant announced Monday that it hired Yoon Kim,  Assistant Professor in the Department of Ele

(* comment hide *}