The Full Story

About the project

Since 2021, the TOPIK exam, managed by NIIED under the Ministry of Education, is being changed to NSDevil's UBT technology.

We're creating auto-scoring AI technology for the TOPIK Speaking exam.

For that reason, we need your support.

The project name: Korean speech database construction project for learning through AI (organized by the Korean Government)

Project overview

With the acceleration of digital transformation, it is necessary to build up the Artificial Intelligence (AI) industry foundation to fortify the competitiveness of Korean companies and institutions. Artificial Intelligence (AI) performance is intrinsically related to training data and learning.

Consequently, securing high-quality and meaningful amounts of data is a priority to meet the needs of data build-up on a national scale.

Building up a database for AI training takes a lot of time with the support of a sustainable budget plan, which in many cases acts as a barrier for small-scale domestic companies and institutions to expand AI-related projects. A study verifies that the biggest obstacle to introducing AI learning in the field is the lack of data (24.2%), etc. (2021, KISDI).

The project's purpose is to construct the Korean speech database that will be used to reinforce the Korean speech corpora, eventually contributing to Korean language education to speakers of other languages and teaching its pronunciation.

AI learning in the field of language education and evaluation has brought attention to the development of an effective artificial intelligence-based Korean language education and evaluation system.

Observed outside of Korea has clearly shown a demand for its research. As for Korean pronunciation and speaking training, it is necessary to establish voice data of the Korean language spoken by native speakers of English (USA, UK, Australia, Canada, etc.), European, Chinese, Japanese, and other Asian languages.

The metadata collected and studied on Korean pronunciation and its speaking education and evaluation will be expected to promote the S/W development in the related field.

In this project, 5,000 hours of speech data of the Korean language from English, European, Chinese, Japanese, and other Asian languages native speakers will be collected.

The core group of the development

NSDevil Co. Ltd (the representative of the consortium, and TOPIK/EPS-TOPIK (Test of Proficiency in Korean/ Employment Permit System-TOPIK) official assessment technology provider), Naver (the leading IT company in South Korea), Seoul National University (SNU), Sungkyunkwan University (SKKU), and the Korean Society of Speech Sciences, are partners in the project.

We need your support

We have reached our quota of voice data from all groups of languages, except for the English sector. Therefore, we are accepting applications from native English speakers.

You will record your voice on our platform:
1st step: Recording session (~1.5 hours)
After completion you will receive a $15 Amazon gift card and have a chance to win a round-trip ticket to Korea.
2nd step: Bonus recording session (~30 minutes)
You will receive an extra $10 Amazon gift card.

Winners will be chosen through a random lottery. The video will be posted on our Facebook page and the prizes will be sent by email. (Estimated date: the end of January 2023)