Text Mining for Korean: Characteristics and Application to 2011 Korean Economic Census Data

Juna Goo, Kyunga Kim

Research output: Contribution to journalArticlepeer-review

Abstract

2011 Korean Economic Census is the first economic census in Korea, which contains text data on menus
served by Korean-food restaurants as well as structured data on characteristics of restaurants including
area, opening year and total sales. In this paper, we applied text mining to the text data and investigated
statistical and technical issues and characteristics of Korean text mining. Pork belly roast was the most
popular menu across provinces and/or restaurant types in year 2010, and the number of restaurants per
10000 people was especially high in Kangwon-do and Daejeon metropolitan city. Beef tartare and fried pork cutlet are popular menus in start-up restaurants while whole chicken soup and maeuntang (spicy fish stew) are in long-lived restaurants. These results can be used as a guideline for menu development to restaurant owners, and for government policy-making process that lead small restaurants to choose proper menus for successful business.
Original languageAmerican English
JournalThe Korean Journal of Applied Statistics
Volume27
Issue number7
DOIs
StatePublished - Dec 2014
Externally publishedYes

Keywords

  • Korean economic census
  • big data
  • dictionary construction
  • text mining

EGS Disciplines

  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Text Mining for Korean: Characteristics and Application to 2011 Korean Economic Census Data'. Together they form a unique fingerprint.

Cite this