A Crowdsourcing Semi-Supervised LSTM Training Approach to Identify Novel Items in Emerging Artificial Intelligent Environments

Edoardo Serra, Haritha Akella, Alfredo Cuzzocrea

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Nowadays always new kinds of cuisines appear on the market. Even though main cuisines such as French, Italian, Japanese, Chinese and Indian are always appreciated, they are not anymore the most popular. The new trend is fusion cuisine. A fusion cuisine is a combination of different main cuisines, this combination makes this cuisine new. The opening of a new restaurant proposing a new kind of cuisine produces a lot of excitement and people feel the need to try it and be part of this new culture. Yelp is a platform which publishes crowd-sourced reviews about different businesses, in particular, restaurants. Yelp allows the possibility to declare for each restaurant the kind of cuisine. Unfortunately, since the restaurants in the Yelp database are not often generated by the owners but by the users creating the reviews, there is no much information about the kind of cuisine, especially for restaurants providing fusion ones.

In this paper, we address the problem of identifying restaurants proposing new kinds of cuisines by using their Yelp reviews. These new cuisines can be completely new or fusion cuisines. Discriminating between main cuisines and fusion cuisines is very difficult because fusion cuisines are similar to the main ones even if they are conceptually different. We propose 4Phase, a semi-supervised procedure that trains Long Short-Term Memory with only the text reviews of the restaurants providing main cuisines. The trained LSTM is ultimately used as a feature generator in combination with a standard novelty detection model (e.g., Gaussian Mixture Models). We perform experiments on Yelp to separate restaurants providing main cuisines from the ones providing completely new cuisines or fusion ones. In this experiments, our 4Phase procedure outperforms all the baselines (term frequency, Doc2Vec, autoencoder LSTM, etc.) and reaches 0.91 of both AUROC and MAP.

Original languageAmerican English
Title of host publicationProceedings - 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018
EditorsM. Arif Wani, Mehmed Kantardzic, Moamar Sayed-Mouchaweh, Joao Gama, Edwin Lughofer
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1479-1485
Number of pages7
ISBN (Electronic)9781538668047
DOIs
StatePublished - 2 Jul 2018
Event17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018 - Orlando, United States
Duration: 17 Dec 201820 Dec 2018

Publication series

NameProceedings - 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018

Conference

Conference17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018
Country/TerritoryUnited States
CityOrlando
Period17/12/1820/12/18

Keywords

  • LSTM
  • deep learning
  • novelty detection
  • recurrent neural network
  • semi unsuperevised

EGS Disciplines

  • Computer Sciences

Fingerprint

Dive into the research topics of 'A Crowdsourcing Semi-Supervised LSTM Training Approach to Identify Novel Items in Emerging Artificial Intelligent Environments'. Together they form a unique fingerprint.

Cite this