Real-Time Understanding of Complex Discriminative Scene Descriptions

Ramesh Manuvinakurike, Casey Kennington, David DeVault, David Schlangen

Research output: Chapter in Book/Report/Conference proceedingChapter

5 Scopus citations

Abstract

Real-world scenes typically have complex structure, and utterances about them consequently do as well. We devise and evaluate a model that processes descriptions of complex configurations of geometric shapes and can identify the described scenes among a set of candidates, including similar distractors. The model works with raw images of scenes, and by design can work word-by-word incrementally. Hence, it can be used in highly-responsive interactive and situated settings. Using a corpus of descriptions from game-play between human subjects (who found this to be a challenging task), we show that reconstruction of description structure in our system contributes to task success and supports the performance of the word-based model of grounded semantics that we use.
Original languageAmerican English
Title of host publicationSIGDIAL 2016: 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference
Pages232-241
Number of pages10
ISBN (Electronic)9781945626234
StatePublished - 2016
Event17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2016 - , United States
Duration: 13 Sep 201615 Sep 2016

Publication series

NameSIGDIAL 2016 - 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference

Conference

Conference17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2016
Country/TerritoryUnited States
Period13/09/1615/09/16

EGS Disciplines

  • Computer Sciences

Fingerprint

Dive into the research topics of 'Real-Time Understanding of Complex Discriminative Scene Descriptions'. Together they form a unique fingerprint.

Cite this