Robust Data-centric Graph Structure Learning for Text Classification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Over the past decades, text classification underwent remarkable evolution across diverse domains. Despite these advancements, most existing model-centric methods in text classification cannot generalize well on class-imbalanced datasets that contain high-similarity textual information. Instead of developing new model architectures, data-centric approaches enhance the performance by manipulating the data structure. In this study, we aim to investigate robust data-centric approaches that can help text classification in our collected dataset, the metadata of survey papers about Large Language Models (LLMs). In the experiments, we explore four paradigms and observe that leveraging arXiv’s co-category information on graphs can help robustly classify the text data over the other three paradigms, conventional machine-learning algorithms, pre-trained language models’ fine-tuning, and zero-shot / few-shot classifications using LLMs.

Original languageEnglish
Title of host publicationWWW 2024 Companion - Companion Proceedings of the ACM Web Conference
Pages1486-1495
Number of pages10
ISBN (Electronic)9798400701726
DOIs
StatePublished - 13 May 2024
Event33rd ACM Web Conference, WWW 2024 - Singapore, Singapore
Duration: 13 May 202417 May 2024

Publication series

NameWWW 2024 Companion - Companion Proceedings of the ACM Web Conference

Conference

Conference33rd ACM Web Conference, WWW 2024
Country/TerritorySingapore
CitySingapore
Period13/05/2417/05/24

Keywords

  • Data-centric AI
  • Graph neural networks
  • Text classification

Fingerprint

Dive into the research topics of 'Robust Data-centric Graph Structure Learning for Text Classification'. Together they form a unique fingerprint.

Cite this