TY - JOUR
T1 - ScaWL
T2 - Scaling k-WL (Weisfeiler-Lehman) Algorithms in Memory and Performance on Shared and Distributed-Memory Systems
AU - Soss, Coby
AU - Sukumaran Rajam, Aravind
AU - Layne, Janet
AU - Serra, Edoardo
AU - Halappanavar, Mahantesh
AU - Gebremedhin, Assefaw H.
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/3/21
Y1 - 2025/3/21
N2 - The k-dimensional Weisfeiler-Lehman (k-WL) algorithm - developed as an efficient heuristic for testing if two graphs are isomorphic - is a fundamental kernel for node embedding in the emerging field of graph neural networks. Unfortunately, the k-WL algorithm has exponential storage requirements, limiting the size of graphs that can be handled. This work presents a novel k-WL scheme with a storage requirement orders of magnitude lower while maintaining the same accuracy as the original k-WL algorithm. Due to the reduced storage requirement, our scheme allows for processing much bigger graphs than previously possible on a single compute node. For even bigger graphs, we provide the first distributed-memory implementation. Our k-WL scheme also has significantly reduced communication volume and offers high scalability. Our experimental results demonstrate that our approach is significantly faster and has superior scalability compared to five other implementations employing state-of-the-art techniques.
AB - The k-dimensional Weisfeiler-Lehman (k-WL) algorithm - developed as an efficient heuristic for testing if two graphs are isomorphic - is a fundamental kernel for node embedding in the emerging field of graph neural networks. Unfortunately, the k-WL algorithm has exponential storage requirements, limiting the size of graphs that can be handled. This work presents a novel k-WL scheme with a storage requirement orders of magnitude lower while maintaining the same accuracy as the original k-WL algorithm. Due to the reduced storage requirement, our scheme allows for processing much bigger graphs than previously possible on a single compute node. For even bigger graphs, we provide the first distributed-memory implementation. Our k-WL scheme also has significantly reduced communication volume and offers high scalability. Our experimental results demonstrate that our approach is significantly faster and has superior scalability compared to five other implementations employing state-of-the-art techniques.
KW - Optimization
KW - k-WL algorithm
KW - parallel
UR - https://www.scopus.com/pages/publications/105003631088
U2 - 10.1145/3715124
DO - 10.1145/3715124
M3 - Article
AN - SCOPUS:105003631088
SN - 1544-3566
VL - 22
JO - ACM Transactions on Architecture and Code Optimization
JF - ACM Transactions on Architecture and Code Optimization
IS - 1
M1 - 45
ER -