School of Computing Seminar

Fridays 2:30-3:30 pm
McAdams 114

October 4, 2019

Dorian Arnold
Emory University


Emerging HPC Networking Technologies and Adaptive Parallel Programming Environments

Abstract:

This talk highlights two recent projects from our laboratory. In part one, we explore the (sometimes unexpected) impacts that emerging HPC network technologies can have on application runtime and system power performance. Our Remote Direct Memory Access (RDMA) application study shows that “Network-induced Memory Contention” (NiMC) can lead to up to 3x performance degradations at scales as small as 8K processes, even in applications previously shown to be performance-resilient to noise. We then offer deeper insight into the root cause of NiMC and some mitigation strategies.

In part two, we present a study of multithreaded message passing (MT-MP) programs developed by coupling independently developed and maintained libraries with differing preferred degrees of threading (aka thread-level heterogeneity). The challenge is to execute such programs in a manner that maximizes both application performance and resource utilization. We explore new ways to structure, execute and analyze coupled MT-MP programs showing that appropriately reconfigurable execution environments can yield significant performance improvements. Our approach uses programmable facilities with modest overheads to dynamically reconfigure runtime environments for compute phases with differing threading factors and affinities. For a majority of our test workloads, our performance results show speedups greater than 50% over the static, under-subscribed baseline.

Bio:

Dorian Arnold is a tenured, associate professor of Computer Science at Emory University. From 2009-2017 he was an assistant and associate professor at The University of New Mexico. He studies distributed systems, fault-tolerance, online (streaming) data analysis, and software tools for high-performance computing environments. Particularly, he is interested in the performance, scalability and reliability issues of extreme scale environments comprising many thousands or even millions of components. He has 60+ peer-reviewed publications with 1800+ citations. His research projects have won two Top 100 R&D awards. In 2017, he was named an ACM Distinguished Speaker.

Arnold has held leadership roles at major HPC venues including chair of many technical components and steering committee member for the SC Conference and as an Associate Editor of the IEEE Transactions on Parallel and Distributed Systems. He is committed to diversity and inclusion and served as General Chair for the 2017 ACM Richard Tapia Celebration of Diversity and the 2016 CRA HPC Pipeline Workshop.

Arnold earned Ph.D. and M.S. degrees in Computer Science from the Universities of Wisconsin and Tennessee, respectively. He earned a B.S. in Math and Computer Science from Regis University (Denver, CO) and his A.S. in Physics, Chemistry and Math from St. John's College (Belize).

 

 

School of Computing | Clemson University