Rogue Scholar Posts

language
Published in JSC Accelerating Devices Lab

Since early 2022, the Accelerated Devices lab has been involved in the OpenGPT-X project. 1 OpenGPT-X trains large language models to enable new data-driven business solutions and specifically address European needs. As of January 2025, the project has published its main results and is set to wrap up in early 2025.

Published in JSC Accelerating Devices Lab

From November 17th to 22nd, 2024, HPC professionals and researchers gathered in Atlanta, Georgia, for the Supercomputing Conference 2024. We presented a paper at the 2024 International Workshop on Performance, Portability, and Productivity in HPC where we introduced CARAML, a reproducible AI benchmarking framework, and jpwr, a custom energy assessment module. The presentation slides are embedded at the bottom.

Published in JSC Accelerating Devices Lab

The Supercomputing Conference 2023 took place in Denver, Colorado, from November 12th to 17th. For the Women in HPC workshop, we submitted a paper, which focused on benchmarking different accelerators for AI. The paper was accepted and I was invited to hold a lightning talk to show the work, spun off our OpenGPT-X project.

Published in JSC Accelerating Devices Lab

sup.wayback { font-size: 0.6em; color: gray; } .imagegrid { display: grid; grid-template-columns: auto auto; grid-gap: 20px; } .imagegrid a { display: flex; align-items: center; justify-content: center; } .imagegrid a img { margin: 0; } table#asm-highlight, table#asm-highlight figure code { font-size: smaller;

Published in JSC Accelerating Devices Lab

Environment Setup Enabling UCC in OpenMPI Enabling NCCL in UCC (Team Layer Selection) All The Variables Results 1. Plain OpenMPI 2. OpenMPI with UCC 3. OpenMPI with UCC+NCCL Scaling Plots Average Latency Bus Bandwidth Comparing MPI, UCC, UCC+NCCL Comparing UCC+NCCL, NCCL Summary Technical Details This post