Panmesia offers full-stack HPC CXL-based server system

Author: EIS Release Date: Jun 25, 2025


Panmesia, the Korean CXL specialist, has unveiled a full-stack HPC offering based on servers featuring its CXL 3.x Switches and CXL 3.x Intellectual Property (IP).

This integrated hardware-software system aims to enable flexible resource scaling and enhances parallel computing performance through CXL-based memory sharing.

 

텍스트, 테이프, 디자인이(가) 표시된 사진 AI 생성 콘텐츠는 정확하지 않을 수 있습니다.

Figure 2. Panmnesia’s HPC solution based on a server featuring CXL 3.x Switches.


■ Hardware Stack: Cost-Optimized Composable Server Architecture


Panmnesia’s hardware stack addresses the cost-efficiency limitations of conventional HPC systems. Traditionally, HPC servers are provisioned with fixed ratios of compute and memory resources. Therefore, when more memory capacity is required, users are often forced to add entire servers—along with redundant compute resources—resulting in inflated costs.

In contrast, Panmnesia’s composable architecture enables independent scaling of compute and memory resources. The CXL 3.x Composable Server exhibited at ISC consists of separate compute nodes (featuring CXL-enabled CPUs or GPUs) and memory nodes (featuring CXL-enabled memory expanders), all interconnected via CXL Switches.

This configuration allows users to selectively add only the nodes equipped with the resources they need, thereby reducing unnecessary expenditure (Figure 3). For example, when memory capacity is insufficient, users can add only memory nodes—without additional compute resources—to meet the required memory demand.

 

텍스트, 스크린샷, 책, 디자인이(가) 표시된 사진 AI 생성 콘텐츠는 정확하지 않을 수 있습니다.

Figure 3. CXL 3.x composable servers configured in various forms based on resource demand.

At the core of this architecture are Panmnesia’s two main products: the CXL Switch and CXL IP.
First, Panmnesia’s CXL Switch connects multiple nodes, making it possible to build a unified system. In particular, Panmnesia’s CXL Switch stands out in that it supports flexible configurations in terms of device types, system scale, and connection topology.

It interconnects not only CPUs and memory, but also GPUs (graphics processing units), NPUs (neural processing units), and various other accelerators.

It further enables scalability to multiple server nodes/racks by supporting advanced features of the latest CXL standards, such as multi-level switching and port-based routing.

Pannesia’s CXL IP enables seamless memory sharing and access across these interconnected devices. As Panmnesia’s low-latency IP which achieved the world’s first double-digit nanosecond latency autonomously performs memory management operations, it can minimize the performance overhead.

스크린샷, 텍스트, 디자인이(가) 표시된 사진 AI 생성 콘텐츠는 정확하지 않을 수 있습니다.

Figure 4. CXL Switch and CXL IP included in Panmnesia’s showcased solution.

■ Software Stack: Accelerating Parallel Computing

Complementing its hardware, Panmnesia also introduced a software stack designed to accelerate parallel computing applications. In traditional HPC systems, data is distributed across the memory attached to each server node, requiring frequent inter-node communication.

This often leads to performance degradation due to redundant data copying and format transformation.

Panmnesia’s approach addresses these inefficiencies by replacing network-based communication with CXL-based memory sharing.

Instead of distributing data across the local memory of individual server nodes, it is stored in a unified memory pool—composed of multiple memory nodes—that is accessible by all compute nodes via CXL. With Panmnesia’s low-latency CXL IP autonomously handling memory management operations, processing units (e.g., CPUs, GPUs) can access the shared memory using standard load/store instructions.

This significantly reduces the latency and overhead typically caused by data copying and transformation during network-based communication.

텍스트, 스크린샷, 도표, 폰트이(가) 표시된 사진 AI 생성 콘텐츠는 정확하지 않을 수 있습니다.

Figure 5. Overview of Panmnesia’s approach to accelerate parallel computing.

Panmnesia implemented this software stack on a Linux-based system and directly executed a fluid dynamics simulation—a representative parallel computing workload—to demonstrate its effectiveness. The demonstration resulted in a 44% reduction in execution time.

텍스트, 스크린샷, 폰트, 브랜드이(가) 표시된 사진 AI 생성 콘텐츠는 정확하지 않을 수 있습니다.