NEWS

Huawei’s KunLun AI Space Debuts FP8 Inference for DeepSeek V3.1

Recently released by Huawei Computing, the DeepSeek V3.1 model has caught the attention of industry watchers due to its use of the FP8 precision format (UE8M0 FP8). According to official announcements from Huawei, KunLun Technology Co., Ltd.*, a subsidiary focusing on AI and computing solutions, has successfully developed an innovative soft FP8 solution based on Ascend C operator programming language for the Ascend AI platform.

The adoption of FP8 precision significantly reduces model memory requirements compared to traditional FP16 or BF16 formats. This breakthrough not only cuts down server hardware pressure but also ensures higher inference accuracy when contrasted with standard INT8 quantization methods, striking a balance between cost-efficiency and performance quality.

Key Breakthroughs

The newly introduced soft FP8 solution offers two critical advancements:

  • Precision Retention: By feeding FP8 weight models into Ascend hardware and converting them to BF16 format through exact reverse quantization operators, this method maintains computational accuracy while future-proofing for upcoming FP8 models.
  • Scalability Across Devices: The scheme ensures that a single KunLun G8600 machine can run the full version of DeepSeek V3.1 smoothly, and even less powerful units such as KunLun G5500V2 or G5580 can handle doubled model parameters while increasing concurrent processing capabilities.

Technical Details

KunLun Technology’s solution is built upon three core technologies:

  • Custom FP8 Reverse Quantization Operators: Efficiently cutting down both memory and bandwidth demands.
  • Operator Full Graph Dispatching: Boosting inference efficiency by 32%.
  • Seamless Compatibility with Mainstream Models: Ensuring easy support for a range of FP8 models.

This breakthrough in computing infrastructure represents another leap forward in the field of AI and machine learning, showcasing Huawei’s commitment to advancing technological capabilities through innovative solutions. KunLun AI Space’s FP8 solution is now fully compatible with DeepSeek V3.1 and other leading FP8 models like DeepSeek-V3/R1 and Qwen3, providing a robust framework for future developments.

With the rapid evolution of AI technologies, this development marks a significant step towards more efficient resource utilization without compromising on performance. It opens up new possibilities for businesses looking to implement advanced AI solutions with reduced costs, demonstrating Huawei’s prowess in delivering cutting-edge computing technologies that address real-world challenges.

Source: ithome.com

Huawei News

Recent Posts

ByteDance to Spend $5.6B on Huawei AI Chips Amid US Nvidia Curbs

ByteDance invests $5.6 billion in Huawei's AI chips amid US curbs on NVIDIA.

1 week ago

Gabon Teams Up With Huawei for National Digital Advancement

Gabon and Huawei team up to advance digital transformation through improved internet infrastructure and education…

1 week ago

Huawei, Sungrow Top Wood Mackenzie’s Inverter Market Ranking

Huawei and Sungrow have secured top positions in Wood Mackenzie’s latest inverter market ranking, underscoring…

1 week ago

Huawei Pura X2 Release Imminent in Early 2026

Huawei is gearing up for an early release of its Pura X2 smartphone, expected to…

1 week ago

Huawei Pura X2 Release Date Leaks Suggest Early Debut in 2026

Leaked information indicates Huawei Pura X2 could make an early debut in 2026 with advanced…

1 week ago

Honor Launches Pad 10 Pro and X10 Pro in China

Honor introduces its latest tablets: the Pad 10 Pro with a larger battery and improved…

1 week ago