During its AI Day conference last April, Qualcomm unveiled the Cloud AI 100, a chipset purpose-built for machine learning inferencing and edge computing workloads. Details were scarce at press time, evidently owing to a lengthy production schedule. But today Qualcomm announced a release date for the Cloud AI 100 — the first half of 2021, after sampling this fall — and shared details about the chipset’s technical specs.
Qualcomm expects the Cloud AI 100 to give it a leg up in an AI chipset market expected to reach $66.3 million by 2025, according to a 2018 Tractica report. Last year, SVP of product management Keith Kressin said he anticipates that inference — the process during which an AI model infers results from data — will become a “significant-sized” marked for silicon, growing 10 times from 2018 to 2025. With the Cloud AI 100, Qualcomm hopes to tackle specific markets, such as datacenters, 5G infrastructure, and advanced driver-assistance systems.
The Cloud AI 100 comes in three flavors — DM.2e, DM.2, and PCIe (Gen 3/4) — corresponding to performance range. At the low end, the Cloud AI 100 Dual M.2e and Dual M.2 models can hit between 50 TOPS (50 trillion floating-point operations per second) and 200 TOPS, while the PCIe model achieves up to 400 TOPS, according to Qualcomm. All three ship with up to 16 AI accelerator cores paired with up to 144MB RAM (9MB per core) and 32GB LPDDR4x on-card DRAM, which the company claims outperforms the competition by 106 times when measured by inferences per second per watt, using the ResNet-50 algorithm. The Cloud AI 100 Dual M.2e and Dual M.2 attain 15,000 to 10,000 inferences per second at under 50 watts, and the PCIe hovers around 25,000 inferences at 50 to 100 watts.
Qualcomm says the Cloud AI 100, which is manufactured on a 7-nanometer process, shouldn’t exceed a power draw of 75 watts. Here’s the breakdown for each card:
- Dual M.2e: 15 watts, 70 TOPS
- Dual M.2: 25 watts, 200 TOPS
- PCI2: 75 watts, 400 TOPS
The first Cloud AI 100-powered device — the Cloud Edge AI Development Kit — is scheduled to arrive in October. It looks similar to a wireless router, with a black shell and an antenna held up by a plastic stand. But it runs CentOS 8.0 and packs a Dual M.2 Cloud AI 100, a Qualcomm Snapdragon 865 system-on-chip, a Snapdragon X55 5G modem, and an NVMe SSD.
The Cloud AI 100 and products it powers integrate a full range of developer tools, including compilers, debuggers, profilers, monitors, servicing, chip debuggers, and quantizers. They also support runtimes like ONNX, Glow, and XLA, as well as machine learning frameworks such as TensorFlow, PyTorch, Keras, MXNet, Baidu’s PaddlePaddle, and Microsoft’s Cognitive Toolkit for applications like computer vision, speech recognition, and language translation.