Aug 08, 2025 20:00:00

What were the results of the attempt to 'build an AI cluster by combining four small desktop PCs'?

To run large-scale AI models quickly, large amounts of memory and high-performance processors are required. Technology YouTuber

Jeff Gearing created a cluster of four machines equipped with AMD's Ryzen AI Max+ 395 processors and published the results of his test to see if they could process AI smoothly.

I clustered four Framework Mainboards to test huge LLMs | Jeff Geerling
https://www.jeffgeerling.com/blog/2025/i-clustered-four-framework-mainboards-test-huge-llms

Benchmark Framework Desktop Mainboard and 4-node cluster · Issue #21 · geerlingguy/ollama-benchmark
https://github.com/geerlingguy/ollama-benchmark/issues/21

Gearing worked with hardware manufacturer Framework to acquire parts for their 'Framework Desktop,' a PC based on the Ryzen AI Max series, which is compact and powerful.

The notebook PC 'Framework Laptop 13' that you can assemble by selecting parts yourself is compatible with the Ryzen AI 300 series & Framework's first small desktop PC and 12-inch notebook PC also appear - GIGAZINE

Gearing sourced a Framework Desktop mainboard, power supply, and fan, and assembled the system, which is powered by a top-of-the-line Ryzen AI Max+ 395 processor and 128GB of RAM.

Four machines with the same configuration were prepared.

The assembled machine was installed in a rack mount called '

GeeekPi 8U ' to form a cluster, which can then be operated as a single machine with 512GB of RAM.

The completed cluster is quite small and fits neatly inside a large rack.

Hehe framework mini rack inside big rack pic.twitter.com/xOgoRZQsGY
— Jeff Geerling (@geerlingguy) August 7, 2025

The test results for the Top500 Benchmark , a benchmark developed for clusters, are as follows: The processing performance of a single Framework Desktop was 0.31 TFLOPS, which was lower than that of a Mac Studio with an M4 Max, but with a four-machine configuration, processing performance improved to 1.18 TFLOPS.

However, in actual AI processing, there were many problems that prevented it from working properly, and the performance improvement expected from the benchmark results was not seen. Gearing pointed out that the performance could not be achieved due to reasons such as 'network access between machines is too slow compared to memory' and 'various tools are slow to respond.'

The graph below shows the cost (bar graph) and tokens processed per second (line graph) for the following configurations: Framework Desktop (4 units), AmpereOne 192-core, Mac Studio with M3 Ultra, and Mac mini with M4 Pro (8 units). Framework Desktop does not perform as expected, while the Mac Studio with M3 Ultra offers outstanding value for money.

Detailed benchmark results can be found at the link below.

Benchmark Framework Desktop Mainboard and 4-node cluster · Issue #21 · geerlingguy/ollama-benchmark
https://github.com/geerlingguy/ollama-benchmark/issues/21

Related Posts:

Aug 08, 2025 20:00:00 in AI, Hardware, Posted by log1o_hf