Using Local LLM with RX9070xt

AMD’s latest graphics card, the RX 9070 XT, delivers gaming performance that can go toe-to-toe with NVIDIA and is drawing a lot of attention for its value for money.

But when considering local LLM usage, is the RX 9070 XT really a good choice?

Many people say that for peace of mind you should go with NVIDIA, but the comparable NVIDIA model in terms of performance and VRAM, the RTX 4080, is about AUD 500 more expensive than the RX 9070 XT. The RX 9070 XT itself already costs around AUD 1,500, so it’s by no means cheap, but NVIDIA’s pricing feels excessively inflated in terms of its raw performance.

I wanted to get the RX 7900 XTX with 24GB of VRAM, but it wasn’t easy to find one in Australia. It feels like GPUs disappear from the market as soon as they become available.

Was choosing the RX 9070 XT a bold but correct decision, betting on the continued expansion of AMD’s ecosystem in the local LLM space?

For now, since LM Studio supports the RX 9070 XT, I can use fast, GPU-accelerated, text-based AI locally.

With 18GB of VRAM, it falls short of the commonly recommended 24GB, but it runs the Gemma 3 12B q6 model (10.51GB) at quite reasonable speeds. Performance is around 24 tokens per second.

GPU offloading is at 47/48, meaning it relies only minimally on the CPU, which allows for fairly fast performance.

However, the next step up—the Gemma 3 27B model—is significantly slower, to the point where it’s difficult to use if you need to repeatedly iterate on prompt engineering.

Fortunately, even with the Gemma 3 12B q6 model, it’s possible to generate fairly high-quality Korean text. English text generation is expected to be even better.

As of April 2, 2025, Ollama does not yet natively support the RX 9070 XT. Once AMD’s LLM GPU library, ROCm, officially supports the RX 9070 XT, Ollama is expected to support it as well.

Personally, I don’t find LM Studio any less convenient to use compared to Ollama.

While running an LLM, GPU usage sits at around 90%, CPU usage (Ryzen 9900X) at about 50%, and GPU temperature stays below 50°C.

So far, I’m fairly satisfied.

Please note that we won't show your email to others, or use it for sending unwanted emails. We will only use it to render your Gravatar image and to validate you as a real person.

Leave a comment