AMD 平台上的 LLM 計算

前幾天在 Hacker News 上看到的文章：「Making AMD GPUs competitive for LLM inference (mlc.ai)」，原文在「Making AMD GPUs competitive for LLM inference」這邊。

Nvidia 在 GPU 上的各種運算這塊進來的很早，除了本家開發了很多工具以外，社群的支援度也很好。而 AMD 這邊就差了不少，但這也反應在顯卡的售價上面。

作者整理了同樣是 24GB VRAM 的顯卡出來，分別是 AMD 的 7900XTX，以及 Nvidia 的 3090 Ti 與新的 4090：

AMD 平台上的 LLM 計算

可以看出來縮然同樣 fp16 對應到的功耗差蠻多的，但單價低很多，對於業餘玩家偶而用來說，其實是個可以考慮的方案。

而他們的成果可以看出來效果其實不差，跑 Llama 2 的 model 可以看到 CP 值相當高：

AMD 平台上的 LLM 計算

看起來支援的主力在 ROCm 上，就效能與功耗的筆直來說其實是超越的？(或者保守一點的說，是在同一個水平上的)

現在算是 AMD 顯卡在追趕的過程，社群的力量看起來會是主力…

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

AMD 平台上的 LLM 計算

模型“看视频写网页”，GPT-5仅36.35分！首个video2code基准发布

真够卷的！DeepSeek更完智谱更：GLM-4.6，代码国内最强

九章云极率先完成DeepSeek-V3.2-Exp适配，提供安全高效部署方案

OpenAI突然发布Sora 2：好一个“AI版抖音”！

DeepSeek-V3.2-Exp第一时间上线华为云

DeepSeek-V3.2-Exp第一时间上线华为云

DeepSeek突然拥抱国产GPU语言!对标CUDA替代Triton,华为Day0适配

ChatGPT可以下单买买买了

宇树机器人被曝漏洞，机器人之间可相互感染，官方火速回应

九章云极率先完成DeepSeek-V3.2-Exp适配，提供安全高效部署方案