Qwen 3.6-27B language model on RTX 2080 Ti 22GiB
Qwen 3.6-27B is a state-of-the-art large language model suitable for local deployment. With technologies such as quantization, MTP, and TurboQuant, this language model can be deployed and run smoothly on an RTX 2080 Ti with 22GiB VRAM.
The following steps have been successfully tested on the latest version of CachyOS (installed using the 260426 ISO).
Installing dependencies
sudo pacman -Syu cmake cuda nodejs npm
paru -S python-modelscope
Configure npm mirror if necessary:
npm config set registry https://mirrors.cloud.tencent.com/npm/
npm config set strict-ssl false
Rebooting or relogging in is recommended for refreshing environment variables.
Read more...