Model HQ

Back to Video Tutorials

Private On-Device AI on Snapdragon X Elite: Download + Test NPU Models in Model HQ

LLMWare

AI & ML Tutorials

In this video, I walk through Model HQ’s model catalogue and capabilities — this time running fully on a Qualcomm Snapdragon X Elite device. Model HQ is a private, local, no-code platform for models, chat, agents, and RAG. Once models are downloaded, no Wi-Fi is needed—everything stays secure and runs on your machine. What you’ll see in this demo: ✅ How to browse 162 available models on Snapdragon X Elite ✅ How to identify NPU-optimized models (ONNX QNN) vs CPU models (ONNX / GGUF) ✅ Popular model families included: Phi, Qwen, Llama, DeepSeek, and more ✅ LLMWare’s specialist models: BLING (RAG) + SLIM (agent function-calling) + Dragon ✅ How to download models with one click and start using them immediately ✅ 3 ways to test any model inside ModelHQ: Sandbox (quick single prompt test + inference time) Standard Test (LLMWare benchmark with context passage + gold answer) Custom Test (bring your own dataset to evaluate model performance) Why this matters If you’re trying to decide which model to run on Snapdragon X Elite, this video shows exactly how to compare quality + speed—especially for RAG-style questions where accuracy matters and hallucinations can’t happen. Everything you see here is running on-device, using the Snapdragon NPU for ONNX QNN models—so you get fast inference while keeping your data private. 🔒 ModelHQ in action: private on-device AI made easy. Subscribe for more demos on Qualcomm Snapdragon, RAG workflows, agents, and model testing. #ModelHQ #LLMWare #Qualcomm #SnapdragonXElite #NPU #ONNX #QNN #OnDeviceAI #PrivateAI #SmallLanguageModels #RAG #NoCodeAI #SLMs