Model HQ
DocumentationBack to Video Tutorials
Private On-Device AI on Snapdragon X Elite: Download + Test NPU Models in Model HQ
LLMWare
AI & ML Tutorials
In this video, I walk through Model HQ’s model catalogue and capabilities — this time running fully on a Qualcomm Snapdragon X Elite device.
Model HQ is a private, local, no-code platform for models, chat, agents, and RAG. Once models are downloaded, no Wi-Fi is needed—everything stays secure and runs on your machine.
What you’ll see in this demo:
✅ How to browse 162 available models on Snapdragon X Elite
✅ How to identify NPU-optimized models (ONNX QNN) vs CPU models (ONNX / GGUF)
✅ Popular model families included: Phi, Qwen, Llama, DeepSeek, and more
✅ LLMWare’s specialist models: BLING (RAG) + SLIM (agent function-calling) + Dragon
✅ How to download models with one click and start using them immediately
✅ 3 ways to test any model inside ModelHQ:
Sandbox (quick single prompt test + inference time)
Standard Test (LLMWare benchmark with context passage + gold answer)
Custom Test (bring your own dataset to evaluate model performance)
Why this matters
If you’re trying to decide which model to run on Snapdragon X Elite, this video shows exactly how to compare quality + speed—especially for RAG-style questions where accuracy matters and hallucinations can’t happen.
Everything you see here is running on-device, using the Snapdragon NPU for ONNX QNN models—so you get fast inference while keeping your data private.
🔒 ModelHQ in action: private on-device AI made easy.
Subscribe for more demos on Qualcomm Snapdragon, RAG workflows, agents, and model testing.
#ModelHQ #LLMWare #Qualcomm #SnapdragonXElite #NPU #ONNX #QNN #OnDeviceAI #PrivateAI #SmallLanguageModels #RAG #NoCodeAI #SLMs
