<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Running Home Assistant Control with a 0.5B Model? I Got the Full Pipeline Working on M5Stack LLM8850]]></title><description><![CDATA[<h2>🎯 TL;DR</h2>
<p dir="auto"><strong>Successfully achieved a complete voice → control pipeline in under 2 seconds using qwen2.5-0.5B</strong>, including voice wake-up, ASR recognition, pause detection, and device control. This speed is genuinely usable for daily smart home interactions.</p>
<hr />
<p dir="auto">Hey Makers! 👋</p>
<p dir="auto">I've been working on a pretty interesting challenge: getting a <strong>tiny 0.5B parameter LLM</strong> to reliably control Home Assistant devices. Spoiler alert—it actually works, and it's fast. Let me walk you through the journey, the failures, and what finally clicked.</p>
<hr />
<h2>01 | The Core Problem: Speed vs. Intelligence</h2>
<p dir="auto"><strong>Product</strong>: M5Stack LLM8850<br />
<strong>Category</strong>: Edge AI / Smart Home Integration</p>
<p dir="auto">When building voice control for Home Assistant, you hit a classic tradeoff:</p>
<ul>
<li><strong>Large models</strong> (7B+): Great at understanding context, terrible at response time.</li>
<li><strong>Small models</strong> (0.5B-1.5B): Lightning fast, but they often hallucinate or ignore instructions.</li>
</ul>
<p dir="auto">My goal was simple: <strong>Make a 0.5B model output structured JSON commands reliably</strong>, just like ChatGPT would—without the random gibberish or making up device names.</p>
<p dir="auto">The approach? <strong>Inject real device info into the LLM's context and force it to output standardized control commands.</strong> Sounds easy, right? Well...</p>
<hr />
<h2>02 | First Attempt: Prompt Engineering (Epic Fail)</h2>
<p dir="auto"><strong>Category</strong>: Methodology</p>
<p dir="auto">I figured I'd try the "prompt engineering" route first—after all, it works wonders with GPT-4. I crafted a detailed system prompt with JSON examples:</p>
<p dir="auto"><img src="/assets/uploads/files/1764657796328-80193cd4-b8fe-4163-a050-72fb98980c1e-image.png" alt="80193cd4-b8fe-4163-a050-72fb98980c1e-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto"><strong>Result?</strong> The model just <strong>copy-pasted my examples verbatim</strong>. It completely ignored the actual user command.</p>
<p dir="auto"><strong>Lesson learned:</strong> Small models don't have the reasoning capacity to "follow instructions" the way larger models do. You can't trick them with clever prompts alone.</p>
<hr />
<h2>03 | Second Attempt: Dataset Fine-Tuning (This Worked)</h2>
<p dir="auto"><strong>Category</strong>: Model Training</p>
<p dir="auto">While digging through GitHub, I found <a href="https://github.com/acon96/home-llm" target="_blank" rel="noopener noreferrer nofollow ugc">home-llm</a>—a project that already proved this concept with a 3B model. The author's core insight was brilliant: <strong>Build a custom training dataset</strong> that teaches the model the exact input→output pattern.</p>
<p dir="auto"><img src="/assets/uploads/files/1764657877095-9742b9c5-e431-447f-aea5-6e79068fb15a-image.png" alt="9742b9c5-e431-447f-aea5-6e79068fb15a-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto"><strong>Their approach:</strong></p>
<ul>
<li><strong>System Prompt</strong>: Randomized lists of devices + available services (simulating real homes).</li>
<li><strong>User Prompt</strong>: Commands drawn from a pre-built library.</li>
<li><strong>Model Output</strong>: Properly formatted JSON commands matching the request.</li>
</ul>
<p dir="auto"><strong>But there were two critical gaps:</strong></p>
<p dir="auto"><strong>❌ Gap #1: No "Device Not Found" Handling</strong><br />
If you said "turn on the washing machine in the bedroom" but that device didn't exist, the model would just pick a random device. No error handling.</p>
<p dir="auto"><strong>❌ Gap #2: The Model Didn't Know Its Own Capabilities</strong><br />
Ask it "what devices can you control?" and it couldn't answer. This caused frequent mismatches during actual use.</p>
<hr />
<h2>04 | My Dataset Improvements</h2>
<p dir="auto"><strong>Category</strong>: Data Engineering</p>
<p dir="auto">I extended the training data to fix these issues:</p>
<p dir="auto"><strong>✅ Fix #1: Added "Device Not Found" Samples</strong></p>
<ul>
<li>Included training examples where the user asks for non-existent devices.</li>
<li>Taught the model to respond with: <code>"devices not found"</code>.</li>
<li><strong>Result:</strong> Near-perfect rejection of garbage inputs.</li>
</ul>
<p dir="auto"><strong>✅ Fix #2: Added Device Query Capability</strong></p>
<ul>
<li>Added samples like "what can you control?" or "list all devices."</li>
<li>Model learned to extract and report the device list from the system prompt.</li>
<li><strong>Result:</strong> Noticeably better device name matching (though room assignments still occasionally glitch).</li>
</ul>
<p dir="auto"><strong>Resources:</strong></p>
<ul>
<li>Full dataset + fine-tuned model: <a href="https://huggingface.co/yunyu1258/qwen2.5-0.5b-ha" target="_blank" rel="noopener noreferrer nofollow ugc">qwen2.5-0.5b-ha @ HuggingFace</a></li>
</ul>
<hr />
<h2>05 | The Fine-Tuning Process (Using LLaMA-Factory)</h2>
<p dir="auto"><strong>Category</strong>: Training Workflow</p>
<p dir="auto">I used <a href="https://github.com/hiyouga/LLaMA-Factory" target="_blank" rel="noopener noreferrer nofollow ugc">LLaMA-Factory</a> for the actual fine-tuning. Here's the quick rundown:</p>
<p dir="auto"><strong>Step 1: Convert Dataset Format</strong><br />
Transform the JSON into LLaMA-Factory's expected structure:</p>
<p dir="auto"><img src="/assets/uploads/files/1764657913507-2bf605e2-f3d6-421f-bd57-4b278730db98-image.png" alt="2bf605e2-f3d6-421f-bd57-4b278730db98-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto"><strong>Step 2: Register the Dataset</strong><br />
Edit <code>dataset_info.json</code> to add your dataset config:</p>
<p dir="auto"><img src="/assets/uploads/files/1764657901357-08a74128-1bbd-47ac-b2e0-c76ba3267646-image.png" alt="08a74128-1bbd-47ac-b2e0-c76ba3267646-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto"><strong>Step 3: Fix Identity Presets</strong><br />
Don't forget to change the default identity dataset—otherwise your model will introduce itself as ChatGPT. 😂</p>
<p dir="auto"><strong>Step 4: Run the Training</strong><br />
Reference notebook: <a href="https://colab.research.google.com/drive/1i_RSU8Y0EpkvfhfzVPPmuEvwcT_zIdxL?usp=sharing" target="_blank" rel="noopener noreferrer nofollow ugc">Colab Fine-Tuning Script</a></p>
<p dir="auto"><strong>Key params:</strong></p>
<ul>
<li>Base model: <code>qwen2.5-0.5b</code> (best speed/quality balance)</li>
<li>Hardware: RTX 3090 or better recommended (T4 free tier is painfully slow)</li>
</ul>
<hr />
<h2>06 | Deployment Architecture</h2>
<p dir="auto"><strong>Product</strong>: M5Stack LLM8850<br />
<strong>Category</strong>: System Design</p>
<p dir="auto">To make this actually usable, I built a control service: <a href="https://github.com/yuyun2000/HomeAssistant-Edge" target="_blank" rel="noopener noreferrer nofollow ugc">HomeAssistant-Edge</a></p>
<p dir="auto"><strong>Pipeline:</strong></p>
<pre><code>Voice Input → VAD (Voice Activity Detection) → KWS (Keyword Spotting)  
→ ASR (Speech Recognition) → LLM (Structured Output) → HA API Call
</code></pre>
<p dir="auto"><strong>Why M5Stack LLM8850?</strong><br />
This hardware was critical to making the whole thing work:</p>
<ul>
<li>✅ <strong>Edge inference</strong>: 0.5B model runs in ~1.5 seconds.</li>
<li>✅ <strong>Complete toolchain</strong>: Supports VAD/KWS/ASR/LLM model deployment.</li>
<li>✅ <strong>Local network</strong>: No cloud dependency = better privacy + speed.</li>
<li>✅ <strong>Ecosystem</strong>: M5Stack's modular design makes client expansion trivial.</li>
</ul>
<p dir="auto"><strong>Setup:</strong></p>
<ol>
<li>Input your Home Assistant API key.</li>
<li>Load pre-trained ASR + LLM models.</li>
<li>Start local network service and wait for M5 clients to connect.</li>
</ol>
<hr />
<h2>07 | Real-World Performance &amp; Limitations</h2>
<p dir="auto"><strong>Category</strong>: Testing Results</p>
<h3>🎯 What Works</h3>
<ul>
<li><strong>Speed</strong>: Full pipeline (ASR + transmission + pause detection) in <strong>under 2 seconds</strong>.</li>
<li><strong>Accuracy</strong>: Reliable control for common devices.</li>
<li><strong>Robustness</strong>: Properly rejects invalid commands and missing devices.</li>
</ul>
<h3>⚠️ Current Limitations</h3>
<ol>
<li><strong>Lab-only validation</strong>: Limited device variety; needs real-world testing.</li>
<li><strong>Client incomplete</strong>: Only the 8850 does inference; M5 microphone modules aren't integrated yet.</li>
<li><strong>No dynamic context</strong>: System prompts are cached in memory for speed, so can't query real-time device states, time, etc.</li>
</ol>
<h3>🔧 Next Steps</h3>
<ul>
<li><strong>ASR error tolerance</strong>: Add training data for typos, homophones.</li>
<li><strong>Dynamic info injection</strong>: Refactor prompt system to support live state queries.</li>
<li><strong>Advanced features</strong>: Build datasets for timers, scene automation, etc.</li>
</ul>
<hr />
<h2>💬 Discussion</h2>
<p dir="auto">Have you tried running LLMs on edge devices for smart home control? What gotchas did you run into? Drop your experiences below—I'd love to hear how others are tackling this problem!</p>
<hr />
<h2>📌 Resources</h2>
<ul>
<li>🤗 <strong>Fine-tuned Model &amp; Dataset</strong>: <a href="https://huggingface.co/yunyu1258/qwen2.5-0.5b-ha" target="_blank" rel="noopener noreferrer nofollow ugc">https://huggingface.co/yunyu1258/qwen2.5-0.5b-ha</a></li>
<li>🛠️ <strong>Control Service Code</strong>: <a href="https://github.com/yuyun2000/HomeAssistant-Edge" target="_blank" rel="noopener noreferrer nofollow ugc">https://github.com/yuyun2000/HomeAssistant-Edge</a></li>
<li>🔧 <strong>Fine-Tuning Tool</strong>: <a href="https://github.com/hiyouga/LLaMA-Factory" target="_blank" rel="noopener noreferrer nofollow ugc">https://github.com/hiyouga/LLaMA-Factory</a></li>
<li>📖 <strong>Reference Project</strong>: <a href="https://github.com/acon96/home-llm" target="_blank" rel="noopener noreferrer nofollow ugc">https://github.com/acon96/home-llm</a></li>
<li>📚 <strong>M5Stack Docs</strong>: <a href="https://docs.m5stack.com" target="_blank" rel="noopener noreferrer nofollow ugc">https://docs.m5stack.com</a></li>
<li>🗣️ <strong>Community Forum</strong>: <a href="https://community.m5stack.com">https://community.m5stack.com</a></li>
</ul>
<hr />
<p dir="auto"><em>Note: This post shares technical experiments and gotchas from building edge-based LLM systems. All implementation details are open-sourced for the maker community.</em></p>
]]></description><link>https://community.m5stack.com/topic/7924/running-home-assistant-control-with-a-0-5b-model-i-got-the-full-pipeline-working-on-m5stack-llm8850</link><generator>RSS for Node</generator><lastBuildDate>Sat, 14 Mar 2026 08:47:18 GMT</lastBuildDate><atom:link href="https://community.m5stack.com/topic/7924.rss" rel="self" type="application/rss+xml"/><pubDate>Tue, 02 Dec 2025 06:45:25 GMT</pubDate><ttl>60</ttl></channel></rss>