LLM module. how to use another llm model ?
-
Hi Everyone,
I am new to the module.
Want to switch llm model as new are available
( ex: llama3.2-1b-prefill-ax630c or qwen2.5-1.5b-ax630c )
in short can't succed to load other modelAny suggestions ? or pointing to the proper documentation ? ( did not find any topic regarding changing model via Arduino )
What I did so far ?
- log into the llm module via serial
- ip a
- connect via ssh root@ip (via the ethernet compagnon board )
- load my ssh public key and then ssh
- proceed succesfully to install other models via apt-get install xxx
- reboot (just in case )
Then test via serial text ( via a M5STACK core grey, with a simple forward serial > serial2 app)
the sequence :reset :
{ "request_id": "11212155", "work_id": "sys", "action": "reset" }
{"created":1746310691,"data":"None","error":{"code":0,"message":"llm server restarting ..."},"object":"None","request_id":"11212155","work_id":"sys"}
{"request_id": "0","work_id": "sys","created": 1746310696,"error":{"code":0, "message":"reset over"}}
then...load model :
{ "request_id": "3", "work_id": "llm", "action": "setup","object": "llm.setup", "data": { "model": "qwen2.5-1.5b-ax630c", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
{"created":1746310710,"data":"None","error":{"code":-5,"message":"Model loading failed."},"object":"None","request_id":"3","work_id":"llm"}but it works with...
{ "request_id": "3", "work_id": "llm", "action": "setup", "object": "llm.setup", "data": { "model": "qwen2.5-0.5B-prefill-20e", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
{"created":1746310813,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"3","work_id":"llm.1004"} -
@erictiquet
have you check this:
https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html -
Hello Kuriko, Everyone,
Found the solutions ;)
The best help came from :
- chat.m5stack.com ( support you guys for making it stable )
- chatgtp to write a little arduino ino code that display a web page. ( you've got the code below )
- and the following page for the llm json syntax : ( that's what is exchange in the dialog )
https://github.com/m5stack/StackFlow/blob/main/doc/projects_llm_framework_doc/llm_llm_en.md
** First install the new models, they should appear in /opt/m5stack/data/
Connect via SSH: (as a normal linux server)
- require that you plug the RJ45 and access the debug port via serial and type "ip a " to get the IP, or that you can sniff your dhcp server
- to be safe and ease the work, suggest you upload your ssh-key on the llm module ( ssh-copy-id )
- the default login is root@<your ip> and password "123456", change it after loading you key successfully.
**Then install the new models : (for example)
apt-get install llm-model-llama3.2-1b-prefill-ax630c llm-model-qwen2.5-1.5b-p256-ax630cyou should see something like :
root@m5stack-LLM:/# ls -la /opt/m5stack/data
total 68
drwxrwxr-x 17 root root 4096 May 4 07:23 .
drwxrwxr-x 7 root root 4096 Feb 20 21:24 ..
drwxrwxr-x 2 root root 4096 Dec 5 17:03 audio
drwxrwxr-x 3 1000 1000 4096 May 4 04:48 llama3.2-1B-prefill-ax630c
drwxrwxr-x 2 root root 4096 Dec 5 17:03 melotts_zh-cn
drwxrwxr-x 2 root root 4096 May 4 07:25 models
drwxrwxr-x 2 1000 1000 4096 May 4 05:50 qwen2.5-0.5B-prefill-20e
drwxr-xr-x 3 1000 1000 4096 May 4 04:49 qwen2.5-1.5B-p256-ax630c
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-20M-2023-02-17
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01
drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_english_fast
drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_fast
drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n
drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-pose
drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-segwatch out the mmc space with "df" command", with 2 more models you are reaching 74% of the avail space.
(another topic to tackle how to use and sdcard for additional storage space.. )root@m5stack-LLM:/# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 29289340 21660980 7611976 74% /
tmpfs 490876 0 490876 0% /dev/shm
tmpfs 196352 876 195476 1% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 490876 0 490876 0% /tmp
/dev/mmcblk1p1 30554112 3424 30550688 1% /mnt/mmcblk1p1
tmpfs 98172 0 98172 0% /run/user/0
root@m5stack-LLM:/#Then to use it, just name the llm name with the name of the model install in the folder
ex : llama3.2-1B-prefill-ax630c
To play with the model you could use ino page and enter the following json :{
"request_id": "2",
"work_id": "llm",
"action": "setup",
"object": "llm.setup",
"data": {
"model": "llama3.2-1B-prefill-ax630c",
"response_format": "llm.utf-8.stream",
"input": "llm.utf-8",
"enoutput": true,
"max_token_len": 256,
"prompt": "You are a helpful AI assistant."
}
}should receive the following return code like :
{"created":1746846795,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"2","work_id":"llm.1004"}
pick the last value like "llm.xxxx" and create a prompt :
{
"request_id": "2",
"work_id": "llm.xxxx",
"action": "inference",
"object": "llm.utf-8.stream",
"data": {
"delta": "What's ur name?",
"index": 0,
"finish": true
}
}then you will see something like... :
{"created":1746846972,"data":{"delta":"I'm an","finish":false,"index":0},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846973,"data":{"delta":" artificial intelligence model","finish":false,"index":1},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846974,"data":{"delta":" known as L","finish":false,"index":2},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846974,"data":{"delta":"lama. L","finish":false,"index":3},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846975,"data":{"delta":"lama stands for","finish":false,"index":4},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846976,"data":{"delta":" "Large Language","finish":false,"index":5},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846976,"data":{"delta":" Model Meta AI","finish":false,"index":6},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846977,"data":{"delta":"."","finish":false,"index":7},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846977,"data":{"delta":"","finish":true,"index":8},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}et voilà.
-
the .ino code I used :
#include <Arduino.h>
#include <M5Unified.h>
#include <WiFi.h>
#include <WebServer.h>// ⚙️ Configuration WiFi
const char* ssid = "you ssid";
const char* password = "you password";// UART2 pour le module LLM
HardwareSerial LLM(2); // GPIO16 = RX, GPIO17 = TXWebServer server(80);
String lastSerialMessage = "";// 🔐 Encodage simple pour éviter les problèmes d’affichage HTML
String htmlEscape(String text) {
text.replace("&", "&");
text.replace("<", "<");
text.replace(">", ">");
text.replace(""", """);
text.replace("'", "'");
return text;
}// 💻 Page HTML dynamique
String getHTMLPage() {
String html = R"rawliteral(
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>M5Core Web Serial</title>
</head>
<body>
<h1>M5Core Web Serial</h1>
<textarea id="msg" rows="15" cols="70" placeholder="Votre message ici..."></textarea><br>
<button onclick="sendMessage()">Envoyer</button>
<button onclick="clearOutput()">Effacer</button>
<p><strong>Réponse série :</strong></p>
<pre id="lastMessage" style="background:#eee; padding:10px; border:1px solid #ccc;"></pre><script> function sendMessage() { const msg = document.getElementById("msg").value; fetch("/send", { method: "POST", headers: { "Content-Type": "application/x-www-form-urlencoded" }, body: "msg=" + encodeURIComponent(msg) }).then(response => response.text()) .then(text => { document.getElementById("lastMessage").innerText = text; }); } function clearOutput() { fetch("/clear").then(r => r.text()).then(txt => { document.getElementById("lastMessage").innerText = ""; }); } setInterval(() => { fetch("/last").then(r => r.text()).then(txt => { document.getElementById("lastMessage").innerText = txt; }); }, 2000); </script> </body> </html>
)rawliteral";
html.replace("%LAST_MESSAGE%", htmlEscape(lastSerialMessage));
return html;
}void handleRoot() {
server.send(200, "text/html", getHTMLPage());
}void handleSend() {
if (server.hasArg("msg")) {
String msg = server.arg("msg");
LLM.println(msg);
lastSerialMessage = "Envoyé : " + msg;
server.send(200, "text/plain", "Envoyé : " + msg);
} else {
server.send(400, "text/plain", "Argument 'msg' manquant");
}
}void handleLast() {
server.send(200, "text/plain", lastSerialMessage);
}void handleClear() {
lastSerialMessage = "";
server.send(200, "text/plain", "Effacé");
}void setup() {
M5.begin();
M5.Lcd.setTextSize(2);
M5.Lcd.println("Initialisation...");Serial.begin(115200); // PC USB
LLM.begin(115200, SERIAL_8N1, 16, 17); // RX2, TX2WiFi.begin(ssid, password);
M5.Lcd.print("Connexion WiFi");
while (WiFi.status() != WL_CONNECTED) {
delay(500);
M5.Lcd.print(".");
}M5.Lcd.println("\nConnecté");
M5.Lcd.println(WiFi.localIP());server.on("/", handleRoot);
server.on("/send", HTTP_POST, handleSend);
server.on("/last", handleLast);
server.on("/clear", handleClear);
server.begin();
M5.Lcd.println("Serveur web actif !");
}void loop() {
server.handleClient();if (Serial.available()) {
char c = Serial.read();
LLM.write(c);
Serial.print(c);
}if (LLM.available()) {
char c = LLM.read();
Serial.print(c);
lastSerialMessage += c;
M5.Lcd.print(c);if (lastSerialMessage.length() > 20000) { lastSerialMessage = lastSerialMessage.substring(lastSerialMessage.length() - 20000); }
}
} -