🤖Have you ever tried Chat.M5Stack.com before asking??😎
    M5Stack Community
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    LLM module. how to use another llm model ?

    Modules
    2
    5
    279
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      erictiquet
      last edited by

      Hi Everyone,

      I am new to the module.
      Want to switch llm model as new are available
      ( ex: llama3.2-1b-prefill-ax630c or qwen2.5-1.5b-ax630c )
      in short can't succed to load other model

      Any suggestions ? or pointing to the proper documentation ? ( did not find any topic regarding changing model via Arduino )


      What I did so far ?

      • log into the llm module via serial
      • ip a
      • connect via ssh root@ip (via the ethernet compagnon board )
      • load my ssh public key and then ssh
      • proceed succesfully to install other models via apt-get install xxx
      • reboot (just in case )

      Then test via serial text ( via a M5STACK core grey, with a simple forward serial > serial2 app)
      the sequence :

      reset :

      { "request_id": "11212155", "work_id": "sys", "action": "reset" }
      {"created":1746310691,"data":"None","error":{"code":0,"message":"llm server restarting ..."},"object":"None","request_id":"11212155","work_id":"sys"}
      {"request_id": "0","work_id": "sys","created": 1746310696,"error":{"code":0, "message":"reset over"}}
      then...

      load model :

      { "request_id": "3", "work_id": "llm", "action": "setup","object": "llm.setup", "data": { "model": "qwen2.5-1.5b-ax630c", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
      {"created":1746310710,"data":"None","error":{"code":-5,"message":"Model loading failed."},"object":"None","request_id":"3","work_id":"llm"}

      but it works with...

      { "request_id": "3", "work_id": "llm", "action": "setup", "object": "llm.setup", "data": { "model": "qwen2.5-0.5B-prefill-20e", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
      {"created":1746310813,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"3","work_id":"llm.1004"}

      1 Reply Last reply Reply Quote 0
      • kurikoK
        kuriko
        last edited by

        @erictiquet
        have you check this:
        https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html

        Good morning, and welcome to the Black Mesa Transit System.

        1 Reply Last reply Reply Quote 0
        • E
          erictiquet
          last edited by

          Hello Kuriko, Everyone,

          Found the solutions ;)

          The best help came from :

          • chat.m5stack.com ( support you guys for making it stable )
          • chatgtp to write a little arduino ino code that display a web page. ( you've got the code below )
          • and the following page for the llm json syntax : ( that's what is exchange in the dialog )
            https://github.com/m5stack/StackFlow/blob/main/doc/projects_llm_framework_doc/llm_llm_en.md

          ** First install the new models, they should appear in /opt/m5stack/data/

          Connect via SSH: (as a normal linux server)

          • require that you plug the RJ45 and access the debug port via serial and type "ip a " to get the IP, or that you can sniff your dhcp server
          • to be safe and ease the work, suggest you upload your ssh-key on the llm module ( ssh-copy-id )
          • the default login is root@<your ip> and password "123456", change it after loading you key successfully.

          **Then install the new models : (for example)
          apt-get install llm-model-llama3.2-1b-prefill-ax630c llm-model-qwen2.5-1.5b-p256-ax630c

          you should see something like :

          root@m5stack-LLM:/# ls -la /opt/m5stack/data
          total 68
          drwxrwxr-x 17 root root 4096 May 4 07:23 .
          drwxrwxr-x 7 root root 4096 Feb 20 21:24 ..
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 audio
          drwxrwxr-x 3 1000 1000 4096 May 4 04:48 llama3.2-1B-prefill-ax630c
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 melotts_zh-cn
          drwxrwxr-x 2 root root 4096 May 4 07:25 models
          drwxrwxr-x 2 1000 1000 4096 May 4 05:50 qwen2.5-0.5B-prefill-20e
          drwxr-xr-x 3 1000 1000 4096 May 4 04:49 qwen2.5-1.5B-p256-ax630c
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-20M-2023-02-17
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_english_fast
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_fast
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-pose
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-seg

          watch out the mmc space with "df" command", with 2 more models you are reaching 74% of the avail space.
          (another topic to tackle how to use and sdcard for additional storage space.. )

          root@m5stack-LLM:/# df
          Filesystem 1K-blocks Used Available Use% Mounted on
          /dev/root 29289340 21660980 7611976 74% /
          tmpfs 490876 0 490876 0% /dev/shm
          tmpfs 196352 876 195476 1% /run
          tmpfs 5120 0 5120 0% /run/lock
          tmpfs 490876 0 490876 0% /tmp
          /dev/mmcblk1p1 30554112 3424 30550688 1% /mnt/mmcblk1p1
          tmpfs 98172 0 98172 0% /run/user/0
          root@m5stack-LLM:/#

          Then to use it, just name the llm name with the name of the model install in the folder
          ex : llama3.2-1B-prefill-ax630c
          To play with the model you could use ino page and enter the following json :

          {
          "request_id": "2",
          "work_id": "llm",
          "action": "setup",
          "object": "llm.setup",
          "data": {
          "model": "llama3.2-1B-prefill-ax630c",
          "response_format": "llm.utf-8.stream",
          "input": "llm.utf-8",
          "enoutput": true,
          "max_token_len": 256,
          "prompt": "You are a helpful AI assistant."
          }
          }

          should receive the following return code like :

          {"created":1746846795,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"2","work_id":"llm.1004"}

          pick the last value like "llm.xxxx" and create a prompt :

          {
          "request_id": "2",
          "work_id": "llm.xxxx",
          "action": "inference",
          "object": "llm.utf-8.stream",
          "data": {
          "delta": "What's ur name?",
          "index": 0,
          "finish": true
          }
          }

          then you will see something like... :

          {"created":1746846972,"data":{"delta":"I'm an","finish":false,"index":0},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846973,"data":{"delta":" artificial intelligence model","finish":false,"index":1},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846974,"data":{"delta":" known as L","finish":false,"index":2},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846974,"data":{"delta":"lama. L","finish":false,"index":3},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846975,"data":{"delta":"lama stands for","finish":false,"index":4},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846976,"data":{"delta":" "Large Language","finish":false,"index":5},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846976,"data":{"delta":" Model Meta AI","finish":false,"index":6},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846977,"data":{"delta":"."","finish":false,"index":7},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846977,"data":{"delta":"","finish":true,"index":8},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}

          et voilà.

          1 Reply Last reply Reply Quote 0
          • E
            erictiquet
            last edited by

            the .ino code I used :

            #include <Arduino.h>
            #include <M5Unified.h>
            #include <WiFi.h>
            #include <WebServer.h>

            // ⚙️ Configuration WiFi
            const char* ssid = "you ssid";
            const char* password = "you password";

            // UART2 pour le module LLM
            HardwareSerial LLM(2); // GPIO16 = RX, GPIO17 = TX

            WebServer server(80);
            String lastSerialMessage = "";

            // 🔐 Encodage simple pour éviter les problèmes d’affichage HTML
            String htmlEscape(String text) {
            text.replace("&", "&");
            text.replace("<", "<");
            text.replace(">", ">");
            text.replace(""", """);
            text.replace("'", "'");
            return text;
            }

            // 💻 Page HTML dynamique
            String getHTMLPage() {
            String html = R"rawliteral(
            <!DOCTYPE html>
            <html>
            <head>
            <meta charset="UTF-8">
            <title>M5Core Web Serial</title>
            </head>
            <body>
            <h1>M5Core Web Serial</h1>
            <textarea id="msg" rows="15" cols="70" placeholder="Votre message ici..."></textarea><br>
            <button onclick="sendMessage()">Envoyer</button>
            <button onclick="clearOutput()">Effacer</button>
            <p><strong>Réponse série :</strong></p>
            <pre id="lastMessage" style="background:#eee; padding:10px; border:1px solid #ccc;"></pre>

              <script>
                function sendMessage() {
                  const msg = document.getElementById("msg").value;
                  fetch("/send", {
                    method: "POST",
                    headers: { "Content-Type": "application/x-www-form-urlencoded" },
                    body: "msg=" + encodeURIComponent(msg)
                  }).then(response => response.text())
                    .then(text => {
                      document.getElementById("lastMessage").innerText = text;
                    });
                }
            
                function clearOutput() {
                  fetch("/clear").then(r => r.text()).then(txt => {
                    document.getElementById("lastMessage").innerText = "";
                  });
                }
            
                setInterval(() => {
                  fetch("/last").then(r => r.text()).then(txt => {
                    document.getElementById("lastMessage").innerText = txt;
                  });
                }, 2000);
              </script>
            </body>
            </html>
            

            )rawliteral";
            html.replace("%LAST_MESSAGE%", htmlEscape(lastSerialMessage));
            return html;
            }

            void handleRoot() {
            server.send(200, "text/html", getHTMLPage());
            }

            void handleSend() {
            if (server.hasArg("msg")) {
            String msg = server.arg("msg");
            LLM.println(msg);
            lastSerialMessage = "Envoyé : " + msg;
            server.send(200, "text/plain", "Envoyé : " + msg);
            } else {
            server.send(400, "text/plain", "Argument 'msg' manquant");
            }
            }

            void handleLast() {
            server.send(200, "text/plain", lastSerialMessage);
            }

            void handleClear() {
            lastSerialMessage = "";
            server.send(200, "text/plain", "Effacé");
            }

            void setup() {
            M5.begin();
            M5.Lcd.setTextSize(2);
            M5.Lcd.println("Initialisation...");

            Serial.begin(115200); // PC USB
            LLM.begin(115200, SERIAL_8N1, 16, 17); // RX2, TX2

            WiFi.begin(ssid, password);
            M5.Lcd.print("Connexion WiFi");
            while (WiFi.status() != WL_CONNECTED) {
            delay(500);
            M5.Lcd.print(".");
            }

            M5.Lcd.println("\nConnecté");
            M5.Lcd.println(WiFi.localIP());

            server.on("/", handleRoot);
            server.on("/send", HTTP_POST, handleSend);
            server.on("/last", handleLast);
            server.on("/clear", handleClear);
            server.begin();
            M5.Lcd.println("Serveur web actif !");
            }

            void loop() {
            server.handleClient();

            if (Serial.available()) {
            char c = Serial.read();
            LLM.write(c);
            Serial.print(c);
            }

            if (LLM.available()) {
            char c = LLM.read();
            Serial.print(c);
            lastSerialMessage += c;
            M5.Lcd.print(c);

            if (lastSerialMessage.length() > 20000) {
              lastSerialMessage = lastSerialMessage.substring(lastSerialMessage.length() - 20000);
            }
            

            }
            }

            1 Reply Last reply Reply Quote 0
            • E
              erictiquet
              last edited by

              screen capture M5core Webserial.png

              1 Reply Last reply Reply Quote 0
              • First post
                Last post