Add llama-server and open-webui apps for local LLM inference

- llama-server: llama.cpp REST API server, 8G memory, port 8080 - open-webui: Chat UI connecting to llama-server, 2G memory, port 3000 - Both include x-casaos metadata for ZimaOS app store - README with model download instructions and API examples
2026-04-19 22:25:22 +02:00
parent 231aba08b0
commit 0aabfc8a72
4 changed files with 309 additions and 0 deletions
@@ -0,0 +1,73 @@
+# OpenWebUI
+
+Modern chat web interface for local LLMs. Connects to llama-server via Docker internal networking.
+
+## Purpose
+
+- **Port**: 3000 (TCP)
+- **Memory**: 2G reservation
+- **Category**: AI / LLM UI
+
+Requires the **llama-server** app to be running first. Connects to `http://llama-server:8080` internally.
+
+## Prerequisites
+
+1. Deploy and start **llama-server** app first
+2. Download a GGUF model into llama-server's `/models` directory
+3. Ensure llama-server container is healthy
+
+## Access
+
+Open in browser:
+
+```
+http://<your-zimaos-host>:3000
+```
+
+First run may take a moment to initialize.
+
+## Environment Variables
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `OLLAMA_BASE_URL` | `http://llama-server:8080` | Internal URL to llama-server API |
+| `WEBUI_PORT` | `3000` | Container listen port |
+| `TZ` | `Europe/Stockholm` | Timezone |
+
+## If Connection Fails
+
+1. Verify llama-server is running: `docker ps | grep llama-server`
+2. Check llama-server logs: `docker logs llama-server`
+3. Ensure llama-server MODEL env matches your downloaded file
+4. From ZimaOS shell, test connectivity:
+   ```bash
+   curl http://llama-server:8080/v1/models
+   ```
+
+## Volumes
+
+| Path | Description |
+|------|-------------|
+| `/app/backend/data` | OpenWebUI persistent data (chat history, settings) |
+
+## Architecture
+
+- `amd64` (Intel/AMD x86_64)
+- `arm64` (Apple Silicon, ARM servers)
+
+## Security
+
+- `security_opt: no-new-privileges:true`
+- `cap_drop: ALL`
+
+## Troubleshooting
+
+**"Cannot connect to LLM" error in UI**
+- Verify llama-server is running before open-webui
+- Check that `OLLAMA_BASE_URL` is set to `http://llama-server:8080`
+- Verify model file exists in `/DATA/AppData/llama-server/models/`
+
+**Slow responses**
+- 7B models on CPU are limited by single-thread performance
+- 3B models recommended for interactive speeds (~15+ tok/s)
+- Close other apps to free RAM
@@ -0,0 +1,68 @@
+name: open-webui
+
+services:
+  open-webui:
+    image: ghcr.io/open-webui/open-webui:main
+    container_name: open-webui
+    restart: unless-stopped
+    environment:
+      TZ: Europe/Stockholm
+      OLLAMA_BASE_URL: http://llama-server:8080
+      WEBUI_PORT: "3000"
+    ports:
+      - target: 3000
+        published: "3000"
+        protocol: tcp
+    volumes:
+      - type: bind
+        source: /DATA/AppData/$AppID/data
+        target: /app/backend/data
+    deploy:
+      resources:
+        reservations:
+          memory: 2G
+    depends_on:
+      - llama-server
+    security_opt:
+      - no-new-privileges:true
+    cap_drop:
+      - ALL
+    x-casaos:
+      envs:
+        - container: OLLAMA_BASE_URL
+          description:
+            en_us: Internal URL to llama-server API (http://llama-server:8080)
+        - container: WEBUI_PORT
+          description:
+            en_us: Web UI listen port inside container
+        - container: TZ
+          description:
+            en_us: Timezone, for example Europe/Stockholm
+      ports:
+        - container: "3000"
+          description:
+            en_us: OpenWebUI web interface port
+      volumes:
+        - container: /app/backend/data
+          description:
+            en_us: OpenWebUI persistent data (chat history, settings)
+
+x-casaos:
+  architectures:
+    - amd64
+    - arm64
+  main: open-webui
+  category: ai
+  author: Joachim Friberg
+  developer: Joachim Friberg
+  icon: https://cdn.simpleicons.org/webui
+  tagline:
+    en_us: Modern chat UI for local LLMs
+  description:
+    en_us: >
+      OpenWebUI provides a modern, feature-rich web interface for interacting with local LLMs.
+      Connect to llama-server or any OpenAI-compatible API. Requires llama-server app to be running first.
+  title:
+    en_us: OpenWebUI
+  index: /
+  port_map: "3000"