Add llama-server and open-webui apps for local LLM inference

- llama-server: llama.cpp REST API server, 8G memory, port 8080
- open-webui: Chat UI connecting to llama-server, 2G memory, port 3000
- Both include x-casaos metadata for ZimaOS app store
- README with model download instructions and API examples
This commit is contained in:
Joachim Friberg
2026-04-19 22:25:22 +02:00
parent 231aba08b0
commit 0aabfc8a72
4 changed files with 309 additions and 0 deletions
+73
View File
@@ -0,0 +1,73 @@
# OpenWebUI
Modern chat web interface for local LLMs. Connects to llama-server via Docker internal networking.
## Purpose
- **Port**: 3000 (TCP)
- **Memory**: 2G reservation
- **Category**: AI / LLM UI
Requires the **llama-server** app to be running first. Connects to `http://llama-server:8080` internally.
## Prerequisites
1. Deploy and start **llama-server** app first
2. Download a GGUF model into llama-server's `/models` directory
3. Ensure llama-server container is healthy
## Access
Open in browser:
```
http://<your-zimaos-host>:3000
```
First run may take a moment to initialize.
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `OLLAMA_BASE_URL` | `http://llama-server:8080` | Internal URL to llama-server API |
| `WEBUI_PORT` | `3000` | Container listen port |
| `TZ` | `Europe/Stockholm` | Timezone |
## If Connection Fails
1. Verify llama-server is running: `docker ps | grep llama-server`
2. Check llama-server logs: `docker logs llama-server`
3. Ensure llama-server MODEL env matches your downloaded file
4. From ZimaOS shell, test connectivity:
```bash
curl http://llama-server:8080/v1/models
```
## Volumes
| Path | Description |
|------|-------------|
| `/app/backend/data` | OpenWebUI persistent data (chat history, settings) |
## Architecture
- `amd64` (Intel/AMD x86_64)
- `arm64` (Apple Silicon, ARM servers)
## Security
- `security_opt: no-new-privileges:true`
- `cap_drop: ALL`
## Troubleshooting
**"Cannot connect to LLM" error in UI**
- Verify llama-server is running before open-webui
- Check that `OLLAMA_BASE_URL` is set to `http://llama-server:8080`
- Verify model file exists in `/DATA/AppData/llama-server/models/`
**Slow responses**
- 7B models on CPU are limited by single-thread performance
- 3B models recommended for interactive speeds (~15+ tok/s)
- Close other apps to free RAM
+68
View File
@@ -0,0 +1,68 @@
name: open-webui
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
environment:
TZ: Europe/Stockholm
OLLAMA_BASE_URL: http://llama-server:8080
WEBUI_PORT: "3000"
ports:
- target: 3000
published: "3000"
protocol: tcp
volumes:
- type: bind
source: /DATA/AppData/$AppID/data
target: /app/backend/data
deploy:
resources:
reservations:
memory: 2G
depends_on:
- llama-server
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
x-casaos:
envs:
- container: OLLAMA_BASE_URL
description:
en_us: Internal URL to llama-server API (http://llama-server:8080)
- container: WEBUI_PORT
description:
en_us: Web UI listen port inside container
- container: TZ
description:
en_us: Timezone, for example Europe/Stockholm
ports:
- container: "3000"
description:
en_us: OpenWebUI web interface port
volumes:
- container: /app/backend/data
description:
en_us: OpenWebUI persistent data (chat history, settings)
x-casaos:
architectures:
- amd64
- arm64
main: open-webui
category: ai
author: Joachim Friberg
developer: Joachim Friberg
icon: https://cdn.simpleicons.org/webui
tagline:
en_us: Modern chat UI for local LLMs
description:
en_us: >
OpenWebUI provides a modern, feature-rich web interface for interacting with local LLMs.
Connect to llama-server or any OpenAI-compatible API. Requires llama-server app to be running first.
title:
en_us: OpenWebUI
index: /
port_map: "3000"