Files
zima-apps/Apps/open-webui
Joachim Friberg 0aabfc8a72 Add llama-server and open-webui apps for local LLM inference
- llama-server: llama.cpp REST API server, 8G memory, port 8080
- open-webui: Chat UI connecting to llama-server, 2G memory, port 3000
- Both include x-casaos metadata for ZimaOS app store
- README with model download instructions and API examples
2026-04-19 22:25:22 +02:00
..

OpenWebUI

Modern chat web interface for local LLMs. Connects to llama-server via Docker internal networking.

Purpose

  • Port: 3000 (TCP)
  • Memory: 2G reservation
  • Category: AI / LLM UI

Requires the llama-server app to be running first. Connects to http://llama-server:8080 internally.

Prerequisites

  1. Deploy and start llama-server app first
  2. Download a GGUF model into llama-server's /models directory
  3. Ensure llama-server container is healthy

Access

Open in browser:

http://<your-zimaos-host>:3000

First run may take a moment to initialize.

Environment Variables

Variable Default Description
OLLAMA_BASE_URL http://llama-server:8080 Internal URL to llama-server API
WEBUI_PORT 3000 Container listen port
TZ Europe/Stockholm Timezone

If Connection Fails

  1. Verify llama-server is running: docker ps | grep llama-server
  2. Check llama-server logs: docker logs llama-server
  3. Ensure llama-server MODEL env matches your downloaded file
  4. From ZimaOS shell, test connectivity:
    curl http://llama-server:8080/v1/models
    

Volumes

Path Description
/app/backend/data OpenWebUI persistent data (chat history, settings)

Architecture

  • amd64 (Intel/AMD x86_64)
  • arm64 (Apple Silicon, ARM servers)

Security

  • security_opt: no-new-privileges:true
  • cap_drop: ALL

Troubleshooting

"Cannot connect to LLM" error in UI

  • Verify llama-server is running before open-webui
  • Check that OLLAMA_BASE_URL is set to http://llama-server:8080
  • Verify model file exists in /DATA/AppData/llama-server/models/

Slow responses

  • 7B models on CPU are limited by single-thread performance
  • 3B models recommended for interactive speeds (~15+ tok/s)
  • Close other apps to free RAM