diff --git a/Apps/llama-server/README.md b/Apps/llama-server/README.md index 18cb37d..e8c632a 100644 --- a/Apps/llama-server/README.md +++ b/Apps/llama-server/README.md @@ -2,6 +2,8 @@ Local LLM inference server using llama.cpp. Serves GGUF models via OpenAI-compatible REST API. +**Image**: `ghcr.io/ggml-org/llama.cpp:server-b8840` (CPU-only, AVX2/AVX512) + ## Purpose - **Port**: 8080 (TCP) diff --git a/Apps/llama-server/docker-compose.yaml b/Apps/llama-server/docker-compose.yaml index 6cc2126..d2557a7 100644 --- a/Apps/llama-server/docker-compose.yaml +++ b/Apps/llama-server/docker-compose.yaml @@ -2,7 +2,7 @@ name: llama-server services: llama-server: - image: ghcr.io/ggerganov/llama.cpp:server + image: ghcr.io/ggml-org/llama.cpp:server-b8840@sha256:99d2554c4c8d5339649dde530056cf10771823d7cd983dbd0441da9c419976b1 container_name: llama-server restart: unless-stopped environment: