Open. Auditable. Self-hostable.
SoverIQ Core is the open-source foundation on which SoverIQ Stack, Cloud, and Box are built. If you need complete transparency across every layer of your AI infrastructure – or want to integrate your own extensions deep into the stack – Core is where you start.
No proprietary core. No hidden code. No trust required.
What is SoverIQ Core?
SoverIQ Core is a curated, production-ready deployment of open-source components for self-hosted generative AI. It brings all the building blocks together – from model inference to vector database to user interface – and connects them into an operational system.
The stack is designed for Kubernetes, fully configurable via Helm, and equipped with standard observability tooling.
Components and Open-Source Tools
Model Inference
| Component | Tool | Description |
|---|---|---|
| LLM Runtime | Ollama | Local model inference for CPU and GPU, OpenAI-compatible API |
| Inference Backend (alternative) | vLLM | High-throughput inference for GPU clusters, PagedAttention |
| Model Hub | Hugging Face Hub | Model sourcing; models are mirrored locally, no ongoing external access required |
Supported models (selection): Llama 3, Mistral, Phi-3, Gemma 2, Qwen 2.5 – all available in quantised variants for CPU deployments.
RAG Engine and Knowledge Base
| Component | Tool | Description |
|---|---|---|
| Vector Database | Qdrant | High-performance vector store, fully on-premise, Rust-based |
| Embedding Models | nomic-embed-text / multilingual-e5 | Locally running embedding models, GDPR-compliant |
| RAG Framework | LangChain / LlamaIndex | Document processing, chunking, retrieval pipelines |
| Document Processing | Unstructured | PDF, DOCX, XLSX, HTML, email – extraction and normalisation |
API Gateway and Orchestration
| Component | Tool | Description |
|---|---|---|
| API Gateway | LiteLLM Proxy | OpenAI-compatible proxy, model routing, rate limiting, usage tracking |
| Workflow Engine | n8n | Low-code automation for AI workflows and data pipelines |
| Authentication | Keycloak | Identity provider, SAML 2.0, OIDC, RBAC |
User Interface
| Component | Tool | Description |
|---|---|---|
| Chat Interface | Open WebUI | Full chat UI, RAG integration, model selection, conversation history |
| Admin Interface | SoverIQ Admin (proprietary development, MIT-licensed) | User management, model deployment, audit log view |
Observability and Operations
| Component | Tool | Description |
|---|---|---|
| Metrics | Prometheus + Grafana | Inference latency, token throughput, GPU utilisation, API error rate |
| Logging | Loki | Log aggregation, structured logging across all components |
| Tracing | OpenTelemetry + Tempo | Distributed tracing for RAG pipelines and API calls |
| Alerting | Alertmanager (Prometheus stack) | Alerts for GPU failures, model downtime, queue depth |
Repositories
SoverIQ Core consists of several repositories:
github.com/soveriq/
├── core # Helm charts, Kubernetes manifests, configuration reference
├── admin # SoverIQ Admin UI (React, MIT licence)
├── connector-framework # Connectors for SAP, SharePoint, DATEV, REST
├── rag-pipelines # Document processing and retrieval pipelines
└── deployment-examples # Example deployments: bare metal, K3s, EKS, Hetzner
The repositories are currently being prepared for public release. Register for early access.
Kubernetes Deployment
SoverIQ Core is delivered entirely via Helm Charts. The deployment is designed for standard Kubernetes clusters (K3s, RKE2, EKS, GKE EU region, Hetzner K8s).
Prerequisites
Kubernetes >= 1.28
Helm >= 3.12
Storage ReadWriteOnce PVC (min. 100 GB for models)
RAM min. 16 GB (CPU-only), min. 32 GB (with GPU)
GPU optional: NVIDIA, CUDA 12.x (for GPU inference)
Quick Start
# Add the repo
helm repo add soveriq https://charts.soveriq.ai
helm repo update
# Create namespace
kubectl create namespace soveriq
# Customise values file
helm show values soveriq/core > values.yaml
# → edit values.yaml: model, storage, auth, domain
# Deploy
helm install soveriq-core soveriq/core \
--namespace soveriq \
--values values.yaml
# Check status
kubectl get pods -n soveriq
What gets deployed?
soveriq-core/
├── ollama # Model inference (StatefulSet + PVC)
├── qdrant # Vector database (StatefulSet + PVC)
├── litellm # API gateway (Deployment)
├── open-webui # Chat interface (Deployment)
├── keycloak # Identity provider (StatefulSet)
├── n8n # Workflow engine (Deployment)
├── prometheus-stack # Metrics + Grafana + Loki (optional)
└── soveriq-admin # Admin UI (Deployment)
All components can be individually enabled and configured via values.yaml. If you already run a Keycloak instance, for example, simply disable the internal Keycloak component.
Resource Profiles
| Profile | Description | Minimum Hardware |
|---|---|---|
| minimal | CPU-only, small model (Phi-3 Mini, Gemma 2B) | 4 vCPU, 16 GB RAM, 50 GB SSD |
| standard | CPU-only, medium model (Llama 3 8B quantised) | 8 vCPU, 32 GB RAM, 200 GB SSD |
| gpu-single | Single NVIDIA GPU, large model (Llama 3 70B quantised) | 16 vCPU, 64 GB RAM, 1× A10G or RTX 4090 |
| gpu-cluster | Multiple GPUs, high-throughput operation (vLLM) | by requirement |
Licence
SoverIQ Core is released under the Apache 2.0 licence. Commercial use, modification, and redistribution are explicitly permitted.
For organisations running SoverIQ Core in production that need professional support, SLAs, or a managed extension (Stack, Cloud, Box), we offer commercial arrangements. Talk to us.
Community and Contributions
- GitHub Discussions: questions, ideas, field reports
- Issues: bug reports and feature requests
- Pull Requests: welcome – contribution guide in the repository
Early Access: SoverIQ Core is shortly before its first public release. If you want to be early – as a tester, contributor, or an organisation looking to run this in production – we’d love to hear from you.