SoverIQ Core

Home /
Product /
SoverIQ Core

Open. Auditable. Self-hostable.

SoverIQ Core is the open-source foundation on which SoverIQ Stack, Cloud, and Box are built. If you need complete transparency across every layer of your AI infrastructure – or want to integrate your own extensions deep into the stack – Core is where you start.

No proprietary core. No hidden code. No trust required.

What is SoverIQ Core?

SoverIQ Core is a curated, production-ready deployment of open-source components for self-hosted generative AI. It brings all the building blocks together – from model inference to vector database to user interface – and connects them into an operational system.

The stack is designed for Kubernetes, fully configurable via Helm, and equipped with standard observability tooling.

Components and Open-Source Tools

Model Inference

Component	Tool	Description
LLM Runtime	Ollama	Local model inference for CPU and GPU, OpenAI-compatible API
Inference Backend (alternative)	vLLM	High-throughput inference for GPU clusters, PagedAttention
Model Hub	Hugging Face Hub	Model sourcing; models are mirrored locally, no ongoing external access required

Supported models (selection): Llama 3, Mistral, Phi-3, Gemma 2, Qwen 2.5 – all available in quantised variants for CPU deployments.

RAG Engine and Knowledge Base

Component	Tool	Description
Vector Database	Qdrant	High-performance vector store, fully on-premise, Rust-based
Embedding Models	nomic-embed-text / multilingual-e5	Locally running embedding models, GDPR-compliant
RAG Framework	LangChain / LlamaIndex	Document processing, chunking, retrieval pipelines
Document Processing	Unstructured	PDF, DOCX, XLSX, HTML, email – extraction and normalisation

API Gateway and Orchestration

Component	Tool	Description
API Gateway	LiteLLM Proxy	OpenAI-compatible proxy, model routing, rate limiting, usage tracking
Workflow Engine	n8n	Low-code automation for AI workflows and data pipelines
Authentication	Keycloak	Identity provider, SAML 2.0, OIDC, RBAC

User Interface

Component	Tool	Description
Chat Interface	Open WebUI	Full chat UI, RAG integration, model selection, conversation history
Admin Interface	SoverIQ Admin (proprietary development, MIT-licensed)	User management, model deployment, audit log view

Observability and Operations

Component	Tool	Description
Metrics	Prometheus + Grafana	Inference latency, token throughput, GPU utilisation, API error rate
Logging	Loki	Log aggregation, structured logging across all components
Tracing	OpenTelemetry + Tempo	Distributed tracing for RAG pipelines and API calls
Alerting	Alertmanager (Prometheus stack)	Alerts for GPU failures, model downtime, queue depth

Repositories

SoverIQ Core consists of several repositories:

github.com/soveriq/
├── core                  # Helm charts, Kubernetes manifests, configuration reference
├── admin                 # SoverIQ Admin UI (React, MIT licence)
├── connector-framework   # Connectors for SAP, SharePoint, DATEV, REST
├── rag-pipelines         # Document processing and retrieval pipelines
└── deployment-examples   # Example deployments: bare metal, K3s, EKS, Hetzner

The repositories are currently being prepared for public release. Register for early access.

Kubernetes Deployment

SoverIQ Core is delivered entirely via Helm Charts. The deployment is designed for standard Kubernetes clusters (K3s, RKE2, EKS, GKE EU region, Hetzner K8s).

Prerequisites

Kubernetes   >= 1.28
Helm         >= 3.12
Storage      ReadWriteOnce PVC (min. 100 GB for models)
RAM          min. 16 GB (CPU-only), min. 32 GB (with GPU)
GPU          optional: NVIDIA, CUDA 12.x (for GPU inference)

Quick Start

# Add the repo
helm repo add soveriq https://charts.soveriq.ai
helm repo update

# Create namespace
kubectl create namespace soveriq

# Customise values file
helm show values soveriq/core > values.yaml
# → edit values.yaml: model, storage, auth, domain

# Deploy
helm install soveriq-core soveriq/core \
  --namespace soveriq \
  --values values.yaml

# Check status
kubectl get pods -n soveriq

What gets deployed?

soveriq-core/
├── ollama              # Model inference (StatefulSet + PVC)
├── qdrant              # Vector database (StatefulSet + PVC)
├── litellm             # API gateway (Deployment)
├── open-webui          # Chat interface (Deployment)
├── keycloak            # Identity provider (StatefulSet)
├── n8n                 # Workflow engine (Deployment)
├── prometheus-stack    # Metrics + Grafana + Loki (optional)
└── soveriq-admin       # Admin UI (Deployment)

All components can be individually enabled and configured via values.yaml. If you already run a Keycloak instance, for example, simply disable the internal Keycloak component.

Resource Profiles

Profile	Description	Minimum Hardware
minimal	CPU-only, small model (Phi-3 Mini, Gemma 2B)	4 vCPU, 16 GB RAM, 50 GB SSD
standard	CPU-only, medium model (Llama 3 8B quantised)	8 vCPU, 32 GB RAM, 200 GB SSD
gpu-single	Single NVIDIA GPU, large model (Llama 3 70B quantised)	16 vCPU, 64 GB RAM, 1× A10G or RTX 4090
gpu-cluster	Multiple GPUs, high-throughput operation (vLLM)	by requirement

Licence

SoverIQ Core is released under the Apache 2.0 licence. Commercial use, modification, and redistribution are explicitly permitted.

For organisations running SoverIQ Core in production that need professional support, SLAs, or a managed extension (Stack, Cloud, Box), we offer commercial arrangements. Talk to us.

Community and Contributions

GitHub Discussions: questions, ideas, field reports
Issues: bug reports and feature requests
Pull Requests: welcome – contribution guide in the repository

Early Access: SoverIQ Core is shortly before its first public release. If you want to be early – as a tester, contributor, or an organisation looking to run this in production – we’d love to hear from you.

Request early access →