AI Limos/Isima✯
Access Point✯
- URL: https://ia.limos.fr
Services✯
- Chat - Chat with language models
- RAG - Retrieval-Augmented Generation for querying your documents
- API - Programmatic access to language models
Current Models✯
dev-model: Focused on development tasks and agentic workflows general: General-purpose model for classic chat tasks with reasoning general_nothink: Variant of general without reasoning capabilities
| Model | Tokens | Params | Active | Aliases | Capabilities |
|---|---|---|---|---|---|
| MiniMax-M2.7 | 196608 | 230B | 10B | dev-model | Chat, Agentic |
| Mistral-Small4 | 262144 | 119B | 6.5B | general, general_nothink | Chat, Agentic, VL (img) |
| bge-m3 | 8192 | 1B | embedding | Embedding |
Hardware✯
- 4x H100 (90GB RAM each) = 360GB total
- 1x H200 (140GB RAM)
Changelogs✯
v1 (20/04/2026)✯
- URL changes
- OpenWebUI reset
- Reset of all LiteLLM keys
- Opening to teachers/researchers
- Deployment of Ragondin (rag.ia.limos.fr)
- Redeployment of various services
v0.5 (13/04/2026)✯
- Update to MiniMax M2.7
- Setup of Searxng proxy
v0.4 (31/03/2026)✯
- API calls now go through LiteLLM
- OpenWebUI connection with SSO
- Token management from https://keymgr.limos.fr
- Mistral Small4 replaces Qwen3.5
- Addition of general-nothink alias that disables reasoning
- BAAI/bge-m3 embedding model
Update API_URL in your configs to https://litellm.limos.fr/v1 Token generation at https://keymgr.limos.fr
v0.3 (25/02/2026)✯
- Model changes: MiniMax (dev) and Qwen3.5 (general)
- Qwen3.5 natively supports VL (image, video, audio, screenshot)
- Export of VLLM metrics via Prometheus
- Resource usage graphs in Grafana
v0.2 (12/02/2026)✯
- Model changes: DevStral replaced by GLM-4.7
- Addition of a CLI to control models
- API key management via LiteLLM
v0.1 (06/01/2026)✯
- Provision of OpenWebUI using LiteLLM
- Web search via SearxNG (self-hosted, separate service)
- API key generation via OpenWebUI
- Available models: DevStral-123b, gpt-oss-120b