Memory for Edge AI
On-Device Vector Search for Edge AI
A drop-in retrieval SDK for local LLMs — no cloud database, no embedding API, no network round-trip.
BUILT FOR ROBOTICS, IoT & EDGE AI
Cloud vector databases weren't built for robots, IoT gateways, or embedded devices.
Clace was.
Efficient memory footprint
Cloud vector databases use more RAM as your data grows. Clace stays flat at ~200MB — whether you store 10K vectors or 10M. It runs inside your app, with no separate database to provision or manage.
Total data privacy
Cloud embedding services log every query and every document you send them. Clace runs the embedding model on the same device as the index. No telemetry, no third parties, nothing crosses the network.
Sub-200ms local retrieval
Cloud retrieval burns 300–500ms on network calls before your model sees the context. Clace returns results in under 200ms, fully local — so the answer reaches your user that much sooner.
Compliance-ready by default
HIPAA, SOC 2, attorney-client privilege, ITAR — every framework hates sending data off-device. Clace runs air-gapped by default: no cloud calls, no telemetry, no internet required. Legal signs off in days, not quarters.
| Feature | Cloud Vector DB | Clace |
|---|---|---|
| RAM Footprint | Scales with data | ~200MB constant |
| Network Calls | Embedding API + DB queries | Zero |
| Query Latency | 300–500ms | <200ms |
| Infrastructure | Vector DB + embedding service | Single SDK binary |
| Deployment | Cloud-dependent | Air-gap ready |
A few lines of code. Zero infrastructure.
python
from clace_sdk import Clace # 1. Initialize the engine — constant ~200MB footprintclace = Clace(index_path="./local_index", bicameral_mode=True) # 2. Ingest compliance rules and user/episodic memoryclace.ingest_ruleset(document="HIPAA_Guidelines.pdf", title="Strict Rules")clace.ingest(data_path="./user_chat_logs/") # 3. Retrieve context locally — no API calls, no data egresscontext = clace.get_context(query="Summarize patient history", top_k=5) # 4. Pass to your local LLM of choiceresponse = local_llm.complete(prompt=context + user_question)Built for the places where your AI actually runs
Local Copilots
Ship AI features as a single offline binary.
Robotics
A memory layer that fits beside perception and planning models.
Voice & Wearables
Sub-200ms semantic recall, no cloud round-trip required.
Retail & POS
Customer-facing AI where data has to stay inside the store.
Industrial IoT
Run semantic search where bandwidth is slow, costly, or absent.
Regulated AI
Build retrieval pipelines where data egress isn't an option.
Ship local AI that actually remembers.
Start with the quickstart guide. No infrastructure, no cloud, no data egress — just a few lines of code.
Read the Quickstart