Zero-Hop Latency
Sub-10 ms retrieval from device memory - no internet delay. Answers feel instant and human-like.
Get real-time retrieval for your conversational AI, voice assistants, and multimodal agents
Moss is a high-performance runtime for real-time semantic search. It delivers sub-10 ms retrieval, instant index updates, and zero infrastructure overhead. It runs wherever your intelligence lives—in-browser, on-device, or in the cloud—so search feels native and effortless.
Where teams are putting Moss to work today:
npm install @inferedge/mosspip install inferedge-mossimport { MossClient } from '@inferedge/moss'
const client = new MossClient(process.env.PROJECT_ID!, process.env.PROJECT_KEY!)
await client.createIndex('docs', [{ id: '1', text: 'Vector search in production' }], 'moss-minilm')
await client.loadIndex("docs")
const results = await client.query('docs', 'production search tips')from inferedge_moss import MossClient, DocumentInfo
client = MossClient("$PROJECT_ID", "$PROJECT_KEY")
await client.create_index(
"docs",
[DocumentInfo(id="1", text="Vector search in production")],
"moss-minilm",
)
await client.load_index("docs")
results = await client.query("docs", "production search tips")For detailed setup instructions, credentials configuration, and more examples, see the Getting Started guide.
| Task | Where to look |
|---|---|
| Project setup & credentials | Getting Started |
| JavaScript usage & API docs | JavaScript SDK |
| Python usage & API docs | Python SDK |
For queries, support, commercial licensing, or partnership inquiries, contact us: contact@usemoss.dev