MOSS - Minimal On-Device Semantic Search
inferedge-moss enables private, on-device semantic search in your Python applications with cloud storage capabilities.
Built for developers who want instant, memory-efficient, privacy-first AI features with seamless cloud integration.
✨ Features
- ⚡ On-Device Vector Search - Sub-millisecond retrieval with zero network latency
- 🔍 Semantic, Keyword & Hybrid Search - Embedding search blended with Keyword matching
- ☁️ Cloud Storage Integration - Automatic index synchronization with cloud storage
- 📦 Multi-Index Support - Manage multiple isolated search spaces
- 🛡️ Privacy-First by Design - Computation happens locally, only indexes sync to cloud
- 🚀 High-Performance Rust Core - Built on optimized Rust bindings for maximum speed
📦 Installation
pip install inferedge-moss🚀 Quick Start
import asyncio
from inferedge_moss import MossClient, DocumentInfo
async def main():
# Initialize search client with project credentials
client = MossClient("your-project-id", "your-project-key")
# Prepare documents to index
documents = [
DocumentInfo(
id="doc1",
text="How do I track my order? You can track your order by logging into your account.",
metadata={"category": "shipping"}
),
DocumentInfo(
id="doc2",
text="What is your return policy? We offer a 30-day return policy for most items.",
metadata={"category": "returns"}
),
DocumentInfo(
id="doc3",
text="How can I change my shipping address? Contact our customer service team.",
metadata={"category": "support"}
)
]
# Create an index with documents (syncs to cloud)
index_name = "faqs"
await client.create_index(index_name, documents, "moss-minilm")
print("Index created and synced to cloud!")
# Load the index (from cloud or local cache)
await client.load_index(index_name)
# Search the index
result = await client.query(
index_name,
"How do I return a damaged product?",
top_k=3,
alpha=0.6 # blend semantic (0.6) and keyword (0.4) scores
)
# Display results
print(f"Query: {result.query}")
for doc in result.docs:
print(f"Score: {doc.score:.4f}")
print(f"ID: {doc.id}")
print(f"Text: {doc.text}")
print("---")
asyncio.run(main())🔥 Example Use Cases
- Smart knowledge base search with cloud backup
- Realtime Voice AI agents with persistent indexes
- Personal note-taking search with sync across devices
- Private in-app AI features with cloud storage
- Local semantic search in edge devices with cloud fallback
Available Models
moss-minilm: Lightweight model optimized for speed and efficiencymoss-mediumlm: Balanced model offering higher accuracy with reasonable performance
🔧 Getting Started
Prerequisites
- Python 3.8 or higher
- Valid InferEdge project credentials
Environment Setup
- Install the package:
pip install inferedge-moss- Get your credentials:
Sign up at InferEdge Platform to get your project_id and project_key.
- Set up environment variables (optional):
export MOSS_PROJECT_ID="your-project-id"
export MOSS_PROJECT_KEY="your-project-key"Basic Usage
import asyncio
from inferedge_moss import MossClient, DocumentInfo
async def main():
# Initialize client
client = MossClient("your-project-id", "your-project-key")
# Create and populate an index
documents = [
DocumentInfo(id="1", text="Python is a programming language"),
DocumentInfo(id="2", text="Machine learning with Python is popular"),
]
await client.create_index("my-docs", documents, "moss-minilm")
await client.load_index("my-docs")
# Search
results = await client.query("my-docs", "programming language", alpha=1.0)
for doc in results.docs:
print(f"{doc.id}: {doc.text} (score: {doc.score:.3f})")
asyncio.run(main())Hybrid Search Controls
alpha lets you decide how much weight to give semantic similarity versus keyword relevance when running query():
# Pure keyword search
await client.query("my-docs", "programming language", alpha=0.0)
# Mixed results (default 0.8 => semantic heavy)
await client.query("my-docs", "programming language")
# Pure embedding search
await client.query("my-docs", "programming language", alpha=1.0)Pick any value between 0.0 and 1.0 to tune the blend for your use case.
📚 API Reference
MossClient
MossClient(project_id: str, project_key: str)
Initialize the client with your InferEdge project credentials.
create_index(index_name: str, docs: List[DocumentInfo], model_id: str) -> bool
Create a new search index with documents. The index is automatically synced to cloud storage.
load_index(index_name: str) -> str
Load an index from cloud storage or local cache for querying.
query(index_name: str, query: str, top_k: int = 5, alpha: Optional[float] = None) -> SearchResult
Search for similar documents using semantic and keyword matching.
add_docs(index_name: str, docs: List[DocumentInfo]) -> Dict[str, int]
Add new documents to an existing index.
list_indexes() -> List[IndexInfo]
Get all available indexes in your project.
📬 Contact
For support, commercial licensing, or partnership inquiries, contact us: contact@inferedge.dev