pipecat-moss package. This setup allows your voice AI to perform search with sub-10ms latency, ensuring your agents answer questions naturally without awkward “thinking” pauses.
Note: To explore a complete example of deploying pipecat-moss, please visit Moss Samples.
Why Use Moss with Pipecat?
Moss retrieval operates with exceptional speed, seamlessly injecting results into the LLM context before the user completes their turn. This eliminates reliance on slow “tool calling” loops, ensuring interactions remain natural and fluid.Required Tools
To integrate Moss with Pipecat, you will need the following tools: Additional references:Integration Guide
1
Installation
Install the official Pipecat-Moss integration package.
2
Environment Setup
Create a
.env file in your project root..env
3
Create Knowledge Base
Before running the bot, ensure your Moss index is uploaded. Use the provided script:Run the script using the following command:
4
Build the Pipeline
The Run the bot using the following command:
MossRetrievalService integrates as a processor in the Pipecat pipeline. It sits between the user input and the LLM, injecting relevant context automatically.Configuration
TheMossRetrievalService allows you to tune how results are retrieved and presented to the LLM.
Initialization
| Parameter | Type | Description |
|---|---|---|
project_id | str | Required. Your Moss Project ID. |
project_key | str | Required. Your Moss Project Key. |
system_prompt | str | Prefix text added to the retrieved context. Default: "Here is additional context retrieved from database:\n\n". |
Pipeline Processor
When addingmoss_service.query() to your pipeline, you can adjust the following:
| Parameter | Type | Default | Description |
|---|---|---|---|
index_name | str | None | The name of the Moss index to query. |
top_k | int | 5 | The number of text chunks to retrieve and inject. |
alpha | float | 0.8 | Hybrid Search Weighting. 0.0 = Keyword only. 1.0 = Semantic only (Vector). 0.8 is recommended for most voice use cases. |