FAQ

Frequently asked questions about Sirchmunk.

How is this different from traditional RAG systems?

Sirchmunk takes an indexless approach:

  1. No pre-indexing: Direct file search without vector database setup
  2. Self-evolving: Knowledge clusters evolve based on search patterns
  3. Multi-level retrieval: Adaptive keyword granularity for better recall
  4. Evidence-based: Monte Carlo sampling for precise content extraction

What LLM providers are supported?

Any OpenAI-compatible API endpoint, including:

  • OpenAI (GPT-4, GPT-4o, GPT-3.5)
  • Local models served via Ollama, llama.cpp, vLLM, SGLang
  • Claude via API proxy
  • Any other OpenAI-compatible provider

Simply specify the path in your search query — no pre-processing or indexing required:

result = await searcher.search(
    query="Your question",
    paths=["/path/to/folder", "/path/to/file.pdf"]
)

Where are knowledge clusters stored?

Knowledge clusters are persisted in Parquet format at:

{SIRCHMUNK_WORK_PATH}/.cache/knowledge/knowledge_clusters.parquet

You can query them using DuckDB or the KnowledgeManager API.

How do I monitor LLM token usage?

Three ways:

  1. Web Dashboard: Visit the Monitor page for real-time statistics
  2. API: GET /api/v1/monitor/llm returns usage metrics
  3. Code: Access searcher.llm_usages after search completion

Does FILENAME_ONLY mode require an LLM?

No. FILENAME_ONLY mode performs fast filename-based search without any LLM calls. Only DEEP mode requires a configured LLM API key.

What file formats are supported?

Sirchmunk leverages ripgrep-all to search across 100+ file formats, including:

  • Source code (Python, JavaScript, Java, Go, Rust, etc.)
  • Documents (PDF, DOCX, XLSX, PPTX)
  • Archives (ZIP, TAR, GZ)
  • Data files (JSON, YAML, CSV, XML)
  • Plain text (TXT, MD, RST)
  • And many more

How does the knowledge system evolve?

Knowledge clusters follow a natural lifecycle:

  1. Creation — New evidence generates a new cluster
  2. Reuse — Similar queries match and enhance existing clusters
  3. Maturation — Repeated validation transitions clusters from Emerging to Stable
  4. Deprecation — Unsupported clusters transition to Contested or Deprecated
docs