Docs
Get Started →

Adding Documents

Go to Knowledge Base → Upload Document. There are three ways to add content, each suited to a different source.


Option 1 — Upload a file

Supported format: .txt

Select your file, choose the target category, set the similarity threshold, and click Upload file.

PDF and DOCX support is on the roadmap. For now, copy content from these formats into a .txt file or use Direct Input.


Option 2 — Direct Input

Use this when you want to type or paste content directly — for example, a single FAQ page, a product description, or a short policy document.

Fill in:

  • Title — used to identify the document in the list
  • Content — paste your text
  • Category — which category to add it to
  • Similarity threshold — see below

Click Upload text to save.


Option 3 — From URL (Site Crawler)

Use this to index an entire website or section of a site automatically.

Fill in:

FieldDescription
URLStarting URL, must include https://
Scanning DepthHow many link levels deep to follow (default: 10)
Max Pages to ScanHard limit on total pages (default: 100)
CategoryWhere to store the indexed pages
Similarity thresholdSee below

Click Scan the site. The crawler runs in the background — you’ll see a progress indicator in the interface. You can navigate away; the crawl continues and a status notification appears when complete.

While a crawl is running, the Upload and Direct Input tabs are disabled for that session. Wait for the crawl to finish before starting another.


Similarity Threshold

Every upload method has a Duplicate Similarity Threshold slider (default: 90%).

When you add a new document, the system compares it against existing documents in the same category. If the similarity score exceeds the threshold, the existing document is replaced rather than creating a duplicate.

When to adjust:

  • Higher (95–100%) — only replace near-identical content. Useful if you have many similar but intentionally distinct documents.
  • Lower (70–85%) — replace documents that are substantially the same even if wording has changed. Useful for keeping updated web pages clean.

The default of 90% works well for most cases.


Document statuses

After upload, documents go through processing:

StatusMeaning
processingBeing indexed — not yet searchable
readyIndexed and available for agent search
errorProcessing failed — try re-uploading

For URL crawls, each page gets its own document entry. You can monitor individual page statuses from the Documents list.