AI Agent Management

Knowledge & Data Sources

Learn how to upload documents, connect websites, create Q&As, and manage all the content that powers your AI agent.

Overview

The Knowledge tab in the Docimal dashboard is where you manage all the content that powers your AI agent. It allows you to upload documents, add structured text snippets, crawl websites or sitemaps, create custom Q&As, and integrate with Notion. These options give you full control over the information your agent is trained on, helping ensure accurate, relevant, and up-to-date responses for your users.

Pro Tip: Standard and Pro plans include Auto Sync, which automatically updates your agent's knowledge base every 24 hours to keep information current without manual intervention.

Multiple Data Sources

Upload files, add text snippets, crawl websites, or integrate with Notion—all in one place

Smart Processing

Automatic text extraction, chunking, and vector embeddings for semantic search

Real-time Updates

Changes to your knowledge base are reflected in your agent after retraining

Version Control

Track when each source was added and last updated for complete visibility

Files

The Files section allows you to upload and manage various document types to train your AI agent. This is the most common way to provide knowledge to your agent.

Supported File Types

Docimal supports the following file formats:

  • .pdf — PDF Documents
  • .txt — Plain Text Files
  • .doc / .docx — Microsoft Word Documents
  • .csv — Comma-Separated Values
  • .md — Markdown Files

Uploading Files

  1. 1Click the Upload Files button in the Knowledge tab
  2. 2Select one or multiple documents from your device
  3. 3Files will be queued and uploaded one by one. Monitor the status of each upload in real-time
  4. 4Once uploaded, click Retrain Agent to process the new content
[Screenshot: File Upload Interface]

Preview and Metadata

After upload, you can:

  • Click on any document to preview its contents directly within the dashboard
  • View timestamps indicating exactly when each file was added and last updated
  • See file size, type, and processing status
  • Check the number of chunks/segments created from each document
[Screenshot: File Preview & Metadata View]

File Management

Delete files individually by clicking the three dots menu and selecting delete. To delete all files at once, select the checkbox next to "File sources" to select all documents, then click the Delete button that appears.

Text Snippets

The Text Snippets feature allows you to add and manage structured content without uploading files. This is ideal for maintaining smaller, frequently updated pieces of information separate from document uploads.

Creating Text Snippets

Create multiple text snippets, each with a unique title to help you easily identify the content. This is particularly useful for segmenting information by topic, department, or use case.

Rich Text Formatting

Each snippet supports full rich text editing:

  • • Add headings for clarity
  • • Format with bold, italic, or strikethrough
  • • Create ordered or bullet lists
  • • Insert hyperlinks to external sources
  • • Use Markdown syntax for better structure
[Screenshot: Text Snippet Editor]

Common Use Cases

Company Policies

Store HR policies, code of conduct, or internal guidelines

Product Information

Maintain product specs, pricing, or feature descriptions

Quick Updates

Add temporary announcements or seasonal information

Structured Data

Store contact lists, schedules, or reference tables

Website Crawling

The Website Crawling feature enables you to train your AI agent using content directly from websites. Whether you're working with a full site, a sitemap, or individual URLs, this tool gives you flexible control over what gets included in your agent's knowledge base.

Crawling Options

1. Crawl a Full Website

Provide the homepage URL and let Docimal discover all public pages automatically

2. Submit a Sitemap

Point to an XML sitemap to fetch a structured list of URLs efficiently

3. Add Individual Links

Manually input specific URLs you want to include for precise control

Path Filtering

Refine your crawl using path filters:

  • Include Paths — Only URLs matching these paths will be fetched (e.g., /docs/*, /blog/*)
  • Exclude Paths — URLs matching these paths will be skipped (e.g., /admin/*, /login)
You can specify multiple paths in both fields. Press the space bar after each path entry. Multiple websites or links can be crawled in parallel for efficiency.
[Screenshot: Website Crawling Settings]

Link Management

Once crawling is complete:

  • All links from a single domain are grouped under the homepage URL for easy management
  • Click on a homepage group to view all fetched links
  • Preview the content of each link by clicking on it
  • Edit or exclude specific links from a group as needed
  • Recrawl the website anytime to fetch new pages

Custom Q&A Training

The Custom Q&A feature lets you train your AI agent with specific question-and-answer pairs, enabling it to respond precisely to frequently asked or business-specific queries. This ensures your agent provides exact answers for critical questions.

Creating Q&As

  • Each Q&A entry starts with a descriptive title for quick organization
  • Add multiple question variations to improve recognition and matching
  • Provide a single definitive answer that will be used when matched
  • Use rich text formatting to make answers clear and scannable
[Screenshot: Q&A Editor Interface]

Answer Priority

Custom Q&A answers take precedence over general knowledge retrieval. When a user's question matches a Q&A entry, your agent will use the exact answer you provided instead of generating a response from documents.

Usage Analytics

Click on any Q&A to open its detail view and see:

  • Number of times the question has been asked by users (updated in real-time)
  • Last time the question was asked
  • Date the Q&A was added
  • Visual chart showing frequency over time

These insights help you identify which topics matter most to your users and prioritize updates accordingly.

[Screenshot: Q&A Usage Analytics Chart]

Best Practices for Q&As

✓ Do

  • • Cover edge cases and specific scenarios
  • • Include multiple phrasings of the same question
  • • Keep answers concise but complete
  • • Update regularly based on analytics

✗ Avoid

  • • Overly generic questions
  • • Answers that frequently change
  • • Duplicate Q&As with slight variations
  • • Questions already covered well by documents

Notion Integration

Connect your Notion workspace to Docimal to enable your AI agent to access and utilize information stored in your Notion databases. This integration keeps your knowledge base synchronized with your team's documentation in Notion.

Setup Requirements

When integrating with a Notion account on a paid plan, ensure you have admin access to provide all necessary permissions for the integration to work properly.

What Gets Synced

  • All pages and sub-pages you grant access to
  • Database entries with their properties
  • Text content, headings, and formatting
  • Linked pages and references

Content updates in Notion are synced automatically when using Auto Sync (available on Standard and Pro plans).

Auto Sync

Auto Sync automatically keeps your AI agent up-to-date by pulling the latest content from your data sources every 24 hours. This ensures your agent always has access to the most current information without requiring manual intervention.

Plan Availability: Auto Sync is available on Standard and Pro plans only. Free plan users will need to manually retrain their agent after updating sources.

Supported Data Sources

Auto Sync works with the following source types:

  • Websites — Discovers newly added links and updates existing page content
  • Notion — Syncs changes from your connected Notion workspace
  • Remote Storage — Google Drive, Dropbox, and other cloud sources

How It Works

  1. 1Your agent automatically fetches new content from all connected sources once every 24 hours
  2. 2Newly added pages or links on your website are automatically discovered and included
  3. 3Changes in Notion pages and databases are detected and synced
  4. 4No manual action required—updates happen in the background automatically

Best Practices

Use Plain Text with Markdown

All data should be in plain text format. Use Markdown syntax for formatting—it's processed more accurately than complex document layouts. Avoid images with text; use actual text instead.

Structure Your Content

Use clear headings, bullet points, and logical sections. Well-structured content helps the AI understand context and relationships between information, leading to better responses.

Files Must Contain Selectable Text

When uploading PDFs, ensure they contain selectable text rather than scanned images. Use OCR tools to convert image-based PDFs to text-selectable format before uploading.

Always Retrain After Changes

Remember to click the "Retrain Agent" button after adding, deleting, or updating your knowledge sources. Changes won't be reflected in your agent until retraining is complete.

Organize by Topic

Use descriptive names for files, snippets, and Q&As. Group related content together and consider creating separate knowledge bases for different topics or departments for better organization.

Test Incrementally

Don't wait until you've uploaded all documents. Test your agent in the Playground after each major content addition to catch issues early and verify response quality.

Powered by Docimal