MongoDB Integration

Connect to MongoDB databases and sync document collections to your knowledge base. Perfect for flexible, document-oriented data.

How It Works

The MongoDB connector reads documents from your collections and converts them into searchable text. Nested documents and arrays are automatically flattened for comprehensive indexing.

  • Automatically discovers collections and fields
  • Handles nested documents and arrays
  • Supports selective collection and field sync
  • Enables incremental sync via timestamp or ObjectId

Read-Only Access

RAG Chats only reads data from your database. We never modify, insert, or delete any documents.

Connecting MongoDB

Add Database Source

From Knowledge Base, click + Add Source MongoDB.

Enter Connection Details

You can connect using either a connection string or individual parameters:

Option 1: Connection String

mongodb+srv:500">class=500">class="text-green-500">"text-muted-foreground">//user:password@cluster.mongodb.net/database

Option 2: Individual Parameters

  • Host: Database server address
  • Port: Usually 27017
  • Database: Database name
  • Username: Database user
  • Password: User password

Configure Authentication

For authenticated databases:

  • Auth Source: Authentication database (usually "admin")
  • Replica Set: Name if using a replica set
  • TLS: Enable for secure connections

Select Collections

After connecting, select which collections to sync. You can sync all collections or choose specific ones.

Configure Fields (Optional)

By default, all document fields are indexed. You can customize which fields to include per collection.

Start Sync

Click Connect to begin syncing. Documents are processed in batches.

Connection Settings

Connection Stringstringdefault: none

Full MongoDB connection string (alternative to individual params)

Hoststringdefault: localhost

Database server hostname or IP

Portintegerdefault: 27017

MongoDB server port

Databasestringdefault: required

Name of the database to connect to

Usernamestringdefault: none

Database user (if authentication enabled)

Passwordstringdefault: none

User password (stored encrypted)

Auth Sourcestringdefault: admin

Database for authentication

Replica Setstringdefault: none

Replica set name (if applicable)

TLSbooleandefault: false

Enable TLS/SSL connection

Sync Settings

Collectionsarraydefault: all collections

Specific collections to sync (empty = all)

Text fieldsobjectdefault: all fields

Fields to index per collection (supports dot notation)

Timestamp fieldstringdefault: none

Field for incremental sync (e.g., updatedAt)

Max depthintegerdefault: 5

Maximum nesting depth for document extraction (1-10)

Batch sizeintegerdefault: 1000

Documents per batch (100-10,000)

Nested Documents

MongoDB documents often contain nested objects and arrays. The connector automatically flattens these for searchability:

{
  "_id": "507f1f77bcf86cd799439011",
  "title": "Product Guide",
  "author": {
    "name": "Jane Smith",
    "department": "Documentation"
  },
  "tags": ["guide", "getting-started", "tutorial"],
  "sections": [
    { "title": "Introduction", "content": "Welcome to..." },
    { "title": "Setup", "content": "First, install..." }
  ]
}

Becomes searchable text:

Collection: articles
title: Product Guide
author.name: Jane Smith
author.department: Documentation
tags: guide, getting-started, tutorial
sections: title: Introduction, content: Welcome to..., title: Setup, content: First, install...

Field Selection with Dot Notation

You can specify exactly which fields to index using dot notation for nested fields:

{
  "articles": ["title", "content", "author.name"],
  "products": ["name", "description", "specs.features"]
}

Array Fields

For array fields, use [] in the path: comments[].text will extract the text field from each comment in the array.

Incremental Sync

MongoDB supports two methods for incremental sync:

1. Timestamp Field

If your documents have an updatedAt or similar field:

  • Only documents modified since the last sync are processed
  • Great for documents that track modification time

2. ObjectId-Based

If you don't have a timestamp field, the connector can use MongoDB's _id field:

  • Only documents created after the last sync are processed
  • Doesn't catch updates to existing documents

MongoDB Atlas

For MongoDB Atlas (cloud) deployments:

  1. Use the connection string from Atlas Dashboard → Connect → Connect your application
  2. Enable TLS (required for Atlas)
  3. Ensure your IP is allowlisted in Atlas Network Access settings
mongodb+srv:500">class=500">class="text-green-500">"text-muted-foreground">//username:password@cluster0.xxxxx.mongodb.net/myDatabase?retryWrites=500">true&w=majority

Security Best Practices

  • Create a read-only user: Never use admin credentials
  • Limit collection access: Grant read only on needed collections
  • Use TLS: Always enable TLS for production databases
  • IP allowlisting: Restrict database access to RAG Chats IPs
  • Exclude sensitive data: Don't sync collections with PII

Example: Creating a Read-Only User

500">class=500">class="text-green-500">"text-muted-foreground">// Connect to admin database
use admin

500">class=500">class="text-green-500">"text-muted-foreground">// Create a read-only user 500">for specific database
db.createUser({
  user: 500">class="text-green-500">"ragchats_reader",
  pwd: 500">class="text-green-500">"secure_password",
  roles: [
    { role: 500">class="text-green-500">"read", db: 500">class="text-green-500">"myDatabase" }
  ]
})

500">class=500">class="text-green-500">"text-muted-foreground">// Or 500">for specific collections only
db.createUser({
  user: 500">class="text-green-500">"ragchats_reader",
  pwd: 500">class="text-green-500">"secure_password",
  roles: [
    {
      role: 500">class="text-green-500">"read",
      db: 500">class="text-green-500">"myDatabase",
      collection: 500">class="text-green-500">"articles"
    },
    {
      role: 500">class="text-green-500">"read",
      db: 500">class="text-green-500">"myDatabase",
      collection: 500">class="text-green-500">"products"
    }
  ]
})

Troubleshooting

Connection failed

  • Verify connection string or host/port settings
  • Check if MongoDB server is running and accessible
  • Ensure firewall rules allow the connection
  • For Atlas, verify IP is allowlisted
  • Check TLS settings match server requirements

Authentication failed

  • Verify username and password
  • Check auth source database (usually "admin")
  • Ensure the user exists and has correct permissions

No data synced

  • Verify collections contain documents
  • Check if collections are empty
  • Ensure selected collections exist
  • System collections (starting with "system.") are skipped

Nested data not appearing

  • Increase max depth setting
  • Check if nested fields are specified correctly
  • Very deep nesting (10+) is truncated

Sync is slow

  • Reduce the number of collections being synced
  • Use incremental sync with a timestamp field
  • Increase batch size for faster processing
  • Add an index on your timestamp field

Removing the Integration

  1. In RAG Chats, delete the MongoDB data source
  2. Optionally, remove the database user:
    use admin
    db.dropUser(500">class="text-green-500">"ragchats_reader")