MongoDB Indexing Strategies
Introduction to MongoDB Indexing
Indexing is one of the most critical aspects of MongoDB performance optimization. A well-designed indexing strategy can transform slow queries into lightning-fast operations, while poor indexing can cripple even the most powerful hardware. MongoDB indexes are data structures that improve the speed of data retrieval operations on a collection at the cost of additional storage space and write performance overhead.
Think of indexes like the index in a book - instead of reading every page to find information about a specific topic, you can use the index to jump directly to the relevant pages. Similarly, MongoDB indexes allow the database to quickly locate documents without scanning every document in a collection.
Key Benefits of Proper Indexing
- Query Performance: Dramatically reduce query execution time from seconds to milliseconds
- Memory Efficiency: Keep frequently accessed data in memory
- Scalability: Maintain performance as data volume grows
- Resource Optimization: Reduce CPU and I/O usage
- User Experience: Ensure responsive application performance
Understanding Index Fundamentals
-
How MongoDB Indexes Work
MongoDB indexes use B-tree data structures that maintain sorted pointers to documents. When a query is executed, MongoDB’s query optimizer determines whether an index can be used to satisfy the query efficiently.
Index Structure:
// Collection document{ _id: ObjectId("..."), name: "Alice", age: 30, city: "New York" }// Index on 'name' field creates a sorted structure:// Index Entry -> Document Location// "Alice" -> Document 1// "Bob" -> Document 3// "Charlie" -> Document 2Query Execution Without Index:
- Collection scan: O(n) complexity
- Examines every document in collection
- Performance degrades linearly with data size
Query Execution With Index:
- Index scan: O(log n) complexity
- Directly locates relevant documents
- Consistent performance regardless of collection size
-
Index Types Overview
MongoDB supports various index types, each optimized for specific use cases:
Primary Index Types:
- Single Field: Index on one field
- Compound: Index on multiple fields
- Multikey: Automatically handles array values
- Text: Full-text search capabilities
- Geospatial: Location-based queries
- Hashed: Evenly distributed shard keys
Specialized Index Types:
- Partial: Index subset of documents
- Sparse: Skip documents missing indexed fields
- TTL: Automatically expire documents
- Unique: Enforce uniqueness constraints
Core Indexing Strategies
-
Single Field Indexes
Single field indexes are the foundation of MongoDB indexing, created on individual document fields to optimize queries and sorting operations.
Creating Single Field Indexes:
// Create ascending index on email fielddb.users.createIndex({ email: 1 });// Create descending index on timestampdb.events.createIndex({ timestamp: -1 });// Create index with optionsdb.users.createIndex({ username: 1 },{unique: true,name: "unique_username_idx",background: true,});Optimized Query Patterns:
// Efficient queries using single field indexes// Equality queriesdb.users.find({ email: "alice@example.com" });// Range queriesdb.events.find({ timestamp: { $gte: ISODate("2023-01-01") } });// Sorting operationsdb.products.find().sort({ price: -1 });// Prefix matching with regular expressionsdb.users.find({ username: /^alice/ });Performance Considerations:
- Index selectivity: High cardinality fields perform better
- Query patterns: Match index sort order for optimal performance
- Write overhead: Each index adds overhead to insert/update operations
-
Compound Indexes
Compound indexes span multiple fields and are essential for optimizing complex queries. The order of fields in compound indexes significantly impacts performance.
Index Prefix Rule: A compound index can support queries on any prefix of its fields:
// Create compound indexdb.users.createIndex({ status: 1, age: 1, city: 1 });// Supported query patterns (index prefixes):// 1. { status: ... }// 2. { status: ..., age: ... }// 3. { status: ..., age: ..., city: ... }// NOT efficiently supported:// { age: ... } - skips first field// { city: ... } - skips first two fields// { age: ..., city: ... } - skips first fieldField Ordering Strategies:
ESR Rule (Equality, Sort, Range):
- Equality: Fields used in equality conditions first
- Sort: Fields used for sorting second
- Range: Fields used in range queries last
// Query: Find active users in NYC, aged 25-35, sorted by registration datedb.users.find({status: "active", // Equalitycity: "NYC", // Equalityage: { $gte: 25, $lte: 35 }, // Range}).sort({ registrationDate: -1 }); // Sort// Optimal compound index:db.users.createIndex({status: 1, // Equalitycity: 1, // EqualityregistrationDate: -1, // Sortage: 1, // Range});Advanced Compound Index Examples:
// E-commerce product catalogdb.products.createIndex({category: 1, // Filter by categoryfeatured: 1, // Filter by featured statusprice: 1, // Sort/filter by pricerating: -1, // Sort by rating (descending)});// Time-series datadb.metrics.createIndex({deviceId: 1, // Partition by devicetimestamp: -1, // Sort by time (newest first)metricType: 1, // Filter by metric type});// User activity logsdb.logs.createIndex({userId: 1, // Filter by useraction: 1, // Filter by action typetimestamp: -1, // Sort by time}); -
Text Indexes
Text indexes enable full-text search capabilities across string fields, supporting language-specific stemming, case-insensitive search, and relevance scoring.
Creating Text Indexes:
// Single field text indexdb.articles.createIndex({ content: "text" });// Multiple field text indexdb.articles.createIndex({title: "text",content: "text",tags: "text",});// Weighted text index (title is more important)db.articles.createIndex({title: "text",content: "text",},{weights: {title: 10,content: 1,},name: "article_text_idx",});Text Search Queries:
// Basic text searchdb.articles.find({ $text: { $search: "mongodb indexing" } });// Phrase searchdb.articles.find({ $text: { $search: '"database optimization"' } });// Exclude termsdb.articles.find({ $text: { $search: "mongodb -sql" } });// Search with score (relevance ranking)db.articles.find({ $text: { $search: "mongodb performance" } },{ score: { $meta: "textScore" } }).sort({ score: { $meta: "textScore" } });// Language-specific searchdb.articles.find({ $text: { $search: "base de données", $language: "fr" } });Text Index Limitations:
- Only one text index per collection
- Cannot be used with sort operations (except textScore)
- No support for partial word matching
- Limited language support for stemming
-
Geospatial Indexes
MongoDB provides specialized indexes for location-based queries, supporting both 2D and spherical (Earth-like) coordinate systems.
2dsphere Indexes (Recommended):
// Create 2dsphere index for GeoJSON datadb.places.createIndex({ location: "2dsphere" });// Sample GeoJSON documentdb.places.insertOne({name: "Central Park",location: {type: "Point",coordinates: [-73.9665, 40.7812], // [longitude, latitude]},});Geospatial Query Examples:
// Find nearby locations (within radius)db.places.find({location: {$near: {$geometry: {type: "Point",coordinates: [-73.9857, 40.7484], // Times Square},$maxDistance: 1000, // 1000 meters},},});// Find locations within polygondb.places.find({location: {$geoWithin: {$geometry: {type: "Polygon",coordinates: [[[-74.0, 40.7],[-74.0, 40.8],[-73.9, 40.8],[-73.9, 40.7],[-74.0, 40.7],],],},},},});// Find intersecting areasdb.regions.find({boundaries: {$geoIntersects: {$geometry: {type: "Point",coordinates: [-73.9857, 40.7484],},},},}); -
Partial and Sparse Indexes
These specialized indexes reduce storage requirements and improve performance by indexing only specific documents.
Partial Indexes: Index only documents matching specified criteria:
// Index only active usersdb.users.createIndex({ email: 1 },{partialFilterExpression: {status: "active",},});// Index only high-value ordersdb.orders.createIndex({ customerId: 1, orderDate: -1 },{partialFilterExpression: {totalAmount: { $gt: 100 },},});// Index only recent documentsdb.logs.createIndex({ userId: 1, action: 1 },{partialFilterExpression: {createdAt: {$gte: new Date("2023-01-01"),},},});Sparse Indexes: Skip documents that don’t contain the indexed field:
// Index only documents with phone numberdb.users.createIndex({ phoneNumber: 1 }, { sparse: true });// Compound sparse indexdb.products.createIndex({ category: 1, discount: 1 },{ sparse: true } // Skip products without discount); -
TTL (Time To Live) Indexes
TTL indexes automatically remove documents after a specified time period, perfect for managing temporary or time-sensitive data.
Creating TTL Indexes:
// Expire documents 30 days after creationdb.sessions.createIndex({ createdAt: 1 },{ expireAfterSeconds: 2592000 } // 30 days in seconds);// Expire at specific datedb.notifications.createIndex({ expireAt: 1 }, { expireAfterSeconds: 0 });// Sample documentsdb.sessions.insertOne({userId: "user123",sessionToken: "abc123",createdAt: new Date(), // Will expire in 30 days});db.notifications.insertOne({message: "Special offer!",expireAt: new Date("2023-12-31T23:59:59Z"),});TTL Index Considerations:
- Only works on Date fields or arrays containing Dates
- MongoDB runs a background task every 60 seconds to remove expired documents
- Cannot be compound indexes
- Useful for sessions, logs, cache data, and temporary collections
Index Performance Analysis
-
Query Execution Plans
Understanding execution plans is crucial for optimizing query performance and validating index effectiveness.
Using explain() Method:
// Basic execution plandb.users.find({ email: "alice@example.com" }).explain();// Detailed execution plandb.users.find({ status: "active", age: { $gte: 25 } }).explain("executionStats");// All available plans comparisondb.users.find({ city: "NYC" }).explain("allPlansExecution");Key Metrics to Analyze:
// Example execution plan output{"executionStats": {"totalDocsExamined": 1, // Documents scanned"totalKeysExamined": 1, // Index keys scanned"executionTimeMillis": 0, // Query execution time"indexesUsed": ["email_1"], // Which indexes were used"stage": "IXSCAN", // Index scan (good)// vs "COLLSCAN" // Collection scan (bad)}}Performance Indicators:
- Good Performance:
totalKeysExamined
≈totalDocsExamined
≈documents returned
- Poor Performance: High ratio of examined to returned documents
- Index Usage: Stage should be
IXSCAN
notCOLLSCAN
- Good Performance:
-
Index Usage Monitoring
Regular monitoring helps identify unused indexes and optimization opportunities.
Index Usage Statistics:
// View index usage statisticsdb.users.aggregate([{ $indexStats: {} }]);// Find unused indexesdb.runCommand({ collStats: "users", indexDetails: true });// Monitor slow operationsdb.setProfilingLevel(2, { slowms: 100 });db.system.profile.find().limit(5).sort({ ts: -1 }).pretty();Index Maintenance Commands:
// List all indexesdb.users.getIndexes();// Get index sizesdb.users.stats().indexSizes;// Rebuild indexesdb.users.reIndex();// Drop unused indexdb.users.dropIndex("unused_field_1");
Advanced Indexing Strategies
-
Index Intersection
MongoDB can use multiple indexes simultaneously to optimize complex queries through index intersection.
// Create separate indexesdb.products.createIndex({ category: 1 });db.products.createIndex({ price: 1 });db.products.createIndex({ rating: 1 });// MongoDB can intersect indexes for this querydb.products.find({category: "electronics",price: { $lt: 500 },rating: { $gte: 4.0 },});// Check if intersection is useddb.products.find({category: "electronics",price: { $lt: 500 },rating: { $gte: 4.0 },}).explain("executionStats"); -
Index Hints
Force MongoDB to use specific indexes when the query optimizer doesn’t choose optimally.
// Force use of specific indexdb.users.find({ name: "Alice", age: 30 }).hint({ name: 1 });// Force collection scan (disable indexes)db.users.find({ status: "active" }).hint({ $natural: 1 });// Use index by namedb.products.find({ category: "electronics" }).hint("category_price_idx"); -
Collation and Indexes
Handle locale-specific sorting and comparison rules with collation-aware indexes.
// Create index with collationdb.users.createIndex({ name: 1 },{collation: {locale: "en",strength: 2, // Case insensitive},});// Query must use same collationdb.users.find({ name: "Alice" }).collation({ locale: "en", strength: 2 });
Indexing Best Practices
-
Index Design Principles
Selectivity First:
- Index highly selective fields first
- High cardinality = better performance
- Avoid indexing boolean fields alone
Query Pattern Analysis:
// Analyze actual query patternsdb.setProfilingLevel(2, { slowms: 0 });// After collecting data, analyze common patternsdb.system.profile.aggregate([{ $group: { _id: "$command.filter", count: { $sum: 1 } } },{ $sort: { count: -1 } },]);ESR Rule Application:
// Bad: Range field firstdb.orders.createIndex({ amount: 1, status: 1, customerId: 1 });// Good: Equality fields first, range lastdb.orders.createIndex({ status: 1, customerId: 1, amount: 1 }); -
Index Maintenance
Regular Index Review:
- Monitor index usage monthly
- Remove unused indexes
- Update indexes based on new query patterns
Background Index Creation:
// Create indexes in background (production safe)db.users.createIndex({ email: 1 }, { background: true });Index Size Management:
// Monitor index sizesdb.users.stats().indexSizes;// Total collection size vs index size ratiodb.users.stats().size / db.users.stats().totalIndexSize; -
Sharding Considerations
Shard Key Indexes:
// Shard key must be indexedsh.shardCollection("mydb.users", { userId: "hashed" });// Compound shard keysh.shardCollection("mydb.orders", { customerId: 1, orderDate: 1 });Cross-Shard Query Optimization:
- Include shard key in queries when possible
- Use targeted queries to avoid scatter-gather operations
Common Indexing Pitfalls
-
Over-Indexing
Problems:
- Increased storage requirements
- Slower write operations
- Memory overhead
Solutions:
// Instead of multiple single-field indexes:db.users.createIndex({ firstName: 1 });db.users.createIndex({ lastName: 1 });db.users.createIndex({ email: 1 });// Use compound index for common query patterns:db.users.createIndex({ email: 1, firstName: 1, lastName: 1 }); -
Incorrect Field Order
Wrong Order:
// Query: Find active users in specific citydb.users.find({ status: "active", city: "NYC" });// Bad index (low selectivity first)db.users.createIndex({ status: 1, city: 1 });Correct Order:
// Good index (high selectivity first)db.users.createIndex({ city: 1, status: 1 }); -
Ignoring Sort Operations
Inefficient:
// Query with sortdb.products.find({ category: "electronics" }).sort({ price: -1 });// Index doesn't support sort efficientlydb.products.createIndex({ category: 1 });Efficient:
// Index supports both filter and sortdb.products.createIndex({ category: 1, price: -1 });
Real-World Indexing Examples
-
E-commerce Platform
// Product catalogdb.products.createIndex({category: 1, // Filter by categorysubcategory: 1, // Filter by subcategoryprice: 1, // Sort/filter by pricerating: -1, // Sort by rating (descending)});// User ordersdb.orders.createIndex({customerId: 1, // Customer lookupstatus: 1, // Filter by statuscreatedAt: -1, // Sort by date (newest first)});// Inventory trackingdb.inventory.createIndex({sku: 1, // Unique product identifierwarehouse: 1, // Filter by locationquantity: 1, // Filter by availability}); -
Social Media Application
// User posts timelinedb.posts.createIndex({authorId: 1, // Filter by authorvisibility: 1, // Public/private filtercreatedAt: -1, // Chronological order});// Friend connectionsdb.connections.createIndex({userId: 1, // User lookupstatus: 1, // Connection statusconnectedAt: -1, // Connection date});// Content searchdb.posts.createIndex({content: "text",tags: "text",},{weights: { content: 1, tags: 5 },}); -
IoT Data Platform
// Sensor readingsdb.readings.createIndex({deviceId: 1, // Partition by devicetimestamp: -1, // Time-based queriessensorType: 1, // Filter by sensor});// Device managementdb.devices.createIndex({location: "2dsphere", // Geospatial queries});// TTL for old datadb.readings.createIndex({ timestamp: 1 },{ expireAfterSeconds: 7776000 } // 90 days);
Performance Testing and Optimization
Benchmarking Strategies
-
Create Test Data:
// Generate realistic test datafor (let i = 0; i < 1000000; i++) {db.users.insertOne({_id: i,email: `user${i}@example.com`,status: Math.random() > 0.8 ? "inactive" : "active",city: cities[Math.floor(Math.random() * cities.length)],age: Math.floor(Math.random() * 60) + 18,createdAt: new Date(Date.now() - Math.random() * 365 * 24 * 60 * 60 * 1000),});} -
Test Query Performance:
// Measure query performancelet startTime = new Date();db.users.find({ city: "NYC", status: "active" }).count();let endTime = new Date();print("Query time: " + (endTime - startTime) + "ms"); -
A/B Test Index Strategies: Compare different indexing approaches using identical queries and measure the performance differences.
Conclusion
Effective MongoDB indexing is both an art and a science. It requires understanding your data patterns, query requirements, and performance goals. The key to success lies in:
- Analyzing Query Patterns: Understanding how your application queries data
- Strategic Index Design: Creating indexes that support multiple query patterns
- Continuous Monitoring: Regularly reviewing index performance and usage
- Iterative Improvement: Refining indexes as application requirements evolve
Remember that indexing is about finding the right balance between query performance, storage overhead, and write performance. Start with essential indexes for your most critical queries, then expand strategically based on actual usage patterns and performance requirements.
Additional Resources
MongoDB Documentation
Tools and Utilities
- MongoDB Compass - Visual index analysis
- explain() Method - Query execution analysis
- Database Profiler - Performance monitoring