Installing MongoDB on a Mac with Homebrew

Homebrew is a great little utility to aid application installations and does a great job in packaging up and simplifying the install process.

To learn more about Homebrew visit https://brew.sh/ or the Wiki page here

Installing MongoDB using Homebrew is pretty straightforward but some familiarity with the terminal is beneficial.

Open the Terminal app and type the following:

brew install mongodb

This should trigger a set of actions, of which the first will be a brew update

Updating Homebrew...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/core).

After updating itself, brew will download the latest mongodb package ‘bottle’ and install it using the default settings.

==> Downloading https://homebrew.bintray.com/bottles/mongodb-3.4.4.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring mongodb-3.4.4.sierra.bottle.tar.gz
==> Caveats
To have launchd start mongodb now and restart at login:
  brew services start mongodb
Or, if you don't want/need a background service you can just run:
  mongod --config /usr/local/etc/mongod.conf
==> Summary
🍺  /usr/local/Cellar/mongodb/3.4.4: 17 files, 266.3MB

As this point, there is enough to get started with a mongodb instance, or to use the mongo shell to connect to a remote instance.

To start the instance locally without a service:

mongod --config /usr/local/etc/mongod.conf —fork

Then connect into the instance:

mongo —port 27017

Alterntively, to start as a brew service, run:

brew services start mongodb

Duplicate Key Error on local.slaves

We have been getting user assertion errors showing up on our 5 node replica set for a while.

image

Initially these assertion errors were not showing up in the mongo logs, so we enabled increased logging – details can be found here – https://dbamohsin.wordpress.com/2015/03/31/set-additional-logging-and-tracing/

The assertion errors turned out to be related to the local.slaves collection:

[slaveTracking] User Assertion: 11000:E11000 duplicate key error index: local.slaves.$id dup key: { : ObjectId(‘4def89b415e7ee0aa29fd64b’) }
[slaveTracking] update local.slaves query: { _id: ObjectId(‘4def89b415e7ee0aa29fd64b’), host: "10.90.47.183", ns: "local.oplog.rs" }
update: { $set: { syncedTo: Timestamp 1323652648000|784 } }
exception 11000 E11000 duplicate key error index: local.slaves.$id dup key: { : ObjectId(‘4def89b415e7ee0aa29fd64b’) } 0ms

Taken from Mongo Docs:

The duplicate key on local.slaves error, occurs when a secondary or slave changes its hostname and the primary or master tries to update its local.slaves collection with the new name. The update fails because it contains the same _id value as the document containing the previous hostname. The error itself will resemble the following.

This is a benign error and does not affect replication operations on the secondary or slave.

To prevent the error from appearing, drop the local.slaves collection from the primary or master, with the following sequence of operations in the mongo shell:

use local
db.slaves.drop()

This should resolve the assertion errors and the new config will be picked up next time the replica syncs:

use local
db.slaves.find()

This topic is also discussed in Jira – https://jira.mongodb.org/browse/SERVER-4473

db.currentOp Queries in mongodb

Return active sessions running for more than x seconds:

db.currentOp().inprog.forEach(
  function(op) {
    if(op.secs_running > 5) printjson(op);
  }
)

Waiting for a lock and not a read:

db.currentOp().inprog.forEach(
   function(d){
     if(d.waitingForLock && d.lockType != "read") 
       printjson(d)
     })

Finding active writes:

db.currentOp().inprog.forEach(
   function(d){
     if(d.active && d.lockType == "write") 
       printjson(d)
     })

Finding active reads:

db.currentOp().inprog.forEach(
   function(d){
     if(d.active && d.lockType == "read") 
       printjson(d)
     })

Set additional logging and tracing in mongodb

logLevel Parameter

To set logging on an ad hoc basis, the parameter can be set in the admin database:

--Current log level:
use admin;
db.runCommand({ getParameter: 1, logLevel: 1 })


Logging can be set between 0 and 5, with 5 being the most verbose logging:

--Set log level to 3
use admin;
db.runCommand( { setParameter: 1, logLevel: 3 } )

LogLevel can also be set at instance startup in the mongod.conf under the systemLog.verbosity parameter:

For more details – http://docs.mongodb.org/manual/reference/configuration-options/#systemLog.verbosity

Database Profiling

The database profiler collects fine grained data about MongoDB write operations, cursors, database commands on a running mongod instance. You can enable profiling on a per-database or per-instance basis. The database profiling is also configurable when enabling profiling

--get the tracing level and current slow ops threshold
db.getProfilingStatus()

--set the profiling level to 2
db.setProfilingLevel(2)


See here for full profiling details – http://docs.mongodb.org/manual/tutorial/manage-the-database-profiler/

// last few entries
show profile                                                     
 
// sort by natural order (time in)
db.system.profile.find({}).sort({$natural:-1})
 
// sort by slow queries first
db.system.profile.find({}).sort({$millis:-1})Alimit(10);
 
// anything > 20ms                  
db.system.profile.find({"millis":{$gt:20}})
 
// single coll order by response time                      
db.system.profile.find({"ns":"test.foo"}).sort({"millis":-1})
 
// regular expression on namespace
db.system.profile.find( { "ns": /test.foo/ } ).sort({millis:-1,$ts:-1})
 
// anything thats moved    
db.system.profile.find({"moved":true})
 
// large scans                           
db.system.profile.find({"nscanned":{$gt:10000}})
 
// anything doing range or full scans                 
db.system.profile.find({"nreturned":{$gt:1}})

Aggregation framework queries:

--response time by operation type
db.system.profile.aggregate(
{ $group : { 
   _id :"$op", 
   count:{$sum:1},
   "max response time":{$max:"$millis"},
   "avg response time":{$avg:"$millis"}
}});
 
--slowest by namespace
db.system.profile.aggregate(
{ $group : {
  _id :"$ns",
  count:{$sum:1}, 
  "max response time":{$max:"$millis"}, 
  "avg response time":{$avg:"$millis"}  
}},
{$sort: {
 "max response time":-1}
}); 
 
--slowest by client
db.system.profile.aggregate(
{$group : { 
  _id :"$client", 
  count:{$sum:1}, 
  "max response time":{$max:"$millis"}, 
  "avg response time":{$avg:"$millis"}  
}},
{$sort: { 
  "max response time":-1} 
});

Count Distinct Values via aggregation framework

Q: Is it possible to count distinct values of a field in mongodb?

A: Yes! This can be done via the aggregation framework in mongo. This takes two group commands; the first groups by all the distinct values, and the second does a count of them all.

pipeline = [ 
    { $group: { _id: "$myNonUniqueFieldId"}  },
    { $group: { _id: 1, count: { $sum: 1 } } }
];

db.runCommand( 
    {
    "aggregate": "collection" , 
    "pipeline": pipeline
    }
);

mongodb exception: can’t convert from BSON type String to Date

Problem: When attempting to do an aggregation on timestamps which are stored as strings, mongodb is unable to do the conversion

repmongo:SECONDARY> db.collection.aggregate(
…     { $match : { "type" : "Review"}},
…     { $group : {
…         _id: {
…             year : { $year : "$created" },
…             month : { $month : "$created" },
…             day : { $dayOfMonth : "$created" },
…         },
…         count: { $sum: 1 }
…     }},
…     { $sort : { _id : 1}}
… );
assert: command failed: {
        "errmsg" : "exception: can’t convert from BSON type String to Date",
        "code" : 16006,
        "ok" : 0
} : aggregate failed

To see which fields are of which type:

repmongo:SECONDARY> typeof db.collection.findOne().created;
string
repmongo:SECONDARY> typeof db.collection.findOne().updated;
string

Solution 1: The best thing to do is to resolve the data type inconsistency at the application layer so that data is entered into the database in ISO format which can then be easily worked with. This would require the existing data set to be changed to [date] data type.

Solution 2: For immediate analysis of the data use regex and substring to extract the date portions.

db.collection.aggregate(
    { $match : { "created" : {$in : [/2014-10/]}}},
    { $group : {
        _id: {
            year :  { $substr : ["$created", 0, 4 ] },   
            month : { $substr : ["$created", 5, 2 ] },                                      
            day :   { $substr : ["$created", 8, 2 ] },        
        },
        count: { $sum: 1 }
    }
              },
    { $sort : { _id : -1}}
);

The above returns data grouped by Year, Month & Day with a sum. The Regex conditions acts as a like clause on the date string.

{ "_id" : { "year" : "2014", "month" : "10", "day" : "21" }, "count" : 1 }

{ "_id" : { "year" : "2014", "month" : "10", "day" : "06" }, "count" : 1 }

If doing in SQL Language, the above would probably look something like this:

SELECT 
TO_DATE(created, 'YYYY'), 
TO_DATE(created, 'MM'), 
TO_DATE(created, 'DD'), 
count(_id)
FROM collection
WHERE created like '%2014-10%'
GROUP BY 
TO_DATE(created, 'YYYY'), 
TO_DATE(created, 'MM'), 
TO_DATE(created, 'DD') 
ORDER BY _id DESC

Passed my M202: MongoDB Advanced Deployment and Operations Course!

The course is a 7 week online programme and then final exam. The course is well recommended – https://university.mongodb.com/courses/10gen/M202/2014_September/about

image

image