Correlating mongodb connections to application threads
January 3, 2019 Leave a comment
We are currently in the process of making sure all of our applications that connect to mongodb Atlas are using connection pooling. There are several benifits to managing connnections effectively and efficiently, namely that connections are recycled efficiently reducing resource overhead.
The mongodb cost of a connection is 1MB so it can quickly add up and eat into valuable RAM that could otherwise be use for cache activity.
In a managed service world, everything is based around limitations and tiering based on some core components such as CPU, IOPS, disk space and from a database perspective; connections.
The number of connections allowed on a mongodb cluster correlates directly to the instance size. For example;
Instance Size | Connection Limit |
M10 | 350 |
M20 | 700 |
M30 | 2000 |
To see all the connection limits see https://docs.atlas.mongodb.com/connection-limits/
Connections limits mattered less to a DBA in an on prem world, as you would set the ulimit settings for open files and processes/threads to 64000 as per the mongodb recommendations (https://docs.mongodb.com/manual/reference/ulimit/). However, it becomes extremely critical when you have an M10 that only allows 350 connections of which around 5-10% are taken up by mongo system processes.
Analysing mongodb logs for connections
I use a little app called mtools developed by Thomas Rückstieß who works at mongodb. It is a collection of helper scripts to parse, filter, and visualize MongoDB log files (mongod
, mongos
).
You can pick it up here – http://blog.rueckstiess.com/mtools/
The setup is straightforward and you can quickly start seeing how many connections are being opened and closed grouped by IPs.
mloginfo mongod.log --connections source: core-prod-vms-scaled.log host: unknown start: 2018 Dec 31 10:56:08.404 end: 2018 Dec 31 13:25:40.320 date format: iso8601-local length: 2714 binary: unknown version: >= 3.0 (iso8601 format, level, component) storage: unknown CONNECTIONS total opened: 155 total closed: 143 no unique IPs: 4 socket exceptions: 0 35.X.X.1 opened: 55 closed: 55 35.X.X.2 opened: 49 closed: 49 35.X.X.3 opened: 39 closed: 39 35.X.X.4 opened: 12 closed: 0
Correlating open connections against an app server
If we take the example output above and use the 35.X.X.4 IP – we can see that it has sent 12 incoming connections to mongo. The best way i’ve found to see established connections on an app server is to use netstat.
netstat -anp | grep ESTABLISHED | grep ":27017" | grep " 172." | awk '{print $5}' | sort | uniq -c | sort -n
12 172.X.X.1:27017 12 172.X.X.2:27017 12 172.X.X.3:27017
The above is telling us that there are 12 threads connected to 3 different IPs. When looking into the IP’s, they reference the 3 nodes on a mongo replica set which tells us that each connection on mongo is actually 3 threads on an app server (or however many nodes there are in the replica set).
maxPoolSize
Setting the maxPoolSize property on the mongo driver will help control how many threads an app server is allowed to open against a mongodb node. Be wary that the maxPoolSize default varies in different drivers – for example, in python its 100, but in node.js its 5.
Knowing the maxPoolSize for applications that have databases on the same cluster can then allow you to accurately calculate what the max connections could potentially be for a cluster. This could then help make more informed decisions about whether to scale or upsize a mongodb cluster or split applications out.
YOu can get more info about connection pool options here – https://docs.mongodb.com/manual/reference/connection-string/#connection-pool-options