Increase storage size of Azure SQL Managed Instance

If you need to increase the storage space available to a managed instance – it has to be done via code as the option to edit a managed instance is not available in the Portal.

I recently increased a size of a managed instance from 320GB to 512GB and it took around 2 minutes to complete. I used the cloud shell available in the portal.

$subId = "my_subscription_id"
$resourceGroup = "resourceGroup1"
$instanceName = "managed-instance-1"

Select-AzureRmSubscription -SubscriptionId $subId

$size = 512
Set-AzureRmSqlInstance -Name $instanceName -ResourceGroupName $resourceGroup -StorageSizeInGB $size

Cores, the Admin password and a few other parameters can also be changed in the same manner. See the set-azurermsqlinstance documentation.

Restoring into an Azure Managed Instance from SQL 2014

We are in the process of setting up our azure environment, and using managed instances is something that lets us efficiently move from an on premise legacy world to a similar setup inside Azure. I’ve just completed my first migration of moving from SQL Server 2014 On premise, onto a SQL Managed Instance running SQL Azure Database V12. There are several functional steps that need to be followed succintly to complete the migration of data which i’ll try to cover off below.

Creating an Azure Storage Container

Similar to S3, Azure has storage accounts which can store multiple blob containers (folders). The general heirarchy of this is:

Subscription
  --Resource Group
    --Storage Account
      --Storage Container

The Storage Account can be set to only allow certain ranges of IP’s (IP Whitelist) to access the data, and for best practice this should be set.

The blob storage should be set to type private to prevent anonymous access – depending on what is being stored, but for databases i’d say this should be the default position.

Backup a SQL 2014 DB to Azure Storage Blob

To be able to talk to the blob storage, we need to provide a credential for SQL Server to use, which has a login access key to the storage account.

USE master 
CREATE CREDENTIAL [backups] -- name of the credential 
WITH IDENTITY='my_storage_account' -- storage account name. 
, SECRET = 'one_of_the_two_storage_keys_for_the_account' -- this is the access key token 
GO

NOTE: If you try to use a Shared Access Signature (SAS) on SQL Server 2014, it will fail to decode the signature when trying to use it. Follow this document to create a SQL 2014 Credential

Error if using SAS on SQL 2014:

Msg 3298, Level 16, State 2, Line 11
Backup/Restore to URL device error: Error while decoding the storage key.
Msg 3013, Level 16, State 1, Line 11
BACKUP DATABASE is terminating abnormally.

To see the credentials that are in place on SQL Server:

select * from sys.credentials

At, this point we can start referencing the credential to backup to a URL based location:

BACKUP DATABASE mydatabase 
TO URL = 'https://<storage_account>.blob.core.windows.net/<storage_container>/mydatabase.bak'
WITH CREDENTIAL = 'backups'

Restore a SQL 2014 Backup to SQL Azure Managed Instance

To be able to allow a SQL Azure database to talk to blob storage, we need to provide a credential using a Shared Access Signature (SAS).

To create a SAS, you need to generate a signature from within the Storage Account in the azure portal. Set an expiry on the SAS to give some added security.

NOTE: Remember to remove the intial ? from the generated SAS key otherwise the key will be invalid.

USE master 
CREATE CREDENTIAL [https://<storage_account>.blob.core.windows.net/<storage_container>] -- this name must match the container path, start with https and must not contain a trailing forward slash. 
WITH IDENTITY='SHARED ACCESS SIGNATURE' -- this is a mandatory string and do not change it. 
, SECRET = 'sas_key' -- this is the shared access signature token 
GO

Run the restore:

RESTORE DATABASE mydatabase 
FROM URL = 'https://<storage_account>.blob.core.windows.net/<storage_container>/mydatabase.bak'

NOTE: Notice that there are no WITH clauses in the restore as these are not supported in SQL Azure. Normally you would have a STATS clause to monitor the restore, but this can be done either via sys.dm_exec_requests – precent_complete column or by downloading sp_whoIsActive – check out Brent Ozar’s Post on this

If you’re happy with the restored database, then it can be changed to 2017 compatibility if the application for the database supports it.

USE [master]
GO
ALTER DATABASE [mydatabase] SET COMPATIBILITY_LEVEL = 140
GO

 

Correlating mongodb connections to application threads

We are currently in the process of making sure all of our applications that connect to mongodb Atlas are using connection pooling. There are several benifits to managing connnections effectively and efficiently, namely that connections are recycled efficiently reducing resource overhead.

The mongodb cost of a connection is 1MB so it can quickly add up and eat into valuable RAM that could otherwise be use for cache activity.

In a managed service world, everything is based around limitations and tiering based on some core components such as CPU, IOPS, disk space and from a database perspective; connections.

The number of connections allowed on a mongodb cluster correlates directly to the instance size. For example;

Instance Size Connection Limit
M10 350
M20 700
M30 2000

To see all the connection limits see https://docs.atlas.mongodb.com/connection-limits/

Connections limits mattered less to a DBA in an on prem world, as you would set the ulimit settings for open files and processes/threads to 64000 as per the mongodb recommendations (https://docs.mongodb.com/manual/reference/ulimit/). However, it becomes extremely critical when you have an M10 that only allows 350 connections of which around 5-10% are taken up by mongo system processes.

Analysing mongodb logs for connections

I use a little app called mtools developed by Thomas Rückstieß who works at mongodb. It is a collection of helper scripts to parse, filter, and visualize MongoDB log files (mongodmongos).

You can pick it up here – http://blog.rueckstiess.com/mtools/

The setup is straightforward and you can quickly start seeing how many connections are being opened and closed grouped by IPs.

mloginfo mongod.log --connections
     source: core-prod-vms-scaled.log
     host: unknown
     start: 2018 Dec 31 10:56:08.404
     end: 2018 Dec 31 13:25:40.320
date format: iso8601-local
     length: 2714
     binary: unknown
     version: >= 3.0 (iso8601 format, level, component)
     storage: unknown

CONNECTIONS
     total opened: 155
     total closed: 143
     no unique IPs: 4
     socket exceptions: 0

35.X.X.1    opened: 55        closed: 55
35.X.X.2    opened: 49        closed: 49
35.X.X.3    opened: 39        closed: 39
35.X.X.4    opened: 12        closed: 0

Correlating open connections against an app server

If we take the example output above and use the 35.X.X.4 IP – we can see that it has sent 12 incoming connections to mongo. The best way i’ve found to see established connections on an app server is to use netstat.

netstat -anp | grep ESTABLISHED | grep ":27017" | grep " 172." | awk '{print $5}' | sort | uniq -c | sort -n
      12 172.X.X.1:27017
      12 172.X.X.2:27017
      12 172.X.X.3:27017

The above is telling us that there are 12 threads connected to 3 different IPs. When looking into the IP’s, they reference the 3 nodes on a mongo replica set which tells us that each connection on mongo is actually 3 threads on an app server (or however many nodes there are in the replica set).

maxPoolSize

Setting the maxPoolSize  property on the mongo driver will help control how many threads an app server is allowed to open against a mongodb node. Be wary that the maxPoolSize default varies in different drivers – for example, in python its 100, but in node.js its 5.

Knowing the maxPoolSize for applications that have databases on the same cluster can then allow you to accurately calculate what the max connections could potentially be for a cluster. This could then help make more informed decisions about whether to scale or upsize a mongodb cluster or split applications out.

YOu can get more info about connection pool options here – https://docs.mongodb.com/manual/reference/connection-string/#connection-pool-options

Return pods running on nodes using external node IP

Some of our databases sit in managed environments which means connections from our applications can show up in the database logs as coming from an external IP at the edge of our cloud infrastructure.

I wrote a little script which will return the pods running on a kubernetes cluster node by specifying its external IP.

kubectl get pods -o wide --sort-by="{.status.phase}" --all-namespaces | grep `kubectl get nodes -o wide | grep 35.X.X.X | awk '{print $1}' | awk '{$1=$1;print}'`

Ive found it quite useful for starting to diagnose where database connections are being initiated from.