Adding columns is super easy (and performant) in postgres 11

Generally databases are always good at adding the column metadata into the data dictionary – normally the performance issues are around adding NOT NULL or a default value because a back population needs to happen.

This can be problamatic when tables have billions of rows and an exclusive lock needs to be taken out. Luckily, in Postgres 11 some underlying changes have been made to allow this previously difficult task to happen instantly.

Someone has done a great job of explaining the difference for version 11 here – https://brandur.org/postgres-default

My testing showed the following results on a 400 million row table (200GB):

alter table stock_performance_by_version add column mos int;
--81 msec
alter table stock_performance_by_version add column mosdefaultint int default 0;
--87 msec
alter table stock_performance_by_version add column mosdefaultnotnull int NOT NULL default 0;
--104 msec
alter table stock_performance_by_version add column mosdefaultstring varchar NOT NULL default 'hello';
--126 msec

If a table column is added with NOT NULL and a default value, the value is now stored in pg_attribute and referenced back when the original data is selected from the table. To the end user, it just looks like they are selected columns from tables, but behind the scenes, the db engine is joining the “missing attributes” back to the table.

Screen Shot 2019-11-13 at 11.50.59

underying meta data for the above can be seen in pg_attribute

SELECT c.oid,
  n.nspname,
  c.relname, t.*
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
CROSS JOIN LATERAL (
  SELECT a.attname,
    pg_catalog.format_type(a.atttypid, a.atttypmod),
    (
      SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
      FROM pg_catalog.pg_attrdef d
      WHERE d.adrelid = a.attrelid
        AND d.adnum = a.attnum
        AND a.atthasdef
    ),
    a.attnotnull, a.attnum,
    (
      SELECT c.collname
      FROM
        pg_catalog.pg_collation c,
        pg_catalog.pg_type t
      WHERE c.oid = a.attcollation
        AND t.oid = a.atttypid
        AND a.attcollation <> t.typcollation
    ) AS attcollation,
	a.atthasmissing,
	a.attmissingval
  FROM pg_catalog.pg_attribute a
  WHERE a.attrelid = c.oid
    AND a.attnum > 0
	AND a.attmissingval is not null --TO ONLY SEE MISSING ATTRIBUTES
    AND NOT a.attisdropped
) AS t
WHERE n.nspname ~ '^(public)$'  -- YOUR SCHEMA HERE
AND relname = 'stock_performance_by_version'
AND pg_catalog.pg_table_is_visible(c.oid);

Screen Shot 2019-11-13 at 11.56.39

Great Feature!

Setting up SOPS (Secrets OPerationS)

We use docker-compose to locally generate terraform plans before committing them into code and triggering pipelines. The code requires some encrypted variables, which can be locally extracted using sops.

SOPS (Secrets OPerationS) is an editor of encrypted files that supports YAML, JSON, ENV, INI and BINARY formats and encrypts with AWS KMS, GCP KMS, Azure Key Vault and PGP.

GitHub – mozilla/sops

brew install sops

Before you can use sops, you need to authorise against the application-default Credentials.

If not authenticated, when you try to source sops this happens:

sops -d testing.config.sops

Failed to get the data key required to decrypt the SOPS file.

Group 0: FAILED
projects/project-name-here/locations/global/path/terraform: FAILED
– | Cannot create GCP KMS service: google: could not find
| default credentials. See
| https://developers.google.com/accounts/docs/application-default-credentials
| for more information.

Recovery failed because no master key was able to decrypt the file. In
order for SOPS to recover the file, at least one key has to be successful,
but none were.

To Authorise a login:

gcloud auth application-default login

Which then allows you to source the decrypted keys locally:

#Bash
eval $(sops -d testing.config.sops)

#zsh
source <(sops -d testing.config.sops)

Essentially all you are doing with sops is setting up your local machine with the correct decrypted keys:

More about Application default credentials

https://cloud.google.com/sdk/gcloud/reference/auth/application-default/

Application Default Credentials (ADC) provide a method to get credentials used in calling Google APIs. The gcloud auth application-default command group allows you to manage active credentials on your machine that are used for local application development.

These credentials are only used by Google client libraries in your own application.

 

Increase storage size of Azure SQL Managed Instance

If you need to increase the storage space available to a managed instance – it has to be done via code as the option to edit a managed instance is not available in the Portal.

I recently increased a size of a managed instance from 320GB to 512GB and it took around 2 minutes to complete. I used the cloud shell available in the portal.

$subId = "my_subscription_id"
$resourceGroup = "resourceGroup1"
$instanceName = "managed-instance-1"

Select-AzureRmSubscription -SubscriptionId $subId

$size = 512
Set-AzureRmSqlInstance -Name $instanceName -ResourceGroupName $resourceGroup -StorageSizeInGB $size

Cores, the Admin password and a few other parameters can also be changed in the same manner. See the set-azurermsqlinstance documentation.

Restoring into an Azure Managed Instance from SQL 2014

We are in the process of setting up our azure environment, and using managed instances is something that lets us efficiently move from an on premise legacy world to a similar setup inside Azure. I’ve just completed my first migration of moving from SQL Server 2014 On premise, onto a SQL Managed Instance running SQL Azure Database V12. There are several functional steps that need to be followed succintly to complete the migration of data which i’ll try to cover off below.

Creating an Azure Storage Container

Similar to S3, Azure has storage accounts which can store multiple blob containers (folders). The general heirarchy of this is:

Subscription
  --Resource Group
    --Storage Account
      --Storage Container

The Storage Account can be set to only allow certain ranges of IP’s (IP Whitelist) to access the data, and for best practice this should be set.

The blob storage should be set to type private to prevent anonymous access – depending on what is being stored, but for databases i’d say this should be the default position.

Backup a SQL 2014 DB to Azure Storage Blob

To be able to talk to the blob storage, we need to provide a credential for SQL Server to use, which has a login access key to the storage account.

USE master 
CREATE CREDENTIAL [backups] -- name of the credential 
WITH IDENTITY='my_storage_account' -- storage account name. 
, SECRET = 'one_of_the_two_storage_keys_for_the_account' -- this is the access key token 
GO

NOTE: If you try to use a Shared Access Signature (SAS) on SQL Server 2014, it will fail to decode the signature when trying to use it. Follow this document to create a SQL 2014 Credential

Error if using SAS on SQL 2014:

Msg 3298, Level 16, State 2, Line 11
Backup/Restore to URL device error: Error while decoding the storage key.
Msg 3013, Level 16, State 1, Line 11
BACKUP DATABASE is terminating abnormally.

To see the credentials that are in place on SQL Server:

select * from sys.credentials

At, this point we can start referencing the credential to backup to a URL based location:

BACKUP DATABASE mydatabase 
TO URL = 'https://<storage_account>.blob.core.windows.net/<storage_container>/mydatabase.bak'
WITH CREDENTIAL = 'backups'

Restore a SQL 2014 Backup to SQL Azure Managed Instance

To be able to allow a SQL Azure database to talk to blob storage, we need to provide a credential using a Shared Access Signature (SAS).

To create a SAS, you need to generate a signature from within the Storage Account in the azure portal. Set an expiry on the SAS to give some added security.

NOTE: Remember to remove the intial ? from the generated SAS key otherwise the key will be invalid.

USE master 
CREATE CREDENTIAL [https://<storage_account>.blob.core.windows.net/<storage_container>] -- this name must match the container path, start with https and must not contain a trailing forward slash. 
WITH IDENTITY='SHARED ACCESS SIGNATURE' -- this is a mandatory string and do not change it. 
, SECRET = 'sas_key' -- this is the shared access signature token 
GO

Run the restore:

RESTORE DATABASE mydatabase 
FROM URL = 'https://<storage_account>.blob.core.windows.net/<storage_container>/mydatabase.bak'

NOTE: Notice that there are no WITH clauses in the restore as these are not supported in SQL Azure. Normally you would have a STATS clause to monitor the restore, but this can be done either via sys.dm_exec_requests – precent_complete column or by downloading sp_whoIsActive – check out Brent Ozar’s Post on this

If you’re happy with the restored database, then it can be changed to 2017 compatibility if the application for the database supports it.

USE [master]
GO
ALTER DATABASE [mydatabase] SET COMPATIBILITY_LEVEL = 140
GO

 

Correlating mongodb connections to application threads

We are currently in the process of making sure all of our applications that connect to mongodb Atlas are using connection pooling. There are several benifits to managing connnections effectively and efficiently, namely that connections are recycled efficiently reducing resource overhead.

The mongodb cost of a connection is 1MB so it can quickly add up and eat into valuable RAM that could otherwise be use for cache activity.

In a managed service world, everything is based around limitations and tiering based on some core components such as CPU, IOPS, disk space and from a database perspective; connections.

The number of connections allowed on a mongodb cluster correlates directly to the instance size. For example;

Instance Size Connection Limit
M10 350
M20 700
M30 2000

To see all the connection limits see https://docs.atlas.mongodb.com/connection-limits/

Connections limits mattered less to a DBA in an on prem world, as you would set the ulimit settings for open files and processes/threads to 64000 as per the mongodb recommendations (https://docs.mongodb.com/manual/reference/ulimit/). However, it becomes extremely critical when you have an M10 that only allows 350 connections of which around 5-10% are taken up by mongo system processes.

Analysing mongodb logs for connections

I use a little app called mtools developed by Thomas Rückstieß who works at mongodb. It is a collection of helper scripts to parse, filter, and visualize MongoDB log files (mongodmongos).

You can pick it up here – http://blog.rueckstiess.com/mtools/

The setup is straightforward and you can quickly start seeing how many connections are being opened and closed grouped by IPs.

mloginfo mongod.log --connections
     source: core-prod-vms-scaled.log
     host: unknown
     start: 2018 Dec 31 10:56:08.404
     end: 2018 Dec 31 13:25:40.320
date format: iso8601-local
     length: 2714
     binary: unknown
     version: >= 3.0 (iso8601 format, level, component)
     storage: unknown

CONNECTIONS
     total opened: 155
     total closed: 143
     no unique IPs: 4
     socket exceptions: 0

35.X.X.1    opened: 55        closed: 55
35.X.X.2    opened: 49        closed: 49
35.X.X.3    opened: 39        closed: 39
35.X.X.4    opened: 12        closed: 0

Correlating open connections against an app server

If we take the example output above and use the 35.X.X.4 IP – we can see that it has sent 12 incoming connections to mongo. The best way i’ve found to see established connections on an app server is to use netstat.

netstat -anp | grep ESTABLISHED | grep ":27017" | grep " 172." | awk '{print $5}' | sort | uniq -c | sort -n
      12 172.X.X.1:27017
      12 172.X.X.2:27017
      12 172.X.X.3:27017

The above is telling us that there are 12 threads connected to 3 different IPs. When looking into the IP’s, they reference the 3 nodes on a mongo replica set which tells us that each connection on mongo is actually 3 threads on an app server (or however many nodes there are in the replica set).

maxPoolSize

Setting the maxPoolSize  property on the mongo driver will help control how many threads an app server is allowed to open against a mongodb node. Be wary that the maxPoolSize default varies in different drivers – for example, in python its 100, but in node.js its 5.

Knowing the maxPoolSize for applications that have databases on the same cluster can then allow you to accurately calculate what the max connections could potentially be for a cluster. This could then help make more informed decisions about whether to scale or upsize a mongodb cluster or split applications out.

YOu can get more info about connection pool options here – https://docs.mongodb.com/manual/reference/connection-string/#connection-pool-options

Return pods running on nodes using external node IP

Some of our databases sit in managed environments which means connections from our applications can show up in the database logs as coming from an external IP at the edge of our cloud infrastructure.

I wrote a little script which will return the pods running on a kubernetes cluster node by specifying its external IP.

kubectl get pods -o wide --sort-by="{.status.phase}" --all-namespaces | grep `kubectl get nodes -o wide | grep 35.X.X.X | awk '{print $1}' | awk '{$1=$1;print}'`

Ive found it quite useful for starting to diagnose where database connections are being initiated from.

Efficiently retrieving Oracle CLOBS in high latency environments

The Problem

We have recently come across an interesting scenario whereby a java application living within a container in Kubernetes in GCP needs to talk back to an Oracle 12.1 database in an on premise DC.

There are several things that are a given in these situations; that the latency in the chatter between a cloud environment and an on premise environment will be Higher than two services talking in the same DC, and depending on the way you look at it, creating a high latency situation will highlight poor code/queries or inefficiencies that may well have been hidden for years on premise due to the sub 2ms latency that developers and ops were lucky enough to have…

So, althougth there is short term pain, there is definately a lot to gain in spending the resource effort in creating efficiencies in the app to database conversations.

Investigation

So how do you start investigating an application developer complaining that his database queries are taking 70-80 seconds since moving the application to the cloud, queries that used to take 2-3 seconds? Mainly by breaking down the problem and finding little nuggets (or red herrings!) to concentrate on.

We are lucky to have a multi disciplined squad/team so we are able to quickly troubleshoot from various angles including database, networking and our kubernetes architecture.

What metrics did we have?

Although we like to trust our developers, we also like to have proven data or logs to back up slowness claims. Our developer provided us with Kibana Logs and Graphana dashboards which showed that in some cases a particular heavy query was taking a significant amount of time (greather than 30 seconds up to 80 seconds). The application had a timeout of 30 seconds so in these cases, it was user impacting.

We also had the query which was causing the problem, and from the Kibana logs we could see that the issue was only happening on big data sets – or was more exponential on big datasets.

Creating a safe testing environment

We knew we had a problem with the query in production, but I didnt want to affect Prod performance even more so we backported a copy of the poorly performing dataset into our QA database environment.

As an example, if we continued to use prod to test the Select query, we may be inadvertantly flushing out hot data from the buffer pool and replacing it with out testing data.

Creating a repeatable test and finding patterns

Generally when I am investigating a problem, my approach is normally scientific and make sure I am doing both a Fair test and a Controlled experiment. A fair test is important as when changing variables you should only change one thing at a time, and controlling a test is important so that you have one test that follows the normal expected behaviour.

The query itself always returned 3727 rows with around 35MB of data from the database.

I embedded the query into a script and set some sqlplus options:

spool qadb21-local-5000.log
SET AUTOTRACE TRACEONLY
set ARRAYSIZE 5000
set TIMING on
set TERMOUT off
SET TRIMSPOOL ON
SET TRIMOUT ON
SET WRAP OFF
SELECT /*+ ALL_ROWS() */ V.ID as V1,A.*, V.*,M.*
FROM A JOIN V ON A.ID = V.A_ID
LEFT JOIN M ON M.V_ID = V.ID
WHERE A.ID = '2dfdda64-e15c-4cdd-9d56-2db1d013c6a0'
ORDER BY V.DISPLAY_ORDER;
spool off

AUTOTRACE TRACEONLY – I didnt want to write any results to the spool – only the trace. Reason for this is that its possible the writing of the result set to a file would add seconds to the query completion.

ARRAYSIZE – To test fetch size in sqlplus – I tested multiple array sizes of 50, 125, 250, 1000, 2500 & 5000

ALL_ROWS Hint – Just to force the same optimizer results behaviour.

I tested 4 scenarios in total:

  • Running the query directly on the Oracle DB [<1 second runtime]
  • Running the query from my laptop to the Oracle DB [25-30 seconds runtime]
  • Running the query froma container in our GCP testing project to the Oracle DB [1:50 – 2:00 minutes runtime]
  • Running the query from a container in our GCP PreProd project to the oracle DB [1:50 – 2:00 minutes runtime]

By testing 2 different GCP Projects against our database, we can see if the problem is environment specific or a general error. The query was returning in similar durations both in our testing project and our preprod project so we deduced that it was something environment independent – but continued to doublecheck our networking config.

Below is an extract of the kind of output I was using to make sure the test stayed fair. I only took results if the recursive calls were 0 – i.e no hard parses.

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
32406 consistent gets
28457 physical reads
0 redo size
3065681 bytes sent via SQL*Net to client
522366 bytes received via SQL*Net from client
6008 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3727 rows processed

As the example output above shows, we were seeing over 6000 rountrips on 3727 rows of data. This rang some alarm bells as in a high latency environment, every rountrip is exponential so this was a key area of focus.

SQLPlus has a default fetchsize of 15 so evern at its worst, we should be seeing no more than approx. 500 roundtrips (rows processed / fetchsize * 2).

I setup a sql trace against my connection and this also showed a high level of chatter between the database and client in all test cases. (https://oracle-base.com/articles/misc/sql-trace-10046-trcsess-and-tkprof)

EG:

...
FETCH #139847718960616:c=0,e=4,p=0,cr=0,cu=0,mis=0,r=0,dep=4,og=4,plh=2542797530,tim=15317741724313
CLOSE #139847718960616:c=0,e=1,dep=4,type=3,tim=15317741724339
EXEC #139847718275456:c=0,e=22,p=0,cr=0,cu=0,mis=0,r=0,dep=4,og=4,plh=3765558045,tim=15317741724381
FETCH #139847718275456:c=0,e=19,p=0,cr=3,cu=0,mis=0,r=1,dep=4,og=4,plh=3765558045,tim=15317741724409
FETCH #139847718275456:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=1,dep=4,og=4,plh=3765558045,tim=15317741724424
FETCH #139847718275456:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=0,dep=4,og=4,plh=3765558045,tim=15317741724435
CLOSE #139847718275456:c=0,e=0,dep=4,type=3,tim=15317741724446
...

Networking

Our network engineers testing various parts of our cloud to on prem architecture including transfer speeds between different environments, How many hops it was taking to get to and from the database, and packet loss.

We found that the network route the query was taking was doing one extra hop, so we did some remediation work to correct this – but we didnt see much impact on query times [reduction of less than 1-2 seconds].

We also did a tcpdump of the packets between my laptop and the database server, to see what was actually happening when the query ran…

sudo tcpdump -i en0 host 172.1.2.34 -w query.pcap
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes
08:58:05.142087 IP 10.10.1.1 > 172.1.2.34: ICMP echo request, id 35751, seq 0, length 64
08:58:05.146186 IP 172.1.2.34 > 10.10.1.1: ICMP echo reply, id 35751, seq 0, length 64
2 packets captured
804 packets received by filter
0 packets dropped by kernel

We ran the pcap file thorugh wireshark and could see straight away that the query was constantly going back and forth to the database – thousands of times in a 2 minute period. This matched up pretty well to what we found with the database statistics showing a high level of roundtrips, and also the SQL Trace output.

Digging Deeper into the fetch size

I broke up the query and looked into all the data type of the result set and found that there was a CLOB column which was storing json data. After investigating I found that sqlplus does individual row retrieval for queries with CLOBs. SQLPlus has been developed so that it doesnt use a feth array when retriving CLOBS (https://asktom.oracle.com/pls/apex/asktom.search?tag=performance-issue-with-clob).

To test that the CLOBs were the culprit, I converted the CLOB data to varchar2 and reran the query and found that the data was returned under 1 seconds with 2 roundtrips.

ALTER TABLE M ADD METRICS_DATA_VAR VARCHAR2(4000);
UPDATE M SET METRICS_DATA_VAR = dbms_lob.substr( METRICS_DATA, 4000, 1 );

How does the Java Driver CLOB retrieval work?

It seems it works in a similar fashion to the sqlplus retrieval. Our CLOBs ranged between 4200 and 5000 characters. The default fetch size for CLOBS is 4000 characters so potentially you could have 2 fetches for 1 CLOB depending on the size.

As of Oracle 11.2g JDBC Driver you can use a prefetch.

statement1.setFetchSize(1000);
if (statement1 instanceof OracleStatement) {
    ((OracleStatement) statement1).setLobPrefetchSize(50000);
}

or can be done globally for the application;

System.setProperty(OracleConnection.CONNECTION_PROPERTY_DEFAULT_LOB_PREFETCH_SIZE, "50000");

Solution

Setting the CONNECTION_PROPERTY_DEFAULT_LOB_PREFETCH_SIZE to 50000 has made a huge difference to our application performance. Why 50000? I worked it out based on being able to fit 10 rows per fetch but not overloading the application java heap.

We have seen queries that were taking 70 seconds on big datasets now taking less that 3 seconds. Round trip latency is the major factor at play here and by reducing the number of fetch trips the query is making to retrieve the data, it significantly boosts performance.

Group IP address counts from a log

A colleague wrote a great little script to do some analysis against an Oracle listener log and to show how many listener connections were being established from each unique IP in the log.

cd /u01/app/oracle/diag/tnslsnr/node_name/listener_qa/trace
grep DB_DEFAULT_SERVICE listener_qa.01-Nov-2018-W44.log | grep -oE "[0-9]{1,3}[.][0-9]{1,3}[.][0-9]{1,3}[.][0-9]{1,3}" | sort | uniq -c

The output looks something like this:

20277 172.0.0.1
37176 172.0.0.4
18    172.1.1.1
2300  172.2.3.10
2300  172.2.3.8
96    172.2.3.4
1012  172.2.3.7

 

 

Using Cloudsqlproxy to initiate connection to mysql from the shell

The cloud sql proxy is a really useful way of managing connectivity to cloudSQL instances. The proxy provides secure access to your Cloud SQL Second Generation instances without having to whitelist IP addresses or configure SSL.

Accessing Cloud SQL instances using the Cloud SQL Proxy offers these advantages:

  • Secure connections: The proxy automatically encrypts traffic to and from the database using TLS 1.2 with a 128-bit AES cipher; SSL certificates are used to verify client and server identities.
  • Easier connection management: The proxy handles authentication with Cloud SQL, removing the need to provide static IP addresses.

There are a couple of pre-requisites before this will work:

  • Only Second generation instances can benifit from this feature.
  • Cloud SQL Admin API must be enabled
  • The instance must have  either a public IPv$ address or be configured to use a private IP.
  • The proxy needs to be given GCP authentication credentials, either an individual or a service account.

Full details of exactly what to do are here – https://cloud.google.com/sql/docs/mysql/sql-proxy 

Some notes on what I did on macOS sierra:

  1. Created a Service Account, gave it the Cloud SQL Client role and then generated a new private key (json key)
curl -o cloud_sql_proxy https://dl.google.com/cloudsql/cloud_sql_proxy.darwin.amd64

Unpack the dmg file, then give execute permissions on the proxy file.

chmod +x cloud_sql_proxy

Then start the proxy:

./cloud_sql_proxy -instances=project:region:instance=tcp:3306 \
-credential_file=/Users/dbamohsin/sslkeys/cloudsqlproxy-env.json

The proxu will now start listening on your local machine for connection attempts against port 3306:

2018/11/09 11:53:10 Listening on 127.0.0.1:3306 for project:region:instance
2018/11/09 11:53:10 Ready for new connections

Then in another terminal window, connect as normal to a mysql instance using the localhost IP:

mysqlsh -udbamohsin -p -h127.0.0.1

This should make a connection to the cloudSQL instance using the proxy and the connection will show in the proxy log:

2018/11/09 11:53:42 New connection for "project:region:instance"

Deleting files older than X days

A couple of different ways to purge files older than a certain amount of days.

find -type f -mtime +350 -name 'CopyCluster*' -execdir rm -- '{}' \;

If you want to be very precise with the number of days ago you want to delete from then you can find the number of days from now using:

days=$(( ( $(date '+%s') - $(date -d '1 months ago' '+%s') ) / 86400 ))
echo $days

and then use the variable to delete

find /logs/CopyCluster* -mtime +$days -type f -delete

To do this all in one command

find /tmp/*.log -type f \
-mtime "+$(( ( $(date '+%s') - $(date -d '1 months ago' '+%s') ) / 86400 ))" -delete