Group IP address counts from a log

A colleague wrote a great little script to do some analysis against an Oracle listener log and to show how many listener connections were being established from each unique IP in the log.

cd /u01/app/oracle/diag/tnslsnr/node_name/listener_qa/trace
grep DB_DEFAULT_SERVICE listener_qa.01-Nov-2018-W44.log | grep -oE "[0-9]{1,3}[.][0-9]{1,3}[.][0-9]{1,3}[.][0-9]{1,3}" | sort | uniq -c

The output looks something like this:

20277 172.0.0.1
37176 172.0.0.4
18    172.1.1.1
2300  172.2.3.10
2300  172.2.3.8
96    172.2.3.4
1012  172.2.3.7

 

 

Advertisements

mongodb exception: can’t convert from BSON type String to Date

Problem: When attempting to do an aggregation on timestamps which are stored as strings, mongodb is unable to do the conversion

repmongo:SECONDARY> db.collection.aggregate(
…     { $match : { "type" : "Review"}},
…     { $group : {
…         _id: {
…             year : { $year : "$created" },
…             month : { $month : "$created" },
…             day : { $dayOfMonth : "$created" },
…         },
…         count: { $sum: 1 }
…     }},
…     { $sort : { _id : 1}}
… );
assert: command failed: {
        "errmsg" : "exception: can’t convert from BSON type String to Date",
        "code" : 16006,
        "ok" : 0
} : aggregate failed

To see which fields are of which type:

repmongo:SECONDARY> typeof db.collection.findOne().created;
string
repmongo:SECONDARY> typeof db.collection.findOne().updated;
string

Solution 1: The best thing to do is to resolve the data type inconsistency at the application layer so that data is entered into the database in ISO format which can then be easily worked with. This would require the existing data set to be changed to [date] data type.

Solution 2: For immediate analysis of the data use regex and substring to extract the date portions.

db.collection.aggregate(
    { $match : { "created" : {$in : [/2014-10/]}}},
    { $group : {
        _id: {
            year :  { $substr : ["$created", 0, 4 ] },   
            month : { $substr : ["$created", 5, 2 ] },                                      
            day :   { $substr : ["$created", 8, 2 ] },        
        },
        count: { $sum: 1 }
    }
              },
    { $sort : { _id : -1}}
);

The above returns data grouped by Year, Month & Day with a sum. The Regex conditions acts as a like clause on the date string.

{ "_id" : { "year" : "2014", "month" : "10", "day" : "21" }, "count" : 1 }

{ "_id" : { "year" : "2014", "month" : "10", "day" : "06" }, "count" : 1 }

If doing in SQL Language, the above would probably look something like this:

SELECT 
TO_DATE(created, 'YYYY'), 
TO_DATE(created, 'MM'), 
TO_DATE(created, 'DD'), 
count(_id)
FROM collection
WHERE created like '%2014-10%'
GROUP BY 
TO_DATE(created, 'YYYY'), 
TO_DATE(created, 'MM'), 
TO_DATE(created, 'DD') 
ORDER BY _id DESC