Mongo DB Aggregation – Part I – Single Purpose Aggregation Operations

Posted on July 21, 2015 by By Shraddha Tambe, in Databases | 0

Mongo DB Aggregation(Part 1) – Single Purpose Aggregation Operations

MongoDB offers more than one ways of aggregating data stored in a collection. It has a flexible datamodel and gives us tools to write some pretty complex queries and derive results similar to what SQL queries could do. Like queries, aggregation operations in MongoDB use collections of documents as an input and return results in the form of one or more documents.

There are 3 main aggregation operators :

Single purpose aggregation operations
Aggregation Framework
Map Reduce

In this part, we will have a look at the “Single purpose aggregation operators” and their usage with examples. Though these are limited in scope, they might provide simple way of doing the common data processing operations.

We will consider below collection as an example for reference:

{ "_id": 1, "emp_dept": "Sales", joining_date: "01-10-2014", "job_roles": [ "Manager", "Regional Head" ], salary : { "base": 2100, "variable": 800 }}
{ "_id": 2, "emp_dept": "Sales", joining_date: "03-01-2014", "job_roles": [ "Analyst", "Sales Officer"], salary : { "base": 2000, "variable": 500 }}
{ "_id": 3, "emp_dept": "Finance", joining_date: "31-01-2015", "job_roles": "Manager" , salary : { "base": 1000, "variable": 500 }}
{ "_id": 4, "emp_dept": "Sales", joining_date: "03-03-2015", "job_roles": [ "Analyst"], salary : { "base": 1500, "variable": 500 } }
{ "_id": 5, "emp_dept": "Finance", joining_date: "28-05-2015", "job_roles": [ "Analyst" , "Regional Head"], salary : { "base": 3000, "variable": 500 } }

To help easy understanding, we will first write an SQL query for each example and then translate to a MongoDB command.

1. COUNT – counts all documents or the ones that match a query (if specified)

Syntax:

db.<collection.count(query)
or
db.runCommand({ count: , query: , limit: , skip: , hint: })

Examples:

Query ==>

select count(*) from employees where joining_date >= '01-01-2015';

In Mongo ==>

db.employees.count({ joining_date: { $gte: new Date("2014","12","31") }})
or
db.runCommand( { count:'employees',
query: { joining_date: { $gte: new Date("2014","12","31") } }
} )

Result ==>

{ "n" : 3, "ok" : 1 }

* hint option is like a performance hint, used to specify a index that can be used.
2. DISTINCT – returns distinct values for a key in collection. In case the key is an array field, it considers each value in the array as different value.

Syntax:

db..distinct(field, query)
or
db.runCommand({ distinct: "", key: "", query: })

Examples:

Query 1 ===>

Select emp_dept from employees;

In Mongo ===>

db.runCommand({ distinct: "employees", key: "emp_dept"})
or db.employees.distinct("emp_dept")

Result ==>

{
"values" : [ "Sales", "Finance" ],
"stats" : { ... },
"ok" : 1
}

Query 2 ===>

Select distinct job_roles from employees where emp_dept = 'Sales';

In Mongo ===>

db.runCommand ( { distinct: "employees", key: "job_roles", query: { emp_dept: "Sales"} } )

Result ===>

{
"values" : [ "Manager", "Regional Head", "Analyst", "Sales Officer"],
"stats" : { ... },
"ok" : 1
}

3. GROUP – groups the documents in a collection by given key and allows to perform simple aggreation functions like sum/count on fields.

Syntax:

db.collection.group({ key, reduce, initial [, keyf] [, cond] [, finalize] })
or
db.runCommand({
group:
{
ns: ,
key: ,
$reduce: ,
initial:,
$keyf: ,
cond: ,
finalize:
}
})

Examples:

Query ===>

Select emp_dept,
sum(salarybase) as basesalary, sum(salaryvariable) as variablesalary,
count(_id) as empcount, avg(salary) as avgsalary
from employees
group by emp_dept;

In Mongo ===>

db.runCommand({ group:{
    ns: "employees",
    key:{emp_dept:1},
    initial: { basesalary : 0, variablesalary: 0, empcount:0 },
    $reduce: function ( curr, result ) {
    result.emp_dept = curr.emp_dept,
    result.basesalary += curr.salary.base;
    result.variablesalary += curr.salary.variable;
    result.empcount++;
    },
    $finalize:function(result){
    result.avgsalary = (result.basesalary+result.variablesalary)/result.empcount;
    }}})

or 

db.employees.group({key:{emp_dept:1},
    initial: { basesalary : 0, variablesalary: 0,empcount:0 }, reduce:function ( curr, result ) {
    result.emp_dept = curr.emp_dept,
    result.basesalary += curr.salary.base;
    result.variablesalary += curr.salary.variable;
    result.empcount++;
},finalize:function(result){
result.avgsalary = (result.basesalary+result.variablesalary)/result.empcount;
}})

Result ===>

{"retval" :[
{ "emp_dept" : "Sales", "basesalary" : 5600 ,"variablesalary": 1800, "empcount": 3, "avgsalary": 2466.666666666667},
{ "emp_dept" : "Finance", "basesalary" : 4000 ,"variablesalary": 1000, "empcount":2,"avgsalary": 2500},
],
"count" : 5,
"keys" : 3,
"ok" : 1}

This Group function / command does not work with sharded clusters and also has limitation on the size of the array that is returned. The aggregation framework or map reduce provide more stronger options.

So, this is it. Simple but handy.

We will look at the Aggregation Framework in Part II of this blog. Stay tuned !!

Shraddha Tambe | Helical IT Solutions

References:
http://docs.mongodb.org/manual/core/single-purpose-aggregation/
http://docs.mongodb.org/manual/reference/method/js-collection/

Best Open Source Business Intelligence Software Helical Insight is Here

A Business Intelligence Framework

MongoDB mongodb aggregation

0 0 votes

Article Rating

0 Comments

Inline Feedbacks

View all comments

You might also like..

Helical Insight

Helical IT Solutions Launches Helical Insight 5.2.2 : Focus on Advance Embedded Analytics

By admin

24 Dec 2024: Helical IT Solutions is excited to unveil Helical Insight 5.2.2, the latest iteration of its cutting-edge Open Source Business Intelligence (BI) platform. This release reinforces Helical Insight's position as a cost-effective, versatile, and powerful alternative to mainstream...

Helical Insight 5.2.1

Helical IT Solutions Launches Helical Insight 5.2.1: Elevating Open Source BI to New Heights

By admin

02 Sept 2024 – Helical IT Solutions is thrilled to announce the release of Helical Insight version 5.2.1, the latest upgrade to its Open Source Business Intelligence (BI) platform. This new version delivers a powerful, cost-effective BI solution that is...

Business Intelligence

Installation of Firebird db

By admin

Steps to install firebird db 1. Go to google and type firebird in search box and then click on first link. License aggrement 2. Click on downloads and then install Firebird latest version(5.0.0). 3. It will navigate to the below...

About Helical IT Solutions Pvt Ltd

Location

Contact Us

Search what you are looking for..

Mongo DB Aggregation – Part I – Single Purpose Aggregation Operations

Posted on July 21, 2015 by By Shraddha Tambe, in Databases | 0

A Business Intelligence Framework

You might also like..

Helical Insight

Helical IT Solutions Launches Helical Insight 5.2.2 : Focus on Advance Embedded Analytics

By admin

Helical Insight 5.2.1

Helical IT Solutions Launches Helical Insight 5.2.1: Elevating Open Source BI to New Heights

By admin

Business Intelligence

Installation of Firebird db

By admin

Contact Form