Mongo DB Aggregation – Part I – Single Purpose Aggregation Operations

Posted on July 21, 2015 by By Shraddha Tambe, in Databases | 0

Mongo DB Aggregation(Part 1) – Single Purpose Aggregation Operations

MongoDB offers more than one ways of aggregating data stored in a collection. It has a flexible datamodel and gives us tools to write some pretty complex queries and derive results similar to what SQL queries could do. Like queries, aggregation operations in MongoDB use collections of documents as an input and return results in the form of one or more documents.

There are 3 main aggregation operators :

Single purpose aggregation operations
Aggregation Framework
Map Reduce

In this part, we will have a look at the “Single purpose aggregation operators” and their usage with examples. Though these are limited in scope, they might provide simple way of doing the common data processing operations.

We will consider below collection as an example for reference:

{ "_id": 1, "emp_dept": "Sales", joining_date: "01-10-2014", "job_roles": [ "Manager", "Regional Head" ], salary : { "base": 2100, "variable": 800 }}
{ "_id": 2, "emp_dept": "Sales", joining_date: "03-01-2014", "job_roles": [ "Analyst", "Sales Officer"], salary : { "base": 2000, "variable": 500 }}
{ "_id": 3, "emp_dept": "Finance", joining_date: "31-01-2015", "job_roles": "Manager" , salary : { "base": 1000, "variable": 500 }}
{ "_id": 4, "emp_dept": "Sales", joining_date: "03-03-2015", "job_roles": [ "Analyst"], salary : { "base": 1500, "variable": 500 } }
{ "_id": 5, "emp_dept": "Finance", joining_date: "28-05-2015", "job_roles": [ "Analyst" , "Regional Head"], salary : { "base": 3000, "variable": 500 } }

To help easy understanding, we will first write an SQL query for each example and then translate to a MongoDB command.

1. COUNT – counts all documents or the ones that match a query (if specified)

Syntax:

db.<collection.count(query)
or
db.runCommand({ count: , query: , limit: , skip: , hint: })

Examples:

Query ==>

select count(*) from employees where joining_date >= '01-01-2015';

In Mongo ==>

db.employees.count({ joining_date: { $gte: new Date("2014","12","31") }})
or
db.runCommand( { count:'employees',
query: { joining_date: { $gte: new Date("2014","12","31") } }
} )

Result ==>

{ "n" : 3, "ok" : 1 }

* hint option is like a performance hint, used to specify a index that can be used.
2. DISTINCT – returns distinct values for a key in collection. In case the key is an array field, it considers each value in the array as different value.

Syntax:

db..distinct(field, query)
or
db.runCommand({ distinct: "", key: "", query: })

Examples:

Query 1 ===>

Select emp_dept from employees;

In Mongo ===>

db.runCommand({ distinct: "employees", key: "emp_dept"})
or db.employees.distinct("emp_dept")

Result ==>

{
"values" : [ "Sales", "Finance" ],
"stats" : { ... },
"ok" : 1
}

Query 2 ===>

Select distinct job_roles from employees where emp_dept = 'Sales';

In Mongo ===>

db.runCommand ( { distinct: "employees", key: "job_roles", query: { emp_dept: "Sales"} } )

Result ===>

{
"values" : [ "Manager", "Regional Head", "Analyst", "Sales Officer"],
"stats" : { ... },
"ok" : 1
}

3. GROUP – groups the documents in a collection by given key and allows to perform simple aggreation functions like sum/count on fields.

Syntax:

db.collection.group({ key, reduce, initial [, keyf] [, cond] [, finalize] })
or
db.runCommand({
group:
{
ns: ,
key: ,
$reduce: ,
initial:,
$keyf: ,
cond: ,
finalize:
}
})

Examples:

Query ===>

Select emp_dept,
sum(salarybase) as basesalary, sum(salaryvariable) as variablesalary,
count(_id) as empcount, avg(salary) as avgsalary
from employees
group by emp_dept;

In Mongo ===>

db.runCommand({ group:{
    ns: "employees",
    key:{emp_dept:1},
    initial: { basesalary : 0, variablesalary: 0, empcount:0 },
    $reduce: function ( curr, result ) {
    result.emp_dept = curr.emp_dept,
    result.basesalary += curr.salary.base;
    result.variablesalary += curr.salary.variable;
    result.empcount++;
    },
    $finalize:function(result){
    result.avgsalary = (result.basesalary+result.variablesalary)/result.empcount;
    }}})

or 

db.employees.group({key:{emp_dept:1},
    initial: { basesalary : 0, variablesalary: 0,empcount:0 }, reduce:function ( curr, result ) {
    result.emp_dept = curr.emp_dept,
    result.basesalary += curr.salary.base;
    result.variablesalary += curr.salary.variable;
    result.empcount++;
},finalize:function(result){
result.avgsalary = (result.basesalary+result.variablesalary)/result.empcount;
}})

Result ===>

{"retval" :[
{ "emp_dept" : "Sales", "basesalary" : 5600 ,"variablesalary": 1800, "empcount": 3, "avgsalary": 2466.666666666667},
{ "emp_dept" : "Finance", "basesalary" : 4000 ,"variablesalary": 1000, "empcount":2,"avgsalary": 2500},
],
"count" : 5,
"keys" : 3,
"ok" : 1}

This Group function / command does not work with sharded clusters and also has limitation on the size of the array that is returned. The aggregation framework or map reduce provide more stronger options.

So, this is it. Simple but handy.

We will look at the Aggregation Framework in Part II of this blog. Stay tuned !!

Shraddha Tambe | Helical IT Solutions

References:
http://docs.mongodb.org/manual/core/single-purpose-aggregation/
http://docs.mongodb.org/manual/reference/method/js-collection/

Best Open Source Business Intelligence Software Helical Insight is Here

A Business Intelligence Framework

MongoDB mongodb aggregation

0 0 votes

Article Rating

0 Comments

Inline Feedbacks

View all comments

You might also like..

Business Intelligence

Installation of Firebird db

By admin

Steps to install firebird db 1. Go to google and type firebird in search box and then click on first link. License aggrement 2. Click on downloads and then install Firebird latest version(5.0.0). 3. It will navigate to the below...

Software Testing

Defect Life Cycle

By admin

This blog explains about the complete life cycle of a bug and different status of bug from the stage it was identified,fixed,retest and close. What is Defect life cycle? Defect life cycle is the life cycle of a defect or...

Software Testing

Different Levels of Testing in Software Testing

By admin

What are the Levels of Software Testing? In this blog,we are going to understand the various levels of software testing In Software Testing,we have four different levels of testing,which are as mentioned below: Unit Testing Integration Testing System Testing Acceptance...

About Helical IT Solutions Pvt Ltd

Location

Contact Us

Search what you are looking for..

Mongo DB Aggregation – Part I – Single Purpose Aggregation Operations

Posted on July 21, 2015 by By Shraddha Tambe, in Databases | 0

A Business Intelligence Framework

You might also like..

Business Intelligence

Installation of Firebird db

By admin

Software Testing

Defect Life Cycle

By admin

Software Testing

Different Levels of Testing in Software Testing

By admin

Contact Form