Difference between revisions of "MongoDB Aggregate Pipeline"

From mi-linux
Jump to navigationJump to search
Line 28: Line 28:
 
https://docs.mongodb.com/manual/reference/operator/aggregation/
 
https://docs.mongodb.com/manual/reference/operator/aggregation/
  
 +
=== $group ===
 +
 +
''$group'' will take a set of input documents, group them by a specified key and then applies an aggregate function to each group.
 +
 +
For example, to sum the salaries found in the ''emp'' collection:
 +
 +
db.emp.aggregate ( [
 +
{ $group:
 +
{ _id: "$deptno", total: {$sum: "$sal"} } }
 +
])
 +
 +
This is similar to the SQL command:
 +
 +
<pre style="color:blue">
 +
SELECT deptno, sum AS total FROM emp
 +
GROUP BY deptno;
 +
</pre>
  
=== $group ===
 
  
  
 
=== $lookup ===
 
=== $lookup ===
 
  
 
== Other Functions ==
 
== Other Functions ==

Revision as of 15:42, 12 November 2017

Main Page >> MongoDB >>MongoDB Workbook >> Aggregation Pipeline

Aggregation Pipeline

The aggregation pipeline is a framework for data aggregation modelled on the concept of data processing pipelines. What this means, is documents enter a multi-stage pipeline that transforms the documents into aggregated results.

This is similar to using GROUP BY in SQL, where you might aggregate the average grades of all students taking a module.

The MongoDB aggregation pipeline consists of stages and each stage transforms the documents as they pass through the pipeline. A stage can generate new documents or filter out documents. A stage can also appear several times in the pipeline.

The syntax is:

db.collectionName.aggregate( [ { <stage> }, ... ] )

The pipeline for instance, could:

  • project out certain details from each document, such as the employees;
  • group the projected details by a certain fields and then using an aggregate function, such as group by the deptno and then counting the number of occurrences;
  • sorting the results in order;
  • limiting the results to a certain number, such as the first 10;

These are represented by the following operators: $project,$group, $sort or $limit.

A number of operations exist for the aggregation pipeline, details of which can be found in the MongoDB manual:

https://docs.mongodb.com/manual/reference/operator/aggregation/

$group

$group will take a set of input documents, group them by a specified key and then applies an aggregate function to each group.

For example, to sum the salaries found in the emp collection:

db.emp.aggregate ( [
{ 	$group: 

{ _id: "$deptno", total: {$sum: "$sal"} } } ])

This is similar to the SQL command:

SELECT deptno, sum AS total FROM emp
GROUP BY deptno;


$lookup

Other Functions

Count

The power of the aggregation pipeline is to do processing on the data.

Lets count how many employees are in department 10:

db.emp.count({deptno: 10})


You can also add count() to a find query to count the records returned, instead of listing them:

db.dept.find({dname:"SALES"}).count()


Distinct

Sometimes you want to find the distinct values for a specified column (similar to distinct in SQL):

db.emp.distinct("deptno")

Next Step

Updating the collection