Difference between revisions of "MongoDB Aggregate Pipeline"

From mi-linux
Jump to navigationJump to search
Line 2: Line 2:
  
 
== Aggregation Pipeline ==
 
== Aggregation Pipeline ==
 
So far '''find()''' either returns all the elements of an array, if one element matches the search criteria, or '''$elematch''' returns the first one found only. The latter is fine if there is only one to be found, but not so good if several items in the array should match the search criteria.
 
  
 
The aggregation pipeline is a framework for data aggregation modelled on the concept of data processing pipelines. What this means, is documents enter a multi-stage pipeline that transforms the documents into aggregated results.
 
The aggregation pipeline is a framework for data aggregation modelled on the concept of data processing pipelines. What this means, is documents enter a multi-stage pipeline that transforms the documents into aggregated results.
Line 32: Line 30:
  
 
=== $group ===
 
=== $group ===
 +
 +
 +
=== $lookup ===
 +
  
 
=== Count ===
 
=== Count ===
Line 37: Line 39:
 
The power of the aggregation pipeline is to do processing on the data.
 
The power of the aggregation pipeline is to do processing on the data.
  
Lets count how many employees each department has:
+
Lets count how many employees are in department 10:
 +
 
 +
db.emp.count({deptno: 10})
 +
 
  
db.deptCollection.aggregate({
+
You can also add count() to a find query to count the records returned, instead of listing them:
  "$project": {
 
    "deptno": 1,
 
    "Count": { "$size": { "$ifNull": [ "$employees", [] ] }
 
      }
 
    }})
 
  
The ''$ifNull'' operator is needed, since department 40 has no employees - you will get an error message if left out!
+
db.dept.find({dname:"SALES"}).count()
  
 
== Next Step ==
 
== Next Step ==
  
 
[[MongoDB_Update|Updating]] the collection
 
[[MongoDB_Update|Updating]] the collection

Revision as of 15:38, 12 November 2017

Main Page >> MongoDB >>MongoDB Workbook >> Aggregation Pipeline

Aggregation Pipeline

The aggregation pipeline is a framework for data aggregation modelled on the concept of data processing pipelines. What this means, is documents enter a multi-stage pipeline that transforms the documents into aggregated results.

This is similar to using GROUP BY in SQL, where you might aggregate the average grades of all students taking a module.

The MongoDB aggregation pipeline consists of stages and each stage transforms the documents as they pass through the pipeline. A stage can generate new documents or filter out documents. A stage can also appear several times in the pipeline.

The syntax is:

db.collectionName.aggregate( [ { <stage> }, ... ] )

The pipeline for instance, could:

  • project out certain details from each document, such as the employees;
  • group the projected details by a certain fields and then using an aggregate function, such as group by the deptno and then counting the number of occurrences;
  • sorting the results in order;
  • limiting the results to a certain number, such as the first 10;

These are represented by the following operators: $project,$group, $sort or $limit.

A number of operations exist for the aggregation pipeline, details of which can be found in the MongoDB manual:

https://docs.mongodb.com/manual/reference/operator/aggregation/


$group

$lookup

Count

The power of the aggregation pipeline is to do processing on the data.

Lets count how many employees are in department 10:

db.emp.count({deptno: 10})


You can also add count() to a find query to count the records returned, instead of listing them:

db.dept.find({dname:"SALES"}).count()

Next Step

Updating the collection