Difference between revisions of "MongoDB QueryData"
Line 140: | Line 140: | ||
</pre> | </pre> | ||
− | '''$filter''' is one of the stages of the pipeline and | + | '''$filter''' is one of the stages of the pipeline and can not be used by itself. It is used with an aggregation framework operator, such as '''$project'''. |
The syntax is: | The syntax is: |
Revision as of 20:23, 10 November 2016
Main Page >> MongoDB >>MongoDB Workbook >> Querying Collections
Querying a collection
The find() function can be used to query the documents.
The format is:
db.collectionName.find(optional_query_criteria)
Where the query_criteria follows a pattern:
db.collectionName.find({fieldName: "value"})
Note:
- the criteria is enclosed in curly brackets: {}
- the value needs quotes if it is a string or date value
- quotes are optional for the fieldName, so long as they do not contain spaces
- if the fieldName refers to a nested document, the name must be in matching single or double quotes
Find all documents
For example, show all the data so far in the deptCollection:
db.deptCollection.find()
The data comes back messy. The pretty() function can be used to improve the layout::
db.deptCollection.find().pretty()
Find One document
To find just one document - department 10:
db.deptCollection.find({deptno:10}).pretty()
Finding an employee means using the array name too:
db.deptCollection.find({"employees.empno":7902}).pretty()
However, this does mean you get back all the employees in the department they were found in!
Since Version 2.2 MongoDB's new $elemMatch can be used with arrays to return only the first element matching the $elemMatch condition:
db.deptCollection.find({deptno:20}, { _id: 0, employees: {$elemMatch: {empno: 7902}}}).pretty()
$elemMatch limits the contents of the employees array to contain only the first element matching the $elemMatch condition.
This is akin to a SQL query:
SELECT * FROM Emp WHERE deptno=20 AND empno = 7902
_id is a unique value automatically generated by MongoDB (like a Primary Key, except it is unique for the whole database).
Using _id:0 suppresses the ObjectId value, however to see it:
db.deptCollection.find({deptno:20}, { employees: {$elemMatch: {empno: 7902}}}).pretty()
More about _ids in the next section.
Find with Query Criteria
The query criteria can be as complex as that found in SQL.
To find all employees earning more than 2000 in department 10:
db.deptCollection.find({deptno:10}, { employees: {$elemMatch: {sal: { $gt: 2000}}}}).pretty()
Same again for department 20 and the managers:
db.deptCollection.find({deptno:20}, { employees: {$elemMatch: {sal: { $gt: 2000}, job: "MANAGER"}}}).pretty()
employees is an array, so $elemMatch only returns the first matching value. What if we try this instead:
db.deptCollection.find({ "employees.sal" : { $gt: 2000}}).pretty()
Things to note:
- This time employees.sal must be enclosed in matching single or double quotes.
- Comment on what the above query returns.
- If you examine the data carefully:
- either find() means if one element of an array is found to be true, then all the elements are returned, even if they do not satisfy the query criteria
- or including $elemMatch means only the first element matching the query criteria is returned.
- Is this good practice?
Find departments with no managers:
db.deptCollection.find({ "employees.job" : { $ne: "MANAGER"}}).pretty()
Aggregation Pipeline
So far find() either returns all the elements of an array, if one element matches the search criteria, or $elematch returns the first one found only. The latter is fine if there is only one to be found, but not so good if several items in the array should match the search criteria.
The aggregation pipeline is a framework for data aggregation modelled on the concept of data processing pipelines. What this means, is documents enter a multi-stage pipeline that transforms the documents into aggregated results.
This is similar to using GROUP BY in SQL, where you might aggregate the average grades of all students taking a module.
The MongoDB aggregation pipeline consists of stages and each stage transforms the documents as they pass through the pipeline.
The pipeline for instance, could:
- project out certain details from each document, such as the employees;
- group the projected details by a certain fields and then using an aggregate function, such as group by the deptno and then counting the number of occurrences;
- sorting the results in order;
- limiting the results to a certain number, such as the first 10;
These are represented by the following operators: $project,$group, $sort or $limit.
A number of operations exist for the aggregation pipeline, details of which can be found in the MongoDB manual:
https://docs.mongodb.com/manual/reference/operator/aggregation/
$filter
We can use the pipeline to gather elements of our employees array to get the employees matching the query criteria only, rather than one, or everyone.
In the aggregation pipeline, array has various operators and the one we are interested in is $filter
This returns a subset of the array with only the elements that match the filter condition
$filter has the following syntax:
{ $filter: { input: <array>, /* expression for the array */ as: <string>, /* variable name for the element */ cond: <expression> /* filter condition */ } }
$filter is one of the stages of the pipeline and can not be used by itself. It is used with an aggregation framework operator, such as $project.
The syntax is:
db.collectionName.aggregate( [ { <stage> }, ... ] )
This is a simple example, with only one stage:
db.deptCollection.aggregate([ { $project: { empSet: { $filter: { input: "$employees", as: "employee", cond: { $gte: [ "$$employee.sal", 2000 ] } } } } } ]).pretty()
Now the system should only retrieve the employees with salary > 2000.
Next Step
Updating the collection