Difference between revisions of "MongoDB QueryData"

From mi-linux
Jump to navigationJump to search
 
(43 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
The format is:
 
The format is:
  
<pre style="color: blue">
+
<pre style="color: purple">
 
  db.collectionName.find(optional_query_criteria)
 
  db.collectionName.find(optional_query_criteria)
 
</pre>
 
</pre>
Line 13: Line 13:
 
Where the query_criteria follows a pattern:
 
Where the query_criteria follows a pattern:
  
<pre style="color: blue">
+
<pre style="color: purple">
  db.collectionName.find({fieldName: "value"})
+
  db.collectionName.find({keyField: "value"})
 
</pre>
 
</pre>
  
Line 20: Line 20:
 
* the criteria is enclosed in curly brackets: {}
 
* the criteria is enclosed in curly brackets: {}
 
* the value needs quotes if it is a string or date value
 
* the value needs quotes if it is a string or date value
 +
* all names and values are case sensitive
 
* quotes are optional for the fieldName, so long as they do not contain spaces
 
* quotes are optional for the fieldName, so long as they do not contain spaces
* if the fieldName refers to a nested document, the name must be in matching single or double quotes
 
  
  
 
=== Find all documents ===
 
=== Find all documents ===
  
For example, show all the data so far in the deptCollection:
+
For example, show all the data so far in the ''dept'' collection:
  db.deptCollection.find()
+
  db.dept.find()
  
The data comes back messy. The pretty() function can be used to improve the layout::
+
The results should look like:
  
db.deptCollection.find().pretty()
+
<pre style="color: blue">
 +
{ "_id" : 10, "deptno" : 10, "dname" : "ACCOUNTING", "loc" : "NEW YORK" }
 +
{ "_id" : 20, "deptno" : 20, "dname" : "RESEARCH", "loc" : "DALLAS" }
 +
{ "_id" : 40, "deptno" : 40, "dname" : "OPERATIONS", "loc" : "BOSTON" }
 +
{ "_id" : 50, "deptno" : 40, "dname" : "OPERATIONS V2", "loc" : "BOSTON" }
 +
{ "_id" : 30, "deptno" : 30, "dname" : "SALES", "loc" : "CHICAGO" }
  
 +
</pre>
  
=== Find One document ===
+
To show the documents in the ''emp'' collection:
  
To find just one document - department 10:
+
db.emp.find()
  
db.deptCollection.find({deptno:10}).pretty()
 
  
Finding an employee means using the array name too:
+
The data comes back messy. The pretty() function can be used to improve the layout:
db.deptCollection.find({"employees.empno":7902}).pretty()
 
  
However, this does mean you get back all the employees in the department they were found in!
+
db.emp.find().pretty()
  
Since Version 2.2 MongoDB's new '''$elemMatch''' can be used with arrays to return only the '''first''' element matching the $elemMatch condition:
+
A subset of the ''emp'' collection is shown below:
  
db.deptCollection.find({deptno:20}
+
<pre style="color: blue">
  { _id: 0, employees: {$elemMatch: {empno: 7902}}}).pretty()
+
> db.emp.find().pretty()
 +
{
 +
        "_id" : ObjectId("5a09e79ac536e890d5a7a666"),
 +
        "empno" : 7782,
 +
        "ename" : "CLARK",
 +
        "job" : "MANAGER",
 +
        "mgr" : 7839,
 +
        "hiredate" : ISODate("1989-06-09T00:00:00Z"),
 +
        "sal" : 2450,
 +
        "deptno" : 10
 +
}
 +
{
 +
        "_id" : ObjectId("5a09e79ac536e890d5a7a667"),
 +
        "empno" : 7839,
 +
        "ename" : "KING",
 +
        "job" : "PRESIDENT",
 +
        "hiredate" : ISODate("1980-11-17T00:00:00Z"),
 +
        "sal" : 5000,
 +
        "deptno" : 10
 +
}
 +
....
  
$elemMatch limits the contents of the employees array to contain only the first element matching the $elemMatch condition.
 
 
This is akin to a SQL query:
 
 
<pre style="color: blue">
 
SELECT * FROM Emp WHERE deptno=20 AND empno = 7902
 
 
</pre>
 
</pre>
  
_id is a unique value automatically generated by MongoDB (like a Primary Key, except it is unique for the whole database).  
+
Note the object ids are now system generated (and will be different values in your own data).
  
Using <b>_id:0</b> suppresses the ''ObjectId'' value, however to see it:
+
=== Find with query criteria ===
  
db.deptCollection.find({deptno:20}, 
 
  { employees: {$elemMatch: {empno: 7902}}}).pretty()
 
  
More about _ids in the next section.
+
If working with a large collection, you will not want all the documents returned.
  
== Find with Query Criteria ==
+
Find all the employees are are clerks:
  
The query criteria can be as complex as that found in SQL.
+
  db.emp.find({job:"CLERK"})
  
To find all employees earning more than 2000 in department 10:
+
For numerical data, the greater than (>) and less than (<) operators are represented by ''$gt'' and ''$lt'' respectively. Note, for these operators, the search criteria must be enclosed in {} brackets.
  
  db.deptCollection.find({deptno:10}, 
+
Find all employees who earn more than 2400:
  { employees: {$elemMatch: {sal: { $gt: 2000}}}}).pretty()
+
  db.emp.find({sal: {$gt:2400}})
  
 +
Find all employees whose commission is less than 1000:
 +
db.emp.find({comm: {$lt:1000}})
  
Same again for department 20 and the managers:
+
Working with the date field (hiredate) is more complex, since you have to create a new date for the comparison.  
db.deptCollection.find({deptno:20}, { employees: {$elemMatch: {sal: { $gt: 2000}, job: "MANAGER"}}}).pretty()
 
  
 +
For example, find all employees who start after the 1st January 2000:
 +
db.emp.find({hiredate: {$lt: new Date("2000-01-01")}})
  
'''employees''' is an array, so $elemMatch only returns the first matching value. What if we try this instead:
+
Find employees who started on the 16th October 2015:
 +
db.emp.find({hiredate: new Date("2015-10-16")})
  
db.deptCollection.find({ "employees.sal" : { $gt: 2000}}).pretty()
+
=== Find One document ===
  
Things to note:
+
To find just one document requires the use of the equivalent of a primary key field. This can be a field that the user takes responsibility to keep unique, such as the ''deptno'':
* This time employees.sal must be enclosed in matching single or double quotes.
 
* Comment on what the above query returns.
 
* If you examine the data carefully, if an element of an array is found to be true, then all the elements are returned, or one only. Is this good practice?
 
  
 +
db.dept.find({deptno:10})
  
Find departments with no managers:
 
 
db.deptCollection.find({ "employees.job" : { $ne: "MANAGER"}}).pretty()
 
  
== Aggregation Pipeline ==
+
Or the object id can be used, which will be unique:
  
So far '''find()''' either returns all the elements of an array, if one element matches the search criteria, or '''$elematch''' returns the first one found only. The latter is fine if there is only one to be found, but not so good if several items in the array should match the search criteria.
+
db.dept.find({_id:10})
  
The aggregation pipeline is a framework for data aggregation modelled on the concept of data processing pipelines. What this means, is documents enter a multi-stage pipeline that transforms the documents into aggregated results.
 
  
This is similar to using GROUP BY in SQL, where you might aggregate the average grades of all students taking a module.
+
In the ''emp'' collection, the _ids are system generated and generally along the lines of: '5a0727e99ba81dee9b1cc6a3', so less easy to use!
  
The MongoDB aggregation pipeline consists of stages and each stage transforms the documents as they pass through the pipeline.
+
List all the records in emp:
  
We can use this to ''gather'' elements of our employees array to get the employees matching the query criteria only, rather than one, or everyone.
+
db.emp.find().pretty()
  
=== $filter ===
+
and pick an _id from the collection and then try and find one record.  
 
 
A number of operations exist for the aggregation pipeline, details of which can be found in the MongoDB manual:
 
 
 
https://docs.mongodb.com/manual/reference/operator/aggregation/
 
 
 
array has various operators and the one we are interested in is '''$filter'''
 
 
 
This returns a subset of the array with only the elements that match the filter condition
 
 
 
$filter has the following syntax:
 
 
 
<pre style="color:blue">
 
{ $filter: {
 
input: <array>,    /* expression for the array */
 
as: <string>,   /* variable name for the element */ 
 
cond: <expression> /* filter condition */
 
} }
 
</pre>
 
 
 
'''$filter''' is one of the stages of the pipeline and is used with the db.collection.aggregrate() function.
 
 
 
The syntax is:
 
 
 
<pre style="color:blue">
 
db.collectionName.aggregate( [ { <stage> }, ... ] )
 
</pre>
 
  
 +
For example (note, your object id will be different):
  
This is a simple example, with only one stage:
+
db.emp.find( {_id : ObjectId("5a0727e99ba81dee9b1cc6a3")}).pretty()
  
db.deptCollection.aggregate([ {
 
      $project: {
 
        empSet: {
 
            $filter: {
 
              input: "$employees",
 
              as: "employee",
 
              cond: { $gte: [ "$$employee.sal", 2000 ] }
 
            }
 
        }
 
      }
 
  }
 
]).pretty()
 
  
Now the system should '''only''' retrieve the employees with salary > 2000.
+
The function ''ObjectId()'' must be used to convert the value into an object id.
  
 
== Next Step ==
 
== Next Step ==
  
[[MongoDB_Update|Updating]] the collection
+
[[MongoDB_Aggregate_Pipeline|MongoDB Aggregation]] Pipeline

Latest revision as of 14:13, 5 March 2018

Main Page >> MongoDB >>MongoDB Workbook >> Querying Collections

Querying a collection

The find() function can be used to query the documents.

The format is:

 db.collectionName.find(optional_query_criteria)

Where the query_criteria follows a pattern:

 db.collectionName.find({keyField: "value"})

Note:

  • the criteria is enclosed in curly brackets: {}
  • the value needs quotes if it is a string or date value
  • all names and values are case sensitive
  • quotes are optional for the fieldName, so long as they do not contain spaces


Find all documents

For example, show all the data so far in the dept collection:

db.dept.find()

The results should look like:

{ "_id" : 10, "deptno" : 10, "dname" : "ACCOUNTING", "loc" : "NEW YORK" }
{ "_id" : 20, "deptno" : 20, "dname" : "RESEARCH", "loc" : "DALLAS" }
{ "_id" : 40, "deptno" : 40, "dname" : "OPERATIONS", "loc" : "BOSTON" }
{ "_id" : 50, "deptno" : 40, "dname" : "OPERATIONS V2", "loc" : "BOSTON" }
{ "_id" : 30, "deptno" : 30, "dname" : "SALES", "loc" : "CHICAGO" }

To show the documents in the emp collection:

db.emp.find()


The data comes back messy. The pretty() function can be used to improve the layout:

db.emp.find().pretty()

A subset of the emp collection is shown below:

> db.emp.find().pretty()
{
        "_id" : ObjectId("5a09e79ac536e890d5a7a666"),
        "empno" : 7782,
        "ename" : "CLARK",
        "job" : "MANAGER",
        "mgr" : 7839,
        "hiredate" : ISODate("1989-06-09T00:00:00Z"),
        "sal" : 2450,
        "deptno" : 10
}
{
        "_id" : ObjectId("5a09e79ac536e890d5a7a667"),
        "empno" : 7839,
        "ename" : "KING",
        "job" : "PRESIDENT",
        "hiredate" : ISODate("1980-11-17T00:00:00Z"),
        "sal" : 5000,
        "deptno" : 10
}
....

Note the object ids are now system generated (and will be different values in your own data).

Find with query criteria

If working with a large collection, you will not want all the documents returned.

Find all the employees are are clerks:

 db.emp.find({job:"CLERK"})

For numerical data, the greater than (>) and less than (<) operators are represented by $gt and $lt respectively. Note, for these operators, the search criteria must be enclosed in {} brackets.

Find all employees who earn more than 2400:

db.emp.find({sal: {$gt:2400}})

Find all employees whose commission is less than 1000:

db.emp.find({comm: {$lt:1000}})

Working with the date field (hiredate) is more complex, since you have to create a new date for the comparison.

For example, find all employees who start after the 1st January 2000:

db.emp.find({hiredate: {$lt: new Date("2000-01-01")}})

Find employees who started on the 16th October 2015:

db.emp.find({hiredate: new Date("2015-10-16")})

Find One document

To find just one document requires the use of the equivalent of a primary key field. This can be a field that the user takes responsibility to keep unique, such as the deptno:

db.dept.find({deptno:10})


Or the object id can be used, which will be unique:

db.dept.find({_id:10})


In the emp collection, the _ids are system generated and generally along the lines of: '5a0727e99ba81dee9b1cc6a3', so less easy to use!

List all the records in emp:

db.emp.find().pretty()

and pick an _id from the collection and then try and find one record.

For example (note, your object id will be different):

db.emp.find( {_id : ObjectId("5a0727e99ba81dee9b1cc6a3")}).pretty()


The function ObjectId() must be used to convert the value into an object id.

Next Step

MongoDB Aggregation Pipeline