.. _aggregation-pipeline:
********************
Aggregation Pipeline
********************
.. default-domain:: mongodb
.. contents:: On this page
:local:
:backlinks: none
:depth: 2
:class: singlecol
Mongoid exposes `MongoDB's aggregation pipeline
`_,
which is used to construct flows of operations that process and return results.
The aggregation pipeline is a superset of the deprecated
:ref:`map/reduce framework ` functionality.
Basic Usage
===========
.. _aggregation-pipeline-example-multiple-collections:
Querying Across Multiple Collections
````````````````````````````````````
The aggregation pipeline may be used for queries involving multiple
referenced associations at the same time:
.. code-block:: ruby
class Band
include Mongoid::Document
has_many :tours
has_many :awards
field :name, type: String
end
class Tour
include Mongoid::Document
belongs_to :band
field :year, type: Integer
end
class Award
include Mongoid::Document
belongs_to :band
field :name, type: String
end
To retrieve bands that toured since 2000 and have at least one award, one
could do the following:
.. code-block:: ruby
band_ids = Band.collection.aggregate([
{ '$lookup' => {
from: 'tours',
localField: '_id',
foreignField: 'band_id',
as: 'tours',
} },
{ '$lookup' => {
from: 'awards',
localField: '_id',
foreignField: 'band_id',
as: 'awards',
} },
{ '$match' => {
'tours.year' => {'$gte' => 2000},
'awards._id' => {'$exists' => true},
} },
{'$project' => {_id: 1}},
])
bands = Band.find(band_ids)
Note that the aggregation pipeline, since it is implemented by the Ruby driver
for MongoDB and not Mongoid, returns raw ``BSON::Document`` objects rather than
``Mongoid::Document`` model instances. The above example projects only
the ``_id`` field which is then used to load full models. An alternative is
to not perform such a projection and work with raw fields, which would eliminate
having to send the list of document ids to Mongoid in the second query
(which could be large).
.. _aggregation-pipeline-builder-dsl:
Builder DSL
===========
Mongoid provides limited support for constructing the aggregation pipeline
itself using a high-level DSL. The following aggregation pipeline operators
are supported:
- `$group `_
- `$project `_
- `$unwind `_
To construct a pipeline, call the corresponding aggregation pipeline methods
on a ``Criteria`` instance. Aggregation pipeline operations are added to the
``pipeline`` attribute of the ``Criteria`` instance. To execute the pipeline,
pass the ``pipeline`` attribute value to ``Collection#aggragegate`` method.
For example, given the following models:
.. code-block:: ruby
class Tour
include Mongoid::Document
embeds_many :participants
field :name, type: String
field :states, type: Array
end
class Participant
include Mongoid::Document
embedded_in :tour
field :name, type: String
end
We can find out which states a participant visited:
.. code-block:: ruby
criteria = Tour.where('participants.name' => 'Serenity',).
unwind(:states).
group(_id: 'states', :states.add_to_set => '$states').
project(_id: 0, states: 1)
pp criteria.pipeline
# => [{"$match"=>{"participants.name"=>"Serenity"}},
# {"$unwind"=>"$states"},
# {"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}},
# {"$project"=>{"_id"=>0, "states"=>1}}]
Tour.collection.aggregate(criteria.pipeline).to_a
group
`````
The ``group`` method adds a `$group aggregation pipeline stage
`_.
The field expressions support Mongoid symbol-operator syntax:
.. code-block:: ruby
criteria = Tour.all.group(_id: 'states', :states.add_to_set => '$states')
criteria.pipeline
# => [{"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}}]
Alternatively, standard MongoDB aggregation pipeline syntax may be used:
.. code-block:: ruby
criteria = Tour.all.group(_id: 'states', states: {'$addToSet' => '$states'})
project
```````
The ``project`` method adds a `$project aggregation pipeline stage
`_.
The argument should be a Hash specifying the projection:
.. code-block:: ruby
criteria = Tour.all.project(_id: 0, states: 1)
criteria.pipeline
# => [{"$project"=>{"_id"=>0, "states"=>1}}]
.. _unwind-dsl:
unwind
``````
The ``unwind`` method adds an `$unwind aggregation pipeline stage
`_.
The argument can be a field name, specifiable as a symbol or a string, or
a Hash or a ``BSON::Document`` instance:
.. code-block:: ruby
criteria = Tour.all.unwind(:states)
criteria = Tour.all.unwind('states')
criteria.pipeline
# => [{"$unwind"=>"$states"}]
criteria = Tour.all.unwind(path: '$states')
criteria.pipeline
# => [{"$unwind"=>{:path=>"$states"}}]