.. _aggregation-pipeline: ******************** Aggregation Pipeline ******************** .. default-domain:: mongodb .. contents:: On this page :local: :backlinks: none :depth: 2 :class: singlecol Mongoid exposes `MongoDB's aggregation pipeline `_, which is used to construct flows of operations that process and return results. The aggregation pipeline is a superset of the deprecated :ref:`map/reduce framework ` functionality. Basic Usage =========== .. _aggregation-pipeline-example-multiple-collections: Querying Across Multiple Collections ```````````````````````````````````` The aggregation pipeline may be used for queries involving multiple referenced associations at the same time: .. code-block:: ruby class Band include Mongoid::Document has_many :tours has_many :awards field :name, type: String end class Tour include Mongoid::Document belongs_to :band field :year, type: Integer end class Award include Mongoid::Document belongs_to :band field :name, type: String end To retrieve bands that toured since 2000 and have at least one award, one could do the following: .. code-block:: ruby band_ids = Band.collection.aggregate([ { '$lookup' => { from: 'tours', localField: '_id', foreignField: 'band_id', as: 'tours', } }, { '$lookup' => { from: 'awards', localField: '_id', foreignField: 'band_id', as: 'awards', } }, { '$match' => { 'tours.year' => {'$gte' => 2000}, 'awards._id' => {'$exists' => true}, } }, {'$project' => {_id: 1}}, ]) bands = Band.find(band_ids) Note that the aggregation pipeline, since it is implemented by the Ruby driver for MongoDB and not Mongoid, returns raw ``BSON::Document`` objects rather than ``Mongoid::Document`` model instances. The above example projects only the ``_id`` field which is then used to load full models. An alternative is to not perform such a projection and work with raw fields, which would eliminate having to send the list of document ids to Mongoid in the second query (which could be large). .. _aggregation-pipeline-builder-dsl: Builder DSL =========== Mongoid provides limited support for constructing the aggregation pipeline itself using a high-level DSL. The following aggregation pipeline operators are supported: - `$group `_ - `$project `_ - `$unwind `_ To construct a pipeline, call the corresponding aggregation pipeline methods on a ``Criteria`` instance. Aggregation pipeline operations are added to the ``pipeline`` attribute of the ``Criteria`` instance. To execute the pipeline, pass the ``pipeline`` attribute value to ``Collection#aggragegate`` method. For example, given the following models: .. code-block:: ruby class Tour include Mongoid::Document embeds_many :participants field :name, type: String field :states, type: Array end class Participant include Mongoid::Document embedded_in :tour field :name, type: String end We can find out which states a participant visited: .. code-block:: ruby criteria = Tour.where('participants.name' => 'Serenity',). unwind(:states). group(_id: 'states', :states.add_to_set => '$states'). project(_id: 0, states: 1) pp criteria.pipeline # => [{"$match"=>{"participants.name"=>"Serenity"}}, # {"$unwind"=>"$states"}, # {"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}}, # {"$project"=>{"_id"=>0, "states"=>1}}] Tour.collection.aggregate(criteria.pipeline).to_a group ````` The ``group`` method adds a `$group aggregation pipeline stage `_. The field expressions support Mongoid symbol-operator syntax: .. code-block:: ruby criteria = Tour.all.group(_id: 'states', :states.add_to_set => '$states') criteria.pipeline # => [{"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}}] Alternatively, standard MongoDB aggregation pipeline syntax may be used: .. code-block:: ruby criteria = Tour.all.group(_id: 'states', states: {'$addToSet' => '$states'}) project ``````` The ``project`` method adds a `$project aggregation pipeline stage `_. The argument should be a Hash specifying the projection: .. code-block:: ruby criteria = Tour.all.project(_id: 0, states: 1) criteria.pipeline # => [{"$project"=>{"_id"=>0, "states"=>1}}] .. _unwind-dsl: unwind `````` The ``unwind`` method adds an `$unwind aggregation pipeline stage `_. The argument can be a field name, specifiable as a symbol or a string, or a Hash or a ``BSON::Document`` instance: .. code-block:: ruby criteria = Tour.all.unwind(:states) criteria = Tour.all.unwind('states') criteria.pipeline # => [{"$unwind"=>"$states"}] criteria = Tour.all.unwind(path: '$states') criteria.pipeline # => [{"$unwind"=>{:path=>"$states"}}]