.. _map-reduce: ********** Map/Reduce ********** .. default-domain:: mongodb .. contents:: On this page :local: :backlinks: none :depth: 2 :class: singlecol Mongoid provides a DSL around MongoDB's map/reduce framework, for performing custom map/reduce jobs or simple aggregations. .. note:: The map-reduce operation is deprecated. The :ref:`aggregation framework ` provides better performance and usability than map-reduce operations, and should be preferred for new development. Execution --------- You can tell Mongoid off the class or a criteria to perform a map/reduce by calling ``map_reduce`` and providing map and reduce javascript functions. .. code-block:: ruby map = %Q{ function() { emit(this.name, { likes: this.likes }); } } reduce = %Q{ function(key, values) { var result = { likes: 0 }; values.forEach(function(value) { result.likes += value.likes; }); return result; } } Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1) Just like criteria, map/reduce calls are lazily evaluated. So nothing will hit the database until you iterate over the results, or make a call on the wrapper that would need to force a database hit. .. code-block:: ruby Band.map_reduce(map, reduce).out(replace: "mr-results").each do |document| p document # { "_id" => "Tool", "value" => { "likes" => 200 }} end The only required thing you provide along with a map/reduce is where to output the results. If you do not provide this an error will be raised. Valid options to ``#out`` are: - ``inline: 1``: Don't store the output in a collection. - ``replace: "name"``: Store in a collection with the provided name, and overwrite any documents that exist in it. - ``merge: "name"``: Store in a collection with the provided name, and merge the results with the existing documents. - ``reduce: "name"``: Store in a collection with the provided name, and reduce all existing results in that collection. Raw Results ----------- Results of Map/Reduce execution can be retrieved via the ``execute`` method or its aliases ``raw`` and ``results``: .. code-block:: ruby mr = Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1) mr.execute # => {"results"=>[{"_id"=>"Tool", "value"=>{"likes"=>200.0}}], "timeMillis"=>14, "counts"=>{"input"=>4, "emit"=>4, "reduce"=>1, "output"=>1}, "ok"=>1.0, "$clusterTime"=>{"clusterTime"=>#, "signature"=>{"hash"=>, "keyId"=>0}}, "operationTime"=>#} Statistics ---------- MongoDB servers 4.2 and lower provide Map/Reduce execution statistics. As of MongoDB 4.4, Map/Reduce is implemented via the aggregation pipeline and statistics described in this section are not available. The following methods are provided on the ``MapReduce`` object: - ``counts``: Number of documents read, emitted, reduced and output through the pipeline. - ``input``, ``emitted``, ``reduced``, ``output``: individual count methods. Note that ``emitted`` and ``reduced`` methods are named differently from hash keys in ``counts``. - ``time``: The time, in milliseconds, that Map/Reduce pipeline took to execute. The following code illustrates retrieving the statistics: .. code-block:: ruby mr = Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1) mr.counts # => {"input"=>4, "emit"=>4, "reduce"=>1, "output"=>1} mr.input # => 4 mr.emitted # => 4 mr.reduced # => 1 mr.output # => 1 mr.time # => 14 .. note:: Each statistics method invocation re-executes the Map/Reduce pipeline. The results of execution are not stored by Mongoid. Consider using the ``execute`` method to retrieve the raw results and obtaining the statistics from the raw results if multiple statistics are desired.