paolo@bimodesign.com | +34 608 61 64 10

NoSQL

        

MongoDB - Exceed memory

A test of the MongoDb Developer's exams was to use the aggregation to figure out pairs of people that tend to communicate a lot from a dataset where is a very large numbers of the email. The issue was that when I execute this aggregation

db.messages.aggregate([
        {
		$unwind: "$headers.To"
	},
	{
		$project: {
			"_id": 1,
			"headers.From": 1,
			"headers.To": 1
		}
	},
	{
		$group: {
			_id: { id: "$_id", from: "$headers.From", to: "$headers.To" },
			count: { $sum: 1 }
		}
	},
	{
		$group: {
			_id: { from: "$_id.from", to: "$_id.to" },
			count: { $sum: 1 }
		}
	},
	{
		$sort: {
			count: -1
		}
	},
	{
		$limit: 5
	}
])

using unwind (that deconstructs an array field from the input documents to output a document for each element.), group and sort options, the output was this message error

exception: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.",

To solve it I had to add the allowDiskUse tag, and used the aggregate query in a run command, like this

db.runCommand(
   { aggregate: "messages",
     pipeline: [
		{
			$unwind: "$headers.To"
		},
		{
			$project: {
				"_id": 1,
				"headers.From": 1,
				"headers.To": 1
			}
		},
		{
			$group: {
				_id: { id: "$_id", from: "$headers.From", to: "$headers.To" },
				count: { $sum: 1 }
			}
		},
		{
			$group: {
				_id: { from: "$_id.from", to: "$_id.to" },
				count: { $sum: 1 }
			}
		},
		{
			$sort: {
				count: -1
			}
		},
		{
			$limit: 5
		}
	],
     allowDiskUse: true
   }
)