MongoDB $inc a NaN value - javascript

I have to deal with inconsistent documents in MongoDB collection where some field might be numeric or might have NaN value. I need to update it with $inc. But looks like if it have NaN value $inc have no effect. What options available for atomic document update?

Well this seems to lead to two logical conclusions. The first being that if there are NaN values present in a field then how to identify them? Consider the following sample, let's call the collection "nantest"
{ "_id" : ObjectId("54055993b145d1c015a1ad41"), "n" : NaN }
{ "_id" : ObjectId("540559e8b145d1c015a1ad42"), "n" : Infinity }
{ "_id" : ObjectId("54055b59b145d1c015a1ad43"), "n" : 1 }
{ "_id" : ObjectId("54055ea1b145d1c015a1ad44"), "n" : -Infinity }
So both NaN and Infinity or -Infinity are representative of "non-numbers" that have somehow emerged in your data. The best way to find these documents where that field is set that way is to use the $where operator for a JavaScript evaluated query condition. Not efficient but it what you have got:
db.nantest.find({
"$where": "return isNaN(this.n) || Math.abs(this.n) == Infinity"
})
So that gives a way of finding the data that is the problem. From here you could jump through hoops and decide that where this was encountered you would just reset it to 0 before incrementing, essentially issuing two update statements where the first one would not match a document to update if the value was correct:
db.nantest.update(
{ "$where": "return isNaN(this.n) || Math.abs(this.n) == Infinity" },
{ "$set": { "n": 0 } }
);
db.nantest.update(
{ },
{ "$inc": { "n": 1 } }
);
But really when you look at that, why would you want to patch your code to cater for this when you can just patch the data. So the logical thing to finally conclude is just update all the Nan and possibly Infinity values to a standard reset number in one statement:
db.nantest.update(
{ "$where": "return isNaN(this.n) || Math.abs(this.n) == Infinity" },
{ "$set": { "n": 0 } },
{ "multi": true }
);
Run one statement and then you don't have to change your code and simply process increments as you should normally expect.
If your trouble is knowing which fields have the Nan values present in order to invoke updates to fix them, then consider something along the lines of this mapReduce process to inspect the fields:
db.nantest.mapReduce(
function () {
var doc = this;
delete doc._id;
Object.keys( doc ).forEach(function(key) {
if ( isNaN( doc[key] ) || Math.abs(doc[key]) == Infinity )
emit( key, 1 );
});
},
function (key,values) {
return Array.sum( values );
},
{ "out": { "inline": 1 } }
)
For which you might need to add some complexity to for more nested documents, but this tells you which fields can possibly contain the errant values so you can construct update statements to fix them.
It would seem that rather than bending your code to suit this you "should be" doing:
Find the source that is causing the numbers to appear and fix that.
Identify the field or fields that contain these values
Process one off update statement to fix the data all at once.
Minimal messing with your code and it both fixes the "source" of the problem and the "result" of that data corruption that was introduced.

Related

TypeError: if 'false' not working as expected

I'm doing a PWA quiz application using React.js and I've met the following problematic:
I can get questions objects with only one answer, and some with multiple.
In the case there is only one possible answer, I want to force the user to only have one possibility.
To do that, I made the following algorithm:
clickOnChoice = (key) => {
if (this.state && this.state.correctAnswers) {
let newChoices = INITIAL_CHOICES; // {}
if (this.state.multiChoice) {
console.log("this.state.multiChoice:", this.state.multiChoice); // this.state.multiChoice: false ???
newChoices = JSON.parse(JSON.stringify(this.state.choices)); // {answer_b: 1}
}
newChoices[key] = 1 - (newChoices[key] | 0); // {answer_b: 1, answer_a: 1}
this.setState({
choices: newChoices
}, this.updateNextButtonState);
}
}
However the execution seems to ignore the condition if (this.state.multiChoice).
What am I missing?
Maybe I need a cup of coffee... ☕
Anyway, thanks in advance!
It is more than likely you are trying to checking a string of 'false' rather than an actual boolean value.
you can check that the string is the expected boolean if (this.state.multiChoice === 'true') or change the value of the state property to true || false

How to $pull elements from an array, $where elements' string length > a large number?

And old slash escaping bug left us with some messed up data, like so:
{
suggestions: [
"ok",
"not ok /////////// ... 10s of KBs of this ... //////",
]
}
I would like to just pull those bad values out of the array. My first idea was to $pull based on a regex that matches 4 "/" characters, but it appears that regexes to not work on large strings:
db.notes.count({suggestions: /\/\/\/\//}) // returns 0
db.notes.count({suggestions: {$regex: "////"}}) // returns 0
My next idea was to use a $where query to find documents that have suggestion strings that are longer than 1000. That query works:
db.notes.count({
suggestions: {$exists: true},
$where: function() {
return !!this.suggestions.filter(function (item) {
return (item || "").length > 1000;
}).length
}
})
// returns a plausible number
But a $where query can't be used as the condition in a $pull update.
db.notes.update({
suggestions: {$exists: true},
}, {
$pull: {
suggestions: {
$where: function() {
return !!this.suggestions.filter(function (item) {
return (item || "").length > 1000;
}).length
}
}
}
})
throws
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 81,
"errmsg" : "no context for parsing $where"
}
})
I'm running out of ideas. Will I have to iterate over the entire collection, and $set: {suggestions: suggestions.filter(...)} for each document individually? Is there no better way to clean bad values out of an array of large strings in MongoDB?
(I'm only adding the "javascript" tag to get SO to format the code correctly)
The simple solution pointed out in the question comments should have worked. It does work with a test case that is a recreation of the original problem. Regexes can match on large strings, there is no special restriction there.
db.notes.updateOne({suggestions: /\/\//}, { "$pull": {suggestions: /\/\//}})
Since this didn't work for me, I ended up going with what the question discussed: updating all documents individually by filtering the array elements based on string length:
db.notes.find({
suggestions: {$exists: true}
}).forEach(function(doc) {
doc.suggestions = doc.suggestions.filter(function(item) {
return (item || "").length <= 1000;
}); db.notes.save(doc);
});
It ran slow, but that wasn't really a problem in this case.

How to update Field by multiply Field value Mongodb

I try to update field with value in euro but something's going wrong.
db.orders.update({
"_id": ObjectId("56892f6065380a21019dc810")
}, {
$set: {
"wartoscEUR": {
$multiply: ["wartoscPLN", 4]
}
}
})
I got an error:
The dollar ($) prefixed field '$multiply' in 'wartoscEUR.$multiply' is not valid for storage.
WartoscPLN and WartoscEUR are number fields, and i'd like to calculate wartoscEUR by multiplying wartoscPLN by 4.
Sorry, maybe this is really easy but I'm just getting starting in nosql.
The $multiply-operator is only used in aggregation. The operator you are looking for is $mul and it has a different syntax:
db.orders.update(
{"_id" : ObjectId("56892f6065380a21019dc810")},
{ $mul: { wartoscPLN: 4 }
);
It is not necessary to combine it with $set, as it implies that semantic implicitly.
The $multiply operator can only be used in the aggregation framework, not in an update operation. You could use the aggregation framework in your case to create a new result set that has the new field wartoscEUR created with the $project pipeline.
From the result set you can then loop through it (using the forEach() method of the cursor returned from the .aggregate() method), update your collection with the new value but you cannot access a value of a document directly in an update statement hence the Bulk API update operation come in handy here:
The Bulk update will at least allow many operations to be sent in a single request with a singular response.
The above operation can be depicted with this implementation:
var bulk = db.orders.initializeOrderedBulkOp(),
counter = 0;
db.orders.aggregate([
{
"$project": {
"wartoscEUR": { "$multiply": ["$wartoscPLN", 4] }
}
}
]).forEach(function (doc) {
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "wartoscEUR": doc.wartoscEUR }
});
counter++;
if (counter % 1000 == 0) {
bulk.execute();
bulk = db.orders.initializeOrderedBulkOp();
}
});
if (counter % 1000 != 0) {
bulk.execute();
}

How can I fake a multi-item $pop in MongoDB?

Quick question from someone new to Mongo.
I have a collection of documents that (simplified) look like this:
{"_id":<objectID>, "name":"fakeName", "seeds":[1231,2341,0842,1341,3451, ...]}
What I really need is a $pop that pops 2 or 3 items off my list of seeds, but $pop currently only works for one item, so I'm trying to look for another way to accomplish the same thing.
The first thing I looked at was doing $push/$each/$slice with an empty "each", like:
update: { $push: { order: { $each: [ ], $slice: ?}}}
The problem here is that I don't know exactly how long I want my new slice to be (I want it to be "current size - number of seeds I popped"). If the $slice modifier worked like the $slice projection, this would be easy, I could just do $slice: [ #of seeds, ], but it doesn't so that doesn't work.
The next thing I looked at was getting the side of the array and using that as an input to $slice, like:
update: { $push: { seeds: { $each: [ ], $slice: {$subtract: [{$size:"$seeds"}, <number of seeds to pop>]}}}}
But Mongo tells me "value for $slice must be numeric value and not an Object", so apparently the result of $subtract is an Object not a number.
Then I tried to see if I could "remove" items from the array based on an empty query with a $limit, but apparently limit gets applied later in the pipeline, so I couldn't manage to make that work.
Any other suggestions, or am I out of luck and need to go back to the drawing board?
Thanks so much for help/input.
MongoDB does not presently have any method of referencing the existing values of fields in a singular update statement. The only exceptions are operators such as $inc an $mul which can act on the present value and alter it according to a set rule.
This is in part due to the compatibility of the "phrasing" of operations to act over multiple documents, whether that is the case or not. But what you are asking for is some "variable" operation that allows the "length" of an array to be tested and used as in "input parameter" to another method. This is not supported.
So the best you can do is read the document content and test the length of the array in code, then perform the $slice update as you first surmized, or alternately you could use the aggregation framework to work out the "possible lengths" of arrays ( assuming a lot of duplication ) and then work on "multi" updates for those documents that match the conditions, of course assuming that you want to do this over more than a single document.
First form:
var bulk = db.collection.initializeOrderedBulkOp();
var count = 0;
db.collection.find().forEach(function(doc) {
if ( doc.order.length > 2 ) {
bulk.find({ "_id": doc._id })
.updateOne({
"$push": {
"order": { "$each": [], "$slice": doc.order.length - 2 }
}
});
count ++;
}
if ( (count % 1000) == 0 && ( count > 1 ) ) {
bulk.execute();
bulk = db.collection.initializeOrderedBulkOp();
}
});
if ( count % 1000 != 0 )
bulk = db.collection.initializeOrderedBulkOp();
Second form:
var bulk = db.collection.initializeOrderedBulkOp();
var count = 0;
db.collection.aggregate([
{ "$group": { "_id": { "$size": "$order" } }},
{ "$match": { "$_id": { "$gt": 2 } }}
]).forEach(function(doc) {
bulk.find({ "order": { "$size": doc._id } })
.update(
"$push": {
"order": { "$each": [], "$slice": doc._id - 2 }
}
});
count ++;
if ( count % 1000 == 0 ) {
bulk.execute();
bulk = db.collection.initializeOrderedBulkOp();
}
});
if ( count % 1000 != 0 )
bulk = db.collection.initializeOrderedBulkOp();
Noting that in both cases there is some logic to consider the length of the arrays in order not to "empty" them, or produce an undesired $slice operation.
Another possibly alternative is to use the projection form of $slice in the query to get the last n elements and then $pull the matching elements from the array. Of course the identifier used for such an operation would have to be "unique", but it is a valid case where uniqueness is assured.
So whatever your case, you cannot do this in a singular update statement without having some prior knowledge of the current state of the document to be modified. The different listings though give you ways to approach "emulating" this, albeit not in a single statement.

mongodb - can't understand why/how to use map-reduce

I'm trying to use map-reduce to understand when this can be helpful.
So I have a collection named "actions" with 100k docs like this:
{
"profile_id":1111,
"action_id":2222
}
Now I'm trying to do map-reduce examples. I'm trying to get a list of "all users and total actions each one has". Is this possible? My code:
db.fbooklikes.mapReduce(
function(){
emit(this.profile_id, this.action_id);
},
function(keyProfile, valueAction){
return Array.sum(valueAction);
},
{
out:"example"
}
)
.. This is not working. The result is:
"counts" : {
"input" : 100000,
"emit" : 100000,
"reduce" : 1146,
"output" : 13
},
"ok" : 1,
"_o" : {
"result" : "map_reduce_example",
"timeMillis" : 2539,
"counts" : {
"input" : 100000,
"emit" : 100000,
"reduce" : 1146,
"output" : 13
},
"ok" : 1
},
What I'm trying to do is something possible with map-reduce?
Well yes you can use it, but the more refined response is that there are likely better tools for doing what you want.
MapReduce is handy for some tasks, but usually best suited when something else does not apply. The inclusion of mapReduce in MongoDB pre-dates the introduction of the aggregation framework, which is generally what you should be using when you can:
db.fbooklikes.aggregate([
{ "$group": {
"_id": "$profile_id",
"count": { "$sum": 1 }
}}
])
Which will simply return the counts for the all documents in the collection grouped by each value of "profile_id".
MapReduce requires JavaScript evaluation and therefore runs much slower than the native code functions implemented by the aggregation framework. Sometimes you have to use it, but in simple cases it is best not to, and there are some quirks that you need to understand:
db.fbooklikes.mapReduce(
function(){
emit(this.profile_id, 1);
},
function(key,values){
return Array.sum(values);
},
{
out: { "inline": 1 }
}
)
The biggest thing people miss with mapReduce is the fact that the reducer is almost never called just once per emitted key. In fact it will process output in "chunks", thus "reducing" down part of that output and placing it back to be "reduced" again against other output until there is only a single value for that key.
For this reason it is important to emit the same type of data from the reduce function as is sent from the "map" function. It's a sticky point that can lead to weird results when you don't understand that part of the function. It is in fact the underlying way that mapReduce can deal with large values of results for a single key value and reduce them.
But generally speaking, you should be using the aggregation framework where possible, and where a problem requires some special calculations that would not be possible there, or otherwise has some complex document traversal where you need to inspect with JavaScript, then that is where you use mapReduce.
You don't want to sum the action ids, you want to count them. So you want something like the following
var map = function () {
emit(this.profile_id, { action_ids : [this.action_id], count : 1 });
}
var reduce = function(profile_id, values) {
var value = { action_ids: [], count: 0 };
for (var i = 0; i < values.length; i++) {
value.count += values[i].count;
value.action_ids.push.apply(value.action_ids, values[i].action_ids);
}
return value;
}
db.fbooklikes.mapReduce(map, reduce, { out:"example" });
This will give you an array of action ids and a count for each profile id. The count could be obtained by accessing the length of the action_ids array, but I thought I would keep it separate to make the example clearer.

Categories

Resources