mongo/mongoid MapReduce on batch inserted documents - javascript

Im creating my batch and inserting it to collection using command i specified below
batch = []
time = 1.day.ago
(1..2000).each{ |i| a = {:name => 'invbatch2k'+i.to_s, :user_id => BSON::ObjectId.from_string('533956cd4d616323cf000000'), :out_id => 'out', :created_at => time, :updated_at => time, :random => '0.5' }; batch.push a; }
Invitation.collection.insert batch
As stated above, every single invitation record has user_id fields value set to '533956cd4d616323cf000000'
after inserting my batch with created_at: 1.day.ago i get:
2.1.1 :102 > Invitation.lte(created_at: 1.week.ago).count
=> 48
2.1.1 :103 > Invitation.lte(created_at: Date.today).count
=> 2048
also:
2.1.1 :104 > Invitation.lte(created_at: 1.week.ago).where(user_id: '533956cd4d616323cf000000').count
=> 14
2.1.1 :105 > Invitation.where(user_id: '533956cd4d616323cf000000').count
=> 2014
Also, I've got a map reduce which counts invitations sent by each unique User (both total and sent to unique out_id)
class Invitation
[...]
def self.get_user_invites_count
map = %q{
function() {
var user_id = this.user_id;
emit(user_id, {user_id : this.user_id, out_id: this.out_id, count: 1, countUnique: 1})
}
}
reduce = %q{
function(key, values) {
var result = {
user_id: key,
count: 0,
countUnique : 0
};
var values_arr = [];
values.forEach(function(value) {
values_arr.push(value.out_id);
result.count += 1
});
var unique = values_arr.filter(function(item, i, ar){ return ar.indexOf(item) === i; });
result.countUnique = unique.length;
return result;
}
}
map_reduce(map,reduce).out(inline: true).to_a.map{|d| d['value']} rescue []
end
end
The issue is:
Invitation.lte(created_at: Date.today.end_of_day).get_user_invites_count
returns
[{"user_id"=>BSON::ObjectId('533956cd4d616323cf000000'), "count"=>49.0, "countUnique"=>2.0} ...]
instead of "count" => 2014, "countUnique" => 6.0 while:
Invitation.lte(created_at: 1.week.ago).get_user_invites_count returns:
[{"user_id"=>BSON::ObjectId('533956cd4d616323cf000000'), "count"=>14.0, "countUnique"=>6.0} ...]
Data provided by query, is accurate before inserting the batch.
I cant wrap my head around whats going on here. Am i missing something?

The part that you seemed to have missed in the documentation seem to be the problem here:
MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.
And also later:
the type of the return object must be identical to the type of the value emitted by the map function to ensure that the following operations is true:
So what you see is your reduce function is returning a signature different to the input it receives from the mapper. This is important since the reducer may not get all of the values for a given key in a single pass. Instead it gets some of them, "reduces" the result and that reduced output may be combined with other values for the key ( possibly also reduced ) in a further pass through the reduce function.
As a result of your fields not matching, subsequent reduce passes do not see those values and do not count towards your totals. So you need to align the signatures of the values:
def self.get_user_invites_count
map = %q{
function() {
var user_id = this.user_id;
emit(user_id, {out_id: this.out_id, count: 1, countUnique: 0})
}
}
reduce = %q{
function(key, values) {
var result = {
out_id: null,
count: 0,
countUnique : 0
};
var values_arr = [];
values.forEach(function(value) {
if (value.out_id != null)
values_arr.push(value.out_id);
result.count += value.count;
result.countUnique += value.countUnique;
});
var unique = values_arr.filter(function(item, i, ar){ return ar.indexOf(item) === i; });
result.countUnique += unique.length;
return result;
}
}
map_reduce(map,reduce).out(inline: true).to_a.map{|d| d['value']} rescue []
end
You also do not need user_id in the values emitted or kept as it is already the "key" value for the mapReduce. The remaining alterations consider that both "count" and "countUnique" can contain an exiting value that needs to be considered, where you were simply resetting the value to 0 on each pass.
Then of course if the "input" has already been through a "reduce" pass, then you do not need the "out_id" values to be filtered for "uniqueness" as you already have the count and that is now included. So any null values are not added to the array of things to count, which is also "added" to the total rather than replacing it.
So the reducer does get called several times. For 20 key values the input will likely not be split, which is why your sample with less input works. For pretty much anything more than that, then the "groups" of the same key values will be split up, which is how mapReduce optimizes for large data processing. As the "reduced" output will be sent back to the reducer again, you need to be mindful that you are considering the values you already sent to output in the previous pass.

Related

how to call post api recursively untill array.length condition is false

Here is my question how to call recusive post api function untill array.length condition is false
Here is my code what i tried
const postItemArray =["<eg:cat>"]
var collectionDataContainer = [];
const recurseFunction = i => {
const [loadingStatus, mainData, addDataToPostItemArray] = useFetch(`http//blabla${postItemArray[i]}variety`);
console.log(edgeData[i]);
if (i < postItemArray.length) {
recurseFunction(i + 1);
}
if (addDataToPostItemArray.length !== 0) {
postItemArray.push(...addDataToPostItemArray);
}
if (mainData.length !== 0) {
collectionDataContainer.push(...demoData);
}
console.log(DummyArray);
};
recurseFunction(0);
here the parameters in useTetch are
loadingStatus will give true or false condition
mainData will give Data after fetching
addDataToPostItemArray fitered out mainData and will give array of keys if MainData have any keys for next loop recursion
Here is the Example Data format for understanding before calling recursive function
step-1
initially postItemArray = ["<eg:cat>"]
after passing postItemArray[i] value to useFetch then we will get data like this
mainData=[
{key : "<eg:dog>",class : "animal"},
{key : "<eg:ant>",class : "insect"},
{key : "<eg:monkey>", category : "jumping"},
{key : "<eg:fish>", category : "swim"}
]
step-2 addDataToPostItemArray this array will find out the class object keys if mainData have class key value pair inside the useFetch and collect the key in the format of below
addDataToPostItemArray =["<eg:dog>","<eg:ant>"]
if mainData array object have class Key it will catch and element store the keys inside the addDataToPostItemArray
here is the logic done in useFetch file for finding keys
axios(config).then(response => {
let originalResult = response.data.results.bindings
.filter(ek => ek.Class !== undefined && ek)
.map(el => el.key);
setFindClass(originalResult);
but my problem is im able to get the keys from useFetch and pushed into the postItemArray
and postItemArray length will exceeds to 3 as we expected
but in inside the recurseFuction if statement(i < postItemArray.length) condition looping only one time because its cosidering same initial length after pushed some data to postItemArray
so that why im not able to get looping data after array length is increased
this is my fail condition
if (i < postItemArray.length) {
recurseFunction(i + 1);
}
please anyone can help me out please im unable to find the solution for this question
just only i want to collect the Data into this container collectionDataContainer = []
looping if mainData have class and key object
if any one have better logic please help me out that is soo appreciatable

Adding onto float within json object

So I have a JSON object that is dynamically creates based on lots. I'm trying to get the total cost of each lot. Each object in the list is a different purchase.
var lot_list = [{lot:123456,cost:'$4,500.00'}, {lot:654321, cost:'$1,600.00'}, {lot:123456, cost:'$6,500.00'}]
I want the total cost for each lot so I tried
var totalBalances = {};
function addBalance(){
lot_list.forEach(function(lots){
totalBalances[lots[lot]] += parseFloat(lots['cost'].replace('$','').replace(',',''));
});
}
This ends with every lot having a null cost
I also tried
var totalBalances = {};
function addBalance(){
lot_list.forEach(function(lots){
totalBalances[lots[lot]] = parseInt(totalBalances[lots[lot]]) + parseFloat(lots['cost'].replace('$','').replace(',',''));
});
}
Neither of these worked any help is much appreciated.
You cannot get a value to sum with parseFloat('$4,500.00') because of the invalid characters. To remove the dollar sign and commas you can replace using;
> '$4,500.00'.replace(/[^\d\./g,'')
> "4500.00"
Here the regex matches anything that is not a digit or decimal place with the global modifier to replace all occurrances.
You can map your array of objects to the float values using Array.map() to get an array of float values.
> lot_list.map((l) => parseFloat(l.cost.replace(/[^\d\.]/g,'')))
> [4500, 1600, 6500]
With an array of float values you can use Array.reduce() to get the sum;
> lot_list.map((l) => parseFloat(l.cost.replace(/[^\d\.]/g,''))).reduce((a,c) => a+c)
> 12600
EDIT
To get totals for each lot map an object with the lot id included and then reduce onto an object with;
> lot_list
.map((l) => {
return {
id: l.lot,
cost: parseFloat(l.cost.replace(/[^\d\.]/g,''))
}
})
.reduce((a,c) => {
a[c.id] = a[c.id] || 0;
a[c.id] += c.cost;
return a;
}, {})
> { 123456: 11000, 654321: 1600 }
Here the reduce function creates an object as the accumulator and then initialises the sum to zero if the lot id has not been summed before.
based on your code
var totalBalances = {};
function addBalance(){
lot_list.forEach(function(lots){
// totalBalances[123456] is not defined yet, this should gets you problem if you're trying calculate its value
// totalBalances[lots[lot]] += parseFloat(lots['cost'].replace('$','').replace(',',''));
// maybe you should use this instead
totalBalances[lots[lot]] = parseFloat(lots['cost'].replace('$','').replace(',',''));
});
}
but, if you want to count total value of cost, you might considering using array reduce
function countCost(){
return lot_list.reduce((accum, dt) => {
return accum + parseFloat(l.cost.replace(/[^\d\.]/g,''))
}, parseFloat(0));
} //this gonna bring you total count of all cost

what is the equivalent of a reduce in javascript

I'm a backend dev moved recently onto js side. I was going through a tutorial and came across the below piece of code.
clickCreate: function(component, event, helper) {
var validExpense = component.find('expenseform').reduce(function (validSoFar, inputCmp) {
// Displays error messages for invalid fields
inputCmp.showHelpMessageIfInvalid();
return validSoFar && inputCmp.get('v.validity').valid;
}, true);
// If we pass error checking, do some real work
if(validExpense){
// Create the new expense
var newExpense = component.get("v.newExpense");
console.log("Create expense: " + JSON.stringify(newExpense));
helper.createExpense(component, newExpense);
}
}
Here I tried to understand a lot on what's happening, there is something called reduce and another thing named validSoFar. I'm unable to understand what's happening under the hood. :-(
I do get the regular loops stuff as done in Java.
Can someone please shower some light on what's happening here. I should be using this a lot in my regular work.
Thanks
The reduce function here is iterating through each input component of the expense form and incrementally mapping to a boolean. If you have say three inputs each with a true validity, the reduce function would return:
true && true where the first true is the initial value passed into reduce.
true && true and where the first true here is the result of the previous result.
true && true
At the end of the reduction, you're left with a single boolean representing the validity of the entire, where by that if just a single input component's validity is false, the entire reduction will amount to false. This is because validSoFar keeps track of the overall validity and is mutated by returning the compound of the whether the form is valid so far and the validity of the current input in iteration.
This is a reasonable equivalent:
var validExpense = true;
var inputCmps = component.find('expenseform')
for (var i = 0; i < inputCmps.length; i++) {
// Displays error messages for invalid fields
inputCmp.showHelpMessageIfInvalid();
if (!inputCmp.get('v.validity').valid) {
validExpense = false;
}
}
// Now we can use validExpense
This is a somewhat strange use of reduce, to be honest, because it does more than simply reducing a list to a single value. It also produces side effects (presumably) in the call to showHelpMessageIfInvalid().
The idea of reduce is simple. Given a list of values that you want to fold down one at a time into a single value (of the same or any other type), you supply a function that takes the current folded value and the next list value and returns a new folded value, and you supply an initial folded value, and reduce combines them by calling the function with each successive list value and the current folded value.
So, for instance,
var items = [
{name: 'foo', price: 7, quantity: 3},
{name: 'bar', price: 5, quantity: 5},
{name: 'baz', price: 19, quantity: 1}
]
const totalPrice = items.reduce(
(total, item) => total + item.price * item.quantity, // folding function
0 // initial value
); //=> 65
It does not make sense to use reduce there and have side effects in the reduce. Better use Array.prototype.filter to get all invalid expense items.
Then use Array.prototype.forEach to produce side effect(s) for each invalid item. You can then check the length of invalid expense items array to see it your input was valid:
function(component, event, helper) {
var invalidExpenses = component.find('expenseform').filter(
function(ex){
//return not valid (!valid)
return !ex.get('v.validity').valid
}
);
invalidExpenses.forEach(
//use forEach if you need a side effect for each thing
function(ex){
ex.showHelpMessageIfInvalid();
}
);
// If we pass error checking, do some real work
if(invalidExpenses.length===0){//no invalid expense items
// Create the new expense
var newExpense = component.get("v.newExpense");
console.log("Create expense: " + JSON.stringify(newExpense));
helper.createExpense(component, newExpense);
}
}
The mdn documentation for Array.prototype.reduce has a good description and examples on how to use it.
It should take an array of things and return one other thing (can be different type of thing). But you won't find any examples there where side effects are initiated in the reducer function.

JavaScript/React Native array(objects) sort

I'm starting with react-native building an app to track lap times from my RC Cars. I have an arduino with TCP connection (server) and for each lap, this arduino sends the current time/lap for all connected clients like this:
{"tx_id":33,"last_time":123456,"lap":612}
In my program (in react-native), I have one state called dados with this struct:
dados[tx_id] = {
tx_id: <tx_id>,
last_time:,
best_lap:0,
best_time:0,
diff:0,
laps:[]
};
This program connects to arduino and when receive some data, just push to this state. More specific in laps array of each transponder. Finally, I get something like this:
dados[33] = {
tx_id:33,
last_time: 456,
best_lap: 3455,
best_time: 32432,
diff: 32,
laps: [{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]
}
dados[34] = {
tx_id:34,
last_time: 123,
best_lap: 32234,
best_time: 335343,
diff: 10,
laps: [{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]
}
dados[35] = {
tx_id:35,
last_time: 789,
best_lap: 32234,
best_time: 335343,
diff: 8,
laps: [{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332},{lap:4,time:343232}]
}
This data in rendered to View's using map function (not a FlatList).
My problem now is that I need to order this before printing on screen.
Now, with this code, data are printed using tx_id as order, since it's the key for main array. Is there a way to order this array using number of elements in laps property and the second option to sort, use last_time property of element?
In this case, the last tx of my example (35) would be the first in the list because it has one lap more than other elements. The second item would be 34 (because of last_time). And the third would be tx 33.
Is there any way to to this in JavaScript, or I need to create a custom functions and check every item in recursive way?!
Tks #crackhead420
While waiting for reply to this question, I just found what you said.... :)
This is my final teste/solution that worked:
var t_teste = this.state.teste;
t_teste[33] = {tx_id: 33, last_time:998,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456}]};
t_teste[34] = {tx_id: 34, last_time:123,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456}]};
t_teste[35] = {tx_id: 35, last_time:456,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456},{lap:3,time:423}]};
t_teste[36] = {tx_id: 36, last_time:789,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456}]};
console.log('Teste original: ',JSON.stringify(t_teste));
var saida = t_teste.sort(function(a, b) {
if (a.laps.length > b.laps.length) {
return -1;
}
if (a.laps.length < b.laps.length) {
return 1;
}
// In this case, the laps are equal....so let's check last_time
if (a.last_time < b.last_time) {
return -1; // fastest lap (less time) first!
}
if (a.last_time > b.last_time) {
return 1;
}
// Return the same
return 0;
});
console.log('Teste novo: ',JSON.stringify(saida));
Using some simple helper functions, this is definitely possible:
const data = [{tx_id:33,last_time:456,best_lap:3455,best_time:32432,diff:32,laps:[{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]},{tx_id:34,last_time:123,best_lap:32234,best_time:335343,diff:10,laps:[{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]},{tx_id:35,last_time:789,best_lap:32234,best_time:335343,diff:8,laps:[{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332},{lap:4,time:343232}]}]
const sortBy = fn => (a, b) => -(fn(a) < fn(b)) || +(fn(a) > fn(b))
const sortByLapsLength = sortBy(o => o.laps.length)
const sortByLastTime = sortBy(o => o.last_time)
const sortFn = (a, b) => -sortByLapsLength(a, b) || sortByLastTime(a, b)
data.sort(sortFn)
// show new order of `tx_id`s
console.log(data.map(o => o.tx_id))
sortBy() (more explanation at the link) accepts a function that selects a value as the sorting criteria of a given object. This value must be a string or a number. sortBy() then returns a function that, given two objects, will sort them in ascending order when passed to Array.prototype.sort(). sortFn() uses two of these functions with a logical OR || operator to employ short-circuiting behavior and sort first by laps.length (in descending order, thus the negation -), and then by last_time if two objects' laps.length are equal.
Its possible to sort an object array by theire values:
dados.sort(function(a, b) {
return a.last_time - b.last_time;
});

rx: unfold array to multiple streams

I have a stream holding an array, each element of which has an id. I need to split this into a stream per id, which will complete when the source stream no longer carries the id.
E.g. input stream sequence with these three values
[{a:1}, {b:1}] [{a:2}, {b:2}, {c:1}] [{b:3}, {c:2}]
should return three streams
a -> 1 2 |
b -> 1 2 3
c -> 1 2
Where a has completed on the 3rd value, since its id is gone, and c has been created on the 2nd value, since its id has appeared.
I'm trying groupByUntil, a bit like
var input = foo.share();
var output = input.selectMany(function (s) {
return rx.Observable.fromArray(s);
}).groupByUntil(
function (s) { return s.keys()[0]; },
null,
function (g) { return input.filter(
function (s) { return !findkey(s, g.key); }
); }
)
So, group by the id, and dispose of the group when the input stream no longer has the id. This seems to work, but the two uses of input look odd to me, like there could a weird order dependency when using a single stream to control the input of the groupByUntil, and the disposal of the groups.
Is there a better way?
update
There is, indeed, a weird timing problem here. fromArray by default uses the currentThread scheduler, which will result in events from that array being interleaved with events from input. The dispose conditions on the group are then evaluated at the wrong time (before the groups from the previous input have been processed).
A possible workaround is to do fromArray(.., rx.Scheduler.immediate), which will keep the grouped events in sync with input.
yeah the only alternative I can think of is to manage the state yourself. I don't know that it is better though.
var d = Object.create(null);
var output = input
.flatMap(function (s) {
// end completed groups
Object
.keys(d)
.filter(function (k) { return !findKey(s, k); })
.forEach(function (k) {
d[k].onNext(1);
d[k].onCompleted();
delete d[k];
});
return Rx.Observable.fromArray(s);
})
.groupByUntil(
function (s) { return s.keys()[0]; },
null,
function (g) { return d[g.key] = new Rx.AsyncSubject(); });

Categories

Resources