rx: unfold array to multiple streams - javascript

I have a stream holding an array, each element of which has an id. I need to split this into a stream per id, which will complete when the source stream no longer carries the id.
E.g. input stream sequence with these three values
[{a:1}, {b:1}] [{a:2}, {b:2}, {c:1}] [{b:3}, {c:2}]
should return three streams
a -> 1 2 |
b -> 1 2 3
c -> 1 2
Where a has completed on the 3rd value, since its id is gone, and c has been created on the 2nd value, since its id has appeared.
I'm trying groupByUntil, a bit like
var input = foo.share();
var output = input.selectMany(function (s) {
return rx.Observable.fromArray(s);
}).groupByUntil(
function (s) { return s.keys()[0]; },
null,
function (g) { return input.filter(
function (s) { return !findkey(s, g.key); }
); }
)
So, group by the id, and dispose of the group when the input stream no longer has the id. This seems to work, but the two uses of input look odd to me, like there could a weird order dependency when using a single stream to control the input of the groupByUntil, and the disposal of the groups.
Is there a better way?
update
There is, indeed, a weird timing problem here. fromArray by default uses the currentThread scheduler, which will result in events from that array being interleaved with events from input. The dispose conditions on the group are then evaluated at the wrong time (before the groups from the previous input have been processed).
A possible workaround is to do fromArray(.., rx.Scheduler.immediate), which will keep the grouped events in sync with input.

yeah the only alternative I can think of is to manage the state yourself. I don't know that it is better though.
var d = Object.create(null);
var output = input
.flatMap(function (s) {
// end completed groups
Object
.keys(d)
.filter(function (k) { return !findKey(s, k); })
.forEach(function (k) {
d[k].onNext(1);
d[k].onCompleted();
delete d[k];
});
return Rx.Observable.fromArray(s);
})
.groupByUntil(
function (s) { return s.keys()[0]; },
null,
function (g) { return d[g.key] = new Rx.AsyncSubject(); });

Related

dc.js Using two reducers without a simple dimension and second grouping stage

Quick question following up my response from this post:
dc.js Box plot reducer using two groups
Just trying to fully get my head around reducers and how to filter and collect data so I'll step through my understanding first.
Data Format:
{
"SSID": "eduroam",
"identifier": "Client",
"latitude": 52.4505,
"longitude": -1.9361,
"mac": "dc:d9:16:##:##:##",
"packet": "PR-REQ",
"timestamp": "2018-07-10 12:25:26",
"vendor": "Huawei Technologies Co.Ltd"
}
(1) Using the following should result in an output array of key value pairs (Key MAC Address & Value Count of networks connected to):
var MacCountsGroup = mac.group().reduce(
function (p, v) {
p[v.mac] = (p[v.mac] || 0) + v.counter;
return p;
},
function (p, v) {
p[v.mac] -= v.counter;
return p;
},
function () {
return {}; // KV Pair of MAC -> Count
}
);
(2) Then in order to use the object this must be passed flattened so it can be passed to a chart as follows:
function flatten_object_group(group) {
return {
all: function () {
return group.all().map(function (kv) {
return {
key: kv.key,
value: Object.values(kv.value).filter(function (v) {
return v > 0;
})
};
});
}
};
}
var connectionsGroup = flatten_object_group(MacCountsGroup);
(3) Then I pass mac as a piechart dimension & connectionsGroup as the group. This gives a chart back a chart with roughly 50,000 slices based on my dataset.
var packetPie = dc.pieChart("#packetPie");
packetPie
.height(495)
.width(350)
.radius(180)
.renderLabel(true)
.transitionDuration(1000)
.dimension(mac)
.ordinalColors(['#07453E', '#145C54', '#36847B'])
.group(connectionsGroup);
This works A'OK and I follow up to this point.
(4) Now I want to group by the values given out by the first reducer, i.e I want to combine all of the mac addresses with 1 network connection, 2 network connections and so on as slices.
How would this be done as a dimension of "Network connections"? How can I produce this summarized data which doesn't exist in my source data and is generated from mac?
Or would this require an intermediate function between the first reducer and flattening to combine all of the values from the first reducer?
You don't need to do all of that to get a pie chart of mac addresses.
There are a few faulty understandings in points 1-3, which I guess I'll address first. It looks like you copy and pasted code from the previous question, so I'm not really sure if this helps.
(1) If you have a dimension of mac addresses, reducing it like this won't have any further effect. The original idea was to dimension/group by vendor and then reduce counts for each mac address. This reduction will group by mac address and then further count instances of each mac address within each bin, so it's just an object with one key. It will produce a map of key value pairs like
{key: 'MAC-123', value: {'MAC-123': 12}}
(2) This will flatten the object within the values, dropping the keys and producing just an array of counts
{key: 'MAC-123', value: [12]}
(3) Since the pie chart is expecting simple key/value pairs with the value being a number, it is probably unhappy with getting values like the array [12]. The values are probably coerced to NaN.
(4) Okay, here's the real question, and it's actually not as easy as your previous question. We got off easy with the box plot because the "dimension" (in crossfilter terms, the keys you filter and group on) existed in your data.
Let's forget the false lead in points 1-3 above, and start from first principles.
There is no way to look at an individual row of your data and determine, without looking at anything else, if it belongs to the category "has 1 connection", "has 2 connections", etc. Assuming you want to be able to click on slices in the pie chart and filter all the data, we'll have to find another way to implement that.
But first let's look at how to produce a pie chart of "number of network connections". That's a little bit easier, but as far as I know, it does require a true "double reduce".
If we use the default reduction on the mac dimension, we'll get an array of key/value pairs, where the key is a mac address, and the value is the number of connections for that address:
[
{
"key": "1c:b7:2c:48",
"value": 8
},
{
"key": "1c:b7:be:ef",
"value": 3
},
{
"key": "6c:17:79:03",
"value": 2
},
...
How do we now produce a key/value array where the key is number of connections, and the value is the array of mac addresses for that number of connections?
Sounds like a job for the lesser-known Array.reduce. This function is the likely inspiration for crossfilter's group.reduce(), but it's a bit simpler: it just walks through an array, combining each value with the result of the last. It's great for producing an object from an array:
var value_keys = macPacketGroup.all().reduce(function(p, kv) {
if(!p[kv.value])
p[kv.value] = [];
p[kv.value].push(kv.key);
return p;
}, {});
Great:
{
"1": [
"b8:1d:ab:d1",
"dc:d9:16:3a",
"dc:d9:16:3b"
],
"2": [
"6c:17:79:03",
"6c:27:79:04",
"b8:1d:aa:d1",
"b8:1d:aa:d2",
"dc:da:16:3d"
],
But we wanted an array of key/value pairs, not an object!
var key_count_value_macs = Object.keys(value_keys)
.map(k => ({key: k, value: value_keys[k]}));
Great, that looks just like what a "real group" would produce:
[
{
"key": "1",
"value": [
"b8:1d:ab:d1",
"dc:d9:16:3a",
"dc:d9:16:3b"
]
},
{
"key": "2",
"value": [
"6c:17:79:03",
"6c:27:79:04",
"b8:1d:aa:d1",
"b8:1d:aa:d2",
"dc:da:16:3d"
]
},
...
Wrapping all that in a "fake group", which when asked to produce .all(), queries the original group and does the above transformations:
function value_keys_group(group) {
return {
all: function() {
var value_keys = group.all().reduce(function(p, kv) {
if(!p[kv.value])
p[kv.value] = [];
p[kv.value].push(kv.key);
return p;
}, {});
return Object.keys(value_keys)
.map(k => ({key: k, value: value_keys[k]}));
}
}
}
Now we can plot the pie chart! The only fancy thing here is that the value accessor should look at the length of the array for each value (instead of assuming the value is just a number):
packetPie
// ...
.group(value_keys_group(macPacketGroup))
.valueAccessor(kv => kv.value.length);
Demo fiddle.
However, clicking on slices won't work. I'll return to that in a minute - just want to hit "save" first!
Part 2: Filtering based on counts
As I remarked at the start, it's not possible to create a crossfilter dimension which will filter based on the count of connections. This is because crossfilter always needs to look at each row and determine, based only on the information in that row, whether it belongs in a group or filter.
If you add another chart at this point and try clicking on a slice, everything in the other charts will disappear. This is because the keys are now counts, and counts are invalid mac addresses, so we're telling it to filter to a key which doesn't exist.
However, we can obviously filter by mac address, and we also know the mac addresses for each count! So this isn't so bad. It just requires a filterHandler.
Although, hmmm, in producing the fake group, we seem to have forgotten value_keys. It's hidden away inside the function, and then let go.
It's a little ugly, but we can fix that:
function value_keys_group(group) {
var saved_value_keys;
return {
all: function() {
var value_keys = group.all().reduce(function(p, kv) {
if(!p[kv.value])
p[kv.value] = [];
p[kv.value].push(kv.key);
return p;
}, {});
saved_value_keys = value_keys;
return Object.keys(value_keys)
.map(k => ({key: k, value: value_keys[k]}));
},
value_keys: function() {
return saved_value_keys;
}
}
}
Now, every time .all() is called (every time the pie chart is drawn), the fake group will stash away the value_keys object. Not a great practice (.value_keys() would return undefined if you called it before .all()), but safe based on the way dc.js works.
With that out of the way, the filterHandler for the pie chart is relatively simple:
packetPie.filterHandler(function(dimension, filters) {
if(filters.length === 0)
dimension.filter(null);
else {
var value_keys = packetPie.group().value_keys();
var all_macs = filters.reduce(
(p, v) => p.concat(value_keys[v]), []);
dimension.filterFunction(k => all_macs.indexOf(k) !== -1);
}
return filters;
});
The interesting line here is another call to Array.reduce. This function is also useful for producing an array from another array, and here we use it just to concatenate all of the values (mac addresses) from all of the selected slices (connection counts).
Now we have a working filter. It doesn't make too much sense to combine it with the box plot from the last question, but the new fiddle demonstrates that filtering based on number of connections does work.
Part 3: what about zeroes?
As commonly comes up, crossfilter considers a bin with value zero to still exist, so we need to "remove the empty bins". However, in this case, we've added a non-standard method to the first fake group, in order to allow filtering. (We could have just used a global there, but globals are messy.)
So, we need to "pass through" the value_keys method:
function remove_empty_bins_pt(source_group) {
return {
all:function () {
return source_group.all().filter(function(d) {
return d.key !== '0';
});
},
value_keys: function() {
return source_group.value_keys();
}
};
}
packetPie
.group(remove_empty_bins_pt(value_keys_group(macPacketGroup)))
Another oddity here is we are filtering out the key zero, and that's a string here!
Demo fiddle!
Alternately, here's a better solution! Do the bin filtering before passing to value_keys_group, and then we can use the ordinary remove_empty_bins!
function remove_empty_bins(source_group) {
return {
all:function () {
return source_group.all().filter(function(d) {
//return Math.abs(d.value) > 0.00001; // if using floating-point numbers
return d.value !== 0; // if integers only
});
}
};
}
packetPie
.group(value_keys_group(remove_empty_bins(macPacketGroup)))
Yet another demo fiddle!!

JavaScript/React Native array(objects) sort

I'm starting with react-native building an app to track lap times from my RC Cars. I have an arduino with TCP connection (server) and for each lap, this arduino sends the current time/lap for all connected clients like this:
{"tx_id":33,"last_time":123456,"lap":612}
In my program (in react-native), I have one state called dados with this struct:
dados[tx_id] = {
tx_id: <tx_id>,
last_time:,
best_lap:0,
best_time:0,
diff:0,
laps:[]
};
This program connects to arduino and when receive some data, just push to this state. More specific in laps array of each transponder. Finally, I get something like this:
dados[33] = {
tx_id:33,
last_time: 456,
best_lap: 3455,
best_time: 32432,
diff: 32,
laps: [{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]
}
dados[34] = {
tx_id:34,
last_time: 123,
best_lap: 32234,
best_time: 335343,
diff: 10,
laps: [{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]
}
dados[35] = {
tx_id:35,
last_time: 789,
best_lap: 32234,
best_time: 335343,
diff: 8,
laps: [{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332},{lap:4,time:343232}]
}
This data in rendered to View's using map function (not a FlatList).
My problem now is that I need to order this before printing on screen.
Now, with this code, data are printed using tx_id as order, since it's the key for main array. Is there a way to order this array using number of elements in laps property and the second option to sort, use last_time property of element?
In this case, the last tx of my example (35) would be the first in the list because it has one lap more than other elements. The second item would be 34 (because of last_time). And the third would be tx 33.
Is there any way to to this in JavaScript, or I need to create a custom functions and check every item in recursive way?!
Tks #crackhead420
While waiting for reply to this question, I just found what you said.... :)
This is my final teste/solution that worked:
var t_teste = this.state.teste;
t_teste[33] = {tx_id: 33, last_time:998,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456}]};
t_teste[34] = {tx_id: 34, last_time:123,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456}]};
t_teste[35] = {tx_id: 35, last_time:456,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456},{lap:3,time:423}]};
t_teste[36] = {tx_id: 36, last_time:789,best_lap:2,best_time:123,diff:0,laps:[{lap:1,time:123},{lap:2,time:456}]};
console.log('Teste original: ',JSON.stringify(t_teste));
var saida = t_teste.sort(function(a, b) {
if (a.laps.length > b.laps.length) {
return -1;
}
if (a.laps.length < b.laps.length) {
return 1;
}
// In this case, the laps are equal....so let's check last_time
if (a.last_time < b.last_time) {
return -1; // fastest lap (less time) first!
}
if (a.last_time > b.last_time) {
return 1;
}
// Return the same
return 0;
});
console.log('Teste novo: ',JSON.stringify(saida));
Using some simple helper functions, this is definitely possible:
const data = [{tx_id:33,last_time:456,best_lap:3455,best_time:32432,diff:32,laps:[{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]},{tx_id:34,last_time:123,best_lap:32234,best_time:335343,diff:10,laps:[{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332}]},{tx_id:35,last_time:789,best_lap:32234,best_time:335343,diff:8,laps:[{lap:1,time:1234},{lap:2,time:32323},{lap:3,time:3242332},{lap:4,time:343232}]}]
const sortBy = fn => (a, b) => -(fn(a) < fn(b)) || +(fn(a) > fn(b))
const sortByLapsLength = sortBy(o => o.laps.length)
const sortByLastTime = sortBy(o => o.last_time)
const sortFn = (a, b) => -sortByLapsLength(a, b) || sortByLastTime(a, b)
data.sort(sortFn)
// show new order of `tx_id`s
console.log(data.map(o => o.tx_id))
sortBy() (more explanation at the link) accepts a function that selects a value as the sorting criteria of a given object. This value must be a string or a number. sortBy() then returns a function that, given two objects, will sort them in ascending order when passed to Array.prototype.sort(). sortFn() uses two of these functions with a logical OR || operator to employ short-circuiting behavior and sort first by laps.length (in descending order, thus the negation -), and then by last_time if two objects' laps.length are equal.
Its possible to sort an object array by theire values:
dados.sort(function(a, b) {
return a.last_time - b.last_time;
});

Accumulating and resetting values in a stream

I'm playing with Reactive Programming, using RxJS, and stumbled upon something I'm not sure how to solve.
Let's say we implement a vending machine. You insert a coin, select an item, and the machine dispenses an item and returns change. We'll assume that price is always 1 cent, so inserting a quarter (25 cents) should return 24 cents back, and so on.
The "tricky" part is that I'd like to be able to handle cases like user inserting 2 coins and then selecting an item. Or selecting an item without inserting a coin.
It seems natural to implement inserted coins and selected items as streams. We can then introduce some sort of dependency between these 2 actions — merging or zipping or combining latest.
However, I quickly ran into an issue where I'd like coins to be accumulated up until an item is dispensed but not further. AFAIU, this means I can't use sum or scan since there's no way to "reset" previous accumulation at some point.
Here's an example diagram:
coins: ---25---5-----10------------|->
acc: ---25---30----40------------|->
items: ------------foo-----bar-----|->
combined: ---------30,foo--40,bar--|->
change:------------29------39------|->
And a corresponding code:
this.getCoinsStream()
.scan(function(sum, current) { return sum + current })
.combineLatest(this.getSelectedItemsStream())
.subscribe(function(cents, item) {
dispenseItem(item);
dispenseChange(cents - 1);
});
25 and 5 cents were inserted and then "foo" item was selected. Accumulating coins and then combining latest would lead to "foo" being combined with "30" (which is correct) and then "bar" with "40" (which is incorrect; should be "bar" and "10").
I looked through all of the methods for grouping and filtering and don't see anything that I can use.
An alternative solution I could use is to accumulate coins separately. But this introduces state outside of a stream and I'd really like to avoid that:
var centsDeposited = 0;
this.getCoinsStream().subscribe(function(cents) {
return centsDeposited += cents;
});
this.getSelectedItemsStream().subscribe(function(item) {
dispenseItem(item);
dispenseChange(centsDeposited - 1);
centsDeposited = 0;
});
Moreover, this doesn't allow for making streams dependent on each other, such as to wait for coin to be inserted until selected action can return an item.
Am I missing already existing method? What's the best way to achieve something like this — accumulating values up until the moment when they need to be merged with another stream, but also waiting for at least 1 value in 1st stream before merging it with the one from the 2nd?
You could use your scan/combineLatest approach and then finish the stream with a first followed up with a repeat so that it "starts over" the stream but your Observers would not see it.
var coinStream = Rx.Observable.merge(
Rx.Observable.fromEvent($('#add5'), 'click').map(5),
Rx.Observable.fromEvent($('#add10'), 'click').map(10),
Rx.Observable.fromEvent($('#add25'), 'click').map(25)
);
var selectedStream = Rx.Observable.merge(
Rx.Observable.fromEvent($('#coke'), 'click').map('Coke'),
Rx.Observable.fromEvent($('#sprite'), 'click').map('sprite')
);
var $selection = $('#selection');
var $change = $('#change');
function dispense(selection) {
$selection.text('Dispensed: ' + selection);
console.log("Dispensing Drink: " + selection);
}
function dispenseChange(change) {
$change.text('Dispensed change: ' + change);
console.log("Dispensing Change: " + change);
}
var dispenser = coinStream.scan(function(acc, delta) { return acc + delta; }, 0)
.combineLatest(selectedStream,
function(coins, selection) {
return {coins : coins, selection : selection};
})
//Combine latest won't emit until both Observables have a value
//so you can safely get the first which will be the point that
//both Observables have emitted.
.first()
//First will complete the stream above so use repeat
//to resubscribe to the stream transparently
//You could also do this conditionally with while or doWhile
.repeat()
//If you only will subscribe once, then you won't need this but
//here I am showing how to do it with two subscribers
.publish();
//Dole out the change
dispenser.pluck('coins')
.map(function(c) { return c - 1;})
.subscribe(dispenseChange);
//Get the selection for dispensation
dispenser.pluck('selection').subscribe(dispense);
//Wire it up
dispenser.connect();
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/4.0.6/rx.all.js"></script>
<button id="coke">Coke</button>
<button id="sprite">Sprite</button>
<button id="add5">5</button>
<button id="add10">10</button>
<button id="add25">25</button>
<div id="change"></div>
<div id="selection"></div>
Generally speaking you have the following set of equations:
inserted_coins :: independent source
items :: independent source
accumulated_coins :: sum(inserted_coins)
accumulated_paid :: sum(price(items))
change :: accumulated_coins - accumulated_paid
coins_in_machine :: when items : 0, when inserted_coins : sum(inserted_coins) starting after last emission of item
The hard part is coins_in_machine. You need to switch the source observable based on some emissions from two sources.
function emits ( who ) {
return function ( x ) { console.log([who, ": "].join(" ") + x);};
}
function sum ( a, b ) {return a + b;}
var inserted_coins = Rx.Observable.fromEvent(document.getElementById("insert"), 'click').map(function ( x ) {return 15;});
var items = Rx.Observable.fromEvent(document.getElementById("item"), 'click').map(function ( x ) {return "snickers";});
console.log("running");
var accumulated_coins = inserted_coins.scan(sum);
var coins_in_machine =
Rx.Observable.merge(
items.tap(emits("items")).map(function ( x ) {return {value : x, flag : 1};}),
inserted_coins.tap(emits("coins inserted ")).map(function ( x ) {return {value : x, flag : 0};}))
.distinctUntilChanged(function(x){return x.flag;})
.flatMapLatest(function ( x ) {
switch (x.flag) {
case 1 :
return Rx.Observable.just(0);
case 0 :
return inserted_coins.scan(sum, x.value).startWith(x.value);
}
}
).startWith(0);
coins_in_machine.subscribe(emits("coins in machine"));
jsbin : http://jsbin.com/mejoneteyo/edit?html,js,console,output
[UPDATE]
Explanations:
We merge the insert_coins stream with the items stream while attaching a flag to them to know which one of the two emitted when we receive a value in the merged stream
When it is the items stream emitting, we want to put 0 in coins_in_machine. When it is the the insert_coins we want to sum the incoming values, as that sum will represent the new amount of coins in the machine. That means the definition of insert_coins switches from one stream to another under the logic defined before. That logic is what is implemented in the switchMapLatest.
I use switchMapLatest and not not switchMap as otherwise the coins_in_machine stream would continue to receive emission from former switched streams, i.e. duplicated emission as in the end there are ever only two streams to and from which we switch. If I may, I would say this is a close and switch that we need.
switchMapLatest has to return a stream, so we jump through hoops to make a stream that emits 0 and never ends (and does not block the computer, as using the repeat operator would in that case)
we jump through some extra hoops to make the inserted_coins emit the values we want. My first implementation was inserted_coins.scan(sum,0) and that never worked. The key and I found that quite tricky, is that when we get to that point in the flow, inserted_coins already emitted one of the values that is a part of the sum. That value is the one passed as a parameter of flatMapLatest but it is not in the source anymore, so calling scan after the fact won-t get it, so it is necessary to get that value from the flatMapLatest and reconstitute the correct behaviour.
You can also use Window to group together multiple coin events, and use item selection as the window boundary.
Next we can use zip to acquire the item value.
Notice we instantly try to give out items. So the user does have to insert coins before he decide on an item.
Notice i decided to publish both selectedStream and dispenser for safety reasons, we don't want to cause a race-condition where events fire while we're building up the query and zip becomes unbalanced. That would be a very rare condition, but notice that when our sources had been cold Observables, they pretty much start generating as soon as we subscribe, and we must use Publish to safeguard ourselves.
(Shamelessly stolen paulpdaniels example code).
var coinStream = Rx.Observable.merge(
Rx.Observable.fromEvent($('#add5'), 'click').map(5),
Rx.Observable.fromEvent($('#add10'), 'click').map(10),
Rx.Observable.fromEvent($('#add25'), 'click').map(25)
);
var selectedStream = Rx.Observable.merge(
Rx.Observable.fromEvent($('#coke'), 'click').map('Coke'),
Rx.Observable.fromEvent($('#sprite'), 'click').map('Sprite')
).publish();
var $selection = $('#selection');
var $change = $('#change');
function dispense(selection) {
$selection.text('Dispensed: ' + selection);
console.log("Dispensing Drink: " + selection);
}
function dispenseChange(change) {
$change.text('Dispensed change: ' + change);
console.log("Dispensing Change: " + change);
}
// Build the query.
var dispenser = Rx.Observable.zip(
coinStream
.window(selectedStream)
.flatMap(ob => ob.reduce((acc, cur) => acc + cur, 0)),
selectedStream,
(coins, selection) => ({coins : coins, selection: selection})
).filter(pay => pay.coins != 0) // Do not give out items if there are no coins.
.publish();
var dispose = new Rx.CompositeDisposable(
//Dole out the change
dispenser
.pluck('coins')
.map(function(c) { return c - 1;})
.subscribe(dispenseChange),
//Get the selection for dispensation
dispenser
.pluck('selection')
.subscribe(dispense),
//Wire it up
dispenser.connect(),
selectedStream.connect()
);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/4.0.6/rx.all.js"></script>
<button id="coke">Coke</button>
<button id="sprite">Sprite</button>
<button id="add5">5</button>
<button id="add10">10</button>
<button id="add25">25</button>
<div id="change"></div>
<div id="selection"></div>

Ember store adding attributes incorrectly

I'm using the latest version of ember-cli, ember-data, ember-localstorage-adapter, and ember.
I have a Node object which has a parent and children. Since I had issues with creating multiple relationships with the same type of object, I decided to store the parentID in a string, and the childIDs in an array of strings. However, when I create a new Node and try to add the new Node's to the parents array of IDs, the ID ends up being added to the correct parent, but also other parents.
level 1 0
/ \
level 2 1 2
| |
level 3 3 4
In a structure like this, 0, 1, and 2 all have correct child and parent IDs. However, after adding 3 and 4, node 1 and node 2's childIDs are [3, 4], instead of [3], [4] respectively.
The Array attribute:
var ArrayTransform = DS.Transform.extend({
serialize: function(value) {
if (!value) {
return [];
}
return value;
},
deserialize: function(value) {
if (!value) {
return [];
}
return value;
}
});
The insertNode code:
insert: function(elem) {
var i,
_store = elem.node.store,
newNodeJSON = elem.node.serialize();
newNodeJSON.childIds = [];
newNodeJSON.level = getNextLevel();
_store.filter('node', function(node) {
return node.get('level') === newnodeJSON.level-1;
}).then(function(prevLevelNodes) {
// if no other nodes yet
if (prevLevelNodes.toArray().length === 0) {
makeNewNode(_store, newNodeJSON, elem.node);
}
// else, generates however many nodes that are in the previous level
else {
prevLevelNodes.toArray().forEach(function(node, idx) {
newNodeJSON.parentId = node.get('id');
makeNewNode(_store, newNodeJSON, elem.node);
});
}
});
}
var makeNewNode = function(_store, newNodeJSON, node) {
console.log(newNodeJSON.parentId); // returns correct value
var newNode = _store.createRecord('node', newNodeJSON);
newNode.save();
var newNodeId = newNode.get('id');
if (newNode.get('parentId')) {
_store.find('node', newNode.get('parentId')).then(function(n) {
var cids = n.get('childIds');
console.log(newNodeId); // returns expected value
console.log(cids); // **DOESN'T RETURN AN EMPTY ARRAY**: returns array with [3,4]
cids.push(newNodeId);
console.log(n.get('childIds')); // returns array with [3,4]
n.save();
});
}
To top this off, this error happens 90% of the time, but 10% of the time it performs as expected. This seems to suggest that there's some sort of race condition, but I'm not sure where that would even be. Some places that I feel like might be causing issues: the ember-cli compilation, passing the entire _store in when making a new node, ember-data being weird, ember-localstorage-adapter being funky... no clue.
For anyone else who may have this problem in the future: the problem lies in two things.
In ArrayTransform, typically I am returning the value sans modification.
In my insert code, I'm passing the same JSON that I defined at the top of the function to makeNewNode.
This JSON contains a reference to a single childIds array; therefore, each new node that gets created uses this same reference for its childIds. Although this doesn't quite explain why the cids array wasn't empty before the push executed (perhaps this is some sort of compiler oddity or console printing lag), it explains why these both Level 3 children were in both Level 2 parents' childIds array.
tl;dr: pass by value vs pass by reference error

mongo/mongoid MapReduce on batch inserted documents

Im creating my batch and inserting it to collection using command i specified below
batch = []
time = 1.day.ago
(1..2000).each{ |i| a = {:name => 'invbatch2k'+i.to_s, :user_id => BSON::ObjectId.from_string('533956cd4d616323cf000000'), :out_id => 'out', :created_at => time, :updated_at => time, :random => '0.5' }; batch.push a; }
Invitation.collection.insert batch
As stated above, every single invitation record has user_id fields value set to '533956cd4d616323cf000000'
after inserting my batch with created_at: 1.day.ago i get:
2.1.1 :102 > Invitation.lte(created_at: 1.week.ago).count
=> 48
2.1.1 :103 > Invitation.lte(created_at: Date.today).count
=> 2048
also:
2.1.1 :104 > Invitation.lte(created_at: 1.week.ago).where(user_id: '533956cd4d616323cf000000').count
=> 14
2.1.1 :105 > Invitation.where(user_id: '533956cd4d616323cf000000').count
=> 2014
Also, I've got a map reduce which counts invitations sent by each unique User (both total and sent to unique out_id)
class Invitation
[...]
def self.get_user_invites_count
map = %q{
function() {
var user_id = this.user_id;
emit(user_id, {user_id : this.user_id, out_id: this.out_id, count: 1, countUnique: 1})
}
}
reduce = %q{
function(key, values) {
var result = {
user_id: key,
count: 0,
countUnique : 0
};
var values_arr = [];
values.forEach(function(value) {
values_arr.push(value.out_id);
result.count += 1
});
var unique = values_arr.filter(function(item, i, ar){ return ar.indexOf(item) === i; });
result.countUnique = unique.length;
return result;
}
}
map_reduce(map,reduce).out(inline: true).to_a.map{|d| d['value']} rescue []
end
end
The issue is:
Invitation.lte(created_at: Date.today.end_of_day).get_user_invites_count
returns
[{"user_id"=>BSON::ObjectId('533956cd4d616323cf000000'), "count"=>49.0, "countUnique"=>2.0} ...]
instead of "count" => 2014, "countUnique" => 6.0 while:
Invitation.lte(created_at: 1.week.ago).get_user_invites_count returns:
[{"user_id"=>BSON::ObjectId('533956cd4d616323cf000000'), "count"=>14.0, "countUnique"=>6.0} ...]
Data provided by query, is accurate before inserting the batch.
I cant wrap my head around whats going on here. Am i missing something?
The part that you seemed to have missed in the documentation seem to be the problem here:
MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.
And also later:
the type of the return object must be identical to the type of the value emitted by the map function to ensure that the following operations is true:
So what you see is your reduce function is returning a signature different to the input it receives from the mapper. This is important since the reducer may not get all of the values for a given key in a single pass. Instead it gets some of them, "reduces" the result and that reduced output may be combined with other values for the key ( possibly also reduced ) in a further pass through the reduce function.
As a result of your fields not matching, subsequent reduce passes do not see those values and do not count towards your totals. So you need to align the signatures of the values:
def self.get_user_invites_count
map = %q{
function() {
var user_id = this.user_id;
emit(user_id, {out_id: this.out_id, count: 1, countUnique: 0})
}
}
reduce = %q{
function(key, values) {
var result = {
out_id: null,
count: 0,
countUnique : 0
};
var values_arr = [];
values.forEach(function(value) {
if (value.out_id != null)
values_arr.push(value.out_id);
result.count += value.count;
result.countUnique += value.countUnique;
});
var unique = values_arr.filter(function(item, i, ar){ return ar.indexOf(item) === i; });
result.countUnique += unique.length;
return result;
}
}
map_reduce(map,reduce).out(inline: true).to_a.map{|d| d['value']} rescue []
end
You also do not need user_id in the values emitted or kept as it is already the "key" value for the mapReduce. The remaining alterations consider that both "count" and "countUnique" can contain an exiting value that needs to be considered, where you were simply resetting the value to 0 on each pass.
Then of course if the "input" has already been through a "reduce" pass, then you do not need the "out_id" values to be filtered for "uniqueness" as you already have the count and that is now included. So any null values are not added to the array of things to count, which is also "added" to the total rather than replacing it.
So the reducer does get called several times. For 20 key values the input will likely not be split, which is why your sample with less input works. For pretty much anything more than that, then the "groups" of the same key values will be split up, which is how mapReduce optimizes for large data processing. As the "reduced" output will be sent back to the reducer again, you need to be mindful that you are considering the values you already sent to output in the previous pass.

Categories

Resources