large arrays in dependent observables - cascading

large arrays in dependent observables - cascading - javascript

I am using Knockout JS, as business requirements dictate that most, if not all logic is processed in the browser due to low bandwidth users. It's working out awesome so far except for one issue.
I am using a number of multiselect dropdown lists that all contain cascading logic. I have, say 8 lists that process hierarchical data and alter selectable options in child lists.
This is all good until I get to the bottom 2 lists which could potentially contain 3000 items depending on parent list selections (especially when 'select all' is clicked).
the problem is, in IE, I'm getting long running script warning messages, which I need to get rid of. Here's some code:
viewModel.BottomLevelList= ko.dependentObservable(function () {
if (this.ParentList().length === 0) { //nothing selected
return [];
}
var result = [];
var i = self.longMasterList.length;
var currentId = 0;
while (i--) {
//psuodo code:
//this.ParentList().Contains(loop-item) then
//put in return list based on some further logic
//else continue
}
return result;
}, viewModel);
I have tried using various setTimeout techniques from SO to break the large array up and return control momentarily to the browser, but with no success. The result is never returned and / or the observable seems to detach itself leaving an empty list in the UI.
If I need to use AJAX I will, but this is a very last resort and would prefer to keep it in the client.
So my question boils down to:
How can I stop long running script warnings as a result of processing large data sets (in the context of Knockout JS dependant observables and cascading lists)
Is there some idiomatic JavaScript technique I could / should be using in this scenario
Am I not seeing the wood for the trees here?!
Thanks muchly for any help

I would first suggest that you optimize your dependentObservable.
When you read any observable, Knockout registers a dependency to it in Dependency Manager. It contains pretty simple code like this:
function registerDependency(observable) {
if (ko.utils.arrayIndexOf(dependencies, observable)) {
dependencies.push(observable);
}
}
I can see in your pseudo-code that you are accessing this.ParentList() in the while loop. It means registerDependency will be called 3000 times and the dependencies array will be scanned 3000 times, which is bad for IE (since it has no built-in Array.indexOf method).
So my number one suggestion would be: Read all observables before loops.
If it doesn't help, I suggest that you proceed with setTimeout(). It is a bit tricky. Please check out this example: http://jsfiddle.net/romanych/4KGAv/3/
I have defined asyncObservable. You should pass an array with all dependencies of your dependentObservable. When ko.toJS is called, all observables are unwrapped. Then we call the passed callback function with arguments enumerated in the dependencies array. This function will be evaluated async.
I have wrapped this code into ko.dependentObservable to re-evaluate loader callback on any change of passed elements passed in dependencies
UPDATE:
My code was overcomplicated for this issue. throttle extender will do the trick. Please checkout this sample: http://jsfiddle.net/romanych/JNwhb/1/

Related

ConcurrentModificationException in amazon neptune using gremlin javascript language variant

I am trying to check and insert 1000 vertices in chunk using promise.all(). The code is as follows:
public async createManyByKey(label: string, key: string, properties: object[]): Promise<T[]> {
const promises = [];
const allVertices = __.addV(label);
const propKeys: Array<string> = Object.keys(properties[0]);
for(const propKey of propKeys){
allVertices.property(propKey, __.select(propKey));
}
const chunkedProperties = chunk(properties, 5); // [["demo-1", "demo-2", "demo-3", "demo-4", "demo-5"], [...], ...]
for(const property of chunkedProperties){
const singleQuery = this.g.withSideEffect('User', property)
.inject(property)
.unfold().as('data')
.coalesce(__.V().hasLabel(label).where(eq('data')).by(key).by(__.select(key)), allVertices).iterate();
promises.push(singleQuery);
}
const result = await Promise.all(promises);
return result;
}
This code throws ConcurrentModificationException. Need help to fix/improve this issue.

I'm not quite sure about the data and parameters you are using, but I needed to modify your query a bit to get it to work with a data set I have handy (air routes) as shown below. I did this to help me think through what your query is doing. I had to change the second by step. I'm not sure how that was working otherwise.
gremlin> g.inject(['AUS','ATL','XXX']).unfold().as('d').
......1> coalesce(__.V().hasLabel('airport').limit(10).
......2> where(eq('d')).
......3> by('code').
......4> by(),
......5> constant('X'))
==>v['3']
==>v['1']
==>X
While a query like this runs fine in isolation, once you start running several asynchronous promises (that contain mutating steps as in your query), what can happen is that one promise tries to access a part of the graph that is locked by another one. Even though the execution I believe is more "concurrent" than truly "parallel" if one promise yields due to an IO wait allowing another to run, the next one may fail if the prior promise already has locks in the database that the next promise also needs. In your case as you have a coalesce that references all vertices with a given label and properties, that can potentially cause conflicting locks to be taken. Perhaps it will work better if you await after each for loop iteration rather than do it all at the end in one big Promise.all.
Something else to keep in mind is that this query is going to be somewhat expensive regardless, as the mid traversal V is going to happen five times (in the case of your example) for each for loop iteration. This is because the unfold of the injected data is taken from chunks of size 5 and therefore spawns five traversers, each of which starts by looking at V.
EDITED 2021-11-17
As discussed a little in the comments, I suspect the most optimal path is actually to use multiple queries. The first query simply does a g.V(id1,id2,...) on all the IDs you are potentially going to add. Have it return a list of IDs found. Remove those from the set to add. Next break the adding part up into batches and do it without coalesce as you now know that those elements do not exist. This is most likely the best way to reduce locking and avoid the CMEs (exceptions). Unless someone else may be also trying to add them in parallel, this is the approach I think I would take.

Fast way to get unique attribute list from object list

I have an object list with a large number of elements of Test:
Test:{name:'', creator:''}
Require to extract a unique creators list from that list. I have tried:
const creators = Array.from(new Set(tests.map(t=>t.creator))
This works fine but, due to rapid changes, I have to call this again and again. So, this takes more time and lags the UI. How to build this array more efficiently than the current implementation?
EDIT:
Context information:
Require unique element array(not set) due to rendering on UI by
tests.map(test=><Test data={test}/>)

You can use lodash library's function "uniqBy" to overcome this issue.
Documentation: https://lodash.com/docs#uniqBy
Example:-
_.uniqBy(data, function (e) {
return e.creator;
});

Temporarily accumulate objects depending on the state of a different stream

I've been trying to teach myself FRP (and bacon.js specifically) by diving in head first on a new project. I've gotten pretty far on my own but recently ran into a problem that I can't seem to fight my way through:
I have an interface with a set of clickable objects. When an object is clicked, detailed information for that object is loaded in a panel to the right.
What I need is the ability to select multiple, to accumulate those objects into an array and show a "bulk actions" panel when more than one is selected.
So far I have:
a SelectMultiple boolean property that represents the current UI mode
a CurrentObject stream that holds the currently selected object
I've gotten somewhat close with this:
var SelectedObjects = CurrentObject.filter(SelectMultiple).skipDuplicates().scan([], function(a,b){
return a.concat([b]);
};
There are a few problems:
The value of SelectedObjects represents the objects selected over
all time, it doesn't reset when SelectMultiple state changes.
The value of SelectObjects does not include the original
CurrentObject (of course because the scan accumulator seed is an
empty array, not the current value of CurrentObject).
The fact that I'm looking to use the current value of a property directly seems to be a hint that there's a fundamental issue here. I have a notion that the answer involves flapMapLatest and spawning a new stream every time SelectMultiple changes, funneling selected orders into this new stream and accumulating, but I can't quite work out what that should look like.
Of course there is an additional problem that skipDuplicates only skips consecutive duplicates. I can probably work this one out on my own but a solution that addresses that issue would be ideal.
Any suggestions would be greatly appreciated!

This might work (coffeescript):
var selectMultiple # Property[Boolean] - whether in multiselect mode
var selectedObject # Property[Object] - latest selected object
var selectedObjects = selectMultiple.flatMapLatest((multiple) ->
if !multiple
selectedObject.map((obj) -> [obj])
else
selectedObject.scan([], (xs, x) ->
xs.concat(x)
)
).toProperty()
On each value of selectMultiple flag we start a new stream that'll either just track the current single selection or start accumulating from the single selection, adding items as they're selected. It doesn't support de-selection by toggling, but that's straightforward to add into the scan part.

Ok I figured out a solution. I realized that I could use a dynamically-sized slidingWindow combinator. I found the basis for the answer in the Implementing Snake in Bacon.js tutorial.
I got an error when I tried adding directly to the Bacon prototype (as described in the tutorial) so I just made a function that takes the stream to observe and a boolean that determines if it should capture values:
slidingWindowWhile = function(sourceStream, toTakeOrNotToTake) {
return new Bacon.EventStream(function(sink){
var buf = [];
var take = false;
sourceStream.onValue(function(x){
if (! take) {
buf = [];
}
buf.push(x);
sink(new Bacon.Next(buf));
});
toTakeOrNotToTake.onValue(function(v){
take = v;
});
});
};
It still seems like there should be a way to do this without using local variables to track state but at least this solution is pretty well encapsulated.

How do I directly assign values to an array but use push method on OOB exception?

It is my understanding that -- from a performance perspective -- direct assignment is more desirable than .push() when populating an array.
My code is currently as follows:
for each (var e in Collection) {
do {
DB_Query().forEach(function(e){data.push([e.title,e.id])});
} while (pageToken);
}
DB_Query() method runs a Google Drive query and returns a list.
My issue arises because DB_Query() can return a list of variable length. As such, if I construct data = new Array(100), direct assignment has the potential to go out of bounds.
Is there a method by which I could try and catch an Out of Bounds exception to have values directly assigned for the 100 pre-allocated indices, but use .push() for any overflow? The expectation here is that an OOB exception will not occur often.
Also, I'm not sure if it matters, but I am clearing the array after a counter variable is >=100 using the following method:
while(data.length > 0) {data.pop()}

In Javascript, if you set a value at an index bigger than the array length, it'll automatically "stretch" the array. So there's no need to bother with this. If you can make a good guess about your array size, go for it.
About your clearing loop: that's correct, and it seems that pop is indeed the fastest way.
My original suggestion was to set the array length back to zero: data.length = 0;
Now a tip that I think really makes a performance difference here: you're worrying with the wrong part!
In Apps Script, what takes long is not resizing arrays dynamically, or working your data, that's fast. The issue is always with the "API calls". That is, using UrlFetch or Spreadsheet.Range.getValue and so on.
You should take care to make the minimum amount of API calls possible and in your case (I'm guessing now, since I haven't seen your whole code) you seem to be doing it wrong. If DB_Query is costly (in API calls terms) you should not have it nested under two loops. The best solution usually involves figuring out everything you'll need before-hand (do as many loops you need, if it doesn't call anywhere), then pass all parameters to do a bulk operation and gather it all at once (in one API call), even if it involves getting more data than you needed. Then, with the whole data at hand, loop through and transform it as required (that's the fast part).

Select/Order data from two tables

I have two tables holding game data for two different games. For the sake of simplicity, let's say they only share a column called Timestamp (in particular they have a different number of columns). I want to render a list holding information from both tables, simultaneously ordered by Timestamp.
What I'm currently doing works, but I'd take almost any bet that there is a much better way to do this. I'm mostly concerned about performance at some point (mobile app). This is a stub representing the structure – believe me, I know how horrible this looks right now. I just wanted to make it work first, now I'm looking for improvements. ;)
var readyA,
readyB = false;
var dataA,
dataB;
function doLoop () {
setTimeout(renderData, 100);
}
function renderData () {
if (!readyA || !readyB) {
doLoop();
return;
}
var dataAll = dataA.concat(dataB);
dataAll.sort(function (a,b) {
return a['Timestamp'] <= b['Timestamp'];
});
// pass data into a template depending on from which game it is and render it
// ...
}
// wait for both queries to finish
doLoop();
// select data from game A
myDatabaseClass.query('SELECT ... FROM GameA', function (results) {
dataA = new Array(results.rows.length);
for (var i=0; i<results.rows.length; i++) {
dataA[i] = results.rows.item(i);
}
readyA = true;
});
// select data from game B
myDatabaseClass.query('SELECT ... FROM GameB', function (results) {
dataB = new Array(results.rows.length);
for (var i=0; i<results.rows.length; i++) {
dataB[i] = results.rows.item(i);
}
readyB = true;
});
The question would now be if I can somehow simplify this by some kind of UNION or JOIN in the query. Obviously, the Timeout construction is horrible, but that will automatically collapse to a simple callback function if the querying can be done in one query (or at least one transaction – the database class can handle that).
Edit: I did found this ( Pull from two different tables and order ) but this whole NULL AS some_column feels dirty. Is there really no better alternative?

The result of a query always is a single table with a fixed number of columns, so all the SELECTs must have the same number of columns:
SELECT a1, a2, a3, Timestamp FROM GameA
UNION ALL
SELECT b1, b2, NULL, Timestamp FROM GameB
ORDER BY Timestamp
(UNION ALL is faster than UNION because it doesn't try to remove duplicates.)

Your code is pretty good. From the point of view of a SQL hacker like me you're doing the UNION and the ORDER BY on the client side. There's nothing wrong with that. You seem to be doing it almost right. Your "concat" is the client-side equivalent of UNION, and your "sort' is the equivalent of ORDER BY.
You say that the NULL as missing-column construction feels somehow dirty if you use server-side UNION operations. But, obviously to treat two different result sets as the same so you can sort them in order you have to make them conform to each other somehow. Your a['Timestamp'] <= b['Timestamp'] sort-ordering criterion in your sort function is also a scheme for conforming two result sets to each other. It may be lower-performance than using a UNION.
Don't be afraid of using NULL as missing-column to make two result sets in a UNION conform to each other. It's not dirty, and it's not expensive.
Do consider limiting your SELECT operation somehow, perhaps by a range of timestamps. That will allow your system to scale up, especially if you put an index on the column you use to limit the SELECT.
(By the way, your sort function has a mistake in it. sort functions need to return -1, 0, or +1 depending on whether the first item is less than, equal to, or greater than the second one. You're returning a true/false value. That doesn't work properly.)
(You're parallelizing the two queries to the same MySQL instance. That's clever, but probably in truth is a formula for overloading MySQL as your game scales up. Keep in mind that each user of your game has her own machine running Javascript but they all share your MySQL.)

Develop Reference

JavaScript is the programming language of the Web.

large arrays in dependent observables - cascading - javascript

Related

ConcurrentModificationException in amazon neptune using gremlin javascript language variant

Fast way to get unique attribute list from object list

Temporarily accumulate objects depending on the state of a different stream

How do I directly assign values to an array but use push method on OOB exception?

Select/Order data from two tables

Categories

Resources