Conflicting benchmark tests while looping over nested arrays - javascript

Info
I am trying to create a effective way to loop over an array with nested objects/arrays as the function looping the data will run frequently and filter values based on if they are matching records in another array containing objects.
The data I am working with has the following types.
type WatchedShows = {
showId: number;
season: number;
episode: number;
}[];
type Shows = {
id: number;
seasons: {
season: {
season_number: number;
episodes: {
episode_number: number;
}[];
};
}[];
}[];
The data in WatchedShows is in my own database so I can control how it is sent to the frontend but, the data in Shows comes from an external API.
What I am trying to do is filter all episodes in Shows that matches the data in WatchedShows and also filter seasons and whole shows if everything is marked as watched.
Problem
Currently I have 3 different solutions but, the first was 3 nested loops and became slow quickly with some data so realistically I have 2 solutions. I also need to return the data in the format of Shows
I have now tried to run them through a benchmark and tested a bit with timers and checking how many iterations each of them have. Looking at the result from the test I did my self there is a clear winner as one has a run time of ~5ms and 42637 iterations while the other has a run time of ~15ms and 884052 iterations.
I then tried to run it through JSBench.me and when I do that its the opposite solution that comes out as the winner with the website saying its 90% faster. How can that happen when that solution has 884052 iterations and the other 42637 iterations? Is it my solutions that are badly optimized or are there something else I am missing? Tips on improving the solutions would be appreciated.
Both test were done with the same code generated dataset with about 20 000 episode records spread over 26 shows, 20 seasons in each and 40 episodes in each season. The code that generates the dataset can be seen on the benchmark site if needed. The benchmark can be seen here
Code
The solution with 884052 iterations and run time of ~15ms looks like this
const newShowObject = {};
for (const show of shows) {
newShowObject[show.id] = { ...show };
}
for (const show of watchedShows) {
for (const season of newShowObject[show.showId].seasons) {
if (season.season_number !== show.season) {
continue;
}
season.episodes = season.episodes.filter(
(episode) => episode.episode !== show.episode
);
}
newShowObject[show.showId].seasons = newShowObject[show.showId].seasons.filter(
(season) => season.episodes.length > 0
);
}
const unwatchedShows = [...Object.values(newShowObject)].filter(
(show) => show.seasons.length > 0
);
The solution with 42637 iterations and run time of ~5ms looks like this
const newShowObject = {};
for (const show of shows) {
newShowObject[show.id] = { ...show };
const newSeasons = {};
for (const seasons of show.seasons) {
newSeasons[seasons.season_number] = { ...seasons };
const newEpisodes = {};
for (const episodes of seasons.episodes) {
newEpisodes[episodes.episode] = { ...episodes };
}
newSeasons[seasons.season_number].episodes = { ...newEpisodes };
}
newShowObject[show.id].seasons = { ...newSeasons };
}
for (const show of watchedShows) {
delete newShowObject[show.showId].seasons[show.season].episodes[show.episode];
}
let unwatchedShows = [...Object.values(newShowObject)];
for (const show of unwatchedShows) {
show.seasons = [...Object.values(show.seasons)];
for (const season of show.seasons) {
season.episodes = [...Object.values(season.episodes)];
}
show.seasons = show.seasons.filter((season) => season.episodes.length > 0);
}
unwatchedShows = unwatchedShows.filter((show) => show.seasons.length > 0);

The problem is that your benchmark is flawed. Notice that jsbench does run the "setup code" only once per test case, not for each test loop iteration.
Your first solution does mutate the input (particularly season.episodes - while it does clone each show), so only on the first run it actually gives the correct output. On all subsequent runs of the test loop, it basically runs on an empty input, which is much faster.
I fixed this here, and now the second solution is the fastest as expected. (Notice the times do now also include creating the object, though).

I'm not sure how you are getting faster results with the second solution, I would triple check results. The JSBench results are legit, and agrees with Big-O notation.
The first solution loops through shows to create a look up table. Then, loops through a nested loop with quad complexity O(n^2).
The second solution loops through nested loops with cubic complexity O(n^3), because it is nested thrice. So, I would expect this algorithm to chew up more time.
The reason for this is called the sum rule. When two loops sit side-by-side, they don't multiply but add up. This makes the complexity O(n + n^2), which can be further reduced to O(n^2) because the first loop becomes negligible as n approaches infinity.

Related

Nested loop (FOR) with IF statement inside 2nd for print just one result

Basically I have 2 arrays, one with some code and another with codes and relative description, what I need to do is match the codes and print the description but my code (apparently) stops at the first loop of the inner FOR (I've attaches a screenshot to understand better).
If I remove the IF statement from the code it prints the counters of the 2 for as it should be.
for (x=0; x<causeoferrorlength; x++)
{
document.getElementById("mdataresult").innerHTML += "x "+causeoferrorsplit[x]+"</br>";
for(k=0; k<78; k++)
{
if ( causeoferrorsplit[x] === gbrucausesoferror[k][0] )
{
document.getElementById("mdataresult").innerHTML += "k "+gbrucausesoferror[k][0]+"</br>";
}
}
}
I have no errors from the console but it isn't printing as expected.
This is probably better handled in a declarative way versus imperative. It will be shorter and easier to reason about.
Given you're using two arrays, and that the codes in the first array will always be found somewhere in the second array:
let causes = ["001", "003", "005"];
let codes = [
["001","Earthquake"],
["002","Sunspots"],
["003","User Error"],
["004","Snakes"],
["005","Black Magic"]
];
let results = causes.map( cause => codes[ codes.findIndex( code => code[0] === cause ) ][1] );
console.log(results); // ["Earthquake", "User Error", "Black Magic"]
What's happening here? We're mapping the array of potential causes of error (the first array) to a list of descriptions taken from the second array.
Array.map takes a function that is invoked once with each array member. We'll call that member 'cause'.
Array.findIndex takes a function that is invoked once for each array member. We'll call that member 'code'.
For each 'cause' in causes we find the index in codes where the first array value is equal to the cause, then return the second array value, the description.
If you have the ability to change the second array to an object, then this gets way simpler:
let causes = ["001", "003", "005"];
let codes = {
"001":"Earthquake",
"002":"Sunspots",
"003":"User Error",
"004":"Snakes",
"005":"Black Magic"
};
let results = causes.map( cause => codes[cause] );
console.log(results); // ["Earthquake", "User Error", "Black Magic"]

Search through a big collection of objects

i have a really big collection of objects that i want to search through.
The array have > 60.000 items and the search performance can be really slow from time to time.
One object in that array looks like this:
{
"title": "title"
"company": "abc company"
"rating": 13 // internal rating based on comments and interaction
...
}
I want to search for the title and the company info and order that by the rating of the items.
This is what my search currently look like:
onSearchInput(searchTerm) {
(<any>window).clearTimeout(this.searchInputTimeout);
this.searchInputTimeout = window.setTimeout(() => {
this.searchForFood(searchTerm);
}, 500);
}
searchForFood(searchTerm) {
if (searchTerm.length > 1) {
this.searchResults = [];
this.foodList.map(item => {
searchTerm.split(' ').map(searchTermPart => {
if (item.title.toLowerCase().includes(searchTermPart.toLowerCase())
|| item.company.toLowerCase().includes(searchTermPart.toLowerCase())) {
this.searchResults.push(item);
}
});
});
this.searchResults = this.searchResults.sort(function(a, b) {
return a.rating - b.rating;
}).reverse();
} else {
this.searchResults = [];
}
}
Question: Is there any way to improve the search logic and performance wise?
A bunch of hints:
It's a bit excessive to put searching through 60,000 items on the front-end. Any way you can perform part of the search on the back-end? If you really must do it on the front-end considering searching in chunks of e.g. 10,000 and then using a setImmediate() to perform the next part of the search so the user's browser won't completely freeze during processing time.
Do the splitting and lowercasing of the search term outside of the loop.
map() like you're using it is weird as you don't use the return value. Better to use forEach(). Better still, is use filter() to get the items that match.
When iterating over the search terms, use some() (as pointed out in the comments) as it's an opportunity to early return.
sort() mutates the original array so you don't need to re-assign it.
sort() with reverse() is usually a smell. Instead, swap the sides of your condition to be b - a.
At this scale, it may make sense to do performance tests with includes(), indexOf(), roll-your-own-for-loop, match() (can almost guarantee it will be slower though)
Alex's suggestions are good. My only suggestion would be, if you could afford to pre-process the data during idle time (preferably don't hold up first render or interaction) you could process the data into a modified prefix trie. That would let you search for the items in O(k) time where k is the length of the search term (right now you are searching in O(kn) time because you look at every item and then do an includes which takes k time (it's actually a little worse because of the toLowerCase's but I don't want to get into the weeds of it).
If you aren't familiar with what a trie is, hopefully the code below gives you the idea or you can search for information with your search engine of choice. It's basically a mapping of characters in a string in nested hash maps.
Here's some sample code of how you might construct the trie:
function makeTries(data){
let companyTrie = {};
let titleTrie = {};
data.forEach(item => {
addToTrie(companyTrie, item.company, item, 0);
addToTrie(titleTrie, item.title, item, 0);
});
return {
companyTrie,
titleTrie
}
}
function addToTrie(trie, str, item, i){
trie.data = trie.data || [];
trie.data.push(item);
if(i >= str.length)
return;
if(! trie[str[i]]){
trie[str[i]] = {};
}
addToTrie(trie[str[i]], str, item, ++i);
}
function searchTrie(trie, term){
if(trie == undefined)
return [];
if(term == "")
return trie.data;
return searchTrie(trie[term[0]], term.substring(1));
}
var testData = [
{
company: "abc",
title: "def",
rank: 5
},{
company: "abd",
title: "deg",
rank: 5
},{
company: "afg",
title: "efg",
rank: 5
},{
company: "afgh",
title: "efh",
rank: 5
},
];
const tries = makeTries(testData);
console.log(searchTrie(tries.companyTrie, "afg"));

what is the equivalent of a reduce in javascript

I'm a backend dev moved recently onto js side. I was going through a tutorial and came across the below piece of code.
clickCreate: function(component, event, helper) {
var validExpense = component.find('expenseform').reduce(function (validSoFar, inputCmp) {
// Displays error messages for invalid fields
inputCmp.showHelpMessageIfInvalid();
return validSoFar && inputCmp.get('v.validity').valid;
}, true);
// If we pass error checking, do some real work
if(validExpense){
// Create the new expense
var newExpense = component.get("v.newExpense");
console.log("Create expense: " + JSON.stringify(newExpense));
helper.createExpense(component, newExpense);
}
}
Here I tried to understand a lot on what's happening, there is something called reduce and another thing named validSoFar. I'm unable to understand what's happening under the hood. :-(
I do get the regular loops stuff as done in Java.
Can someone please shower some light on what's happening here. I should be using this a lot in my regular work.
Thanks
The reduce function here is iterating through each input component of the expense form and incrementally mapping to a boolean. If you have say three inputs each with a true validity, the reduce function would return:
true && true where the first true is the initial value passed into reduce.
true && true and where the first true here is the result of the previous result.
true && true
At the end of the reduction, you're left with a single boolean representing the validity of the entire, where by that if just a single input component's validity is false, the entire reduction will amount to false. This is because validSoFar keeps track of the overall validity and is mutated by returning the compound of the whether the form is valid so far and the validity of the current input in iteration.
This is a reasonable equivalent:
var validExpense = true;
var inputCmps = component.find('expenseform')
for (var i = 0; i < inputCmps.length; i++) {
// Displays error messages for invalid fields
inputCmp.showHelpMessageIfInvalid();
if (!inputCmp.get('v.validity').valid) {
validExpense = false;
}
}
// Now we can use validExpense
This is a somewhat strange use of reduce, to be honest, because it does more than simply reducing a list to a single value. It also produces side effects (presumably) in the call to showHelpMessageIfInvalid().
The idea of reduce is simple. Given a list of values that you want to fold down one at a time into a single value (of the same or any other type), you supply a function that takes the current folded value and the next list value and returns a new folded value, and you supply an initial folded value, and reduce combines them by calling the function with each successive list value and the current folded value.
So, for instance,
var items = [
{name: 'foo', price: 7, quantity: 3},
{name: 'bar', price: 5, quantity: 5},
{name: 'baz', price: 19, quantity: 1}
]
const totalPrice = items.reduce(
(total, item) => total + item.price * item.quantity, // folding function
0 // initial value
); //=> 65
It does not make sense to use reduce there and have side effects in the reduce. Better use Array.prototype.filter to get all invalid expense items.
Then use Array.prototype.forEach to produce side effect(s) for each invalid item. You can then check the length of invalid expense items array to see it your input was valid:
function(component, event, helper) {
var invalidExpenses = component.find('expenseform').filter(
function(ex){
//return not valid (!valid)
return !ex.get('v.validity').valid
}
);
invalidExpenses.forEach(
//use forEach if you need a side effect for each thing
function(ex){
ex.showHelpMessageIfInvalid();
}
);
// If we pass error checking, do some real work
if(invalidExpenses.length===0){//no invalid expense items
// Create the new expense
var newExpense = component.get("v.newExpense");
console.log("Create expense: " + JSON.stringify(newExpense));
helper.createExpense(component, newExpense);
}
}
The mdn documentation for Array.prototype.reduce has a good description and examples on how to use it.
It should take an array of things and return one other thing (can be different type of thing). But you won't find any examples there where side effects are initiated in the reducer function.

Comparing 2 Json Object using javascript or underscore

PS: I have already searched the forums and have seen the relevant posts for this wherein the same post exists but I am not able to resolve my issue with those solutions.
I have 2 json objects
var json1 = [{uid:"111", addrs:"abc", tab:"tab1"},{uid:"222", addrs:"def", tab:"tab2"}];
var json2 = [{id:"tab1"},{id:"new"}];
I want to compare both these and check if the id element in json2 is present in json1 by comparing to its tab key. If not then set some boolean to false. ie by comparing id:"tab1" in json2 to tab:"tab1 in json1 .
I tried using below solutions as suggested by various posts:
var o1 = json1;
var o2 = json2;
var set= false;
for (var p in o1) {
if (o1.hasOwnProperty(p)) {
if (o1[p].tab!== o2[p].id) {
set= true;
}
}
}
for (var p in o2) {
if (o2.hasOwnProperty(p)) {
if (o1[p].tab!== o2[p].id) {
set= true;
}
}
}
Also tried with underscore as:
_.each(json1, function(one) {
_.each(json2, function(two) {
if (one.tab!== two.id) {
set= true;
}
});
});
Both of them fail for some test case or other.
Can anyone tell any other better method or outline the issues above.
Don't call them JSON because they are JavaScript arrays. Read What is JSON.
To solve the problem, you may loop over second array and then in the iteration check if none of the objects in the first array matched the criteria. If so, set the result to true.
const obj1 = [{uid:"111", addrs:"abc", tab:"tab1"},{uid:"222",addrs:"def", tab:"tab2"}];
const obj2 = [{id:"tab1"},{id:"new"}];
let result = false;
for (let {id} of obj2) {
if (!obj1.some(i => i.tab === id)) {
result = true;
break;
}
}
console.log(result);
Unfortunately, searching the forums and reading the relevant posts is not going to replace THINKING. Step away from your computer, and write down, on a piece of paper, exactly what the problem is and how you plan to solve it. For example:
Calculate for each object in an array whether some object in another array has a tab property whose value is the same as the first object's id property.
There are many ways to do this. The first way involves using array functions like map (corresponding to the "calculate for each" in the question, and some (corresponding to the "some" in the question). To make it easier, and try to avoid confusing ourselves, we'll do it step by step.
function calculateMatch(obj2) {
return obj2.map(doesSomeElementInObj1Match);
}
That's it. Your program is finished. You don't even need to test it, because it's obviously right.
But wait. How are you supposed to know about these array functions like map and some? By reading the documentation. No one help you with that. You have to do it yourself. You have to do it in advance as part of your learning process. You can't do it at the moment you need it, because you won't know what you don't know!
If it's easier for you to understand, and you're just getting started with functions, you may want to write this as
obj2.map(obj1Element => doesSomeElementInObj1Match(obj1Element))
or, if you're still not up to speed on arrow functions, then
obj2.map(function(obj1Element) { return doesSomeElementInObj1Match(obj1Element); })
The only thing left to do is to write doesSomeElementInObj2Match. For testing purposes, we can make one that always returns true:
function doesSomeElementInObj2Match() { return true; }
But eventually we will have to write it. Remember the part of our English description of the problem that's relevant here:
some object in another array has a tab property whose value is the same as the first object's id property.
When working with JS arrays, for "some" we have the some function. So, following the same top-down approach, we are going to write (assuming we know what the ID is):
In the same way as above, we can write this as
function doesSomeElementInObj2Match(id) {
obj2.some(obj2Element => tabFieldMatches(obj2Element, id))
}
or
obj2.some(function(obj2Element) { return tabFieldMatches(obj2Element, id); })
Here, tabFieldMatches is nothing more than checking to make sure obj2Element.tab and id are identical.
We're almost done! but we still have to write hasMatchingTabField. That's quite easy, it turns out:
function hasMatchingTabField(e2, id) { return e2.tab === id; }
In the following, to save space, we will write e1 for obj1Element and e2 for obj2Element, and stick with the arrow functions. This completes our first solution. We have
const tabFieldMatches = (tab, id) { return tab === id; }
const hasMatchingTabField = (obj, id) => obj.some(e => tabFieldMatches(e.tab, id);
const findMatches = obj => obj.some(e => hasMatchingTabField(e1, obj.id));
And we call this using findMatches(obj1).
Old-fashioned array
But perhaps all these maps and somes are a little too much for you at this point. What ever happened to good old-fashioned for-loops? Yes, we can write things this way, and some people might prefer that alternative.
top: for (e1 of obj1) {
for (e2 of (obj2) {
if (e1.id === e2.tab) {
console.log("found match");
break top;
}
}
console.log("didn't find match);
}
But some people are sure to complain about the non-standard use of break here. Or, we might want to end up with an array of boolean parallel to the input array. In that case, we have to be careful about remembering what matched, at what level.
const matched = [];
for (e1 of obj1) {
let match = false;
for (e2 of obj2) {
if (e1.id === e2.tab) match = true;
}
matched.push(match);
}
We can clean this up and optimize it bit, but that's the basic idea. Notice that we have to reset match each time through the loop over the first object.

immutable.js filter and mutate (remove) found entries

I have two loops, one for each day of the month, other with all events for this month. Let's say I have 100 000 events.
I'm looking for a way to remove events from the main events List once they were "consumed".
The code is something like:
const calendarRange = [{initialDate}, {initialDate}, {initialDate}, {initialDate}, ...] // say we have 30 dates, one for each day
const events = fromJS([{initialDate}, {initialDate}, {initialDate}, ...]) // let's say we have 100 000
calendarRange.map((day) => {
const dayEvents = events.filter((event) => day.get('initialDate').isSame(event.get('initialDate'), 'day')) // we get all events for each day
doSomeThingWithDays(dayEvents)
// how could I subtract `dayEvents` from `events` in a way
// the next celandarRange iteration we have less events to filter?
// the order of the first loop must be preserved (because it's from day 1 to day 3{01}])
}
With lodash I could just do something like:
calendarRange.map((day) => {
const dayEvents = events.filter((event) => day.get('initialDate').isSame(event.get('initialDate'), 'day')) // we get all events for each day
doSomeThingWithDays(dayEvents)
pullAllWith(events, dayEvents, (a, b) => a === b)
}
How to accomplish the same optimization with immutablejs? I'm not really expecting a solution for my way of iterating the list, but for a smart way of reducing the events List in a way it get smaller and smaller..
You can try a Map with events split into bins - based on your example, you bin based on dates - you can lookup a bin, process it as a batch and remove it O(1). Immutable maps are fairly inexpensive, and fare much better than iterating over lists. You can incur the cost of a one time binning, but amortize it over O(1) lookups.
Something like this perhaps:
eventbins = OrderedMap(events.groupBy(evt => evt.get('initialDate').dayOfYear() /* or whatever selector */))
function iter(list, bins) {
if(list.isEmpty())
return
day = list.first()
dayEvents = bins.get(day.dayOfYear())
doSomeThingWithDays(dayEvents)
iter(list.shift(), bins.delete(day))
}
iter(rangeOfDays, eventbins)
By remobing already processed elements you are not going to make anything faster. The cost of all filter operations will be halved on average, but constructing the new list in every iteration will cost you some cpu cycles so it is not going to be significantly faster (in a big O sense). Instead, you could build an index, for example an immutable map, based on the initialDate-s, making all the filter calls unnecessary.
const calendarRange = Immutable.Range(0, 10, 2).map(i => Immutable.fromJS({initialDate: i}));
const events = Immutable.Range(0, 20).map(i => Immutable.fromJS({initialDate: i%10, i:i}));
const index = events.groupBy(event => event.get('initialDate'));
calendarRange.forEach(day => {
const dayEvents = index.get(day.get('initialDate'));
doSomeThingWithDays(dayEvents);
});
function doSomeThingWithDays(data) {
console.log(data);
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/immutable/3.8.1/immutable.js"></script>

Categories

Resources