Hi, everyone! I need some help in my first app:
I’m creating an application with express+node.js as the background. There is no database. I’m using 3rd-party solution with some functions, that doing calculations, instead.
Front
50 objects. Every object has one unique value: random number. At start I have all these objects, I need to calculate some values for every object and position it on the form based on the calculated results.
Each object sends: axios.get('/calculations?value=uniqueValue') and I accumulate the results in an array. When array.length will be equal 50 I will compare array elements to each other and define (x, y) coordinates of each object. After that, objects will appear on the form.
Back
let value = uniqueValue; // an unique value received from an object
let requests = [];
for (let i = 0; i < 1500; i++) { // this loop is necessary due to application idea
requests.push(calculateData(value)); // 3rd-party function
value += 1250;
}
let result = await Promise.all(requests);
let newData = transform(result); // here I transform the calculated result and then return it.
return newData
Calculations for one object cost 700 ms. All calculations for all 50 objects cost ≈10 seconds. 3rd-party function receives only one value at the same time, but works very quickly. But the loop for (let i = 1; i < 1500; i++) {…} is very expensive.
Issues
10 seconds is not a good result, user can’t wait so long. May be I should change in approach for calculations?
Server is very busy while calculating, and other requests (e.g. axios.get('/getSomething?params=something') are pending.
Any advice will be much appreciated!
You can make the call in chunks of data using async.eachLimit
var values = [];
for (let i = 0; i < 1500; i++) { // this loop is necessary due to application idea
values.push(value);
value += 1250;
}
var arrayOfItemArrays = _.chunk(values, 50);
async.eachLimit(arrayOfItemArrays, 5, eachUpdate, function(err, result){let
newData = transform(result);
return newData ;
});
function eachUpdate(req_arr, cb){
var result = []
req_arr.forEach(fucntion(item){
calculateData(item).then((x){
result.push(x);
});
cb(result);
}
Related
I'm trying to understand how to perform some action on each element of an array, but by working in portions of that array, until each element has been touched.
As a more specific example, let's assume I have an array of 990 elements and want to perform some action on each element, but in portions of 200. What would be the most efficient way to do this?
function foo(array) {
results = []
if (array.length > 200) {
// Loop over and perform action on first 200 elements, then next 200, and so on...
// for each element, push result to results array
}
return results;
}
EDIT:
For my specific use case, each element in the array is a URL. I'm making a GET request with each URL using Axios. There is potential for my array to contain thousands of URLs, so I don't want to make a request and wait for a response one at a time; however, the server I'm making the requests to can only handle so many requests at one time (about 200).
There are lots of ways to do this. Some ways better than others. But I will assume you dont want to modify the original array and want to handle 200 elements on different moments:
function stepArray(arr){
//... create your custom index on the array object
//..., so it will know where to continue from
if(typeof arr.myIndex == 'undefined'){ arr.myIndex = 0; }
for(var k=0; k<200; k++){
if(k + arr.myIndex >= arr.length){return;}
process(arr[k + arr.myIndex]);
}
}
To make multiple chunks you can use reduce, like that:
var perChunk = 200 // chunk size
var inputArray = [] // your array
const result = inputArray.reduce((resultArray, item, index) => {
const chunkIndex = Math.floor(index/perChunk)
if(!resultArray[chunkIndex]) {
resultArray[chunkIndex] = [] // new chunk
}
resultArray[chunkIndex].push(item)
return resultArray
}, [])
console.log(result);
Then you can iterate again over the sible chunks a make your axios request.
for(i=0; i< result.length; i++) {
let delay = 3000 * i
setTimeout(() => {
console.log(// your action with the Array arr[i])
}, delay)
}
While it may not be the most efficient solution per my initial request, I found the following to be easy-to-understand:
while (array.length != 0) {
array.splice(0, 200).forEach(function (url) {
// Perform some action
});
}
My application is collecting a series of adjustments to budget values to try to reach a goal. So far, I'm collecting the adjustments into an array and totaling their values to show how much more needs to be cut to reach the goal:
const goal = 25;
let cutTotal = 0;
let remaining = 0;
let cuts = [];
function update (value) {
// Values are numbers: -1, 1, 0.1, -0.1. This function gets called
// when any update is made to a budget value.
cuts.push(value);
cutTotal = 0;
cuts.forEach(function (value) {
cutTotal += value;
});
remaining = goal - cutTotal;
}
This is working as expected, but I'm thinking there has to be a reasonably performant way to manage the length of the cuts array by removing values that are redundant. (Adding and subtracting 1 from the total doesn't change the total, so why store the values?)
I'm trying to solve this kata:
Given an integer N (<1000), return an array of integers 1..N where the sum of each 2 consecutive numbers is a perfect square. If that's not possible, return false.
For example, if N=15, the result should be this array: [9, 7, 2, 14, 11, 5, 4, 12, 13, 3, 6, 10, 15, 1, 8]. Below N=14, there's no answer, so the function should return false.
I thought 'how hard can this be?' and it's been long days in the rabbit hole. I've been programming for just a few months and don't have a background of CS so I'll write what I understand so far of the problem trying to use the proper concepts but please feel free to tell me if any expression is not correct.
Apparently, the problem is very similar to a known problem in graph theory called TSP. In this case, the vertices are connected if the sum of them is a perfect square. Also, I don't have to look for a cycle, just find one Hamiltonian Path, not all.
I understand that what I'm using is backtracking. I build an object that represents the graph and then try to find the path recursively. This is how I build the object:
function buildAdjacentsObject (limit) {
const potentialSquares = getPotentialSquares(limit)
const adjacents = {}
for (let i = 0; i < (limit + 1); i++) {
adjacents[i] = {}
for (let j = 0; j < potentialSquares.length; j++) {
if (potentialSquares[j] > i) {
const dif = potentialSquares[j] - i
if (dif <= limit) {
adjacents[i][dif] = 1
} else {
break
}
}
}
}
return adjacents
}
function getPotentialSquares (limit) {
const maxSum = limit * 2 - 1
let square = 4
let i = 3
const potentialSquares = []
while (square <= maxSum) {
potentialSquares.push(square)
square = i * i
i++
}
return potentialSquares
}
At first I was using a hash table with an array of adjacent nodes on each key. But when my algorithm had to delete vertices from the object, it had to look for elements in arrays several times, which took linear time every time. I made the adjacent vertices hashable and that improved my execution time. Then I look for the path with this function:
function findSquarePathInRange (limit) {
// Build the graph object
const adjacents = buildAdjacentsObject(limit)
// Deep copy the object before making any changes
const adjacentsCopy = JSON.parse(JSON.stringify(adjacents))
// Create empty path
const solution = []
// Recursively complete the path
function getSolution (currentCandidates) {
if (solution.length === limit) {
return solution
}
// Sort the candidate vertices to start with the ones with less adjacent vert
currentCandidates = currentCandidates.sort((a, b) => {
return Object.keys(adjacentsCopy[a]).length -
Object.keys(adjacentsCopy[b]).length
})
for (const candidate of currentCandidates) {
// Add the candidate to the path
solution.push(candidate)
// and delete it from the object
for (const candidateAdjacent in adjacents[candidate]) {
delete adjacentsCopy[candidateAdjacent][candidate]
}
if (getSolution(Object.keys(adjacentsCopy[candidate]))) {
return solution
}
// If not solution was found, delete the element from the path
solution.pop()
// and add it back to the object
for (const candidateAdjacent in adjacents[candidate]) {
adjacentsCopy[candidateAdjacent][candidate] = 1
}
}
return false
}
const endSolution = getSolution(
Array.from(Array(limit).keys()).slice(1)
)
// The elements of the path can't be strings
return (endSolution) ? endSolution.map(x => parseInt(x, 10)) : false
}
My solution works 'fast' but it's not fast enough. I need to pass more than 200 tests in less than 12 seconds and so far it's only passing 150. Probably both my algorithm and my usage of JS can be improved, so, my questions:
Can you see a bottleneck in the code? The sorting step should be the one taking more time but it also gets me to the solution faster. Also, I'm not sure if I'm using the best data structure for this kind of problem. I tried classic looping instead of using for..in and for..of but it didn't change the performance.
Do you see any place where I can save previous calculations to look for them later?
Regarding the last question, I read that there is a dynamic solution to the problem but everywhere I found one, it looks for minimum distance, number of paths or existence of path, not the path itself. I read this everywhere but I'm unable to apply it:
Also, a dynamic programming algorithm of Bellman, Held, and Karp can be used to solve the problem in time O(n2 2n). In this method, one determines, for each set S of vertices and each vertex v in S, whether there is a path that covers exactly the vertices in S and ends at v. For each choice of S and v, a path exists for (S,v) if and only if v has a neighbor w such that a path exists for (S − v,w), which can be looked up from already-computed information in the dynamic program.
I just can't get the idea on how to implement that if I'm not looking for all the paths. I found this implementation of a similar problem in python that uses a cache and some binary but again, I could translate it from py but I'm not sure how to apply those concepts to my algorithm.
I'm currently out of ideas so any hint of something to try would be super helpful.
EDIT 1:
After Photon comment, I tried going back to using a hash table for the graph, storing adjacent vertices as arrays. Also added a separate array of bools to keep track of the remaining vertices.
That improved my efficiency a lot. With these changes I avoided the need to convert object keys to arrays all the time, no need to copy the graph object as it was not going to be modified and no need to loop after adding one node to the path. The bad thing is that then I needed to check that separate object when sorting, to check which adjacent vertices were still available. Also, I had to filter the arrays before passing them to the next recursion.
Yosef approach from the first answer of using an array to store the adjacent vertices and access them by index prove even more efficient. My code so far (no changes to the square finding function):
function square_sums_row (limit) {
const adjacents = buildAdjacentsObject(limit)
const adjacentsCopy = JSON.parse(JSON.stringify(adjacents))
const solution = []
function getSolution (currentCandidates) {
if (solution.length === limit) {
return solution
}
currentCandidates = currentCandidates.sort((a, b) => {
return adjacentsCopy[a].length - adjacentsCopy[b].length
})
for (const candidate of currentCandidates) {
solution.push(candidate)
for (const candidateAdjacent of adjacents[candidate]) {
adjacentsCopy[candidateAdjacent] = adjacentsCopy[candidateAdjacent]
.filter(t => t !== candidate)
}
if (getSolution(adjacentsCopy[candidate])) {
return solution
}
solution.pop()
for (const candidateAdjacent of adjacents[candidate]) {
adjacentsCopy[candidateAdjacent].push(candidate)
}
}
return false
}
return getSolution(Array.from(Array(limit + 1).keys()).slice(1))
}
function buildAdjacentsObject (limit) {
const potentialSquares = getPotentialSquares(limit)
const squaresLength = potentialSquares.length
const adjacents = []
for (let i = 1; i < (limit + 1); i++) {
adjacents[i] = []
for (let j = 0; j < squaresLength; j++) {
if (potentialSquares[j] > i) {
const dif = potentialSquares[j] - i
if (dif <= limit) {
adjacents[i].push(dif)
} else {
break
}
}
}
}
return adjacents
}
EDIT 2:
The code performs fine in most of the cases, but my worst case scenarios suck:
// time for 51: 30138.229ms
// time for 77: 145214.155ms
// time for 182: 22964.025ms
EDIT 3:
I accepted Yosef answer as it was super useful to improve the efficiency of my JS code. Found a way to tweak the algorithm to avoid paths with dead ends using some of the restrictions from this paper A Search Procedure for Hamilton Paths and Circuits..
Basically, before calling another recursion, I check 2 things:
If there is any node with no edges that's not part of the path till now and the path is missing more than 1 node
If there were more than 2 nodes with 1 edge (one can be following node, that had 2 edges before deleting the edge to the current node, and other can be the last node)
Both situations make it impossible to find a Hamiltonian path with the remaining nodes and edges (if you draw the graph it'll be clear why). Following that logic, there's another improvement if you check nodes with only 2 edges (1 way to get in and other to go out). I think you can use that to delete other edges in advance but it was not necessary at least for me.
Now, the algorithm performs worse in most cases, where just sorting by remaining edges was good enough to predict the next node and extra work was added, but it's able to solve the worst cases in a much better time. For example, limit = 77 it's solved in 15ms but limit=1000 went from 30ms to 100ms.
This is a really long post, if you have any edit suggestions, let me know. I don't think posting the final code it's the best idea taking into account that you can't check the solutions in the platform before solving the kata. But the accepted answer and this final edit should be good advice to think about this last part while still learning something. Hope it's useful.
By replacing the object by an array you save yourself from convert the object to an array every time you want to find the length (which you do a lot - in any step of the sort algorithm), or when you want to get the keys for the next candidates. in my tests the code below has been a lot more effective in terms of execution time
(0.102s vs 1.078s for limit=4500 on my machine)
function buildAdjacentsObject (limit) {
const potentialSquares = getPotentialSquares(limit)
const adjacents = [];
for (let i = 0; i < (limit + 1); i++) {
adjacents[i] = [];
for (let j = 0; j < potentialSquares.length; j++) {
if (potentialSquares[j] > i) {
const dif = potentialSquares[j] - i
if (dif <= limit) {
adjacents[i].push(dif)
} else {
break
}
}
}
}
return adjacents
}
function getPotentialSquares (limit) {
const maxSum = limit * 2 - 1
let square = 4
let i = 3
const potentialSquares = []
while (square <= maxSum) {
potentialSquares.push(square)
square = i * i
i++
}
return potentialSquares
}
function findSquarePathInRange (limit) {
// Build the graph object
const adjacents = buildAdjacentsObject(limit)
// Deep copy the object before making any changes
const adjacentsCopy = JSON.parse(JSON.stringify(adjacents))
// Create empty path
const solution = [];
// Recursively complete the path
function getSolution (currentCandidates) {
if (solution.length === limit) {
return solution
}
// Sort the candidate vertices to start with the ones with less adjacent vert
currentCandidates = currentCandidates.sort((a, b) => {
return adjacentsCopy[a].length - adjacentsCopy[b].length
});
for (const candidate of currentCandidates) {
// Add the candidate to the path
solution.push(candidate)
// and delete it from the object
for (const candidateAdjacent of adjacents[candidate]) {
adjacentsCopy[candidateAdjacent] = adjacentsCopy[candidateAdjacent].filter(t=>t!== candidate)
}
if (getSolution(adjacentsCopy[candidate])) {
return solution
}
// If not solution was found, delete the element from the path
solution.pop()
// and add it back to the object
for (const candidateAdjacent of adjacents[candidate]) {
adjacentsCopy[candidateAdjacent].push(candidate);
}
}
return false
}
const endSolution = getSolution(
Array.from(Array(limit).keys()).slice(1)
)
// The elements of the path can't be strings
return endSolution
}
var t = new Date().getTime();
var res = findSquarePathInRange(4500);
var t2 = new Date().getTime();
console.log(res, ((t2-t)/1000).toFixed(4)+'s');
I have a json file that contains many objects and options.
Each of these kinds:
{"item": "name", "itemId": 78, "data": "Some data", ..., "option": number or string}
There are about 10,000 objects in the file.
And when part of item value("ame", "nam", "na", etc) entered , it should display all the objects and their options that match this part.
RegExp is the only thing that comes to my mind, but at 200mb+ file it starts searching for a long time(2 seconds+)
That's how I'm getting the object right now:
let reg = new RegExp(enteredName, 'gi'), //enteredName for example "nam"
data = await fetch("myFile.json"),
jsonData = await data.json();
let results = jsonData.filter(jsonObj => {
let item = jsonObj.item,
itemId = String(jsonObj.itemId);
return reg.test(item) || reg.test(itemId);
});
But that option is too slow for me.
What method is faster to perform such search using js?
Looking up items by item number should be easy enough by creating a hash table, which others have already suggested. The big problem here is searching for items by name. You could burn a ton of RAM by creating a tree, but I'm going to go out on a limb and guess that you're not necessarily looking for raw lookup speed. Instead, I'm assuming that you just want something that'll update a list on-the-fly as you type, without actually interrupting your typing, is that correct?
To that end, what you need is a search function that won't lock-up the main thread, allowing the DOM to be updated between returned results. Interval timers are one way to tackle this, as they can be set up to iterate through large, time-consuming volumes of data while allowing for other functions (such as DOM updates) to be executed between each iteration.
I've created a Fiddle that does just that:
// Create a big array containing items with names generated randomly for testing purposes
let jsonData = [];
for (i = 0; i < 10000; i++) {
var itemName = '';
jsonData.push({ item: Math.random().toString(36).substring(2, 15) + Math.random().toString(36).substring(2, 15) });
}
// Now on to the actual search part
let returnLimit = 1000; // Maximum number of results to return
let intervalItr = null; // A handle used for iterating through the array with an interval timer
function nameInput (e) {
document.getElementById('output').innerHTML = '';
if (intervalItr) clearInterval(intervalItr); // If we were iterating through a previous search, stop it.
if (e.value.length > 0) search(e.value);
}
let reg, idx
function search (enteredName) {
reg = new RegExp(enteredName, 'i');
idx = 0;
// Kick off the search by creating an interval that'll call searchNext() with a 0ms delay.
// This will prevent the search function from locking the main thread while it's working,
// allowing the DOM to be updated as you type
intervalItr = setInterval(searchNext, 0);
}
function searchNext() {
if (idx >= jsonData.length || idx > returnLimit) {
clearInterval(intervalItr);
return;
}
let item = jsonData[idx].item;
if (reg.test(item)) document.getElementById('output').innerHTML += '<br>' + item;
idx++;
}
https://jsfiddle.net/FlimFlamboyant/we4r36tp/26/
Note that this could also be handled with a WebWorker, but I'm not sure it's strictly necessary.
Additionally, this could be further optimized by utilizing a secondary array that is filled as the search takes place. When you enter an additional character and a new search is started, the new search could begin with this secondary array, switching to the original if it runs out of data.
I would like to know if it is worth to convert an array into a set in order to search using NodeJS.
My use case is that this search is done lot of times, but not necessary on big sets of data (can go up to ~2000 items in the array from time to time).
Looking for a specific id in a list.
Which approach is better :
const isPresent = (myArray, id) => {
return Boolean(myArray.some((arrayElement) => arrayElement.id === id);
}
or
const mySet = new Set(myArray)
const isPresent = (mySet, id) => {
return mySet.has(id);
}
I know that theoretically the second approach is better as it is O(1) and O(n) for the first approach. But can the instantiation of the set offset the gain on small arrays?
#jonrsharpe - particularly for your case, I found that converting an array of 2k to Set itself is taking ~1.15ms. No doubt searching Set is faster than an Array but in your case, this additional conversion can be little costly.
You can run below code in your browser console to check. new Set(arr) is taking almost ~1.2ms
var arr = [], set = new Set(), n = 2000;
for (let i = 0; i < n; i++) {
arr.push(i);
};
console.time('Set');
set = new Set(arr);
console.timeEnd('Set');
Adding element in the Set is always costly.
Below code shows the time required to insert an item in array/set. Which shows Array insertion is faster than Set.
var arr = [], set = new Set(), n = 2000;
console.time('Array');
for (let i = 0; i < n; i++) {
arr.push(i);
};
console.timeEnd('Array');
console.time('Set');
for (let i = 0; i < n; i++) {
set.add(i);
};
console.timeEnd('Set');
I run the following code to analyze the speed of locating an element in the array and set. Found that set is 8-10 time faster than the array.
You can copy-paste this code in your browser to analyze further
var arr = [], set = new Set(), n = 100000;
for (let i = 0; i < n; i++) {
arr.push(i);
set.add(i);
}
var result;
console.time('Array');
result = arr.indexOf(12313) !== -1;
console.timeEnd('Array');
console.time('Set');
result = set.has(12313);
console.timeEnd('Set');
So for your case array.some is better!
I will offer a different upside for using Set: your code is now more semantic, easier to know what it does.
Other than that this post has a nice comparison - Javascript Set vs. Array performance but make your own measurements if you really feel that this is your bottleneck. Don't optimise things that are not your bottleneck!
My own heuristic is a isPresent-like utility for nicer code but if the check is done in a loop I always construct a Set before.