Angular2/RXJS - Handling Potentially Long Queries - javascript

Goal: Front-end of application allows users to select files from their local machines, and send the file names to a server. The server then matches those file names to files located on the server. The server will then return a list of all matching files.
Issue: This works great if you a user select less than a few hundred files, otherwise it can cause long response times. I do not want to limit the number of files a user can select, and I don't want to have to worry about the http requests timing out on the front-end.
Sample code so far:
//html on front-end to collect file information
<div>
<input (change)="add_files($event)" type="file" multiple>
</div>
//function called from the front-end, which then calls the profile_service add_files function
//it passes along the $event object
add_files($event){
this.profile_service.add_files($event).subscribe(
data => console.log('request returned'),
err => console.error(err),
() => //update view function
);
}
//The following two functions are in my profile_service which is dependency injected into my componenet
//formats the event object for the eventual query
add_files(event_obj){
let file_arr = [];
let file_obj = event_obj.target.files;
for(let key in file_obj){
if (file_obj.hasOwnProperty(key)){
file_arr.push(file_obj[key]['name'])
}
}
let query_obj = {files:title_arr};
return this.save_files(query_obj)
}
//here is where the actual request to the back-end is made
save_files(query_obj){
let payload = JSON.stringify(query_obj);
let headers = new Headers();
headers.append('Content-Type', 'application/json');
return this.http.post('https://some_url/api/1.0/collection',payload,{headers:headers})
.map((res:Response) => res.json())
}
Possible Solutions:
Process requests in batches. Re-write the code so that the profile-service is only called with 25 files at a time, and upon each response call profile-service again with the next 25 files. If this is the best solution, is there an elegant way to do this with observables? If not, I will use recursive callbacks which should work fine.
Have the endpoint return a generic response immediately like "file matches being uploaded and saved to your profile". Since all the matching files are persisted to a db on the backend, this would work and then I could have the front-end query the db every so often to get the current list of matching files. This seem ugly, but figured I'd throw it out there.
Any other solutions are welcome. Would be great to get a best-practice for handling this type of long-lasting query with angular2/observables in an elegant way.

I would recommend that you break up the number of files that you search for into manageable batches and then process more as results are returned, i.e. solution #1. The following is an untested but I think rather elegant way of accomplishing this:
add_files(event_obj){
let file_arr = [];
let file_obj = event_obj.target.files;
for(let key in file_obj){
if (file_obj.hasOwnProperty(key)){
file_arr.push(file_obj[key]['name'])
}
}
let self = this;
let bufferedFiles = Observable.from(file_arr)
.bufferCount(25); //Nice round number that you could play with
return bufferedFiles
//concatMap will make sure that each of your requests are not executed
//until the previous completes. Then all the data is merged into a single output
.concatMap((arr) => {
let payload = JSON.stringify({files: arr});
let headers = new Headers();
hearders.append('Content-Type', 'application/json');
//Use defer to make sure because http.post is eager
//this makes it only execute after subscription
return Observable.defer(() =>
self.post('https://some_url/api/1.0/collection',payload, {headers:headers})
}, resp => resp.json());
}
concatMap will keep your server from executing more than whatever the size of your buffer is, by preventing new requests until the previous one has returned. You could also use mergeMap if you wanted them all to be executed in parallel, but it seems the server is the resource limitation in this case if I am not mistaken.

I'd suggest to use websocket connections instead because they don't time out.
See also
- https://www.npmjs.com/package/angular2-websocket
- http://mmrath.com/post/websockets-with-angular2-and-spring-boot/
- http://www.html5rocks.com/de/tutorials/websockets/basics/
An alternative approach would be polling, where the client makes repeated requests in a defined interval to get the current processing state from the server.
To send multiple requests and waiting for all of them to complete
getAll(urls:any[]):Observable {
let observables = [];
for(var i = 0; i < items.length; i++) {
observables.push(this.http.get(urls[i]));
}
return Observable.forkJoin(observables);
}
someMethod(server:string) {
let urls = [
'${server}/fileService?somedata=a',
'${server}/fileService?somedata=b',
'${server}/fileService?somedata=c'];
this.getAll(urls).subscribe(
(value) => processValue(val),
(err) => processError(err),
() => onDone());
}

Related

can we set forkJoin() API request calling one after another, not all at once in angular

there is one scenario when i tried to make multiple update calls of same endpoint with different body in forkJoin(), server returning
"Transaction (Process ID 92) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
so API team asking to send requests one after another, for simplified and as per recommended by TL i'm using forkJoin([]), not for loop based.
can we configure a setting / solution to ask forkJoin(), call the array of Observable API endpoints, one after another,not all at once, Please Help.
let url = 'https://jsonplaceholder.typicode.com/users/';
if (this.inviteUpdateBatch.length) {
forkJoin(this.http.putAsync(`${url}`, data1),
this.http.putAsync(`${url}`, data2)).subscribe(()=> {});
}
You can simply use concat rxjs operator.
for example:
const e = of(1);
const s = of(2);
concat(e, s).subscribe(val => {
console.log('val', val);
});
this will execute the the requests one after the one.

How can I improve this paginated do while async await GET request in Javascript (NodeJS)?

I am learning JavaScript (Node.js - using the Pipedream platform). I have been writing scripts to help automate some little tasks in my day to day work.
I am creating one that generates a report on recent interactions with clients.
As part of this I am using axios to get "engagements" from the Hubspot API (basically a list of identifiers I will use in later requests).
The API returns paginated responses. I have encountered pagination previously and understand the principle behind it, but have never written a script to handle it. This is my first.
It works. But I feel it could be improved. Below I've commented how I've approached it.
The endpoint returns up to 100 values 'per page' along with a "hasMore":true flag and an "offset":987654321 value which can be passed as a query parameter in subsequent requests (if hasMore === true).
Example API response:
{"results":[1234,1235,1236],"hasMore":true,"offset":987654321}
My code:
import axios from 'axios';
//function to get each page of data
async function getAssoc(req){
const options = {
method: 'GET',
url: `https://api.hubapi.com/${req}`,
headers: {
Authorization: `Bearer ${auths}`,
},
};
return await axios(options);
}
//declare array in which to store all 'associations'
const assocs = [];
//store the ID that I get in an earlier step
const id = vid;
//declare variable in which to temporarily store each request response data
var resp;
//declare query parameter value, initially blank, but will be assigned a value upon subsequent iterations of do while
var offset = '';
do {
//make request and store response in resp variable
resp = await getAssoc(`crm-associations/v1/associations/${id}/HUBSPOT_DEFINED/9?offset=${offset}`);
//push the results into my 'assocs' (associations) array
resp.data.results.forEach(element => assocs.push(element));
//store offset value for use in next iteration's request
offset = resp.data.offset;
} while (resp.data.hasMore); //hasMore will be false when there's no more records to request
return assocs;
I feel it could be improved because:
The DO WHILE loop, I believe, is making sequential requests. Is parallel a better/faster/more efficient option? (EDIT Thanks #Evert - of course I cannot make parallel requests because of the offset!)
I'm re-assigning new values to vars instead of using consts which seems simple and intuitive in my beginner's mind, but I don't understand a better way in this instance.
I would welcome any feedback or suggestions on how I can improve this for my own learning.
Thank you in advance for your time and any assistance you can offer.

Call stack size exceeded on re-starting Node function

I'm trying to overcome Call stack size exceeded error but with no luck,
Goal is to re-run the GET request as long as I get music in type field.
//tech: node.js + mongoose
//import components
const https = require('https');
const options = new URL('https://www.boredapi.com/api/activity');
//obtain data using GET
https.get(options, (response) => {
//console.log('statusCode:', response.statusCode);
//console.log('headers:', response.headers);
response.on('data', (data) => {
//process.stdout.write(data);
apiResult = JSON.parse(data);
apiResultType = apiResult.type;
returnDataOutside(data);
});
})
.on('error', (error) => {
console.error(error);
});
function returnDataOutside(data){
apiResultType;
if (apiResultType == 'music') {
console.log(apiResult);
} else {
returnDataOutside(data);
console.log(apiResult); //Maximum call stack size exceeded
};
};
Your function returnDataOutside() is calling itself recursively. If it doesn't gets an apiResultType of 'music' on the first time, then it just keeps calling itself deeper and deeper until the stack overflows with no chance of ever getting the music type because you're just calling it with the same data over and over.
It appears that you want to rerun the GET request when you don't have music type, but your code is not doing that - it's just calling your response function over and over. So, instead, you need to put the code that makes the GET request into a function and call that new function that actually makes a fresh GET request when the apiResultType isn't what you want.
In addition, you shouldn't code something like this that keeping going forever hammering some server. You should have either a maximum number of times you try or a timer back-off or both.
And, you can't just assume that response.on('data', ...) contains a perfectly formed piece of JSON. If the data is anything but very small, then the data may arrive in any arbitrary sized chunks. It make take multiple data events to get your entire payload. And, this may work on fast networks, but not on slow networks or through some proxies, but not others. Instead, you have to accumulate the data from the entire response (all the data events that occur) concatenated together and then process that final result on the end event.
While, you can code the plain https.get() to collect all the results for you (there's an example of that right in the doc here), it's a lot easier to just use a higher level library that brings support for a bunch of useful things.
My favorite library to use in this regard is got(), but there's a list of alternatives here and you can find the one you like. Not only do these libraries accumulate the entire request for you with you writing any extra code, but they are promise-based which makes the asynchronous coding easier and they also automatically check status code results for you, follow redirects, etc... - many things you would want an http request library to "just handle" for you.

Synchronize critical section in API for each user in JavaScript

I wanted to swap a profile picture of a user. For this, I have to check the database to see if a picture has already been saved, if so, it should be deleted. Then the new one should be saved and entered into the database.
Here is a simplified (pseudo) code of that:
async function changePic(user, file) {
// remove old pic
if (await database.hasPic(user)) {
let oldPath = await database.getPicOfUser(user);
filesystem.remove(oldPath);
}
// save new pic
let path = "some/new/generated/path.png";
file = await Image.modify(file);
await Promise.all([
filesystem.save(path, file),
database.saveThatUserHasNewPic(user, path)
]);
return "I'm done!";
}
I ran into the following problem with it:
If the user calls the API twice in a short time, serious errors occur. The database queries and the functions in between are asynchronous, causing that the changes of the first API call weren't applied when the second API checks for a profile pic to delete. So I'm left with a filesystem.remove request for an already unexisting file and an unremoved image in the filesystem.
I would like to safely handle that situation by synchronizing this critical section of code. I don't want to reject requests only because the server hasn't finished the previous one and I also want to synchronize it for each user, so users aren't bothered by the actions of other users.
Is there a clean way to achieve this in JavaScript? Some sort of monitor like you know it from Java would be nice.
You could use a library like p-limit to control your concurrency. Use a map to track the active/pending requests for each user. Use their ID (which I assume exists) as the key and the limit instance as the value:
const pLimit = require('p-limit');
const limits = new Map();
function changePic(user, file) {
async function impl(user, file) {
// your implementation from above
}
const { id } = user // or similar to distinguish them
if (!limits.has(id)) {
limits.set(id, pLimit(1)); // only one active request per user
}
const limit = limits.get(id);
return limit(impl, user, file); // schedule impl for execution
}
// TODO clean up limits to prevent memory leak?

NodeJS: How to handle a variable number of callbacks run in parallel and map their responses to requests?

As an exercise to teach myself more about node js I started making a basic CRUD REST server for SimpleDB (sdb) using the aws-sdk.
Everything was running smoothly until I got to a function for reading the domains. The aws-sdk has two functions for this purpose: listDomains and domainMetadata. listDomains returns an array of sdb domain names. domainMetadata will return additional statistics about a domain, but will only return them for one domain at a time. It does not include the domain name in the results.
My script is running listDomains and returning an array in the JSON response just fine. I would like to make my api readDomains function more ambitious though and have it return the metadata for all of the domains in the same single api call. After all, running a handful of domainMetadata calls at the same time is where node's async io should shine.
The problem is I can't figure out how to run a variable number of calls, use the same callback for all of them, match the results of each domainMetadata call to it's domainName (since it's async and they're not guaranteed to return in the order they were requested) and tell when all of the metadata requests have finished so that I can send my final response. Put into code my problem areas are:
domain.receiveDomainList = function(err, data){
var domainList = [];
for(var i=0; i<data.DomainNames.length; i++){
sdb.domainMetaData({"DomainName":data.DomainNames[i]},domain.receiveMetadata);
// alternatively: domainList.push({"DomainName":data.DomainNames[i]});
}
// alternatively:
// async.map(domainList, sdb.domainMetadata, domain.receiveMetadata)
console.log(domainList);
}
domain.receiveMetadata = function (err, data){
// I figure I can stash the results one at a time in an array in the
// parent scope but...
// How can I tell when all of the results have been received?
// Since the domainname used for the original call is not returned with
// the results how do I tell what result matches what request?
}
Based on my reading of async's readme the map function should at least match the metadata responses with the requests through some black magic, but it causes node to bomb out in the aws sync library with an error of " has no method 'makeRequest'".
Is there any way to have it all: requests run in parallel, requests matched with responses and knowing when I've received everything?
Using .bind() you can set the context or this values as well as provide leading default arguments to the bound function.
The sample code below is purely to show how you might use .bind() to add additional context to your response callbacks.
In the code below, .bind is used to:
set a domainResults object as the context for the receiveMetaData callback
pass the current domain name as an argument to the callback
The domainResults object is used to:
keep track of the number of names received in the first request
keep track of the completedCount (incremented on each callback from the metaData request)
keep track of both error and success responses in list
provide a complete callback
Completely untested code for illustrative purposes only:
domain.receiveDomainList = function(err, data) {
// Assuming err is falsey
var domainResults = {
nameCount: data.DomainNames.length,
completeCount: 0,
list: {},
complete: function() {
console.log(this.list);
}
};
for (var i = 0; i < data.DomainNames.length; i++) {
sdb.domainMetaData({ "DomainName": data.DomainNames[i] },
domain.receiveMetadata.bind(domainResults, data.DomainNames[i]));
}
}
domain.receiveMetadata = function(domainName, err, data) {
// Because of .bind, this === domainResults
this.completeCount++;
this.list[domainName] = data || {error: err};
if(this.completeCount === this.nameCount) {
this.complete();
}
}

Categories

Resources