Handling large data sets on client side

Handling large data sets on client side - javascript

I'm trying to build an application that uses Server Sent Events in order to fetch and show some tweets (latest 50- 100 tweets) on UI.
Url for SSE:
https://tweet-service.herokuapp.com/stream
Problem(s):
My UI is becoming unresponsive because there is a huge data that's coming in!
How do I make sure My UI is responsive? What strategies should I usually adopt in making sure I'm handling the data?
Current Setup: (For better clarity on what I'm trying to achieve)
Currently I have a Max-Heap that has a custom comparator to show latest 50 tweets.
Everytime there's a change, I am re-rendering the page with new max-heap data.

We should not keep the EventSource open, since this will block the main thread if too many messages are sent in a short amount of time. Instead, we only should keep the event source open for as long as it takes to get 50-100 tweets. For example:
function getLatestTweets(limit) {
return new Promise((resolve, reject) => {
let items = [];
let source = new EventSource('https://tweet-service.herokuapp.com/stream');
source.onmessage = ({data}) => {
if (limit-- > 0) {
items.push(JSON.parse(data));
} else {
// resolve this promise once we have reached the specified limit
resolve(items);
source.close();
}
}
});
}
getLatestTweets(100).then(e => console.log(e))
You can then compare these tweets to previously fetched tweets to figure out which ones are new, and then update the UI accordingly. You can use setInterval to call this function periodically to fetch the latest tweets.

Related

Angular: Increase Query Loading Time in Firebase Database

I have an angular app where i am querying my firebase database as below:
constructor() {
this.getData();
}
getData() {
this.projectSubscription$ = this.dataService.getAllProjects()
.pipe(
map((projects: any) =>
projects.map(sc=> ({ key: sc.key, ...sc.payload.val() }))
),
switchMap(appUsers => this.dataService.getAllAppUsers()
.pipe(
map((admins: any) =>
appUsers.map(proj =>{
const match: any = admins.find(admin => admin.key === proj.admin);
return {...proj, imgArr: this.mapObjectToArray(proj.images), adminUser: match.payload.val()}
})
)
)
)
).subscribe(res => {
this.loadingState = false;
this.projects = res.reverse();
});
}
mapObjectToArray = (obj: any) => {
const mappedDatas = [];
for (const key in obj) {
if (Object.prototype.hasOwnProperty.call(obj, key)) {
mappedDatas.push({ ...obj[key], id: key });
}
}
return mappedDatas;
};
And here is what I am querying inside dataService:
getAllProjects() {
return this.afDatabase.list('/projects/', ref=>ref.orderByChild('createdAt')).snapshotChanges();
}
getAllAppUsers() {
return this.afDatabase.list('/appUsers/', ref=>ref.orderByChild('name')).snapshotChanges();
}
The problem I am facing with this is I have 400 rows of data which I am trying to load and it is taking around 30seconds to load which is insanely high. Any idea how can I query this in a faster time?

We have no way to know whether the 30s is reasonable, as that depends on the amount of data loaded, the connection latency and bandwidth of the client, and more factors we can't know/control.
But one thing to keep in mind is that you're performing 400 queries to get the users of each individual app, which is likely not great for performance.
Things you could consider:
Pre-load all the users once, and then use that list for each project.
Duplicate the name of each user into each project, so that you don't need to join any data at all.
If you come from a background in relational databases the latter may be counterintuitive, but it is actually very common in NoSQL data modeling and is one of the reasons NoSQL databases scale so well.

I propose 3 solutions.
1. Pagination
Instead of returning all those documents on app load, limit them to just 10 and keep record of the last one. Then display the 10 (or any arbitrary base number)
Then make the UI in such a way that the user has to click next or when the user scrolls, you fetch the next set based on the previous last document's field's info.
I'm supposing you need to display all the fetched data in some table or list so having the UI paginate the data should make sense.
2. Loader
Show some loader UI on website load. Then when all the documents have fetched, you hide the loader and show the data as you want. You can use some custom stuff for loader, or choose from any of the abundant libraries out there, or use mat-progress-spinner from Angular Material
3. onCall Cloud Function
What if you try getting them through an onCall cloud function? It night be faster because it's just one request that the app will make and Firebase's Cloud Functions are very fast within Google's data centers.
Given that the user's network might be slow to iterate the documents but the cloud function will return all at once and that might give you what you want.
I guess you could go for this option only if you really really need to display all that data at once on website load.
... Note on cost
Fetching 400 or more documents every time a given website loads might be expensive. It'll be expensive if the website is visited very frequently by very many users. Firebase cost will increase as you are charged per document read too.
Check to see if you could optimise the data structure to avoid fetching this much.
This doesn't apply to you if this some admin dashboard or if fetching all users like this is done rarely making cost to not be high in that case.

For Loop Hanging Browser. How to Populate DataTable Efficiently

I am trying to populate a DataTable (DataTables.net) from a javascript array of objects. While I don't have problem doing so if the data is relatively small, such as less than 6000 rows--and data can be as large as 20,000 rows. But when the data is larger then I get Google Chrome to hang and I get the message to Wait or exit. Firstly, my current code:
var data = [];
for (var i = 0; i < arr_Members_in_Radius.length; i++) {
data.push([arr_Members_in_Radius[i].record_id);
}
var search_results_table_2 = $('#tbl_search_results').DataTable({
destroy: true,
data: data
});
The above code hangs when data is large. So, following this link How to rewrite forEach to use Promises to stop "freezing" browsers? I implemented a Promise based approach:
function PopulateDataTable(current_row) {
search_results_table.row.add([current_row.record_id, current_row.JOA_Enrollment, current_row.mbr_full_name,
current_row.member_address, current_row.channel_subchannel, current_row.joa_clinic, current_row.serviceaddress]).draw();
};
var wait = ms => new Promise(resolve => setTimeout(resolve, ms));
arr_Members_in_Radius.reduce((p, i) => p.then(() => PopulateDataTable(i) ).then(() => wait(1)), Promise.resolve());
and while that seems to solve the problem of the browser hanging, the datatable gets updated one row at a time--which is very time consuming and makes the datatable unusuable (scrolling issues, searching, sorting etc) until all/sufficient of the 6000 rows are loaded. It would be nice if datatable gets loaded at least 100+ at a time in the PopulateDataTable() function call. I wonder how will I able to do that. Or please suggest a different approach.
Thank you!

It turns out that the data table was indeed able to handle 20,000 rows of data without any issue. The problem was--and to some extent still is-- that the subsequent call to another function load large data on a map (a Leaflet.js map) was causing the browser to hang. So I guess too much data processing too close to each other. Here is how I fixed it. The function to call the display on the map is now waiting for 3 seconds before calling. I will tweak my code to increase/reduce the timeout based on the amount of data. Not an elegant solution but the browser doesn't hang anymore and I get to display all the data per user selections.
setTimeout(function () {
createRouteOnMap(arr_Members_in_Radius);
}, 3000);

Node JS Auto Reload

I want to make a program with javascript or node.js what I want to achieve from the program is when there is a new item in rss that I take it will get a log through the terminal, and for the future I will put the code in firebase hosting, so I need that the code can run by itself the log that I will get maybe I will change it into a text file or stored in a database
so like this
I run the program and get all the items on RSS,
but when there is a new item I don't have to run the node app.js again, so every time there is a new item in the rss it will display the log by itself automatically
so far i made it with js node and i use rss-parser
and the code I use like this:
let Parser = require('rss-parser');
let parser = new Parser();
(async () => {
let feed = await parser.parseURL('https://rss-checker.blogspot.com/feeds/posts/default?alt=rss');
feed.items.forEach(items => {
console.log(items);
})
})();

There are three common ways to achieve this:
Polling
Stream push
Webhook
Based on your code sample I assume that the RSS feeder is request/response. This lends well to polling.
A poll based program will make a request to a resource on an interval. This interval should be informed by resource limits and expected performance from the end user. Ideally the API will accept an offset or a page so you could request all feeds above some ID. This make the program stateful.
setInterval can be used to drive the polling loop. Below shows an example of the poller loop with no state management. It polls at 5 second intervals:
let Parser = require('rss-parser');
let parser = new Parser();
setInterval(async () => {
let feed = await parser.parseURL('https://rss-checker.blogspot.com/feeds/posts/default?alt=rss');
feed.items.forEach(items => {
console.log(items);
})
}), 5000);
This is incomplete because it needs to keep track of already seen posts. Creating a poll loop means you have a stateful process that needs to stay running.

Synchronize critical section in API for each user in JavaScript

I wanted to swap a profile picture of a user. For this, I have to check the database to see if a picture has already been saved, if so, it should be deleted. Then the new one should be saved and entered into the database.
Here is a simplified (pseudo) code of that:
async function changePic(user, file) {
// remove old pic
if (await database.hasPic(user)) {
let oldPath = await database.getPicOfUser(user);
filesystem.remove(oldPath);
}
// save new pic
let path = "some/new/generated/path.png";
file = await Image.modify(file);
await Promise.all([
filesystem.save(path, file),
database.saveThatUserHasNewPic(user, path)
]);
return "I'm done!";
}
I ran into the following problem with it:
If the user calls the API twice in a short time, serious errors occur. The database queries and the functions in between are asynchronous, causing that the changes of the first API call weren't applied when the second API checks for a profile pic to delete. So I'm left with a filesystem.remove request for an already unexisting file and an unremoved image in the filesystem.
I would like to safely handle that situation by synchronizing this critical section of code. I don't want to reject requests only because the server hasn't finished the previous one and I also want to synchronize it for each user, so users aren't bothered by the actions of other users.
Is there a clean way to achieve this in JavaScript? Some sort of monitor like you know it from Java would be nice.

You could use a library like p-limit to control your concurrency. Use a map to track the active/pending requests for each user. Use their ID (which I assume exists) as the key and the limit instance as the value:
const pLimit = require('p-limit');
const limits = new Map();
function changePic(user, file) {
async function impl(user, file) {
// your implementation from above
}
const { id } = user // or similar to distinguish them
if (!limits.has(id)) {
limits.set(id, pLimit(1)); // only one active request per user
}
const limit = limits.get(id);
return limit(impl, user, file); // schedule impl for execution
}
// TODO clean up limits to prevent memory leak?

Progress event for large Firebase queries?

I have the following query:
fire = new Firebase 'ME.firebaseio.com'
users = fire.child 'venues/ID/users'
users.once 'value', (snapshot) ->
# do things with snapshot.val()
...
I am loading 10+ mb of data, and the request takes around 1sec/mb. Is it possible to give the user a progress indicator as content streams in? Ideally I'd like to process the data as it comes in as well (not just notify).
I tried using the on "child_added" event instead, but it doesn't work as expected - instead of children streaming in at a consistent rate, they all come at once after the entire dataset is loaded (which takes 10-15 sec), so in practice it seems to be a less performant version of on "value".

You should be able to optimize your download time from 10-20secs to a few milliseconds by starting with some denormalization.
For example, we could move the images and any other peripherals comprising the majority of the payload to their own path, keep only the meta data (name, email, etc) in the user records, and grab the extras separately:
/users/user_id/name, email, etc...
/images/user_id/...
The number of event listeners you attach or paths you connect to does not have any significant overhead locally or for networking bandwidth (just the payload) so you can do something like this to "normalize" after grabbing the meta data:
var firebaseRef = new Firebase(URL);
firebaseRef.child('users').on('child_added', function(snap) {
console.log('got user ', snap.name());
// I chose once() here to snag the image, assuming they don't change much
// but on() would work just as well
firebaseRef.child('images/'+snap.name()).once('value', function(imageSnap) {
console.log('got image for user ', imageSnap.name());
});
});
You'll notice right away that when you move the bulk of the data out and keep only the meta info for users locally, they will be lightning-fast to grab (all of the "got user" logs will appear right away). Then the images will trickle in one at a time after this, allowing you to create progress bars or process them as they show up.
If you aren't willing to denormalize the data, there are a couple ways you could break up the loading process. Here's a simple pagination approach to grab the users in segments:
var firebaseRef = new Firebase(URL);
grabNextTen(firebaseRef, null);
function grabNextTen(ref, startAt) {
ref.limit(startAt? 11 : 10).startAt(startAt).once('value', function(snap) {
var lastEntry;
snap.forEach(function(userSnap) {
// skip the startAt() entry, which we've already processed
if( userSnap.name() === lastEntry ) { return; }
processUser(userSnap);
lastEntry = userSnap.name();
});
// setTimeout closes the call stack, allowing us to recurse
// infinitely without a maximum call stack error
setTimeout(grabNextTen.bind(null, ref, lastEntry);
});
}
function processUser(snap) {
console.log('got user', snap.name());
}
function didTenUsers(lastEntry) {
console.log('finished up to ', lastEntry);
}
A third popular approach would be to store the images in a static cloud asset like Amazon S3 and simply store the URLs in Firebase. For large data sets in the hundreds of thousands this is very economical, since those solutions are a bit cheaper than Firebase storage.
But I'd highly suggest you both read the article on denormalization and invest in that approach first.

Develop Reference

JavaScript is the programming language of the Web.

Handling large data sets on client side - javascript

Related

Angular: Increase Query Loading Time in Firebase Database

For Loop Hanging Browser. How to Populate DataTable Efficiently

Node JS Auto Reload

Synchronize critical section in API for each user in JavaScript

Progress event for large Firebase queries?

Categories

Resources