PHP: Request 50MB-100MB json - browser crash / do not display any result - javascript

Huge json server requests: around 50MB - 100MB for example.
From what I know, it might crash when loading huge requests of data to a table (I usually use datatables), the result: memory reaches to almost 8G, and the browser crash. Chrome might not return a result, Firefox will usually ask if I want to wait or kill the process.
I'm going to start working on a project which will send requests for huge jsons, all compressed (done by the server side PHP). The purpose of my report is to fetch data, and display all in a table - made easy to filter and order. So I cant find the use of "lazy load"ing for this specific case.
I might use a vue-js datatable library this time (not sure which specifically).
What's exactly using so much of my memory? I know for sure that the json result is received. Is that rendering/parsing of the json to the DOM? (I'm referring to the datatable example for now: https://datatables.net/examples/data_sources/ajax)
What is the best practices in these kind of situations?
I started researching this issue and noticed that there are posts from 2010 that seem like they're not relevant at all.

There is no limit on the size of an HTTP response. There is a limit on other things, such as:
local storage
session storage
cache
cookies
query string length
memory (per your CPU limitations or browser allocation)
Instead, the problem is with your implementation of your datatable most likely. You can't just insert 100,000 nodes into the DOM and not expect some type of performance impact. Furthermore, if the datatable is performing logic against each of those datum as they're coming in and processing them before the node insertion, that's also going to be a big no no.
What you've done here is essentially pass the leg work of performing pagination from the server to the client, and with dire impacts.
If you must return a response that big, consider using one of the storage options that browsers provide (a few mentioned above). Then paginate off of the stored JSON response.

Related

How to partially initialize an array with huge load of data from API response

I use Axios to get a response from an endpoint and then I input that response into a variable in the form of an object. The response I get from the server has more than 10000 data.
My question is :
How to process the data so it doesn't take up memory and can insert it into a table?
Is there a request handler that can process that response better?
Or is there a method to partially download the response so that it can be partially consumed by the user from the frontend?
There are many things to look at here:
Every site has a memory limit ( Assigned by the browsers ) and the site crashes if the memory exceeds.
The short and ugly fix would be... taking all those and storing only some part of it (Ugly fix).
The right and correct fix would be making a kind of pagination and limiting the number of data per request. in this case, You'll have proper data and a good user experience with the help of Tabs/pages with limited scroll. And most important, the site's memory will be optimized.

Best way to pull static data

Consider I have a zoo app that shows all the zoos for each city. Each city is a page with a list of zoos.
In my current solution, on each page, I have ajax call to the server that pulls the list of the zoos for that particular city.
The performance is extremely important for me and my thought was to remove the ajax call and replace it with a JSON object that will live in the app. That way I will save a call to the server and I believe the data will arrive faster.
Is this solution makes sense? There are around 40 cities with ~50 zoos for each.
Consider the data is static and will never change.
Since 900 records is not much **, you can get all the records at once during the initial load and filter the all records array by city, that way your user experience would be much smoother, since client side js processing is far better than n/w latency.
** - note: strictly considering the data set size of ~900
Other solution can be - cache the data in the session scope and when ever there is a specific request for a city check for the availability in session scope, if it's not there make a n/w call.
I think correct question is what is my performance requirements?
Because you can write all your data in json object and do everything on client side without any ajax call but in this case when any client visit your page that means it will download all data. and that is another question mark

Scraping a webpage that is using a firebase database

DISCLAIMER: I'm just learning by doing, I have no bad intentions
So, I would like to fetch the list of the applications listed on this website: http://roaringapps.com/apps
I've done similar things in the past, but with simpler websites; this time I'm having problems getting my hands on the data behind this webpage.
The scrolling from page to page is blazing fast so, to understand how the webpage works, I've fired up a packet sniffer and analyzed the traffic. I've noticed that, after the initial loading, no traffic is exchanged between the server and my client, even if I scroll over 2500 records in the browser. How is that possible?
Anyhow. My understanding is that the website is loading the data from a stream of some sort, and render it via Javascript. Am I correct?
So, I've fired up chromium devtools a looked at the "network" tab, and saw that a WebSocket request is made to the following address: wss://s-usc1c-nss-123.firebaseio.com
At this point, after googling a bit, I've tried to query the very same server, using the "v=5&ns=roaringapps" query I saw on the devtools window:
from websocket import create_connection
ws = create_connection('wss://s-usc1c-nss-123.firebaseio.com')
ws.send('v=5&ns=roaringapps')
print json.loads(ws.recv())
And got this reply:
{u't': u'c', u'd': {u't': u'h', u'd': {u'h': u's-usc1c-nss-123.firebaseio.com', u's': u'JUL5t1nC2SXfGaIjwecB6G13j1OsmMVv', u'ts': 1476799051047L, u'v': u'5'}}}
I was expecting to see a json response with the raw data about applications & so on. What I'm doing wrong?
Thanks a lot!
UPDATE
Actually, I just found out that the website is using json to load its data. I was not seeing it in iterated requests probably because of caching - but disabling it in chromium did the trick.
While the Firebase Database allows you to read/write JSON data. But its SDKs don't simply transfer the raw JSON data, they do many tricks on top of that to ensure an efficient and smooth experience. W
hat you're getting there is Firebase's wire protocol. The protocol is not publicly documented and (if you're new to it) trying to unravel it is going to give you an unpleasant time.
To retrieve the actual JSON at a location, it's easiest to use Firebase's REST API. You can get that by simply appending .json to the URL and firing a HTTP GET request against that.
So if the initial data is being loaded from:
https://mynamespace.firebaseio.com/path/to/data
You'd get the raw JSON by firing a HTTP GET against:
https://mynamespace.firebaseio.com/path/to/data.json

Where to put "a lot" of data, array / file / somewhere else, in JS on node.js

This may be a "stupid" question to ask, but I am working with a "a lot" of data for the first time.
What I want to do: Querying the World Bank API
Problem: The API is very unflexible when it comes to searching/filtering... I could query every country/indicator for it self, but I would generate a lot of calls. So I wanted to download all informations abourt a country or indicator at once and then sort them on the machine.
My Question: Where/How to store the data? Can I simply but it into an array, do I have to worry about size? Should I write to a temporary json file ? Or do you have another idea ?
Thanks for your time!
Example:
20 Countries, 15 Indicators
If I would query every country for itself I would generate 20*15 API calls, if I would call ALL countries for 1 indicator it would result in 15 API calls. I would get a lot of "junk" data :/
You can keep the data in RAM in an appropriate data structure (array or object) if the following are true:
The data is only needed temporarily (during one particular operation) or can easily be retrieved again if your server restarts.
If you have enough available RAM for your node.js process to store the data in RAM. In a typical server environment, there might be more than a GB of RAM available. I wouldn't recommend using all of that, but you could easily use 100MB of that for data storage.
Keeping it in RAM will likely make it faster and easier to interact with than storing it on disk. The data will, obviously, not be persistent across server restarts if it is in RAM.
If the data is needed long term and you only want to fetch it once and then have access to the data over and over again even if your server restarts of if the data is more than hundreds of MBs or if your server environment does not have a lot of RAM, then you will want to write the data to an appropriate database where it will persist and you can query it as needed.
If you don't know how large your data will be, you can write code to temporarily put it in an array/object and observe the memory usage of your node.js process after the data has been loaded.
I would suggest storing it in a nosql database, since you'll be working with JSON, and querying from there.
mongodb is very 'node friendly' - there's the native driver - https://github.com/mongodb/node-mongodb-native
or mongoose
Storing data from an external source you don't control brings with it the complexity of keeping the data in sync if the data happens to change. Without knowing your use case or the API it's hard to make recommendations. For example, are you sure you need the entire data set? Is there a way to filter down the data based on information you already have (user input, etc)?

Browser-based caching for remote resources

I have two REST-ful resources on my server:
/someEntry/{id}
Response:
{
someInfoAboutEntry: ...,
entryTypeUrl: "/entryType/12345"
}
and
/entryType/{id}
Response:
{
someInfoAboutEntryType: ...
}
The entryTypeUrl is used to fetch additional data about the type of this entry from the different URL. It will be bound to some "Detailed information" button near each entry. There can be many (let's say 100) entries, while there are only 5 types (so most entries point to same entryTypeUrl.
I'm building a Javascript client to access those resources. Should I cache entryType results in my Javascript code, or should I rely on the browser to cache the data for me and dispatch XHR requests every time user clicks the "Detailed information" button?
As far as I see it, both approaches should work just fine. The second one (always dispatching requests) will result in clearer code though. Should I stick to it, or are there some points I'm not aware of?
Thanks in advance.
I would definitely let the browser manage the caching, rather than writing a custom caching layer yourself.
This way you have less code to write and maintain, and you allow the server to dictate (via its HTTP headers) whether the response should be cached or not. If you write your own caching code you remove the ability to refetch stale data - which you would get for free from the browser.

Categories

Resources