I am creating a grid that provides sorting functionality. It uses D3 js . Currently the data set is around 2000 records but in the future this could be a large number i.e in 10,000 to 1 million.
Should client js scripts be used for sorting or should it done on server assuming we have a large recordset.
Also I would at what point should I be considering lazy loading of data for the table.
Thanks
Related
I have a form like below which contains dynamic columns and rows which can be added and deleted as per the user.
Assuming the number of rows being 50 to 60 and columns upto 10, there are a lot of calculations taking place in the javascript.
I am using MySQL database here and php(Laravel) as backend.
Currently, my table structure is as given below to store the above:-
Table 1 - row_headers (row_id,row_name)
Table 2 - column_headers (column_id, column_name)
Table 3 - book_data ( book_id, row_id, column_id, data_value)
The above set of tables do suffice the data storing, but is extremely slow with respect to Store call as well as get call. While getting the complete data back to the UI there is much load on the database as well as HTML to load the data properly(for loops kill all the time) and also the same is a tedious process.
I want to understand how to optimize the above? What table structure should be used other than the above and what is the best way to reduce the load at backend as well as frontend?
Help appreciated.
I'm thinking about what is the best way to feed a chart with filtered data. The thing is that I have year and month filters among others and I want the chart to be quick on showing up and switching data on filter change.
Is it better to prepare already filtered data into individual CSVs and load them from server as needed or should I use Javascript to filter data on client side using a big CSV?
Individual files:
- Load quickly
- No client-side computations
Big CSV:
- Avoids network connection (loaded once)
If I choose individual files I would have A LOT of them as filters create many combinations. I don't know if there is any drawback in that case. I think the individual files are the option with the highest performance.
I have been working on dc and crossfilter js and I currently have a large dataset with 550,000 rows and size 60mb csv and am facing a lot of issues with it like browser crashes etc
So , I'm trying to understand how dc and crossfilter deals with large datasets.
http://dc-js.github.io/dc.js/
The example on their main site runs very smoothly and after seeing timelines->memory (in console) it goes to a max of 34 mb and slowly reduces with time
My project is taking up memory in the range of 300-500mb per dropdown selection, when it loads a json file and renders the entire visualization
So, 2 questions
What is the backend for the dc site example? Is it possible to find out the exact backend file?
How can I reduce the data overload on my RAM from my application, which is running very slowly and eventually crashing?
Hi you can try running loading the data, and filtering it on the server. I faced a similar problem when the size of my dataset was being too big for the browser to handle.
I posted a question a few weeks back as to implementing the same. Using dc.js on the clientside with crossfilter on the server
Here is an overview of going about it.
On the client side, you'd want to create fake dimensions and fake groups that have basic functionality that dc.js expects(https://github.com/dc-js/dc.js/wiki/FAQ#filter-the-data-before-its-charted). You create your dc.js charts on the client side and plug in the fake dimensions and groups wherever required.
Now on the server side you have crossfilter running(https://www.npmjs.org/package/crossfilter). You create your actual dimensions and groups here.
The fakedimensions have a .filter() function that basically sends an ajax request to the server to perform the actual filtering. The filtering information could be encoded in the form of a query string. You'd also need a .all() function on your fake group to return the results of the filtering.
The project requirements are odd for this one, but I'm looking to get some insight...
I have a CSV file with about 12,000 rows of data, approximately 12-15 columns. I'm converting that to a JSON array and loading it via JSONP (has to run client-side). It takes many seconds to do any kind of querying on the data set to returned a smaller, filtered data set. I'm currently using JLINQ to do the filtering, but I'm essentially just looping through the array and returning a smaller set based on conditions.
Would webdb or indexeddb allow me to do this filtering significantly faster? Any tutorials/articles out there that you know of that tackles this particular type of issue?
http://square.github.com/crossfilter/ (no longer maintained, see https://github.com/crossfilter/crossfilter for a newer fork.)
Crossfilter is a JavaScript library for exploring large multivariate
datasets in the browser. Crossfilter supports extremely fast (<30ms)
interaction with coordinated views, even with datasets containing a
million or more records...
This reminds me of an article John Resig wrote about dictionary lookups (a real dictionary, not a programming construct).
http://ejohn.org/blog/dictionary-lookups-in-javascript/
He starts with server side implementations, and then works on a client side solution. It should give you some ideas for ways to improve what you are doing right now:
Caching
Local Storage
Memory Considerations
If you require loading an entire data object into memory before you apply some transform on it, I would leave IndexedDB and WebSQL out of the mix as they typically both add to complexity and reduce the performance of apps.
For this type of filtering, a library like Crossfilter will go a long way.
Where IndexedDB and WebSQL can come into play in terms of filtering is when you don't need to load, or don't want to load, an entire dataset into memory. These databases are best utilized for their ability to index rows (WebSQL) and attributes (IndexedDB).
With in browser databases, you can stream data into a database one record at a time and then cursor through it, one record at a time. The benefit here for filtering is that this you means can leave your data on "disk" (a .leveldb in Chrome and .sqlite database for FF) and filter out unnecessary records either as a pre-filter step or filter in itself.
I'm using MIT's Simile to display thumbnails and links with faceted filtering. I works great, but large data sets (greater than 500 elements) start to slow significantly. My user base will tolerate seconds, but not 10's of seconds, and certainly not minutes while the page renders.
Is it the volume of data in the JSON structure?
Is it Simile's method of parsing?
Too slow compared to what? Its probably faster than XML and easier to implement compared to your own custom binary format.
Exhibit version 3 (http://simile-widgets.org/exhibit ) provides good interaction with up to 100,000 items. Displaying them all can take some time if the individual items' lens template is complicated, but if you use pagination then loading, filtering, and display are all pretty quick.