Charting sporadic events over time on Highcharts (Dynamic Precision?) - javascript

How would you chart something like pageviews over time using Highcharts?
Given that page views take place at sporadic irregular intervals, how could you chart this as accurately and legibly as possible?
One way is to group pageviews into time intervals (like days), and then sum up all pageviews on any given day.
The obvious issue here is that if you are only looking at data for a few days, the intervals are too large, and the data fits basically into a few buckets (not really showing any trends).
Another solution I thought of is to have a minimum interval (say, 7 steps) and when less than 7 days of data are requested, (say 3) I could divide that time period into 7 intervals.
However this seems like too much fuss, especially on the backend, for the purpose of simply showing data.
Given that the underlying data does not change, only the manner in which it's rendered, I figured there must be a general solution to this problem.

It depends on what you are trying to discover in the data. If you want to compare it with another data set then use the same scheme it does ("pageviews per day" or whatever). If you want to spot trends over time you need to decide on your horizon and use an appropriate time period (so, for example, if you are trying to justify the purchase of a larger server then perhaps quarterly data comparing this year to last would be good). Designing visualizations for datasets is a huge topic.
So, in other words, I think you pretty much answered your own question.

It looks like the answer is forced Data Grouping
http://api.highcharts.com/highstock/#plotOptions.series.dataGrouping
forced: Boolean
When data grouping is forced, it runs no matter how small the
intervals are. This can be handy for example when the sum should be
calculated for values appearing at random times within each hour.
Defaults to false.
I'll try this and see if it works well
This could work for highstock, but it's not part of highcharts...

Related

How to dynamically change a field depending on a unique date formula

I'm working on a work schedule of sorts, a feature that I'm trying to create would be one that can be setup initial and would not need to be touched ever again. However I'm not sure where to even start; I have a weekly schedule with three different fields, a middle field that is unique and requires no attention and a day/night field. My user is able to drag users into these fields and set that they are expected to working during that period. However as my user has different shifts that sometimes overlap I'm looking to color code each of the shifts of which there is a total of five, the shifts go on a pattern of working: 2 nights, than 2 days off, than 3 days working and lastly than 2 days off. What would be the best approach of mapping this as I cannot simply put that Monday nights are color X as the by the time next weeks comes around the shift would be starting on Wednesday, along side this some shifts will be working at the same time so the color coding should not cover the entire day but only a limited number of entries (2). My initial idea was to try using Hashmap or something of the sort but I'm uncertain how I would structure it to achieve what I'm looking for..
Drag and drop experience would be nice but I assume not mandatory, so can try to use standard features. Im not sure which user is using this - the worker doing the shift, or a manager allocating workers to the shifts? If is for a manager - If you created a 'shift' object and had shifts represented as records you'd be able to create list views of those shifts ie. 'next n days','this month' etc. Then you could use bulk action within the list view to assign a user to multiple shift records at a time. Would be reasonably efficient and quick to build. Just a thought... (and I couldnt follow what you meant about color coding sorry.)

How to call a JS function or Python method when a glyph is completely rendered?

I am using bokeh as a Server Application. When I make a selection in a plot I do some actions in python and I update some sources (CDS). This changes are reflected in the plot. Is there a way to check when the glyphs are completely rendered (after the update)? I want to call a JavaScript function when all is completely loaded? With that function I want to call other python method to update the CDS again.
If I do not wait for this profiles to be rendered probably the application breaks, and that´s what I want to avoid. Actually I did some tests in the past and I had to create a huge CDS instead of several smaller CDS to make it work properly.
My Use Case. Why do I want to make this?
I have many tabs in my layout, they can be 10 for example. And each tab has some plots (3-6 plots). If I update the entire ColumnDataSource at the same time, it will take a while. Then I want to make it more fluent, so I would like to update only the data of the current visible tab, it will render faster and the user would receive an immediate response. I can disable the rest of the tabs temporarily to prevent malfunctions. At this moment I would need to call the JS or python method in order to update the content of the rest of the tabs.
Here a drawing of what I want to achieve in order to speed up the process:
About the data
Basically I have two DataFrames, one to build the cloud of points (around 5000 row and 130 columns) and I extract from the selected points another DataFrame to know which lines I should draw (360 columns and 5 to 15 rows), making some filters and selections. The algorythm I have used is in the answer of a question I have written time ago. With this amount of data the algorythm takes 6 or 7 seconds to finish.
Any other idea of how to improve the performance or how to split or the computing?
To improve the rendering speed you could try the webgl JavaScript API. This Bokeh documentation page Speeding up with WebGL explains how to do it. webgl supports circles, lines and most of the markers. Application:
p = Plot(output_backend="webgl") # for the glyph API
p = figure(output_backend="webgl") # for the plotting API
Please be aware that users report issues with webgl like plot stuttering, etc... but it may work in your case depend on which type of glyphs your plot contains.
Also make sure your data passed to the plot doesn't include NaN's as it is known to slow down Bokeh performance.
To my knowledge there is no attribute that indicates that rendering is completed or is still ongoing but you may think about some other alternatives to speed things up like combination of Bokeh with Datashader (pre-rendering large datasets into a fixed-size raster image) or Dask (speed up data reading from multiple sources like multiple csv files)
For example you could have one standard Bokeh plot where you make a
selection and let the other plots being generated as Datashader images
and embed them in Bokeh plots.
This example shows how to combine Bokeh + Datashader which significantly improves performance especially when over-plotting takes place. Please note that each time a single point is added to the plot entire canvas area will be re-drawn in the browser. This is how browsers work. Datashader can provide a single image so updating the plot is much quicker while you can still use Toolbar tools like zoom, pan etc....
Also the Python code implementation details counts. Using e.g. gridplot to link many plots can slow down performance so it is better to add them one by one to the document root, etc...
Time ago I made a trick to check if my design would work if I could trigger some function if the plots were rendered:
First I updated the current tab. This worked very well and fast.
Then I set a timeout to update the data of the rest of the tabs. But, in the meantime this second algorythm was being executed I could not work with the plots of the current tab because they were frozen.
So, the approach of triggering a function when everything is rendered is not a good idea, because even with such a callback the app would not work as I was expecting.

Can I animate visual attributes for multiple objects simultaneously in D3?

I’m trying to show a D3 hive plot animation representing internet attack data over time using d3.hive.min.js. The base view has 2 axes. Along one axis is a set of source IP addresses – the IP address from which the attack originated. The other axis shows destination IP addresses.
A static view showing the set of attacks (each involving one source and one destination IP) looks like this:
Hive Plot Source/Destination Network Paths - Static View
Each attack has a start time and an end time, which provides a temporal sequencing.
The desired visualization will cause the links to appear and disappear in temporal order, preferably with a variable duration computed from the start/end times. However, a constant time for the appearance of each link would suffice.
I’m using d3.hive.min.js to create the plot axes and links.
I’ve tried a number of approaches without success:
Creating all links with opactity 0, then cycling through a set of
“time steps” where I reset the opactity of each link based on it’s
state at that time. This has the advantage of having ultimate control
of the opacity values based on computations based on start and end
time. I've tried variations with and without transitions and delays.
Bringing the links in one at a time. The result is not the exact
desired result, as it would show build-up of links over time, rather
than them coming and disappearing over time.
Creating and removing the links as they occur over time. With this
approach I don’t understand how I would be able to show overlapping attack
events, where multiple links should appear at same time.
My question at this point is foremost “is this possible with D3”? I'm no expert with D3, but have spent well over 10 hours experimenting and searching. I would very much appreciate a suggestion of an approach with which I could experiment, or a pointer to a similar example.

How should I handle the browser freezing due to a large amount of data?

I have created a web page where data can be loaded into a google chart for displaying it visually. Everything works well when the data set is small, but as soon as it approaches 10,000 points or more, the browser starts to freeze.
Is there any way to stop the freezing other than lowering the amount of data displayed or cutting it off after a certain limit is reached?
Can you scale the data before being worked with?
For example, if you have 5-10k visitors each day and you want to have chart that shows growth of visits in month, divide all numbers with 1000 and than process them. Than write below (or where ever) that scale is x1000.
I'm guessing is not as simple as this, but believe that can be done on these principles.

JavaScript data grid for millions of rows [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I need to present a large number of rows of data (ie. millions of rows) to the user in a grid using JavaScript.
The user shouldn't see pages or view only finite amounts of data at a time.
Rather, it should appear that all of the data are available.
Instead of downloading the data all at once, small chunks are downloaded as the user comes to them (ie. by scrolling through the grid).
The rows will not be edited through this front end, so read-only grids are acceptable.
What data grids, written in JavaScript, exist for this kind of seamless paging?
(Disclaimer: I am the author of SlickGrid)
UPDATE
This has now been implemented in SlickGrid.
Please see http://github.com/mleibman/SlickGrid/issues#issue/22 for an ongoing discussion on making SlickGrid work with larger numbers of rows.
The problem is that SlickGrid does not virtualize the scrollbar itself - the scrollable area's height is set to the total height of all the rows. The rows are still being added and removed as the user is scrolling, but the scrolling itself is done by the browser. That allows it to be very fast yet smooth (onscroll events are notoriously slow). The caveat is that there are bugs/limits in the browsers' CSS engines that limit the potential height of an element. For IE, that happens to be 0x123456 or 1193046 pixels. For other browsers it is higher.
There is an experimental workaround in the "largenum-fix" branch that raises that limit significantly by populating the scrollable area with "pages" set to 1M pixels height and then using relative positioning within those pages. Since the height limit in the CSS engine seems to be different and significantly lower than in the actual layout engine, this gives us a much higher upper limit.
I am still looking for a way to get to unlimited number of rows without giving up the performance edge that SlickGrid currently holds over other implementations.
Rudiger, can you elaborate on how you solved this?
https://github.com/mleibman/SlickGrid/wiki
"SlickGrid utilizes virtual rendering to enable you to easily work with hundreds of thousands of items without any drop in performance. In fact, there is no difference in performance between working with a grid with 10 rows versus a 100’000 rows."
Some highlights:
Adaptive virtual scrolling (handle hundreds of thousands of rows)
Extremely fast rendering speed
Background post-rendering for richer cells
Configurable & customizable
Full keyboard navigation
Column resize/reorder/show/hide
Column autosizing & force-fit
Pluggable cell formatters & editors
Support for editing and creating new rows."
by mleibman
It's free (MIT license).
It uses jQuery.
The best Grids in my opinion are below:
Flexigrid: http://flexigrid.info/
jQuery Grid: http://www.trirand.com/blog/
jqGridView: http://plugins.jquery.com/project/jqGridView
jqxGrid: https://www.jqwidgets.com/
Ingrid: http://reconstrukt.com/ingrid/
SlickGrid http://github.com/mleibman/SlickGrid
DataTables http://www.datatables.net/index
ShieldUI http://demos.shieldui.com/web/grid-virtualization/performance-1mil-rows
Smart.Grid https://www.htmlelements.com/demos/grid/overview/
My best 3 options are jqGrid, jqxGrid and DataTables. They can work with thousands of rows and support virtualization.
I don't mean to start a flame war, but assuming your researchers are human, you don't know them as well as you think. Just because they have petabytes of data doesn't make them capable of viewing even millions of records in any meaningful way. They might say they want to see millions of records, but that's just silly. Have your smartest researchers do some basic math: Assume they spend 1 second viewing each record. At that rate, it will take 1000000 seconds, which works out to more than six weeks (of 40 hour work-weeks with no breaks for food or lavatory).
Do they (or you) seriously think one person (the one looking at the grid) can muster that kind of concentration? Are they really getting much done in that 1 second, or are they (more likely) filtering out the stuff the don't want? I suspect that after viewing a "reasonably-sized" subset, they could describe a filter to you that would automatically filter out those records.
As paxdiablo and Sleeper Smith and Lasse V Karlsen also implied, you (and they) have not thought through the requirements. On the up side, now that you've found SlickGrid, I'm sure the need for those filters became immediately obvious.
I can say with pretty good certainty that you seriously do not need to show millions of rows of data to the user.
There is no user in the world that will be able to comprehend or manage that data set so even if you technically manage to pull it off, you won't solve any known problem for that user.
Instead I would focus on why the user wants to see the data. The user does not want to see the data just to see the data, there is usually a question being asked. If you focus on answering those questions instead, then you would be much closer to something that solves an actual problem.
I recommend the Ext JS Grid with the Buffered View feature.
http://www.extjs.com/deploy/dev/examples/grid/buffer.html
(Disclaimer: I am the author of w2ui)
I have recently written an article on how to implement JavaScript grid with 1 million records (http://w2ui.com/web/blog/7/JavaScript-Grid-with-One-Million-Records). I discovered that ultimately there are 3 restrictions that prevent from taking it highter:
Height of the div has a limit (can be overcome by virtual scrolling)
Operations such as sort and search start being slow after 1 million records or so
RAM is limited because data is stored in JavaScript array
I have tested the grid with 1 million records (except IE) and it performs well. See article for demos and examples.
dojox.grid.DataGrid offers a JS abstraction for data so you can hook it up to various backends with provided dojo.data stores or write your own. You'll obviously need one that supports random access for this many records. DataGrid also provides full accessibility.
Edit so here's a link to Matthew Russell's article that should provide the example you need, viewing millions of records with dojox.grid. Note that it uses the old version of the grid, but the concepts are the same, there were just some incompatible API improvements.
Oh, and it's totally free open source.
I used jQuery Grid Plugin, it was nice.
Demos
Here are a couple of optimizations you can apply you speed up things. Just thinking out loud.
Since the number of rows can be in the millions, you will want a caching system just for the JSON data from the server. I can't imagine anybody wanting to download all X million items, but if they did, it would be a problem. This little test on Chrome for an array on 20M+ integers crashes on my machine constantly.
var data = [];
for(var i = 0; i < 20000000; i++) {
data.push(i);
}
console.log(data.length);​
You could use LRU or some other caching algorithm and have an upper bound on how much data you're willing to cache.
For the table cells themselves, I think constructing/destroying DOM nodes can be expensive. Instead, you could just pre-define X number of cells, and whenever the user scrolls to a new position, inject the JSON data into these cells. The scrollbar would virtually have no direct relationship to how much space (height) is required to represent the entire dataset. You could arbitrarily set the table container's height, say 5000px, and map that to the total number of rows. For example, if the containers height is 5000px and there are a total of 10M rows, then the starting row ≈ (scroll.top/5000) * 10M where scroll.top represents the scroll distance from the top of the container. Small demo here.
To detect when to request more data, ideally an object should act as a mediator that listens to scroll events. This object keeps track of how fast the user is scrolling, and when it looks like the user is slowing down or has completely stopped, makes a data request for the corresponding rows. Retrieving data in this fashion means your data is going to be fragmented, so the cache should be designed with that in mind.
Also the browser limits on maximum outgoing connections can play an important part. A user may scroll to a certain position which will fire an AJAX request, but before that finishes the user can scroll to some other portion. If the server is not responsive enough the requests would get queued up and the application will look unresponsive. You could use a request manager through which all requests are routed, and it can cancel pending requests to make space.
I know it's an old question but still.. There is also dhtmlxGrid that can handle millions of rows. There is a demo with 50,000 rows but the number of rows that can be loaded/processed in grid is unlimited.
Disclaimer: I'm from DHTMLX team.
I suggest you read this
http://www.sitepen.com/blog/2008/11/21/effective-use-of-jsonreststore-referencing-lazy-loading-and-more/
Disclaimer: i heavily use YUI DataTable without no headache for a long time. It is powerful and stable. For your needs, you can use a ScrollingDataTable wich suports
x-scrolling
y-scrolling
xy-scrolling
A powerful Event mechanism
For what you need, i think you want is a tableScrollEvent. Its API says
Fired when a fixed scrolling DataTable has a scroll.
As each DataTable uses a DataSource, you can monitoring its data through tableScrollEvent along with render loop size in order to populate your ScrollingDataTable according to your needs.
Render loop size says
In cases where your DataTable needs to display the entirety of a very large set of data, the renderLoopSize config can help manage browser DOM rendering so that the UI thread does not get locked up on very large tables. Any value greater than 0 will cause the DOM rendering to be executed in setTimeout() chains that render the specified number of rows in each loop. The ideal value should be determined per implementation since there are no hard and fast rules, only general guidelines:
By default renderLoopSize is 0, so all rows are rendered in a single loop. A renderLoopSize > 0 adds overhead so use thoughtfully.
If your set of data is large enough (number of rows X number of Columns X formatting complexity) that users experience latency in the visual rendering and/or it causes the script to hang, consider setting a renderLoopSize.
A renderLoopSize under 50 probably isn't worth it. A renderLoopSize > 100 is probably better.
A data set is probably not considered large enough unless it has hundreds and hundreds of rows.
Having a renderLoopSize > 0 and < total rows does cause the table to be rendered in one loop (same as renderLoopSize = 0) but it also triggers functionality such as post-render row striping to be handled from a separate setTimeout thread.
For instance
// Render 100 rows per loop
var dt = new YAHOO.widget.DataTable(<WHICH_DIV_WILL_STORE_YOUR_DATATABLE>, <HOW YOUR_TABLE_IS STRUCTURED>, <WHERE_DOES_THE_DATA_COME_FROM>, {
renderLoopSize:100
});
<WHERE_DOES_THE_DATA_COME_FROM> is just a single DataSource. It can be a JSON, JSFunction, XML and even a single HTML element
Here you can see a Simple tutorial, provided by me. Be aware no other DATA_TABLE pluglin supports single and dual click at the same time. YUI DataTable allows you. And more, you can use it even with JQuery without no headache
Some examples, you can see
List item
Feel free to question about anything else you want about YUI DataTable.
regards,
I kind of fail to see the point, for jqGrid you can use the virtual scrolling functionality:
http://www.trirand.net/aspnetmvc/grid/performancevirtualscrolling
but then again, millions of rows with filtering can be done:
http://www.trirand.net/aspnetmvc/grid/performancelinq
I really fail to see the point of "as if there are no pages" though, I mean... there is no way to display 1,000,000 rows at once in the browser - this is 10MB of HTML raw, I kind of fail to see why users would not want to see the pages.
Anyway...
best approach i could think of is by loading the chunk of data in json format for every scroll or some limit before the scrolling ends. json can be easily converted to objects and hence table rows can be constructed easily unobtrusively
I would highly recommend Open rico.
It is difficult to implement in the the beginning, but once you grab it you will never look back.
I know this question is a few years old, but jqgrid now supports virtual scrolling:
http://www.trirand.com/blog/phpjqgrid/examples/paging/scrollbar/default.php
but with pagination disabled
I suggest sigma grid, sigma grid has embed paging features which could support millions of rows. And also, you may need a remote paging to do it.
see the demo
http://www.sigmawidgets.com/products/sigma_grid2/demos/example_master_details.html
Take a look at dGrid:
https://dgrid.io/
I agree that users will NEVER, EVER need to view millions of rows of data all at once, but dGrid can display them quickly (a screenful at a time).
Don't boil the ocean to make a cup of tea.

Categories

Resources