Scalability of Cytoscape.js

Scalability of Cytoscape.js - javascript

I have a 11MB JSON graph file with about 45K edges and 73K nodes without x, y locations, and I want to display this graph using the BFS layout. I am using promise/deferred to load the file. I haven't been able get Cytoscape to display this graph on chrome. So:
Are there some special techniques for displaying large graphs?
What is the largest graph anyone has displayed using cytoscape.js?
If cytoscape.js won't work are there other JS frameworks that will work for large graphs?

You are limited by the performance of the browsers themselves. Cytoscape.js uses several techniques to optimise rendering performance, but you'll still hit the ceiling of the browser's performance.
I don't think you'll find any browser tech today (July 2015) that supports rendering such large datasets.

We display a graph from a 5.4 MB JSON file with predefined coordinates on different browsers with great performance. Is there a specific reason for not precalculating the coordinates (e.g. in Cytoscape desktop)?
To increase rendering performance:
use haystack edges
provide "min-zoomed-font-size" for nodes and edges
hide non-selected edge labels
use batches for series of operations

Related

Library for hierarchical graph visualisation with Angular 11

I'm looking for a library to use with Angular 11 to show and interact with hierarchical graphs. I would like to zoom in and out the graph, click nodes and add events on click, hover etc. The problem is that my graphs are very big (thousands of nodes and edges - they are deep learning models).
I tried ngx-graph and it works great until the graph has more than 500 nodes. When it's getting bigger, the performance is very low.
I also tried echarts and it has no performance issues, but it doesn't support hierarchical graph layout.
Do you have any recommendations? I would prefer MIT or Apache license.

Cytoscape.js orphan nodes position

I am using Cytoscape.js to visualize networks.
I am currently using the cola layout which works a treat when one network (with zero orphan nodes) is generated.
However if I add an additional network on screen that is unaffiliated to the first with zero connections, on visualization the two separate networks a re positioned miles apart.
I have to manually grab one of the networks as a cluster and drag it for ages until it is closer to the other network.
My assumption is there is a way to set the distance between two graphs manually however how can I do this?

As far as I know, Cola does not have a component-packing algorithm. That would conflict with its live-animated, n-body physics simulation. Components are naturally repelled from one another, unless there is a strong gravity force to the screen's centre.
Component packing is typically a separate step for force-directed layouts, so it would give a jittery result when coupled with live animation. You may have better luck with other force-directed layouts with different trade-offs, such as FCOSE.

Drawing directed acyclic graphs in a web application - only strange approach available?

we're currently building a web application (AngularJS) which needs to draw directed acyclic graphs (up to 1000 nodes, up to 10000 edges) based on dynamically created data.
Since over a year I am looking for a library/tool which is calculating the required layout and drawing a graph which can be styled, is zoom and panable, interactive (e.g. highlight children on mouse over).
Graphviz is the tool which produces the best results but it's not really ready to be used on webservers (especially as I cannot guarantee the OS and don't want to).
I tried dagre and it's d3 rendering - and I like it very much, but it has two major drawbacks: 1) it doesn't really support ranking and clustering - which makes the output rather chaotic and 2) with the graphs getting bigger its performance is getting unacceptable.
Next thing which really looked convincing was cytoscape.js as the output looks very nice and it's rather fast in drawing even larger graphs (and allowing some performance tweaking). But its standard layouting (e.g. cose or breadthfirst) doesn't produce the output we're requireing.
From my current point of view a have two chances:
1) Create layout with dagre.js and drawing the result with cytoscape.js (layout: 'preset', using the calculated x and y for nodes from dagre layout). But 'compounds'/clusters are not supported that way.
2) Using [viz.js](https://github.com/mdaines/viz.js) (an emscripted Javascript version of Graphviz, again not really performing well in drawing the graphs) to calculate the result as output format plain and again using this result to draw it with cytoscape.js.
Now my question(s):
1) Do you have another idea how to achieve it?
2) Could you give me any hint on how to ideally link the chain if my ideas are correct?
(AngularJS -> REST Backend -> JSON to Frontend -> Restructure JSON for dagre or viz -> Calculate layout -> input result to cytoscape -> render in browser?!?) and is there a chance to do the layout calculation on my node.js frontend server and not the client itself (again due to performance)?
Any hint and idea is heaviley appreciated.

It sounds like you're dissatisfied with Dagre as a compound layout. That makes sense, because the author of Dagre did not implement compound support for it. Dagre is separate from Cytoscape.js but can be used by it. You'll note the author of Dagre has stated he will no longer update the library with new features from himself. Thus, your options are:
(1) Use another Cytoscape.js layout actually designed for compound nodes, like CoSE or Cola.
(2) Write your own layout for Cytoscape.js that meets your needs: http://js.cytoscape.org/#extensions/layouts
(3) Adapt Dagre to recognise compound nodes and make a pull request to the author: https://github.com/cpettitt/dagre Then the Cytoscape.js layout for Dagre can be updated to use the new Dagre APIs that you add.
Currently, compound graph layout is an open research area -- so you'll be hard pressed to find many layouts that support them at all in any library. Cytoscape.js has some compound supporting layouts, and you're free to fork them if you want different behaviour. Also note that the 2.5 branch has CoSE2 with an improved algorithm.

Cytoscape.js can use a dagre layout. You can look here in the cytoscape docs for more information, but you can just add your nodes and edges to the graph and then run the dagre layout. That way you don't have to position all of the nodes manually based on the position of nodes from the separate dagre.js.

Employing dynamic data for graphs

I am aiming to build a site that will contain a lot of user generated data, hopefully.
I'm in my first year of self learning programming: Python, Django, MySQL, HTML and Javascript.
I can chart dummy data on a table just fine, but I'm now looking at turning that data into nice colorful looking graphs.
I am in my first day of investigation into finding out how to do this. But before I continue, I would like to ask a few questions.
There seems to be many JavaScript frameworks for building charts, such as Google charts and jquery charts, and some object orientated programs for building charts, such as Cairo Plot and matplotlib.
The Javascript frameworks seem initially like a nice easy way to do it. However, whereas with tables, where you can enter variable data tags in the body of an HTML page, and have Javascript make it look pretty, the data of a graph goes in the scripting area, where the variable data tags don't quite seem to work the same way. I'm using Django, so a variable tag looks like:
{{ uniquenum }}
Q1. Should this work and am I just doing it wrong, or am I right in thinking variable tags can't go in the scripting area?
Q2. Can you have Javascript frameworks produce graphs from data outside the <script> area?
Q3. I've read that Javascript frameworks are getting more powerful, but because I'll be potentially using large amounts of dynamic data, should I be concentrating on using OO style graph programs like Cairo Plot and matplotlib, which to me don't seem to have the same levels of support?
Just looking for a nudge in the right direction.

How are the plots (typically) placed on the web page?
Here's the usual API schema for javascript-based data visualization libraries:
i. pre-allocate a div as the chart container in your markup (or template); typically using an id selector using an id selector, like so:
<div id="chart1"> </div>
Often these libraries also require that this container be pre-sized--styled with height and width properties e.g.,
<div id="chart1" style="height:300px;width:500px; "></div>
HTML5 libraries are particular about the container--i.e., they require the chart to be placed inside the canvas tag, e.g.,
ii. call the plot constructor from your included javascript file and pass in (i) a data source; (ii) aesthetic options (e.g., axis labels), and (iii) the location in your markup of the container that will hold the plot; this location is usually expected to be in the form of an id selector. So in jqplot for instance inside the jQuery ready event,
plot1 = $.jqplot('chart1', [dataSet1, dataSet2], chartOptions)
javascript-based data visualization libraries i recommend (based on having used each for multiple projects).
I. conventional plotting formats: bar, line, point, pie
Among javascript plotting libraries, i recommend the jQuery-based options because you need less code to create yoru plots and because it's easier to use jQuery's AJAX methods to load your data, for instance, jqplot, flot, and HighCharts (the three libraries that i recommend below) all include in their basic distribution, complete (html, css, js) example plots that demonstate loading data via AJAX.
HighCharts (open source but requires paid license for commercial use, but the most polished and longest feature list; active and fairly high signal-noise ratio forums on the HighCharts Site)
flot (perhaps the most widely used)
jqplot (large selection of plot types, highly modular, e.g,. most functinality beyond the basics is added one plugin at a time)
II. graphs, trees, network diagrams, etc.
d3 (the successor to protovis; stunning graphic quality, rich interactive elements, animation; not strictly jQuery-based, but the author clearly borrowed the basic syntax patterns from jQuery; excellent tutorials on d3 by an accomplished data visualization specialist, Jan Willem Tulp Unlike the others mentioned here, this is a low-level library; indeed there are (at least) several plotting libraries based on d3, e.g., rickshaw by shutterstock, and cube by Square. If you want conventional x-y line/bar plots then for instance, you can build your plots in e.g., HighCharts much faster. D3 becomes more interesting as use cases become more specific--in particular animation and unorthodox visualization (sunburst diagrams, chord diagrams, parallel line plots, geographic maps, etc.)
RafaelJS, renders in SVG, along with d3 and processing.js, you can make just about anything (e.g., two-player games in the browser) with this library; gRafael is a separate library for creating the usual plot types (bar, line, pie)
III. time-series plots
dygraphs (a javascript library dedidated soley to time-series plotting, and its feature set reflects this mission, e.g., capacity
to process and render plots with high volumes of data (>10,000
points), wide range of opdtions for tick labels of time axis with
many formatting options
HighStock (a time-series library from the HighChart guys)
IV. real-time/streaming data
Smoothie Charts (sparse feature set, only intended to do one thing well which is smoothly render streaming data; HighCharts, jqplot, and flot will also do this, but depending on the character of your data (variance, rate) these three general-purpose libraries might not show the data as "spiky", which is precisely what Smoothie was designed to eliminate)

If you're going to deal with very large datasets (>10000 elements), you will probably run into performance problems, no matter what Javascript library you end up picking.
Having said that, there's a growing number of Javascript toolkits which dynamycally load the data set with an HTTP request as a JSON, XML, etc... flot is pretty fast and open source. Highcharts is very feature rich and free for non-commercial projects. And if you need more esoteric visualizations, you must take a look at d3.js.

I have found another javascript charting library to be useful for streaming data = https://github.com/INRIA/VisualSedimentation. The code is baed on d3.js, but has some good extensibility.

Pie, bar, line: SVG/VML better than Canvas

I need to choose a library for "standard" charting: pies, lines and bars.
From what I've read, it seems to me that the best format is SVG/VML, like Highcharts for example. SVG is becoming standard across all major browsers, now that IE 9 accepts it. It seems easier to rescale and export than Canvas.
Still, I see that several charting libraries rely on Canvas. Am I missing something? Is there any reason for considering Canvas over SVG for such applications?

You can generally achieve the same results with either. Both end up drawing pixels to the screen for the user. The major differentiators are that HTML5 Canvas gives you pixel-level control over the results (both reading and writing), while SVG is a retained-mode graphics API that makes it very easy to handle events or manipulate the artwork with JavaScript or SMIL Animation and have all redrawing taken care of for you.
In general, I'd suggest using HTML5 Canvas if you:
need pixel-level control over effects (e.g. blurring or blending)
have a very large number of data points that will be presented once (and perhaps panned), but are otherwise static
Use SVG if you:
want complex objects drawn on the screen to be associated with events (e.g. move over a data point to see a tooltip)
want the result to print well at high resolution
need to animate the shapes of various graph parts independently
will be including text in your output that you want to be indexed by search engines
want to use XML and/or XSLT to produce the output

Canvas isn't needed unless you want heavy manipulation/animation or are going to have 10,000+ charts. More on performance analysis here.
It is also important to make the distinction: Charting and diagramming are two different things. Displaying a few bar charts is very different from (for instance) making diagramming flowcharts with 10,000+ movable, link-able, potentially-animated objects.
Every SVG element is a DOM element, and adding 10,000 or 100,000 nodes to the DOM causes incredible slowdown. But adding that many elements to Canvas is entirely doable, and can be quite fast.
In case it may have confused you: RaphaelJS (in my opinion the best charting SVG Library) makes use of the word canvas, but that is no way related to the HTML <canvas> element.

In the past two years, my preference has been to use svg, as I mainly deal with relatively small datasets to build pies, column charts or maps.
However one advantage I have found with canvas is the ability to save the chart as an image thanks to the toDataURL method. I haven't found an equivalent for svg, and it seems that the best way to save an svg chart client side is to convert it to canvas first (using for example canvg).

Develop Reference

JavaScript is the programming language of the Web.