I have a d3 simulation that groups nodes dependent on their properties, as calculated by an algorithm. Upon a number of iterations of this algorithm, I want to be able to display a 'cluster area' around certain groups, preferably through aggregating collided nodes while the algorithm is still running. So, rather than taking the approach of this block, where nodes are removed upon collision, is it possible to aggregate a count of collided nodes? And perhaps in a different way than noted in that block (seems to be computationally expensive with hundreds of data nodes)? I've looked through documentation for d3-force and couldn't find anything that struck out.
Thank you!
Related
I am building a web app where the user can create a sort of bipartite graph like this one:
Sometimes the user would like to build a huge graph, for example a graph in which the topmost layer has 784 nodes, as in the next picture. My application can handle the computation, but the result is ugly and meaningless:
Do you have any idea for rendering a huge layer without just drawing all the nodes but, instead, summarising them with another, prettier representation?
Until now I have thought about putting all the nodes of a "huge" layer in an empty compound node, but of course then it is not possible to draw some edges (so the huge layer would seem as it were disconnected from the graph). Another solution would be to have all layers with more than 100 nodes have exactly 100 nodes, and put them inside a compound node with z-index greater than each node's z-index; but I haven't tried this yet.
If you have some other ideas, or if Cytoscape.js provides a way to summarise large graphs, please let me know.
Thank you.
you could group nodes when the amount in a layer exceeds a certain number (e.g. if you have 100 nodes, you combine them into groups of 25)
I'd do this by iterating over the nodes in the layer, making a new node N for each subgroup (inserting all relevant information needed), and then replacing any mention of the replaced nodes by N in all relevant edges (finding connected edges).
As I personally generate layouts/nodes in python before sending them to cytoscape for visualization, I'll refrain from posting a potentially ineffecient/incorrect javascript/cytoscape example :)
I need to use the algorithm of forces with a network, but that network is divided into parts.
For example, the vertices of the first part can not leave the part 1. The vertices of the part 2 need to be in part 2.
If there is a connection between a vertex from part 1 and a vertex from part 2, this connection will make these two vertices be near, however, will not let their parts.
This draft image illustrates the idea:
I need to do this for about 8 parts. Some parts are in the other, other parts are next to each other. And the network will be represented on these parts, each vertex in their respective part, however, the algorithm forces should try to pull the connected vertices themselves.
My solution was to create constraints using the "tick" function from d3.
To improve performance and avoid "locked" vertices on corners, I decided to use only circles.
For each node:
Make sure it is inside its region.
Make sure it is also not inside each of the other regions.
This ended up in lots of calculations for each node vs. each region. This is summed with the collision detection complexity. Yet, it handled networks with around a thousand nodes.
Details here:
Heberle, H., Carazzolle, M., Telles, G. et al. CellNetVis: a web tool for visualization of biological networks using force-directed layout constrained by cellular components. BMC Bioinformatics 18, 395 (2017). https://doi.org/10.1186/s12859-017-1787-5
Note for potential improvement: one of the most expensive calculations is the combo of checking over what elements a node is. For instance, if there was a native "browser's" function that giving a point x,y it returned the elements of an SVG that this point is over, it would make the visualization smoother and create "space" for more improvements. By the time I implemented it, I did not find such function.
My case is that I am dynamically positioning a bunch of icons on a static map image, each by CSS, positioned absolutely. Now it often happens, that two or even more points are too near to each other, so the icons overlap and they are not anymore distinguishable.
I am looking for an algorithm to find these "too near to each other" points, and then spread their icons out in a manner that they do not overlap each other anymore.
I am thinking of a radial spread, like finding the average middle point of all points that are too near and then spread them out relatively to that point.
Is there any pattern for such a problem of which you may know?
Thanks a lot in advance.
Here are a few solutions that might solve your problem:
Use a solution to the closest pair of points problem to find the two icons that are closest to one another. If the closest pair is "too close" by your definition, you can move them apart from one another and repeat this process.
Use a spatial data structure like a k-d tree or R-tree to store all the points. You can then execute fast nearest-neighbor searches to find points that are close to one another and move them apart.
Use a force-directed layout algorithm to find a layout for the points that globally minimizes some energy function. Algorithms like Fruchterman-Reingold are pretty straightforward to code up and produce good results.
Hope this helps!
I've got something similar to the force-directed graph example. The main difference is there is no force -- the layout is static except for user drag interactions. I've added code that draws convex hulls (as svg:path elements) around user-defined groups of nodes. As the nodes are moved (that is, .on("drag")) the code that calculates the hulls is fired. That means it's fired constantly when the nodes are dragged. There are typically 10 to 30 hulls; a node may be in one or more hull. See below:
I'm trying to figure out if there's a more efficient (higher performance) way to do what I'm doing. Keeping this at high-level for now:
On drag, update all hull graphics:
For each hull, create an array of the coordinates of the constituent nodes.
Replace each coordinate pair in the above array with 8 points that are some distance r away from the original point. This is to pad/dilate the hull so it's not actually touching any nodes.
Feed the calculated coordinates to d3.geom.hull.
I'm getting about 50 fps in Chrome, which isn't bad at all, but it seems like a dreadfully inefficient setup.
My only clear idea is to store the ID(s) of the hull(s) in which a node is contained in the nodes array, and only update that/those hulls instead of all of the hulls. I'm also wondering about more efficient data structures to store the hull data (not the paths themselves). Currently the data structure looks like this:
hulls = {1:{nodeIDs:[1,2,3,4,5], name:"hull1"}, 2:{...}, ...};
Pardon the open-ended question, but I'm hoping there's some programming technique that I'm not familiar with that would work well here.
The most efficient approach would be to only recalculate the position of the 8-tuple of coordinates orbiting the node that is being dragged, then propagating that information in the parent hull(s) for redrawing.
You can do this by making a few changes.
To each node's data structure, add the following elements.
References to all parent hulls (as you suggested)
The 8-tuple of "padding" coordinates orbiting that node (cache them)
On drag, consider only the node being dragged.
Update that node's 8-tuple.
Collect that node's parent hull's child nodes and collapse together their 8-tuples and feed them to hull (destroying the previous hull, if necessary).
I'm not familar with d3.js's API, but I'm sure you can fill in any blanks that I've left.
I'm having a problem with finding N positions in a grid based on a central point and a max range.
I have this grid, each coordinate in the grid can be either closed or open, I'm tying to, using an open coordinate, find all the open coordinates around the first that are valid and the walk range between then is equal or lower than the max walk range.
At first I tried a solution using A*, where I would select every coordinate in range, check if they were valid, if they were I would call A* from them to the center position and count the number of steps, if they were higher than my max range I would just remove the coordinate from my list. Of course, this is really is slow for ranges higher than 3, or grids with more than just a few coordinates.
Then I tried recursively searching for the coordinates, starting with the central one, expanding and recursively checking for the coordinates validness. That solution proved to be the most effective, except that in my code, each function call was rechecking coordinates that were already checked and returning repeated values. I thought I could just share the 'checked' list between the functions calls, but that just broke my code, since a call was checking a coordinate and closing it even if it still had child nodes but were technically not in range of their center (but were in range of the first center) it would close them and give some weird results.
Finally, my question is, how do I do that? Is my approach wrong? Is there a better way of doing it?
Thanks.
I've implemented something similar once. The simple recursive solution you proposed worked fine for small walk ranges, but execution time spun out of control for larger values ...
I improved this by implementing a variant of Dijkstra's algorithm, which keeps a list of visited nodes as you said. But instead of running a full path search for every coordinate in reach like you did with A*, do the first N iterations of the algorithm if N is your walk range (the gif on Wikipedia might help you understand what I mean). Your list of visited nodes will then be the reachable tiles you're looking for.
http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm