I have gone through the 'point-along-path' d3 visualization through the code: https://bl.ocks.org/mbostock/1705868. I have noticed that while the point is moving along its path it consumes 7 to 11% of CPU Usage.
Current scenario, I have around 100 paths and on each path, I will have to move points(circles) from sources to destinations. So it consumes more than 90% of the CPU memory as more number of points are moving at the same time.
I have tried as:
function translateAlong(path) {
var l = path.getTotalLength();
return function(d, i, a) {
return function(t) {
var p = path.getPointAtLength(t * l);
return "translate(" + p.x + "," + p.y + ")";
};
};
}
// On each path(100 paths), we are moving circles from source to destination.
var movingCircle = mapNode.append('circle')
.attr('r', 2)
.attr('fill', 'white')
movingCircle.transition()
.duration(10000)
.ease("linear")
.attrTween("transform", translateAlong(path.node()))
.each("end", function() {
this.remove();
});
So what should be the better way to reduce the CPU usage?
Thanks.
There are a few approaches to this, which vary greatly in potential efficacy.
Ultimately, you are conducting expensive operations every animation frame to calculate each point's new location and to re render it. So, every effort should be made to reduce the cost of those operations.
If frame rate is dropping below 60, it probably means we're nearing CPU capacity. I've used frame rate below to help indicate CPU capacity as it is more easily measured than CPU usage (and probably less invasive).
I had all sorts of charts and theory for this approach, but once typed it seemed like it should be intuitive and I didn't want to dwell on it.
Essentially the goal is to maximize how many transitions I can show at 60 frames per second - this way I can scale back the number of transitions and gain CPU capacity.
Ok, let's get some transitions running with more than 100 nodes along more than 100 paths at 60 frames per second.
D3v4
First, d3v4 likely offers some benefits here. v4 synchronized transitions, which appears to have had the effect of slightly improved times. d3.transition is very effective and low cost in any event, so this isn't the most useful - but upgrading isn't a bad idea.
There are also minor browser specific gains to be had by using different shaped nodes, positioning by transform or by cx,cy etc. I didn't implement any of those because the gains are relatively trivial.
Canvas
Second, SVG just can't move fast enough. Manipulating the DOM takes time, additional elements slows down operations and takes up more memory. I realize canvas can be less convenient from a coding perspective but canvas is faster than SVG for this sort of task. Use detached circle elements to represent each node (the same as with the paths), and transition these.
Save more time by drawing two canvases: one to draw once and to hold the paths (if needed) and another to be redrawn each frame showing the points. Save further time by setting the datum of each circle to the length of the path it is on: no need to call path.getTotalLength() each time.
Maybe something like this
Canvas Simplified Lines
Third, we still have a detached node that has SVG paths so we can use path.getPointAtLength() - and this is actually pretty effective. A major point slowing this down though is the use of curved lines. If you can do it, draw straight lines (multiple segments are fine) - the difference is substantial.
As a further bonus, use context.fillRect() instead of context.arc()
Pure JS and Canvas
Lastly, D3 and the detached nodes for each path (so we can use path.getTotalLength()) can start to get in the way. If need be leave them behind using typed arrays, context.imageData, and your own formula for positioning nodes on paths. Here's a quick bare bones example (100 000 nodes, 500 000 nodes, 1 000 000 nodes (Chrome is best for this, possible browser limitations. Since the paths now essentially color the entire canvas a solid color I don't show them but the nodes follow them still). These can transition 700 000 nodes at 10 frames per second on my slow system. Compare those 7 million transition positioning calculations and renderings/second against about 7 thousand transition positioning calculations and renderings/second I got with d3v3 and SVG (three orders of magnitude difference):
canvas A is with curved lines (cardinal) and circle markers (link above), canvas B is with straight (multi-segment) lines and square markers.
As you might imagine, a machine and script that can render 1000 transitioning nodes at 60 frames per second will have a fair bit of extra capacity if only rendering 100 nodes.
If the transition position and rendering calculations are the primary activity and CPU usage is at 100%, then half the nodes should free up roughly half the CPU capacity. In the slowest canvas example above, my machine logged 200 nodes transitioning along cardinal curves at 60 frames per second (it then started to drop off, indicating that CPU capacity was limiting frame rate and consequently usage should be near 100%), with 100 nodes we have a pleasant ~50% CPU usage:
Horizontal centerline is 50% CPU usage, transition repeated 6 times
But the key savings are to be found from dropping complex cardinal curves - if possible use straight lines. The other key savings are from customizing your scripts to be purpose built.
Compare the above with straight lines (multi segment still) and square nodes:
Again, horizontal centerline is 50% CPU usage, transition repeated 6 times
The above is 1000 transitioning nodes on 1000 3 segment paths - more than an order of magnitude better than with curved lines and circular markers.
Other Options
These can be combined with methods above.
Don't animate every point each tick
If you can't position all nodes each transition tick before the next animation frame you'll be using close to all of your CPU capacity. One option is don't position each node each tick - you don't have to. This is a complex solution - but position one third of circles each tick - each circle still can be positioned 20 frames per second (pretty smooth), but the amount of calculations per frame are 1/3 of what they would be otherwise. For canvas you still have to render each node - but you could skip calculating the position for two thirds of the nodes. For SVG this is a bit easier as you could modify d3-transition to include an every() method that sets how many ticks pass before transition values are re-calculated (so that one third are transitioned each tick).
Caching
Depending on circumstance, caching is also not a bad idea - but the front-ending of all calculations (or loading of data) may lead to unnecessary delays in the commencement of animation - or slowness on first run. This approach did lead to positive outcomes for me, but is discussed in another answer so I won't go into it here.
Post edit:
Here is the default. (peak CPU around %99 for 100 points at 2.7Ghz i7)
Here is my version. (peak CPU around 20% for 100 points at 2.7Ghz i7)
On average I am 5 times faster.
I presume the bottleneck here is the call to getPointAtLength method at every 17ms. I would also avoid lengthy string concatenations if I have to, but in your case its not much long so I think the best way:
cache points beforehand and only calculate once with a given resolution (I divided here in 1000 parts)
reduce calls to DOM methods within the requestAnimationFrame (that function you see receiving the normalized t parameter)
In the default case there is 2 calls, 1 when you call getPointAtLength, and then another one when you are setting the translate(under the hood).
You can replace the translateAlong with this below:
function translateAlong(path){
var points = path.__points || collectPoints(path);
return function (d,i){
var transformObj = this.transform.baseVal[0];
return function(t){
var index = t * 1000 | 0,
point = points[index];
transformObj.setTranslate(point[0],point[1]);
}
}
}
function collectPoints(path) {
var l = path.getTotalLength(),
step = l*1e-3,
points = [];
for(var i = 0,p;i<=1000;++i){
p = path.getPointAtLength(i*step);
points.push([p.x,p.y]);
}
return path.__points = points;
}
And a small modification to that tweening line:
.tween("transform", translateAlong(path.node()))
setting attr is not necessary, calling it is enough. Here is the result:
http://jsfiddle.net/ibowankenobi/8kx04y29/
Tell me if it did improve, because I'm not 100% sure.
Another way of achieving this might be to use an svg:animateMotion which you can use to move an element along a given path. Here is an example from the docs. In essence you want:
<svg>
<path id="path1" d="...">
<circle cx="" cy="" r="10" fill="red">
<animateMotion dur="10s" repeatCount="0">
<mpath xlink:href="#path1" />
</animateMotion>
</circle>
</path>
</svg>
I've not profiled it, but I think you're struggle to get much better performance than using something built into SVG itself.
Browser Support
Note that after a comment from #ibrahimtanyalcin I started checking browser compatibility. Turns out that this isn't supported in IE or Microsoft Edge.
On my computer:
#mbostock uses 8% CPU.
#ibrahimtanyalcin uses 11% CPU.
#Ian uses 10% CPU.
For #mbostock and #ibrahimtanyalcin this means the transition and the transform update use the CPU.
If I put 100 of these animations in 1 SVG I get
#mbostock uses 50% CPU. (1 core full)
#Ian uses 40% CPU.
All animations look smooth.
One possibility is to add a sleep in the transform update function using https://stackoverflow.com/a/39914235/9938317
Edit
In the nice Andrew Reid answer i have found a few optimizations.
I have written a version of the Canvas+JS 100 000 test that only does the calculation part and counts how many iterations can be do in 4000ms. Half the time order=false as t controls it.
I have written my own random generator to be sure that I get the same random numbers each time I run the modified code.
The blocks version of the code: 200 iterations
According to the docs for parseInt
parseInt should not be used as a substitute for Math.floor()
Converting a number to a string and then parsing the string up to the . sounds not very efficient.
Replacing parseInt() with Math.floor(): 218 iterations
I also found a line that has no function and looks not important
let p = new Int16Array(2);
It is inside the inner while loop.
Replacing this line with
let p;
gives 300 iterations.
Using these modifications the code can handle more points with a frame rate of 60 Hz.
I tried a few other things but they turn out to be slower.
I was surprised that if I pre-calculate the lengths of the segments and simplify the calculation of segment to a mere array lookup it is slower than doing the 4 array lookups and a couple of Math.pow and additions and a Math.sqrt.
Related
I'm currently porting an application to React Native that captures user input as a stroke and animates it to the correct position to match an svg (pictures below). In the web, I use a combination of multiple smoothing libraries & pixijs to achieve perfectly smooth transitions with no artifacts.
With React Native & reanimated I'm limited to functions I can write by hand to handle the interpolation between two paths. Currently what I'm doing is:
Convert the target svg to a fixed number N of points
Smooth the captured input and convert it to a series of N points
Loop over each coordinate and interpolate the value between those two points (linear interpolation)
Run the resulting points array through a Catmull-Rom function
Render the resulting SVG curve
1 & 2 I can cache prior to the animation, but steps 3 4 & 5 need to happen on each render.
Unfortunately, using this method, I'm limited to a value of around 300 N as the maximum amount of points before dropping some frames. I'm also still seeing some artifacts at the end of an animation that I don't know how to fix.
This is sufficient, but given that in the web I can animate tens of thousands of points without dropping frames, I feel like I am missing a key performance optimization here. For example, is there a way to combine steps 3 & 4? Are there more performant algorithms than Catmull-Rom?
Is there a better way to achieve a smooth transition between two vector paths using just pure JavaScript (or dropping into Swift if that is possible)?
Is there something more I can do to remove the artifacts pictured in the last photo? I'm not sure what these are called technically so it's hard for me to research - the catmull-rom spline removed most of them but I still see a few at the tail ends of the animation.
Animation end/start state:
Animation middle state:
Animation start/end state (with artifact):
You might want to have a look at flubber.js
Also why not ditch the catmull-rom for simple linear sections (probably detailed enough with 1000+ points)
If neither helps, or you want to get as fast as possible, you might want to leverage the GPUs power for embarrassingly parallel workflows like the interpolation between to N-sized arrays.
edit:
also consider using the skia renderer which already leverages the gpu and supports stuff perfectly fitting your use-case
import {Canvas, Path, Skia, interpolatePath} from "#shopify/react-native-skia";
//obv. you need to modify this to use your arrays
const path1 = new Path();
path1.moveTo(0, 0);
path1.lineTo(100, 0);
const path2 = new Path();
path2.moveTo(0, 0);
path2.lineTo(0, 100);
//you have to do this dynamically (maybe using skia animations)
let animationProgress = 0.5;
//magic already implemented for you
let path = interpolatePath(animationProgress, [0, 1], [path1, path2]);
const PathDemo = () => {
return (
<Canvas style={{ flex: 1 }}>
<Path
path={path}
color="lightblue"
/>
</Canvas>
);
};
I am trying to calculate with higher precision numbers in JavaScript to be able to zoom in more on the Mandlebrot set.
(after a certain amount of zooming the results get "pixelated", because of the low precision)
I have looked at this question, so I tried using a library such as BigNumber but it was unusably slow.
I have been trying to figure this out for a while and I think the only way is to use a slow library.
Is there a faster library?
Is there any other way to calculate with higher precision numbers?
Is there any other way to be able to zoom in more on the Mandlebrot set?
Probably unneceseary to add this code, but this is the function I use to check if a point is in the Mandlebrot set.
function mandelbrot(x, y, it) {
var z = [0, 0]
var c1 = [x, y]
for (var i = 0; i < it; i++) {
z = [z[0]*z[0] - z[1]*z[1] + c1[0], 2*z[0]*z[1] + c1[1]]
if (Math.abs(z[0]) > 2, Math.abs(z[1]) > 2) {
break
}
}
return i
}
The key is not so much the raw numeric precision of JavaScript numbers (though that of course has its effects), but the way the basic Mandelbrot "escape" test works, specifically the threshold iteration counts. To compute whether a point in the complex plane is in or out of the set, you iterate on the formula (which I don't exactly remember and don't feel like looking up) for the point over and over again until the point obviously diverges (the formula "escapes" from the origin of the complex plane by a lot) or doesn't before the iteration threshold is reached.
The iteration threshold when rendering a view of the set that covers most of it around the origin of the complex plane (about 2 units in all directions from the origin) can be as low as 500 to get a pretty good rendering of the whole set at a reasonable magnification on a modern computer. As you zoom in, however, the iteration threshold needs to increase in inverse proportion to the size of the "window" onto the complex plane. If it doesn't, then the "escape" test doesn't work with sufficient accuracy to delineate fine details at higher magnifications.
The formula I used in my JavaScript implementation is
maxIterations = 400 * Math.log(1/dz0)
where dz0 is (arbitrarily) the width of the window onto the plane. As one zooms into a view of the set (well, the "edge" of the set, where things are interesting), dz0 gets pretty small so the iteration threshold gets up into the thousands.
The iteration count, of course, for points that do "escape" (that is, points that are not part of the Mandelbrot set) can be used as a sort of "distance" measurement. A point that escapes within a few iterations is clearly not "close to" the set, while a point that escapes only after 2000 iterations is much closer. That distance quality can be used in various ways in visualizations, either to provide a color value (common) or possibly a z-axis value if the set is being rendered as a 3D view (with the set as a sort of "mesa" in three dimensions and the borders being a vertical "cliff" off the sides).
I am creating a Multiplayer game using socket io in javascript. The game works perfectly at the moment aside from the client interpolation. Right now, when I get a packet from the server, I simply set the clients position to the position sent by the server. Here is what I have tried to do:
getServerInfo(packet) {
var otherPlayer = players[packet.id]; // GET PLAYER
otherPlayer.setTarget(packet.x, packet.y); // SET TARGET TO MOVE TO
...
}
So I set the players Target position. And then in the Players Update method I simply did this:
var update = function(delta) {
if (x != target.x || y != target.y){
var direction = Math.atan2((target.y - y), (target.x - x));
x += (delta* speed) * Math.cos(direction);
y += (delta* speed) * Math.sin(direction);
var dist = Math.sqrt((x - target.x) * (x - target.x) + (y - target.y)
* (y - target.y));
if (dist < treshhold){
x = target.x;
y = target.y;
}
}
}
This basically moves the player in the direction of the target at a fixed speed. The issue is that the player arrives at the target either before or after the next information arrives from the server.
Edit: I have just read Gabriel Bambettas Article on this subject, and he mentions this:
Say you receive position data at t = 1000. You already had received data at t = 900, so you know where the player was at t = 900 and t = 1000. So, from t = 1000 and t = 1100, you show what the other player did from t = 900 to t = 1000. This way you’re always showing the user actual movement data, except you’re showing it 100 ms “late”.
This again assumed that it is exactly 100ms late. If your ping varies a lot, this will not work.
Would you be able to provide some pseudo code so I can get an Idea of how to do this?
I have found this question online here. But none of the answers provide an example of how to do it, only suggestions.
I'm completely fresh to multiplayer game client/server architecture and algorithms, however in reading this question the first thing that came to mind was implementing second-order (or higher) Kalman filters on the relevant variables for each player.
Specifically, the Kalman prediction steps which are much better than simple dead-reckoning. Also the fact that Kalman prediction and update steps work somewhat as weighted or optimal interpolators. And futhermore, the dynamics of players could be encoded directly rather than playing around with abstracted parameterizations used in other methods.
Meanwhile, a quick search led me to this:
An improvement of dead reckoning algorithm using kalman filter for minimizing network traffic of 3d on-line games
The abstract:
Online 3D games require efficient and fast user interaction support
over network, and the networking support is usually implemented using
network game engine. The network game engine should minimize the
network delay and mitigate the network traffic congestion. To minimize
the network traffic between game users, a client-based prediction
(dead reckoning algorithm) is used. Each game entity uses the
algorithm to estimates its own movement (also other entities'
movement), and when the estimation error is over threshold, the entity
sends the UPDATE (including position, velocity, etc) packet to other
entities. As the estimation accuracy is increased, each entity can
minimize the transmission of the UPDATE packet. To improve the
prediction accuracy of dead reckoning algorithm, we propose the Kalman
filter based dead reckoning approach. To show real demonstration, we
use a popular network game (BZFlag), and improve the game optimized
dead reckoning algorithm using Kalman filter. We improve the
prediction accuracy and reduce the network traffic by 12 percents.
Might seem wordy and like a whole new problem to learn what it's all about... and discrete state-space for that matter.
Briefly, I'd say a Kalman filter is a filter that takes into account uncertainty, which is what you've got here. It normally works on measurement uncertainty at a known sample rate, but it could be re-tooled to work with uncertainty in measurement period/phase.
The idea being that in lieu of a proper measurement, you'd simply update with the kalman predictions. The tactic is similar to target tracking applications.
I was recommended them on stackexchange myself - took about a week to figure out how they were relevant but I've since implemented them successfully in vision processing work.
(...it's making me want to experiment with your problem now !)
As I wanted more direct control over the filter, I copied someone else's roll-your-own implementation of a Kalman filter in matlab into openCV (in C++):
void Marker::kalmanPredict(){
//Prediction for state vector
Xx = A * Xx;
Xy = A * Xy;
//and covariance
Px = A * Px * A.t() + Q;
Py = A * Py * A.t() + Q;
}
void Marker::kalmanUpdate(Point2d& measuredPosition){
//Kalman gain K:
Mat tempINVx = Mat(2, 2, CV_64F);
Mat tempINVy = Mat(2, 2, CV_64F);
tempINVx = C*Px*C.t() + R;
tempINVy = C*Py*C.t() + R;
Kx = Px*C.t() * tempINVx.inv(DECOMP_CHOLESKY);
Ky = Py*C.t() * tempINVy.inv(DECOMP_CHOLESKY);
//Estimate of velocity
//units are pixels.s^-1
Point2d measuredVelocity = Point2d(measuredPosition.x - Xx.at<double>(0), measuredPosition.y - Xy.at<double>(0));
Mat zx = (Mat_<double>(2,1) << measuredPosition.x, measuredVelocity.x);
Mat zy = (Mat_<double>(2,1) << measuredPosition.y, measuredVelocity.y);
//kalman correction based on position measurement and velocity estimate:
Xx = Xx + Kx*(zx - C*Xx);
Xy = Xy + Ky*(zy - C*Xy);
//and covariance again
Px = Px - Kx*C*Px;
Py = Py - Ky*C*Py;
}
I don't expect you to be able to use this directly though, but if anyone comes across it and understand what 'A', 'P', 'Q' and 'C' are in state-space (hint hint, state-space understanding is a pre-req here) they'll likely see how connect the dots.
(both matlab and openCV have their own Kalman filter implementations included by the way...)
This question is being left open with a request for more detail, so I’ll try to fill in the gaps of Patrick Klug’s answer. He suggested, reasonably, that you transmit both the current position and the current velocity at each time point.
Since two position and two velocity measurements give a system of four equations, it enables us to solve for a system of four unknowns, namely a cubic spline (which has four coefficients, a, b, c and d). In order for this spline to be smooth, the first and second derivatives (velocity and acceleration) should be equal at the endpoints. There are two standard, equivalent ways of calculating this: Hermite splines (https://en.wikipedia.org/wiki/Cubic_Hermite_spline) and Bézier splines (http://mathfaculty.fullerton.edu/mathews/n2003/BezierCurveMod.html). For a two-dimensional problem such as this, I suggested separating variables and finding splines for both x and y based on the tangent data in the updates, which is called a clamped piecewise cubic Hermite spline. This has several advantages over the splines in the link above, such as cardinal splines, which do not take advantage of that information. The locations and velocities at the control points will match, you can interpolate up to the last update rather than the one before, and you can apply this method just as easily to polar coordinates if the game world is inherently polar like Space wars. (Another approach sometimes used for periodic data is to perform a FFT and do trigonometric interpolation in the frequency domain, but that doesn’t sound applicable here.)
What originally appeared here was a derivation of the Hermite spline using linear algebra in a somewhat unusual way that (unless I made a mistake entering it) would have worked. However, the comments convinced me it would be more helpful to give the standard names for what I was talking about. If you are interested in the mathematical details of how and why this works, this is a better explanation: https://math.stackexchange.com/questions/62360/natural-cubic-splines-vs-piecewise-hermite-splines
A better algorithm than the one I gave is to represent the sample points and first derivatives as a tridiagonal matrix that, multiplied by a column vector of coefficients, produces the boundary conditions, and solve for the coefficients. An alternative is to add control points to a Bézier curve where the tangent lines at the sampled points intersect and on the tangent lines at the endpoints. Both methods produce the same, unique, smooth cubic spline.
One situation you might be able to avoid if you were choosing the points rather than receiving updates is if you get a bad sample of points. You can’t, for example, intersect parallel tangent lines, or tell what happened if it’s back in the same place with a nonzero first derivative. You’d never choose those points for a piecewise spline, but you might get them if an object made a swerve between updates.
If my computer weren’t broken right now, here is where I would put fancy graphics like the ones I posted to TeX.SX. Unfortunately, I have to bow out of those for now.
Is this better than straight linear interpolation? Definitely: linear interpolation will get you straight- line paths, quadratic splines won't be smooth, and higher-order polynomials will likely be overfitted. Cubic splines are the standard way to solve that problem.
Are they better for extrapolation, where you try to predict where a game object will go? Possibly not: this way, you’re assuming that a player who’s accelerating will keep accelerating, rather than that they will immediately stop accelerating, and that could put you much further off. However, the time between updates should be short, so you shouldn’t get too far off.
Finally, you might make things a lot easier on yourself by programming in a bit more conservation of momentum. If there’s a limit to how quickly objects can turn, accelerate or decelerate, their paths will not be able to diverge as much from where you predict based on their last positions and velocities.
Depending on your game you might want to prefer smooth player movement over super-precise location. If so, then I'd suggest to aim for 'eventual consistency'. I think your idea of keeping 'real' and 'simulated' data-points is a good one. Just make sure that from time to time you force the simulated to converge with the real, otherwise the gap will get too big.
Regarding your concern about different movement speed I'd suggest you include the current velocity and direction of the player in addition to the current position in your packet. This will enable you to more smoothly predict where the player would be based on your own framerate/update timing.
Essentially you would calculate the current simulated velocity and direction taking into account the last simulated location and velocity as well as last known location and velocity (put more emphasis on the second) and then simulate new position based on that.
If the gap between simulated and known gets too big, just put more emphasis on the known location and the otherPlayer will catch up quicker.
I am creating a video game based on Node.js/WebGL/Canvas/PIXI.js.
In this game, blocks have a generic size: they can be circles, polygons, or everything. So, my physical engine needs to know where exactly the things are, what pixels are walls and what pixels are not. Since I think PIXI don't allow this, I create an invisible canvas where I put all the wall's images of the map. Then, I use the function getImageData to create a function "isWall" at (x, y):
function isWall(x, y):
return canvas.getImageData(x, y, 1, 1).data[3] != 0;
However, this is very slow (it takes up to 70% of the CPU time of the game, according to Chrome profiling). Also, since I introduced this function, I sometimes got the error "Oops, WebGL crashed" without any additional advice.
Is there a better method to access the value of the pixel? I thought about storing everything in a static bit array (walls have a fixed size), with 1 corresponding to a wall and 0 to a non-wall. Is it reasonable to have a 10-million-cells array in memory?
Some thoughts:
For first check: Use collision regions for all of your objects. The regions can even be defined for each side depending on shape (ie. complex shapes). Only check for collisions inside intersecting regions.
Use half resolution for hit-test bitmaps (or even 25% if your scenario allow). Our brains are not capable of detecting pixel-accurate collisions when things are moving so this can be taken advantage of.
For complex shapes, pre-store the whole bitmap for it (based on its region(s)) but transform it to a single value typed array like Uint8Array with high and low values (re-use this instead of getting one and one pixels via the context). Subtract object's position and use the result as a delta for your shape region, then hit-testing the "bitmap". If the shape rotates, transform incoming check points accordingly (there is probably a sweet-spot here where updating bitmap becomes faster than transforming a bunch of points etc. You need to test for your scenario).
For close-to-square shaped objects do a compromise and use a simple rectangle check
For circles and ellipses use un-squared values to check distances for radius.
In some cases you can perhaps use collision predictions which you calculate before the games starts and when knowing all objects positions, directions and velocities (calculate the complete motion path, find intersections for those paths, calculate time/distance to those intersections). If your objects change direction etc. due to other events during their path, this will of course not work so well (or try and see if re-calculating is beneficial or not).
I'm sure why you would need 10m stored in memory, it's doable though - but you will need to use something like a quad-tree and split the array up, so it becomes efficient to look up a pixel state. IMO you will only need to store "bits" for the complex shapes, and you can limit it further by defining multiple regions per shape. For simpler shapes just use vectors (rectangles, radius/distance). Do performance tests often to find the right balance.
In any case - these sort of things has to be hand-optimized for the very scenario, so this is just a general take on it. Other factors will affect the approach such as high velocities, rotation, reflection etc. and it will quickly become very broad. Hope this gives some input though.
I use bit arrays to store 0 || 1 info and it works very well.
The information is stored compactly and gets/sets are very fast.
Here is the bit library I use:
https://github.com/drslump/Bits-js/blob/master/lib/Bits.js
I've not tried with 10m bits so you'll have to try it on your own dataset.
The solution you propose is very "flat", meaning each pixel must have a corresponding bit. This results in a large amount of memory being required--even if information is stored as bits.
An alternative testing data ranges instead of testing each pixel:
If the number of wall pixels is small versus the total number of pixels you might try storing each wall as a series of "runs". For example, a wall run might be stored in an object like this (warning: untested code!):
// an object containing all horizontal wall runs
var xRuns={}
// an object containing all vertical wall runs
var yRuns={}
// define a wall that runs on y=50 from x=100 to x=185
// and then runs on x=185 from y=50 to y=225
var y=50;
var x=185;
if(!xRuns[y]){ xRuns[y]=[]; }
xRuns[y].push({start:100,end:185});
if(!yRuns[x]){ yRuns[x]=[]; }
yRuns[x].push({start:50,end:225});
Then you can quickly test an [x,y] against the wall runs like this (warning untested code!):
function isWall(x,y){
if(xRuns[y]){
var a=xRuns[y];
var i=a.length;
do while(i--){
var run=a[i];
if(x>=run.start && x<=run.end){return(true);}
}
}
if(yRuns[x]){
var a=yRuns[x];
var i=a.length;
do while(i--){
var run=a[i];
if(y>=run.start && y<=run.end){return(true);}
}
}
return(false);
}
This should require very few tests because the x & y exactly specify which array of xRuns and yRuns need to be tested.
It may (or may not) be faster than testing the "flat" model because there is overhead getting to the specified element of the flat model. You'd have to perf test using both methods.
The wall-run method would likely require much less memory.
Hope this helps...Keep in mind the wall-run alternative is just off the top of my head and probably requires tweaking ;-)
I've got an issue with an experiment I'm working on.
My plan is to have a beautiful and shining stars Background on a whole page.
Using that wondeful tutorial (http://timothypoon.com/blog/2011/01/19/html5-canvas-particle-animation/) I managed to get the perfect background.
I use a static canvas to display static stars and an animated canvas for the shining ones.
The fact is it's very memory hungry! On chrome and opera it runs quite smoothly, but on firefox IE or tablet, it was a total mess 1s to render each frame etc... It is worse on pages where HEIGHT is huge.
So i went into some optimisations:
-Using a buffer canvas, the problem was createRadialGradient which was called 1500 times each frame
-Using a big buffer canvas, and 1 canvas for each stars with an only call to createRadialGradient at init.
-Remove that buffer canvas and drawing every stars canvas to the main one
That last optimisation was the best i could achieve so i wrote a fiddle displaying how is the code right now.
//Buffering the star image
this.scanvas = document.createElement('canvas');
this.scanvas.width=2*this.r;
this.scanvas.height=2*this.r;
this.scon=this.scanvas.getContext('2d');
g = this.scon.createRadialGradient(this.r,this.r,0,this.r,this.r,this.r);
g.addColorStop(0.0, 'rgba(255,255,255,0.9)');
g.addColorStop(this.stop, 'rgba('+this.color.r+','+this.color.g+','+this.color.b+','+this.stop+')');
g.addColorStop(1.0, 'rgba('+this.color.r+','+this.color.g+','+this.color.b+',0)');
this.scon.fillStyle = g;
this.scon.fillRect(0,0,2*this.r,2*this.r);
That's the point where I need you:
-A way to adjust the number of shining stars according to the user perfomance
-Optimisation tips
Thanks in advance to everyone minding to help me and I apologize if I made grammar mistakes, my english isn't perfect.
EDIT
Thanks for your feedbacks,
Let me explains the whole process,
Every stars has it's own different gradient and size, that's why I stored it into a personal canvas, the shining effect is only done by scaling that canvas on the main one with drawImage.
I think the best would be to prerender 50 or 100 different stars in a buffer canvas then picking and drawing a random one, don't you think?
EDIT2
Updated fiddle according to Warlock great advises, one prerendered star, scaled to match the current size. The stars are less pretty, but the whole thing runs a lot smoother.
EDIT3
Updated fiddle to use a sprite sheet. Gorgeous!!!!
//generate the star strip
var len=(ttlm/rint)|0;
scanvas = document.createElement('canvas');
scanvas.width=len*2*r;
scanvas.height=2*r;
scon=scanvas.getContext('2d');
for(var i=0;i<len;i++){
var newo = (i/len);
var cr = r*newo;
g = scon.createRadialGradient(2*r*i+r,r,0,2*r*i+r,r,(cr <= 2 ? 2 : cr));
g.addColorStop(0.0, 'rgba(200,220,255,'+newo+')');
g.addColorStop(0.2, 'rgba(200,220,255,'+(newo*.7)+')');
g.addColorStop(0.4, 'rgba(150,170,205,'+(newo*.2)+')');
g.addColorStop(0.7, 'rgba(150,170,205,0)');
scon.fillStyle = g;
scon.fillRect(2*r*i,0,2*r,2*r);
}
EDIT 4(Final)
Dynamic stars creations
function draw() {
frameTime.push(Date.now());
con.clearRect(0,0,WIDTH,HEIGHT);
for(var i = 0, len = pxs.length; i < len; i++) {
pxs[i].fade();
pxs[i].draw();
}
requestAnimationFrame(draw);
if(allowMore === true && frameTime.length == monitoredFrame)
{
if(getAvgTime()<threshold && pxs.length<totalStars )
{
addStars();
}
else
{
allowMore=false;
static=true;
fillpxs(totalStars-pxs.length,pxss);
drawstatic();
static=false;
}
}
}
Here is the updated and final fiddle, with spritesheet, dynamic stars creation and several optimisations. If you see anything else i should update don't hesitate.
POST EDIT Reenabled shooting stars/Prototyped object/got rid of Jquery
http://jsfiddle.net/macintox/K8YTu/32/
Thanks everyone who helped me, that was really kind and instructive, and I hope it will help somebody sometimes.
Aesdotjs.
PS: I'm so happy. After testing, that script run smoothly on every browser even IE9. Yatta!!
Adopting to browser performance
To measure capability of the user's setup you can implement a dynamic star creator which stops at a certain threshold.
For example, in your code you define a minimum number of stars to draw. Then in your main loop you measure the time and if the time spent drawing the stars are less than your max threshold you add 10 more stars (I'm just throwing out a number here).
Not many are aware of that requestAnimationFrame gives an argument (DOMHighResTimeStamp) to the function it calls with time in milliseconds spent since last request. This will help you keep track of load and as we know that 60 fps is about 16.7 ms per frame we can set a threshold a little under this to be optimal and still allow some overhead for other browser stuff.
A code could look like this:
var minCount = 100, /// minimum number of stars
batchCount = 10, /// stars to add each frame
threshold= 14, /// milliseconds for each frame used
allowMore = true; /// keep adding
/// generate initial stars
generateStarts(minCount);
/// timeUsed contains the time in ms since last requestAnimationFrame was called
function loop(timeUsed) {
if (allowMore === true && timeUsed < threshold) {
addMoreStars(batchNumber);
} else {
allowMore = false;
}
/// render stars
requestAnimationFrame(loop);
}
Just note that this is a bit simplified. You will need to run a few rounds first and measure the average to have this work better as you can and will get peak when you add stars (and due to other browser operations).
So add stars, measure a few rounds, if average is below threshold add stars and repeat.
Optimizations
Sprite-sheets
As to optimizations sprite-sheets are the way to go. And they don't have to just be the stars (I'll try to explain below).
The gradient and arc is the costly part of this applications. Even when pre-rendering a single star there is cost in resizing so many stars due to interpolation etc.
When there becomes a lot of costly operations it is better to do a compromise with memory usage and pre-render everything you can.
For example: render the various sizes by first rendering a big star using gradient and arc.
Use that star to draw the other sizes as a strip of stars with the same cell size.
Now, draw only half of the number of stars using the sprite-sheet and draw clipped parts of the sprite-sheet (and not re-sized). Then rotate the canvas 90 degrees and draw the canvas itself on top of itself in a different position (the canvas becoming a big "sprite-sheet" in itself).
Rotating 90 degrees is not so performance hungry as other degrees (0, 90, 180, 270 are optimized). This will give you the illusion of having the actual amount of stars and since it's rotated we are not able to detect a repetitive pattern that easy.
A single drawImage operation of canvas is faster than many small draw operations of all the stars.
(and of course, you can do this many times instead of just once up to a point right before where you start see patterns - there is no key answer to how many, what size etc. so to find the right balance is always an experiment).
Integer numbers
Other optimizations can be using only integer positions and sizes. When you use float numbers sub-pixeling is activated which is costly as the browser need to calculate anti-alias for the offset pixels.
Using integer values can help as sub-pixeling isn't needed (but this doesn't mean the image won't be interpolated if not 1:1 dimension).
Memory bounds
You can also help the underlying low-lowel bitmap handling a tiny bit by using sizes and positions dividable on 4. This has to do with memory copy and low-level clipping. You can always make several sprite-sheet to variate positions within a cell that is dividable on 4.
This trick is more valuable on slower computers (ie. typical consumer spec'ed computers).
Turn off anti-aliasing
Turn off anti-aliasing for images. This will help performance but will give a little more rough result of the stars. To turn off image anti-aliasing do this:
ctx.webkitEnableImageSmoothing = false;
ctx.mozEnableImageSmoothing = false;
ctx.enableImageSmoothing = false;
You will by doing this see a noticeable improvement in performance as long as you use drawImage to render the stars.
Cache everything
Cache everything you can cache, being the star image as well as variables.
When you do this stars.length the browser's parser need to first find stars and then traverse that tree to find length - for each round (this may be optimized in some browsers).
If you first cache this to a variable var len = stars.length the browser only need to traverse the tree and branch once and in the loop it will only need to look up the local scope to find variable len which is faster.
Resolution reduction
You can also reduce resolution in half, ie. do everything at half the target size. In the final step draw your render enlarged to full size. This will save you initially 75% render area but give you a bit low-res look as a result.
From the professional video world we often use low-resolution when things are animated (primarily moving) as the eye/brain patch up (or can't detect) so much details when objects are moving and therefor isn't so noticeable. If this can help here must be tested - perhaps not since the stars aren't actually moving, but worth a try for the second benefit: increased performance.
How about just creating a spritesheet of a star in its various stages of radial glow.
You could even use canvas to initially create the spritesheet.
Then use context.drawImage(spritesheet,spriteX,spriteY,starWidth,starHeight) to display the star.
Spritesheet images can be drawn to the screen very quickly with very little overhead.
You might further optimize by breaking the spritesheet into individual star images.
Good luck on your project :)
1. Minimize operations, related to the DOM;
In the LINE 93 you are creating canvas:
this.scanvas = document.createElement('canvas');
You need only one canvas instead of this. Move canvas creation to the initialization step.
2. Use integer coordinates for canvas;
3. Use Object Pool design pattern to improve performance.
4. In for loops cache the length variable:
for(var i = 0; i < pxs.length; i++) {...
}
Better:
for(var i = 0, len = pxs.length; i < len; i++) {
...
}
Note: don't mix jquery with native js.