Monte-carlo methods in graphics

Monte-carlo methods in graphics - javascript

I was reading through this interesting post about using JavaScript to generate an image of a rose. However, I'm a bit confused, as this article claims that the author used monte carlo methods to reduce the code size.
It's my understanding that the author was using monte carlo methods to do something like GIF interlacing, so that the image would appear to load more quickly. Have I missed something?

The Monte-Carlo (MC) method used by the author has nothing to do with the resulting image file type, it has everything to do with how the image was generated in the first place. Since the point of JS1K is to write compact code, the author defines the rose by mathematical forms that are to be filled in with tiny dots (so they look like a solid image) by a basic render.
How do you fill those forms in? One method is to sample the surface uniformly, that is over a set interval, place a dot. As #Jordan quoted, it will work if and only if the interval is set correctly. Make it to small it takes to long; make it to large, the image is patchwork. However you can bypass the whole problem by sampling over the surface randomly. This is where the MC method comes in.
I've seen this confusion over MC before, since it is often thought of as a tool for numerical simulation. While widely used as such, the core idea is to randomly sample an interval with a bias that weights each step accordingly (dependent on the problem). For example, a physics simulation might have a weight of e^(-E/kT), whereas a numerical integrator might use a weight proportional to the derivative at the sample point. The wikipeida entry (and the refs. therein) are a good starting place for a more detail.
You can think of the complete rose as a function that is fully computed. As the MC algorithm runs, it samples this function while it converges onto the correct answer.

The author writes in the article that he uses Monte Carlo sampling to overcome the limitations of interval based sampling because the latter "requires the setting of a proper interval for each surface. If the interval is big, it render fast but it can end with holes in the surface that has not been filled. On the other hand, if the interval is too little, the time for rendering increments up to prohibitive quantities." I believe that the WebMonkey article's conclusion re: code size is incorrect.

Related

How to convert a wavetable for use with `OscillatorNode.setPeriodicWave`?

I would like use a custom waveform with a WebAudio OscillatorNode. I'm new to audio synthesis and still struggle quite a lot with the mathematics (I can, at least, program).
The waveforms are defined as functions, so I have the function itself, and can sample the wave. However, the OscillatorNode.createPeriodicWave method requires two arrays (real and imag) that define the waveform in the frequency domain.
The AnalyserNode has FFT methods for computing an array (of bytes or floats) in the frequency domain, but it works with a signal from another node.
I cannot think of a way to feed a wavetable into the AnalyserNode correctly, but if I could, it only returns a single array, while OscillatorNode.createPeriodicWave requires two.
TLDR Starting with a periodic function, how do you compute the corresponding arguments for OscillatorNode.createPeriodicWave?

Since you have a periodic waveform defined by a function, you can compute the Fourier Series for this function. If the series has an infinite number of terms, you'll need to truncate it.
This is a bit of work, but this is exactly how the pre-defined Oscillator types are computed. For example, see the definition of the square wave for the OscillatorNode. The PeriodicWave coefficients for the square wave were computed in exactly this way.
If you know the bandwidth of your waveform, you can simplify the work a lot by not having to do the messy integrals. Just uniformly sample the waveform fast enough, and then use an FFT to get the coefficients you need for the PeriodicWave. Additional details on in the sampling theorem.
Or you can just assume that sample rate of the AudioContext (typically 44.1 kHz or 48 kHz) is high enough and just sample your waveform every 1/44100 or 1/48000 sec and compute the FFT of the resulting samples.

I just wrote an implementation of this. To use it, drag and drop the squares to form a waveform and then play the piano that appears afterwards. Watch the video in this tweet to see a use example. The live demo is in alpha version, so the code and UI are a little rough. You can check out the source here.
I didn't write any documentation, but I recorded some videos (Video 1) (Video 2) (Video 3) of me coding the project live. They should be pretty self-explanatory. There are a couple of bugs in there that I fixed later. For the working version, please refer to the github link.

Fast implementation of an 8-point 1d DCT for modified JPEG decoder

I've been working on a custom video codec for use on the web. The custom codec will be powered by javascript and the html5 Canvas element.
There are several reasons for me wanting to do this that I will list at the bottom of this question, but first I want to explain what I have done so far and why I am looking for a fast DCT transform.
The main idea behind all video compression is that frames next to eachother share a large amount of similarities. So what I'm doing is I send the first frame compressed as a jpg. Then I send another Jpeg image that is 8 times as wide as the first frame holding the "differences" between the first frame and the next 8 frames after that.
This large Jpeg image holding the "differences" is much easier to compress because it only has the differences.
I've done many experiments with this large jpeg and I found out that when converted to a YCbCr color space the "chroma" channels are almost completely flat, with a few stand out exceptions. In other words there are few parts of the video that change much in the chroma channels, but some of the parts that do change are quite significant.
With this knowledge I looked up how JPEG compression works and saw that among other things it uses the DCT to compress each 8x8 block. This really interested me because I thought what if I could modify this so that it not only compresses "each" 8x8 block, but it also checks to see if the "next" 8x8 block is similar to the first one. If it is close enough then just send the first block and use the same data for both blocks.
This would increase both decoding speed, and improve bit rate transfer because there would be less data to work with.
I thought that this should be a simple task to accomplish. So I tried to build my own "modified" jpeg encoder/decoder. I built the RGB to YCbCr converter, I left "gzip" compression to do the huffman encoding and now the only main part I have left is to do the DCT transforms.
However this has me stuck. I can not find a fast 8 point 1d dct transform. I am looking for this specific transform because according to many articles I've read the 2d 8x8 dct transform can be separated into several 1x8 id transforms. This is the approach many implementations of jpeg use because it's faster to process.
So I figured that with Jpeg being such an old well known standard a fast 8 point 1d dct should just jump out at me, but after weeks of searching I have yet to find one.
I have found many algorithms that use the O(N^2) complexity approach. However that's bewilderingly slow. I have also found algorithms that use the Fast Fourier Transform and I've modifed them to compute the DCT. Such as the one in this link below:
https://www.nayuki.io/page/free-small-fft-in-multiple-languages
In theory this should have the "fast" complexity of O(Nlog2(n)) but when I run it it takes my i7 computer about 12 seconds to encode/decode the "modified" jpeg.
I don't understand why it's so slow? There are other javascript jpeg decoders that can do it much faster, but when I try to look through their source code I can't pull out which part is doing the DCT/IDCT transforms.
https://github.com/notmasteryet/jpgjs
The only thing I can think of is maybe the math behind the DCT has already been precomputed and is being stored in a lookup table or something. However I have looked hard on google and I can't find anything (that I understand at least) that talks about this.
So my question is where can I find/how can I build a fast way to compute an 8 point 1d dct transform for this "modified" jpeg encoder/decoder. Any help with this would be greatly appreciated.
Okay as for why I want to do this, the main reason is I want to have "interactive" video for mobile phones on my website. This can not be done because of things like iOS loading up it's "native" quick time player every time it starts playing a video. Also it's hard to make the transition to another point in time of the video seem "smooth" when you have such little control of how videos are rendered especially on mobile devices.
Thank you again very much for any help that anyone can provide!

So my question is where can I find/how can I build a fast way to compute an 8 point 1d dct transform for this "modified" jpeg encoder/decoder. Any help with this would be greatly appreciated.
take a look into the flash-world and the JPEG-encoder there (before it was inegrated into the Engine).
Here for example: http://www.bytearray.org/?p=1089 (sources provided) this code contains a function called fDCTQuant() that does the DCT, first for the rows, then for the columns, and then it quantifies the block (so basically there you have your 8x1 DCT).
So what I'm doing is I send the first frame compressed as a jpg. Then I send another Jpeg image ...
take a look at progressive JPEG. I think some of the things how this works, and how the data-stream is built will sound kind of familiar with this description (not the same, but they both go in related directions. imo)
what if I could modify this so that it not only compresses "each" 8x8 block, but it also checks to see if the "next" 8x8 block is similar to the first one. If it is close enough then just send the first block and use the same data for both blocks.
The expressions "similar" and "close enough" got my attention here. take a look at the usually used quantization-tables. you know, that a change of the value by 1 could easily result in a value-change of 15% brightness (for chroma-channels usually even more) of that point depending on the position in the 8x8-block and therefore the applied quantifier.
calculation with quantifier 40
(may be included in the set even at the lowest compression rates
at lower compression rates some quantifier can go up to 100 and beyond)
change the input by 1 changes the output by 40.
since we are working on 1byte value-range it's a change of 40/255
that is about 15% of the total possible range
So you should be really thoughtful what you call "close enough".
To sum this up: Well a Video-codec based on jpeg that utilizes differences between the frames to reduce the amount of data. That also sounde kind of familiar to me.
Got it: MPEG https://github.com/phoboslab/jsmpeg
*no connection to the referenced codes or the coder

I implemented separable integer 2D DCTs of various sizes (as well as other transforms) here: https://github.com/flanglet/kanzi/tree/master/java/src/kanzi/transform. The code is in Java but really for this kind of algorithm, it is pretty much the same in any language. The most interesting part IMO is the rescaling you do after computing each direction. Depending on your goals (max precision, 16 bit computation, no scaling, ...), you may want to change the scaling factors for each step. Using bigger blocks in areas where the image is very uniform saves bits.

This book shows how the DCT matrix can be factored to Gaussian Normal Form. That would be the fastest way to do a DCT.
http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=pd_sim_14_1?ie=UTF8&dpID=41XJBED6RCL&dpSrc=sims&preST=_AC_UL160_SR127%2C160_&refRID=1Q0G2H5EFYCQW2TJCCJN

Word Cloud for Other Languages

I using JasonDavies's Word Cloud for my project, but there is a problem that I using Persian[Farsi] Strings and my problem here that words have overlapping in Svg.
This is my project's output:
What happened to the Farsi words?

As explained on the About page for the project, the generator needs to retrieve the shape of a glyph to be able to compute where it is "safe" to put other words. The about page explains the process in much more detail, but here's what we care for:
Glyphs are rendered individually to a hidden <canvas> element.
Pixel data is retrieved
Bounding boxes are derived
The word cloud is generated.
Now, the critical insight is that in Western (and many other) scripts, glyphs don't change shape based on context often. Yes, there are such things as ligatures, but they are generally rare, and definitely not necessary for the script.
In Persian, however, the glyph shape will change based on context. For non-Persian readers, look at ی and س which, when combined, become یس. Yes, that last one is two glyphs!
The algorithm actually has no problem dealing with Persian characters, as you can see by hacking the demo on the about page, putting a breakpoint just after the d.code is generated, to be able to modify it:
Replacing it with 1740, which is the charCode for the first Persian glyph above, and letting the algorithm run, shows beautiful and perfectly correct bounding boxes around the glyph:
The issue is that when the word cloud is actually rendered, the glyph is placed in context and... changes shape. The generator doesn't know this, though, and continues to use the old bounding data to place other words, thus creating the overlapping you witnessed. In addition, there is probably also an issue around right-to-left handling of text, which certainly would not help.
I would encourage you to take this up the author of the generator directly. The project has a GitHub page: https://github.com/jasondavies/d3-cloud so opening an issue there (and maybe referring back to this answer) would help!

Moving object performance - ThreeJs

Context :
I'm working on a pretty simple THREE.JS project, and it is, I believe, optimized in a pretty good way.
I'm using a WebGLRenderer to display lot's of Bode plot extracted from an audio signal every 50ms. This is pretty cool, but obviously, the more Bode I display, the more laggy it is. In addition, Bodes are moving at constant speed, letting new ones some space to be displayed.
I'm now at a point where I implemented every "basic" optimization I found on Internet, and I managed to get a 30 fps constantly at about 10.000.000 lines displayed, with such a bad computer (nVidia GT 210 and Core i3 2100...).
Note also i'm not using any lights,reflections... Only basic lines =)
As it is a working project, i'm not allowed to show some screenshots/code, sorry ...
Current implementation :
I'm using an array to store all my Bodes, which are each displayed via a THREE.Line.
FYI, actually 2000 THREE.Line are used.
When a Bode has been displayed and moved for 40s, it is then deleted and the THREE.Line is re-used with another one. Note that to move these, I'm modifying THREE.Line.position property.
Note also that I already disabled my scene and object matrix autoUpdate, as I'm doing it manually. (Thx for pointing that Volune).
My Question :
Do the THREE.Line.position modification induces some heavy
calculations the renderer has already done ? Or is three.js aware that my object did not change and
avoid these ?
In other words, I'd like to know if rendering/updating the same object which was just translated is heavier in the rendering process than just leaving it alone, without updating his matrix etc...
Is there any sort of low-level optimization, either in ThreeJS about rendering the same objects many times ? Is this optimization cancelled when I move my object ?
If so, I've in mind an other way to do this : using only two big Mesh, which are folowing each other, but this induces merging/deleting parts of their geometries each frames... Might it be better ?
Thanks in advance.

I found in the sources (here and here) that the meshes matrices are updated each frame no matter the position changed or not.
This means that the position modification does not induce heavy calculation itself. This also means that a lot of matrices are updated and a lot of uniforms are sent to the GC each frame.
I would suggest trying your idea with one or two big meshes. This should reduce javascript computations internal to THREE.js, and the only big communications with the GC will be relative to the big buffers.
Also note that there exists a WebGL function bufferSubData (MSDN documentation) to update parts of a buffer, but it seems not yet usable in THREE.js

asynchronous / variable framerate in javascript game

This may be a stupid/previously answered question, but it is something that has been stumping me and my friends for a little while, and I have been unable to find a good answer.
Right now, i make all my JS Canvas games run in ticks. For example:
function tick(){
//calculate character position
//clear canvas
//draw sprites to canvas
if(gameOn == true)
t = setTimeout(tick(), timeout)
}
This works fine for CPU-cheep games on high-end systems, but when i try to draw a little more every tick, it starts to run in slow motion. So my question is, how can i keep the x,y position and hit-detection calculations going at full speed while allowing a variable framerate?
Side Note: I have tried to use the requestAnimationFrame API, but to be honest it was a little confusing (not all that many good tutorials on it) and, while it might speed up your processing, it doesn't entirely fix the problem.
Thanks guys -- any help is appreciated.

RequestAnimationFrame makes a big difference. It's probably the solution to your problem. There are two more things you could do: set up a second tick system which handles the model side of it, e.g. hit detection. A good example of this is how PhysiJS does it. It goes one step further, and uses a feature of some new browsers called a web worker. It allows you to utilise a second CPU core. John Resig has a good tutorial. But be warned, it's complicated, is not very well supported (and hence buggy, it tends to crash a lot).
Really, request animation frame is very simple, it's just a couple of lines which once you've set up you can forget about it. It shouldn't change any of your existing code. It is a bit of a challenge to understand what the code does but you can pretty much cut-and-replace your setTimeout code for the examples out there. If you ask me, setTimeout is just as complicated! They do virtually the same thing, except setTimeout has a delay time, whereas requestAnimationFrame doesn't - it just calls your function when it's ready, rather than after a set period of time.

You're not actually using the ticks. What's hapenning is that you are repeatedly calling tick() over and over and over again. You need to remove the () and just leave setTimeout(tick,timeout); Personally I like to use arguments.callee to explicitly state that a function calls itself (and thereby removing the dependency of knowing the function name).
With that being said, what I like to do when implementing a variable frame rate is to simplify the underlying engine as much as possible. For instance, to make a ball bounce against a wall, I check if the line from the ball's previous position to the next one hits the wall and, if so, when.
That being said you need to be careful because some browsers halt all JavaScript execution when a contaxt menu (or any other menu) is opened, so you could end up with a gap of several seconds or even minutes between two "frames". Personally I think frame-based timing is the way to go in most cases.

As Kolink mentioned. The setTimeout looks like a bug. Assuming it's only a typo and not actually a bug I'd say that it is unlikely that it's the animation itself (that is to say, DOM updates) that's really slowing down your code.
How much is a little more? I've animated hundreds of elements on screen at once with good results on IE7 in VMWare on a 1.2GHz Atom netbook (slowest browser I have on the slowest machine I have, the VMWare is because I use Linux).
In my experience, hit detection if not done properly causes the most slowdown when the number of elements you're animating increases. That's because a naive implementation is essentially exponential (it will try to do n^n compares). The way around this is to filter out the comparisons to avoid unnecessary comparisons.
One of the most common ways of doing this in game engines (regardless of language) is to segment your world map into a larger set of grids. Then you only do hit detection of items in the same grid (and adjacent grids if you want to be more accurate). This greatly reduces the number of comparisons you need to make especially if you have lots of characters.

Develop Reference

JavaScript is the programming language of the Web.