I realize this is not strictly related to programming problems but as SO is the best resource for programming related problems, I decided to try it out. :)
I have a project where I need to do 3D pathfinding with javascript, inside a building. Dijkstra algorithm is probably the best case for this, as it handles irregular shapes quite nicely.
However, the problem is this:
Dijkstra requires node structure for it to work. But how to create that data? Obviously some sort of conversion need to be done from the basedata, but how to create that basedata? Going through the blueprint, getting x & y values for each possible path node, calculating the distances by hand seems bit excessive... And prone for swearwords...
I was even thinking of using Google Scetchup for this. Drawing lines for each possible path, but then the problem is getting the path data out from it. :/
I can't be the first person to have this problem... Any ideas? Are there any ready-made tools for creating path data?
Could not find any ready made tools so I ended up creating the path data as lines in Google SketchUp, exporting them Collada files and writing my own converter for the Collada XML data.
This can all be done in code by constructing a 3d grid and removing cubes that intersect with 3d objects.
I would then layer multiple 3d grids (doubling in size each time) that gives a more general idea of reachability (constructed from smaller grids), then by sheer virtue of path finding algorithms you will always find the most efficient path from A-B that will automatically direct the path using the largest cells (and therefore the fewest calculation steps). Note: make the larger 3d grids have a slightly lower weighting so that it's paths are favoured.
This can be used for many applications. For example if you could only walk on the ground, then simply remove blocks in unreachable areas.
Related
I am currently looking at an efficient way to visualise a lot of data in javascript. The data is geospatial and I have approximately 2 million data points.
Now I know that I cannot give that many datapoint to the browser directly otherwise it would just crash most of the time (or the response time will be very slow anyway).
I was thinking of having a javascript window communicating with a python which would do all the operations on the data and stream json data back to the javascript app.
My idea was to have the javascript window send in real time the bounding box of the map (lat and lng of north east and south west Point) so that the python script could go through all the entries before sending the json of only viewable objects.
I just did a very simple script that could do that which basically
Reads the whole CSV and store data in a list with lat, lng, and other attributes (2 or 3)
A naive implementation to check whether points are within the bounding box sent by the javascript.
Currently, going through all the datapoints takes approximately 15 seconds... Which is way too long, since I also have to then transform them into a geojson object before streaming them to my javascript application.
Now of course, I could first of all sort my points in ascending order of lat and lng so that the function checking if a point is within the javascript sent bounding box would be an order of magnitude faster. However, the processing time would still be too slow.
But even admitting that it is not, I still have the problem that at very low zoom levels, I would get too many points. Constraining the min_zoom_level is not really an option for me. So I was thinking that I should probably try and cluster data points.
My question is therefore do you think that this approach is the right one? If so, how does one compute the clusters... It seems to me that I would have to generate a lot of possible clusters (different zoom levels, different places on the map...) and I am not sure if this is an efficient and smart way to do that.
I would very much like to have your input on that, with possible adjustments or completely different solutions if you have some.
This is almost language agnostic, but I will tag as python since currently my server is running python script and I believe that python is quite efficient for large datasets.
Final note:
I know that it is possible to pre-compute tiles that I could just feed my javascript visualization but as I want to have interactive control over what is being displayed, this is not really an option for me.
Edit:
I know that, for instance, mapbox provides the clustering of data point to facilitate displaying something like a million data point.
However, I think (and this is related to an open question here
) while I can easily display clusters of points, I cannot possibly make a data-driven style for my cluster.
For instance, if we take the now famous example of ethnicity maps, if I use mapbox to cluster data points and a cluster is giving me 50 people per cluster, I cannot make the cluster the color of the most represented ethnicity in the sample of 50 people that it gathers.
Edit 2:
Also learned about supercluster, but I am quite unsure whether this tool could support multiple million data points without crashing either.
I'm working about audio but I'm a newbie in this area. I would like to matching sound from microphone to my source audio(just only 1 sound) like Coke Ads from Shazam. Example Video (0.45 minute) However, I want to make it on website by JavaScript. Thank you.
Building something similar to the backend of Shazam is not an easy task. We need to:
Acquire audio from the user's microphone (easy)
Compare it to the source and identify a match (hmm... how do... )
How can we perform each step?
Aquire Audio
This one is a definite no biggy. We can use the Web Audio API for this. You can google around for good tutorials on how to use it. This link provides some good fundametal knowledge that you may want to understand when using it.
Compare Samples to Audio Source File
Clearly this piece is going to be an algorithmic challenge in a project like this. There are probably various ways to approach this part, and not enough time to describe them all here, but one feasible technique (which happens to be what Shazam actually uses), and which is also described in greater detail here, is to create and compare against a sort of fingerprint for smaller pieces of your source material, which you can generate using FFT analysis.
This works as follows:
Look at small sections of a sample no more than a few seconds long (note that this is done using a sliding window, not discrete partitioning) at a time
Calculate the Fourier Transform of the audio selection. This decomposes our selection into many signals of different frequencies. We can analyze the frequency domain of our sample to draw useful conclusions about what we are hearing.
Create a fingerprint for the selection by identifying critical values in the FFT, such as peak frequencies or magnitudes
If you want to be able to match multiple samples like Shazam does, you should maintain a dictionary of fingerprints, but since you only need to match one source material, you can just maintain them in a list. Since your keys are going to be an array of numerical values, I propose that another possible data structure to quickly query your dataset would be a k-d tree. I don't think Shazam uses one, but the more I think about it, the closer their system seems to an n-dimensional nearest neighbor search, if you can keep the amount of critical points consistent. For now though, just keep it simple, use a list.
Now we have a database of fingerprints primed and ready for use. We need to compare them against our microphone input now.
Sample our microphone input in small segments with a sliding window, the same way we did our sources.
For each segment, calculate the fingerprint, and see if it matches close to any from storage. You can look for a partial match here and there are lots of tweaks and optimizations you could try.
This is going to be a noisy and inaccurate signal so don't expect every segment to get a match. If lots of them are getting a match (you will have to figure out what lots means experimentally), then assume you have one. If there are relatively few matches, then figure you don't.
Conclusions
This is not going to be an super easy project to do well. The amount of tuning and optimization required will prove to be a challenge. Some microphones are inaccurate, and most environments have other sounds, and all of that will mess with your results, but it's also probably not as bad as it sounds. I mean, this is a system that from the outside seems unapproachably complex, and we just broke it down into some relatively simple steps.
Also as a final note, you mention Javascript several times in your post, and you may notice that I mentioned it zero times up until now in my answer, and that's because language of implementation is not an important factor. This system is complex enough that the hardest pieces to the puzzle are going to be the ones you solve on paper, so you don't need to think in terms of "how can I do X in Y", just figure out an algorithm for X, and the Y should come naturally.
I have somewhere between 2M and 10M static objects which I would like overlay on Google Maps. I've previously tried HeatmapLayer successfully on much smaller sets. Due to the shear volume I'm a bit concerned, and I must to lump the objects together to avoid performance problems. The target platform is Chrome on a standard desktop.
What is the best way to space partition and merge objects in close proximity? Should I try some type of loose quad tree to lump the objects together, and then display each node with its respective weight using the HeatmapLayer? Or should I try to dynamically build some type of triangle mesh where vertices can be dynamically merged and triangles gain weight as more objects are added to them and then display the triangles on top of Google Maps? HeatmapLayer is pretty fast (looks like it's implemented in GL shaders), but I doubt Polygon is.
I've tried searching for open source loose quad tree JavaScript implementations and other fast space partition JavaScript implementations but found nothing. Is my best bet to port some C++ implementation? Any answers/comments from someone who built something similar would be helpful!
I settled on preprocessing my data in the backend using a space partitioning implementation. I recommend it for anybody who has the luxury of doing so.
I am making a game with a Node server which uses pathfinding for the enemies. I was using a 100x100 grid map and I did not see any slowdowns on performance, but when I raised the size to 1000x1000, each time a path is generated there is now a 1 second delay on the server.
Currently I am using PathFindingjs with A* path finding. Is there a better path finding library or path finding algorithm that will allow the use of a 1000x1000 grid without a delay, or am I out of luck?
Any help is appreciated, thank you.
What do you mean by "delay"? Like, it took longer to process a larger grid when nothing else was happening? Or, the processing "froze" while the path was calculated and then continued on?
Taking longer to process is natural for a large processing space. More cells is more compute power needed. There's no way around that, other than other CPU cores or some sort of processing service. That might be an answer to your question right there.
Nodejs is a single-threaded system, so all that processing will hang up the other actions that are going on. There might be ways to run chunks of the path processing that doesn't noticeably affect other things - unsure how the lib is built. Or chunk the grid to more manageable segments for the pathing algorithm (would 4 500x500 grids be almost the same?, that kind of thing). Or have two different servers on the same machine - pathing and other, and segment your requests.
I'm currently writing an application that displays a lot, and I mean, a lot of 2D paths (made of hundreds, thousands of tiny segments) on an HTML5 canvas. Typically, a few million points. These points are downloaded from a server into a binary ArrayBuffer.
I probably won't be using that many points in the real world, but I'm kinda interested in how I could improve the performance. You can call it curiosity if you want ;)
Anyway, I've tested the following solutions :
Using gl.LINES or gl.LINE_STRIP with WebGL, and compute everything in shaders on the GPU. Currently the fastest, can display up to 10M segments without flinching on my Macbook Air. But there are very strict constraints for the binary format if you want to avoid processing things in JavaScript, which is slow.
Using Canvas2D, draw a huge path with all the segments in one stroke() call. When I'm getting past 100k points, the page freezes for a few seconds before the canvas is updated. So, not working here.
Using Canvas2D, but draw each path with its own stroke() call. Despite what others have been saying on the internet, this is much faster than drawing everything in one call, but still a lot slower than WebGL. Things start to get bad when I reach about 500k segments.
The two Canvas2D solutions require looping through all the points of all the paths in JavaScript, so this is quite slow. Do you know of any method(s) that could improve JavaScript's iteration speed in an ArrayBuffer, or processing speed in general?
But, what's strange is, the screen isn't updated immediately after all the canvas draw calls have finished. When I start getting to the performance limit, there is a noticeable delay between the end of the draw calls and the update of the canvas. Do you have any idea where that comes from, and is there a way to reduce it?
First, WebGL was a nice and hype idea, but the amount of processing required to decode and display the binary data simply doesn't work in shaders, so I ruled it out.
Here are the main bottlenecks I've encountered. Some of them are quite common in general programming, but it's a good idea to remember them :
It's best to use multiple, small for loops
Create variables and closures at the highest level possible, don't create them inside the for loops
Render your data in chunks, and use setTimeout to schedule the next chunk after a few milliseconds : that way, the user will still be able to use the UI
JavaScript objects and arrays are fast and cheap, use them. It's best to read/write them in sequential order, from the beginning to the end.
If you don't write data sequentially in an array, use objects (because non-sequential read-writes are cheap for objects) and push the indexes into an index array. I used a SortedList implementation to keep the indexes sorted, which I found here. Overhead was minimal (about 10-20% of the rendering time), and in the end it was well worth it.
That's about everything I remember. If I do find something else, I'll update this answer!