I was wondering how to detect camera motion in a youtube video.
I want to read in an youtube link process the video and tell the user if it was filmed using a tripod or if it was super shakey.
Do anyone know where I would even start? It might not even be possible?
Just spitballing here, but I'd start by capturing frames that are close together at various points throughout the video.
You would take the frames from each section and compare them to each other for variations in composition, dunno how to best go about that.. I'd probably start with like colour detection in various spots? anyway, start building a "difference score"
once you've gone through the frames for each section you sampled, you'll have a "difference score" and you can then start trying to figure out what the cut off point is for detecting a shakey video.
you probably couldn't do this anywhere close to realtime, so be prepared to have a bit of a wait period while the video processes.
Take some images each seconds (avoid to take it at a fixed frequency, because if the film has a move frequency, ie: if we are on a boat and we film during high waves)
After you can convert it to black and white (not gray levels) and compare it. Using the position of the minor color. (But it ain't gonna work fine)
Usually we use, edge detection : http://fr.mathworks.com/discovery/edge-detection.html and compare some of it. To see what is the scene and work together and waht is not. You must find "interest points" an calculate the vector between two frames. Some of this vector will move together. It's an object. Now you have to find which object if the scene.
Related
I have an animated character (half pig half woman) that plays an integral role in my client's brand. She performs several movements/actions such as walking/running/climbing in place, dancing in several different ways and gesturing with facial and body movements.
On the website, she will be displayed floating near the user's scroll position and will perform different actions (i.e. play a specified segment of this motion/action) based on what the user is doing at the time. For example, while scrolling, the page may be playing the climbing loop and or when focus is given to a lead-capture form, the character starts dancing and when the user is typing in the 'email' field of said form, we jump to the super-fun part of the dance... I'm not sure if it will be exactly those, but something along those lines.
With the exception of the dance, which will be around 60 seconds in its non-looped state, everything else will only be a couple seconds max. So I'm trying to figure out what the most efficient way to rig this is - by that I mean the best way to use JavaScript to control the character's action based on the user's actions.
I'm considering using animated GIFs for everything except the dancing and just switching out their src when appropriate (i.e. $pigImg.src='pig-smile.gif' when she is clicked on and $pigImg.src='pig-climb.gif' for scrolling...) and then hiding the GIF (or displaying something blank) and streaming the video at its appropriate timestamp when it's time for her to dance.
I think I'll do all this within a <canvas> element to maximize the flexibility:simplicity ratio but if for no other reason than to be able to use clips with a green-screen background to do things I haven't even planned yet in the future.
I know this is a pretty broad intro, so I'll try it focus this post on the question of whether or not there are any potential obstacles that I need to consider with my <canvas> and mix of video and GIFs approach? I know that switching between the 2 media types may cause some mis-registration issues (i.e. the character not lining up 100% on the nose).
I'm sure this will end up being FAR more complex in reality that it is in my head right now, so I guess making sure I'm not dreaming up something filled with numerous technical holes of which I am unaware is a good starting point.
I am using node.js & node-ar-drone to program my AR.Drone 2.0 to perform some basic flight maneuvers indoors. From what I can tell, the drone seems to never fly straight. It will always sway to the left and right, hover for a few seconds, or crash into a wall regardless of where I set the takeoff point from. In other words, if I run the same program to fly down a hallway 10 times, each time it will do something different.
If it does make it down a hallway it will land somewhere different each time. I would have built-in counter moves to adjust for the random swaying such as if it sways to the right, I would tell it to shift to the left, but it never seems to be enough. No amount of counter moves seems to get it to fly straight. I am using the latest firmware on the drone.
I was told that there is nothing on board the drone that corrects errors during flight, such as a feedback loop. In addition to this I was also told that these drones were primarily made for use outdoors or in very wide open spaces such that it wont crash.
I wanted to see if this held true with anyone else or if anyone had any suggestions to get it to fly straight. Any input or comment would be helpful
The AR.Drone does use feedback from its combination of sensors to improve its flight, as seen in this diagram (from "The Navigation and Control technology inside the AR.Drone micro UAV"):
For your situation, probably the most important thing is how well attitude and speed estimation is working, which uses the accelerometers, gyrometers and cameras. There are a few things you can do to help those systems work:
Make sure you take off from a completely level surface.
Call ftrim to set the flat trim level before taking off.
The vision algorithms are designed to try to do a good job even if the surface under the downward-facing camera doesn't have very much texture, but they still can get confused if the floor/ground is too featureless. Try flying over something with more texture and contrast.
For #3, flying over something like a uniformly colored carpet or a concrete floor can make it harder for the drone to see what it's doing--very similar to the problem of using an optical mouse on a smooth, featureless surface. When you see Parrot showing off the AR.Drone's abilities, you'll notice they often fly over a surface that is obviously chosen to make navigation easier. E.g.,
From https://www.youtube.com/watch?v=IcxBf-kegKo:
From https://www.youtube.com/watch?v=pEMD6P_j5uQ#t=8m25s:
That said, with my drone I've sometimes experienced situations where immediately on takeoff, the drone veers off to the side until it crashes even though I called ftrim and thought I took off from a flat surface. You may need to use trial and error to find a good takeoff point.
The drone is designed to be able to fly indoors (e.g. the styrofoam hull with the propeller protectors is recommended for indoor flight but not recommended for outdoor flight, and the FreeFlight app has indoor & outdoor flight modes), but in my experience the drone still wanders a bit and so you'll have the best results in a larger room.
Here's a demo where my drone flies in a very stable manner indoors, in a large room, with well textured carpet, from a very flat location: https://www.youtube.com/watch?v=uhBa11gdbeU
Even then you can see the drone make a small, quick correction at 0:23.
I am very happy that I got the opportunity to work on a website that is gesture-based.
I have a few inspiration for this: link
I visited lot of websites and googled it, Wikipedia and gitHub also didn't help much. There is not much information provided as these technologies are in nascent stages.
I think I will have to use some js for this project
gesture.js (our custom javascript code)
reveal.js (Frame work for slideshow)
My questions are how come gestures generate events, how does my JavaScript interact with my webcam? Do I have to use some API or algorithms?
I am not asking for code. I am just asking the mechanism, or some links providing vital info will do. I seriously believe that if the accuracy on this technology can be improved, this technology can do wonders in the near future.
To enable gestural interactions in a web app, you can use navigator.getUserMedia() to get video from your local webcam, periodically put video frame data into a canvas element and then analyse changes between frames.
There are several JavaScript gesture libraries and demos available (including a nice slide controller). For face/head tracking you can use libraries like headtrackr.js: example at simpl.info/headtrackr.
I'm playing a little bit with that at the moment so, from what I understood
the most basic technique is:
you request to use the user's webcam to take a video.
when permission is given, create a canvas in which to put the video.
you use a filter (black and white) on the video.
you put some control points in the canvas frame (a small area in where all the pixel colors in it are registered)
you start attaching a function for each frame (for the purpose of the explanation, I'll only demonstrate left-right gestures)
At each frame:
If the frame is the first (F0) continue
If not: we subtract the current frame's pixels (Fn) from the previous one
if there were no movement between Fn and F(n-1) all the pixels will be black
if there are, you will see the difference Delta = Fn - F(n-1) as white pixels
Then you can test your control points for which areas are light up and store them
( ** )x = DeltaN
Repeat the same operations until you have two or more Deltas variables and you subtract the control points DeltaN from the control points Delta(n-1) and you'll have a vector
( **)x = DeltaN
( ** )x = Delta(N-1)
( +2 )x = DeltaN - Delta(N-1)
You can now test if the vector is either positive or negative, or test if the values are superior to some value of your choosing
if positive on x and value > 5
and trigger an event, then listen to it:
$(document).trigger('MyPlugin/MoveLeft', values)
$(document).on('MyPlugin/MoveLeft', doSomething)
You can greatly improve the precision by caching the vectors or adding them and only trigger an event when the vector values becomes a sensible value.
You can also expect a shape on your first subtractions and try to map a "hand" or a "box"
and listen to the changes of the shape's coordinates, but remember the gestures are in 3D and the analysis is 2D so the same shape can change while moving.
Here's a more precise explanation. Hope my explanation helped.
How can I select a range of frames across different layers on the timeline, convert them into a symbol(MovieClip) and once converted the arrangement of layers and frames remains intact. Much the same as After Effects 'pre-compose' for layers.
The default behaviour is to put all of the separate frames on the same layer once converted which is extremely annoying.
Could this be possible with a custom flash command? (jsfl)
Don't know about a "custom flash command", but you can use the Flash Player virtual machine API (read: ActionScript) to sort of achieve the desired effect. What you need to do is loop through each frame with gotoAndStop. Then you need to ask yourself - do your individual frames exhibit animation or are they static? If they are animated each as well, then you need to either ignore the animation and just take a snapshot at a "random" time, or you need to traverse that animation as well - call gotoAndStop for the sub-movie clip as well. Let's assume your frames do not animate themselves, as it makes the whole method easier. You simply use BitmapData.draw on each of your frame content, copying its visual pixel data and thus caaching a frame of animation. You store your bitmap data objects as an indexed array, and create a timer which calls displaying of each such bitmap in succession. Essentially, you are then caching your timeline and reproduce the animation with your own "engine".
Alternatively, you can try to experiment with DisplayObject.cacheAsBitmap property, setting it to true for whatever is displayed in your frames. Mind you, that would probably NOT be a smart thing to do if your individual frames exhibit animation, but try it out never the less - Flash Player could be smart enough to ignore your setting, as caching a snapshot of animation as a bitmap for a rapid animation may waste more memory than it brings any good
1 - select all layers within the timeline and choose rightclick-copy (in german it is Bilder kopieren), NOT layers
2 - create new MovieClip (strg-F8)
3 - click in the first frame and rightclick-paste
now you can delete the originial layers and use the new movieclip in their place
Try the "New Anim Clip" extension from ToonMonkey: http://toonmonkey.com/extensions.html
I think it'll do what you need.
I'm new to HTML5/Canvas/Game programming, but have been tinkering around with it after reading a couple of books. I THINK I have a fairly good idea of how things work out. This question asks several smaller questions, but in general is basically a "structural approach" question. I'm not expecting verbose responses, but hopefully small pointers here and there :) Here is a link to a non-scrolling, and currently rather boring Super Mario World.
Super Mario World Test
NOTE: Controls are Left/Right and Spacebar to jump. This is only setup for Firefox right now as I'm just learning.
Did I Do Something Wrong at This Point?
Currently I've just focused on how Mario runs and jumps, and think that I've gotten it down fairly okay. The coin box doesn't do anything and the background is just an image loaded in for looks. Here's my approach, please let me know if there is anything entirely wrong with this:
Allows Mario to jump by enacting on 2 Y velocities (Gravity and Jump variables)
Allows Mario to run by enacting on 1 velocity (Left or Right "Friction" + Acceleration)
Sprites are used and positioned according to keypress/keydown
I'm not sure if this is right, but I'm using a constructor function to build an object, then inside the main animation loop I'm calling the prototype.draw function for that object to update all variables and redraw the object.
I'm clearing the entire canvas each Frame
Should I be splitting this into more than just a draw function, like Mario.move()?
I've setup a GroundLevel and a JumpLevel variable to create 2 planes of gameplay. The JumpLevel is setup to allow for controlling how high Mario can jump on the fly. The 2 places would allow for the ground to rise like a hill - keeping the point at which Gravity overrules Mario's jumping force at the same distance from the ground.
For clarity sake, everything is separated into different JS files, but would obviously consolidate.
Moving Forward:
Now that I've finished setting up how Mario moves around (I think there are a couple other minor things I might do like mushroom up/down and shooting fireballs). I think I can figure that out, but I'm really lost when it comes to visualizing the following and how HTML5/Canvas can handle this easily:
Scrolling background (I've tried setting up Ground Tiles and using Screen Wrapping, but that seems to cause a lot of uneven issues since I was moving the tiles in the opposite direction. Unfortunately, since I'm trying to account for acceleration, this threw off the count and was causing gaps in the ground. I ditched this idea. Would a DIV beneath the canvas with a large background image be the best solution?
Enemies: Would I create enemies the same way and run a loop for collision detection on every enemy during each frame?
Background Boxes: I'm trying to allow Mario to stand on the boxes in the background, but am unsure how to approach this. I currently have boundaries setup for Mario to stay on the canvas, do I continue to expand these conditions to setup different boundaries based on the boxes? I can see that having several boxes on the screen and doing it this way would get kind of crazy, especially if I would be doing the same hit testing for enemies? I know I'm missing something here....
Level Movement: This is somewhat related. When the Right key is pressed, basically everything in the level needs to move to the left. Would I need to track out all positions of everything that could touch Mario (boxes for him to stand on and enemies for him to collide with) during every animation frame? This seems like it would get kind of inefficient?
Thanks to all! I'll keep this updated with results/solutions :)
Wow, okay. I really like your question because you've obviously done a lot of thinking on this, but partially because of that it's incredibly broad and conversational. You'd do better to find a forum to ask this question.
...That being said, I'm gonna answer the handful of points I'm qualified to, in no particular order. :)
Level Movement: That's a weird (read: inefficient) way to do it. I wouldn't do any calculations based on onscreen positions: track a canonical, camera-agnostic set of coordinates for everything in your level and update the visuals to match. This will stop you from running into weird niggling problems where framerate impacts what you can and can't walk through, or causing slower computers to let Mario run through enemies without being damaged sometimes. Tracking positions this way will incidentally fix a lot of your other problems.
You should absolutely be splitting this into multiple functions. Having movement code and rendering code in the same place is going to screw you, particularly by interacting malignantly with your update/refresh rate. It's going to essentially mean that every time the player does a tricky jump the game does more updates than usual which will make animation/hit detection/etc much less likely to be even.
Enemies: I'd suggest rolling this in with everything else. Do one hit-detection pass against everything, and if you hit something check to see what it was. You could try to optimize this by only checking any given entity against objects within 100 pixels of itself, but if you do it this way you'll need to run separate collision detection events for every enemy. Letting the enemies clip through each other would be computationally cheaper.
Edit: I'd like to clarify about my first point on 'level movement.' Essentially, what you don't want to do is move every entity onscreen every time the camera does, or to store all entity locations as offsets from the camera location (in which case you're still effectively having to move everything, every time the camera moves.)
Your ideal approach is to store your enemy, block, terrain locations with X/Y coordinates that are offset from the absolute top-left of the level (at the very beginning.) In order to render a frame, you'd do essentially this: (pseudocode because we're talking about a hypothetical level format!)
function GetVisible(x,width,level_entities_array) {
for (i = 0; i < count(level_array); i++){
if (level_entities_array[i][x] > x && level_entities_array[i][x] < x+width) {
visible_elements[] = level_entities_array[i][x];
}
}
return visible_elements;
}
Boom, you've got everything that should be inside the window. Now you subtract the camera's x offset from the entity's x location and ZAP, you've got its position on the canvas. Pose as a team, 'cause things just got real.
You'll note that I'm not bothering to cull on the Y axis. This can be rectified by extrapolation, which I'm guessing you can handle because you've made it this far. This will be necessary if you want to do any Mario-style vertical exploration.
Yes, I know my pseudocode looks like C# and JavaScript's unholy lovechild. I'm sorry, that's just how I roll at 11:30 at night. ;)