I want to find the text lines in a page of text (like from a book).
Sample image:
One of the problems is that I want to implement this in Javascript and this is the best computer vision library that I found:
http://inspirit.github.io/jsfeat/#imgproc
Therefore I am limited to the algorithms implemented in JSFeat (or another JS library).
I thought of doing feature detection on the page and then doing statistics on the plotted points to find the lines. I'm not sure that's a good idea or how this can be done.
For example this is the output of FAST when applied on that image.
It should work regardless of the font used. Also slight rotation tolerance would be even better.
Help much appreciated!
My approach would be to count the number of vertical edges on each horizontal scanline. Each letter will produce two or more edges.
First, use the sobel operator to calculate x derivative:
Now we have positive and negative edges, but we want to count them both as positive. So take the absolut value:
Now count the edges on each line. This can be done by summing the pixels up, or simply by scaling the image to a width of 1px, leaving the height unchanged. For easy viewing I've plotted the result:
Now you'll need to threshold this result somehow, or maybe find the maxima after running a blur on the 1px-width image. If the font size and the letters per line stay roughly the same, this is easy.
You may want to re-run on different rotations of the original image and then use the result with the highest contrast.
Related
Good Day everyone,
I am working on a project that represents data on a colored line.
For example if you give me 300 samples of the color of the sky throughout the day. I would want to plot them on a line like this:
The bright colors on the left show the daytime and the dark on the right show the night time.
That's easy enough, however for my real project I have about a million samples instead of 300. I want to use this data to make a line similar to above but 1,000,000 pixels wide, by 16 high.
It has to be in an image format because I will then run this image through different image compression algorithms (ie, jpg, jpg2000, webp ect..) and search for patterns within the results.
I thought to use HTMl Canvas for this, however the max width of html canvas in every majoy browser is much less than a million.
Chrome:
Maximum height/width: 32,767 pixels
Firefox:
Maximum height/width: 32,767 pixels
IE:
Maximum height/width: 8,192 pixels
The total area of the image is very normal, it's just the width that Canvas doesn't like.
Is there a way to turn these limits off in the browsers? Or is there another programming environment that could easily be setup to to build a picture like this?
Thank you for your help!
If this is for display purpose only and there is restriction to use JavaScript then ideal solution would be to split your data into chunks and add navigation buttons. At a particular time show only a part of your image and when a person navigate prev/next at that time re-render your image. See mockup below
I read somewhere that javascript is handling whole numbers better than decimal etc.
Since I am working with SVG a lot I thought hmm SVG is about vectors so its coordinates system could be whatever we want.
Now I built this naive performance test here:
https://jsperf.com/svg-whole-numbers-or-not
Question - the test shows 50% faster processing for the case with whole numbers. Can someone explain if avoiding using non-whole numbers actually give any real benefit?
I basically want to know should I (even if the performance win is small) just by default avoid non-whole numbers when I work with SVG?
The speed difference is largely accounted for by the need to "anti-alias" stroke and fill operations by differentially shading pixels which are adjacent to a path that does not run between adjacent pixels.
While this can occur for horizontal lines, vertical lines and rectangles, it would not generally apply to diagonal lines, curves and arbitrary shapes.
Avoiding non integer values will not benefit general purpose svg drawings. It may speed up a sub class of drawings such as very simple flow diagrams that make use of horizontal and vertical joiners and boxes. Whether the care or editing to use whole number coordinates is worth the time saved is (imho) doubtful.
I am looking to achieve something like this. A HTML view has a finite number of images (shown as red boxes in the image below). Are there any browser/jQuery APIs available today (cross-browser) which will let me calculate the dimensions of the remaining space (shown in green boxes) quickly? In the example shown below, it is easy to calculate the green area dimensions using simple geometry given the dimensions of the red boxes. But I am talking about very complex scenarios and complicated combination of images.
Appreciate any help. Thanks.
If you every images have absolute property, you can calculate dimension through top and left properties like $('#elementID').offset().top and $('#elementID').offset().left
From my experience working with DOM element dimensions, you cannot rely on them for exact values, and certainly can't really on them for the same values cross-browser. You can get OK results, but if you have complex scenarios then you will probably come undone at some point.
One way I have achieved similar things in the past is by drawing images to HTML5 Canvas. Using canvas you can have very fine-grained control. I have even iterated canvases pixel-by-pixel to get pixel perfect measurements of items on the canvas.
Check out this tutorial for a brief overview of drawing an image.
UPDATE
There is no easy way to do it. Using this method is low-level and will require you to use mathematics, and possibly byte-level image data from the canvas. However, if your problem is as complex as you suggest then you will have to get stuck in. When I did something similar I was also looking for an easy way to achieve what I wanted in the browser, then spent a month getting to grips with the canvas API, learning about byte-level colour data etc, but in then end I got what I needed, and ended up with something quite unique as it was difficult to achieve in a browser.
To get started, first I would say look at implementing a layered canvas by absolutely positioning multiple canvases on top of each other, then drawing a single image on each one. You already know the sizes of the images, and you can decide the coordinates of where to draw the image, so that's a start. In fact that may be all you need, you can track each image as you draw them by storing coords and dimensions, and you should be able to build up an accurate picture in numbers of where all your images are in 2D space.
Using those numbers you should then be able to calculate any empty spaces on there. However, that is a beyond me and probably a question for Mathematics Stack Exchange (which is actually down at the moment :D).
I am working on this browser-based experiment where i am given N specific circles (let's say they have a unique picture in them) and need to position them together, leaving as little space between them as possible. It doesn't have to be arranged in a circle, but they should be "clustered" together.
The circle sizes are customizable and a user will be able to change the sizes by dragging a javascript slider, changing some circles' sizes (for example, in 10% of the slider the circle 4 will have radius of 20px, circle 2 10px, circle 5 stays the same, etc...). As you may have already guessed, i will try to "transition" the resizing-repositioning smoothly when the slider is being moved.
The approach i have tried tried so far: instead of manually trying to position them i've tried to use a physics engine-
The idea:
place some kind of gravitational pull in the center of the screen
use a physics engine to take care of the balls collision
during the "drag the time" slider event i would just set different
ball sizes and let the engine take care of the rest
For this task i have used "box2Dweb". i placed a gravitational pull to the center of the screen, however, it took a really long time until the balls were placed in the center and they floated around. Then i put a small static piece of ball in the center so they would hit it and then stop. It looked like this:
The results were a bit better, but the circles still moved for some time before they went static. Even after playing around with variables like the ball friction and different gravitational pulls, the whole thing just floated around and felt very "wobbly", while i wanted the balls move only when i drag the time slider (when they change sizes). Plus, box2d doesn't allow to change the sizes of the objects and i would have to hack my way for a workaround.
So, the box2d approach made me realize that maybe to leave a physics engine to handle this isn't the best solution for the problem. Or maybe i have to include some other force i haven't thought of. I have found this similar question to mine on StackOverflow. However, the very important difference is that it just generates some n unspecific circles "at once" and doesn't allow for additional specific ball size and position manipulation.
I am really stuck now, does anyone have any ideas how to approach this problem?
update: it's been almost a year now and i totally forgot about this thread. what i did in the end is to stick to the physics model and reset forces/stop in almost idle conditions. the result can be seen here http://stateofwealth.net/
the triangles you see are inside those circles. the remaining lines are connected via "delaunay triangulation algorithm"
I recall seeing a d3.js demo that is very similar to what you're describing. It's written by Mike Bostock himself: http://bl.ocks.org/mbostock/1747543
It uses quadtrees for fast collision detection and uses a force based graph, which are both d3.js utilities.
In the tick function, you should be able to add a .attr("r", function(d) { return d.radius; }) which will update the radius each tick for when you change the nodes data. Just for starters you can set it to return random and the circles should jitter around like crazy.
(Not a comment because it wouldn't fit)
I'm impressed that you've brought in Box2D to help with the heavy-lifting, but it's true that unfortunately it is probably not well-suited to your requirements, as Box2D is at its best when you are after simulating rigid objects and their collision dynamics.
I think if you really consider what it is that you need, it isn't quite so much a rigid body dynamics problem at all. You actually want none of the complexity of box2d as all of your geometry consists of spheres (which I assure you are vastly simpler to model than arbitrary convex polygons, which is what IMO Box2D's complexity arises from), and like you mention, Box2D's inability to smoothly change the geometric parameters isn't helping as it will bog down the browser with unnecessary geometry allocations and deallocations and fail to apply any sort of smooth animation.
What you are probably looking for is an algorithm or method to evolve the positions of a set of coordinates (each with a radius that is also potentially changing) so that they stay separated by their radii and also minimize their distance to the center position. If this has to be smooth, you can't just apply the minimal solution every time, as you may get "warping" as the optimal configuration might shift dramatically at particular points along your slider's movement. Suffice it to say there is a lot of tweaking for you to do, but not really anything scarier than what one must contend with inside of Box2D.
How important is it that your circles do not overlap? I think you should just do a simple iterative "solver" that first tries to bring the circles toward their target (center of screen?), and then tries to separate them based on radii.
I believe if you try to come up with a simplified mathematical model for the motion that you want, it will be better than trying to get Box2D to do it. Box2D is magical, but it's only good at what it's good at.
At least for me, seems like the easiest solution is to first set up the circles in a cluster. So first set the largest circle in the center, put the second circle next to the first one. For the third one you can just put it next to the first circle, and then move it along the edge until it hits the second circle.
All the other circles can follow the same method: place it next to an arbitrary circle, and move it along the edge until it is touching, but not intersecting, another circle. Note that this won't make it the most efficient clustering, but it works. After that, when you expand, say, circle 1, you'd move all the adjacent circles outward, and shift them around to re-cluster.
I'm trying to implement collision detection for SVG text elements using client side JavaScript. The hit-test should check if any glyph of a text overlaps any glyph of another text element. Since getBBox and getExtentOfChar are anything than accurate I need a custom solution.
My first approach was to get the colour of each coordinate/pixel of an element and do the hit-testing manually, but this does not work because it isn't possible to get the colour of a coordinate. It would require an additional canvas to get pixel colours -> awful workaround.
Now I'm thinking about converting the text or the glyphs to polygons for hit testing. Is it possible? Or has anyone another approach for glyph based hit testing?
Best Regards
You are really entering a world of pain and cross browser problems. I ended up doing custom path-rendering of fonts only to get the total text length reliable and consistent. I don't even want to think about glyph-hitting.
One problem for example is that firefox (at least 3.6) and iirc also some version of opera has some rounding error when scaling so when you scale the parent-element holding the text and scale the text by the inverse of that scale, then the letter-spacing will be slightly different compared to without any scale. (Because each letter must begin on an even number or something like that, problem can be solved by multiplying both the upscale and downscale with like 10000 but that's another story)
The performance impact by using path compared to text is unfortunately quite noticeable. If your canvas does any form of animated panning or zooming you should switch to pure text-elements during the animation and once static, turn on path rendering for accuracy.
Fortunally converting svg-fonts to paths is very easy, it is plaintext and using the exact same format as the path-element. (beware of font-embedding-licenses though! Also keep file size in mind as you cannot use the fonts from the users system, )
As for the pixel-based hit-testing – if you switch to HTML5 Canvas, then this will become possible. Several projects provide easy transition from SVG to Canvas, e.g. fabric.js. See a comparison table here.
As for the polygon-based approach – possible, but difficult. You can convert text or glyphs to polygons (paths) using some tool (Inkscape's text-to-path for instance). And then there'll be calculations. Making a general solution for any text will require a lot of work. However, if the text doesn't change, then drawing your text manually using paths can be a quick and dirty solution.