contentEditable insert br when new line occurs - javascript

A contentEditable has automatic word wrapping, creating a new line when you reach the width of the editable area. This is great but I am parsing the contents of this afterwards and I need it to add a <br> when it does this. I have tried everything I can think of and I can't achieve this. Any help greatly received.

This is not possible, the word wrapping point is 'browser discretion' and as such susceptible to font size differences, fonts not being installed, font render engines, anti-aliasing settings etc. etc. The line-wrap point is, so to speak, 'not your problem' from the browser's perspective, and as such it doesn't give this info away.
Theoretically you could rebuild the content word-for-word in JS in a dynamically sized and similarly styled div, and monitor for when the height changes - that's where the newlines occur. It'd be a crap load of crappy code to achieve a dodgy result though.
I can't help but feel like you're asking for an XY-solution here - if you need newlines at the given point, let the end user give them when he wants to. Simply adding overflow:auto;white-space:nowrap to the editable element forces them to. Example here.

Related

Dynamic text height plus wrapping to fill textarea in CSS

I am new to coding so I hope this is not too unclear:
I am learning Javascript, CSS and HTML through making a simple HTML calculator of which the display is a textarea (unless there is a better option for what I describe below).
I would like the contents of the textarea to be the same height as it at first, becoming smaller as the numbers require more width. However, I would like there to be a particular text size at which it begins to wrap (break onto the next line) without overflow. Ideally, the text would continue to get smaller and wrap onto more lines in order to fit in the textarea (this may not be practical for a real calculator, but this is also an exercise in design).
I have tried a few ways with word-break, word-wrap and overflow-wrap but none have had any effect, I imagine it might require some script as well and maybe some a more nuanced definition of the textarea size.
This may be complex or too ambitious but I would like to see if it is possible first!
Thanks a lot in advance.

Javascript retrieve linebreaks from dom [duplicate]

I need to add line breaks in the positions that the browser naturally adds a newline in a paragraph of text.
For example:
<p>This is some very long text \n that spans a number of lines in the paragraph.</p>
This is a paragraph that the browser chose to break at the position of the \n
I need to find this position and insert a <br />
Does anyone know of any JS libraries or functions that are able to do this?
The only solutuion that I have found so far is to remove tokens from the paragraph and observe the clientHeight property to detect a change in element height. I don't have time to finish this and would like to find something that's already tested.
Edit:
The reason I need to do this is that I need to accurately convert HTML to PDF. Acrobat renders text narrower than the browser does. This results in text that breaks in different positions. I need an identical ragged edge and the same number of lines in the converted PDF.
Edit:
#dtsazza: Thanks for your considered answer. It's not impossible to produce a layout editor that almost exactly replciates HTML I've written 99% of one ;)
The app I'm working on allows a user to create a product catalogue by dragging on 'tiles' The tiles are fixed width, absolutely positioned divs that contain images and text. All elemets are styled so font size is fixed. My solution for finding \n in paragraph is ok 80% of the time and when it works with a given paragrah the resulting PDF is so close to the on-screen version that the differences do not matter. Paragraphs are the same height (to the pixel), images are replaced with high res versions and all bitmap artwork is replaced with SVGs generated server side.
The only slight difference between my HTML and PDF is that Acrobat renderes text slightly more narrowly which results in line slightly shorter line length.
Diodeus's solution of adding span's and finding their coords is a very good one and should give me the location of the BRs. Please remember that the user will never see the HTML with the inserted BRs - these are added so that the PDF conversion produces a paragraph that is exactly the same size.
There are lots of people that seem to think this is impossible. I already have a working app that created extremely accurate HTML->PDF conversion of our docs - I just need a better solution of adding BRs because my solution sometimes misses a BR. BTW when it does work my paragraphs are the same height as the HTML equivalents which is the result we are after.
If anyone is interested in the type of doc i'm converting then you can check ou this screen cast:
http://www.localsa.com.au/brochure/brochure.html
Edit: Many thanks to Diodeus - your suggestion was spot on.
Solution:
for my situation it made more sense to wrap the words in spans instead of the spaces.
var text = paragraphElement.innerHTML.replace(/ /g, '</span> <span>');
text = "<span>"+text+"</span>"; //wrap first and last words.
This wraps each word in a span. I can now query the document to get all the words, iterate and compare y position. When y pos changes add a br.
This works flawlessly and gives me the results I need - Thank you!
I would suggest wrapping all spaces in a span tag and finding the coordinates of each tag. When the Y-value changes, you're on a new line.
I don't think there's going to be a very clean solution to this one, if any at all. The browser will flow a paragraph to fit the available space, linebreaking where needed. Consider that if a user resizes the browser window, all the paragraphs will be rerendered and almost certainly will change their break positions. If the user changes the size of the text on the page, the paragraphs will be rerendered with different line break points. If you (or some script on your page) changes the size of another element on the page, this will change the amount of space available to a floating paragraph and again - different line break points.
Besides, changing the actual markup of your page to mimic something that the browser does for you (and does very well) seems like the wrong approach to whatever you're doing. What's the actual problem you're trying to solve here? There's probably a better way to achieve it.
Edit: OK, so you want to render to PDF the same as "the screen version". Do you have a specific definitive screen version nominated - in terms of browser window dimensions, user stylesheets, font preferences and adjusted font size? The critical thing about HTML is that it deliberately does not specify a specific layout. It simply describes what is on the page, what they are and where they are in relation to one another.
I've seen several misguided attempts before to produce some HTML that will exactly replicate a printed creative, designed in something like a DTP application where a definitive absolute layout is essential. Those efforts were doomed to failure because of the nature of HTML, and doing it the other way round (as you're trying to) will be even worse because you don't even have a definitive starting point to work from.
On the assumption that this is all out of your hands and you'll have to do it anyway, my suggestion would be to give up on the idea of mangling the HTML. Look at the PDF conversion software - if it's any good it should give you some options for font kerning and similar settings. Playing around with the details here should get you something that approximates the font rendering in the browser and thus breaks lines at the same places.
Failing that, all I can suggest is taking screenshots of the browser and parsing these with OCR to work out where the lines break (it shouldn't require a very accurate OCR since you know what the raw text is anyway, it essentially just has to count spaces). Or perhaps just embed the screenshot in the PDF if text search/selection isn't a big deal.
Finally doing it by hand is likely the only way to make this work definitively and reliably.
But really, this is still just wrong and any attempts to revise the requirements would be better. Keep going up one step in the chain - why does the PDF have to have the exact same ragged edge as some arbitrary browser rendering? Can you achieve that purpose in another (better) way?
Sounds like a bad idea when you account for user set font sizes, MS Windows accessibility mode, and the hundreds of different mobile devices. Let the browser do it's thing - trying to have exact control over the rendering will only cause you hours of frustration.
I don't think you'll be able to do this with any kind of accuracy without embedding Gecko/WebKit/Trident or essentially recreating them.
Maybe an alternative: do all line-breaks yourself, instead of relying on the browser. Place all text in pre tags, and add your own linebreaks. Now at least you don't have to figure out where the browser put them.

Extract all content from PDF file (not just text, but also tables/diagrams)?

I'd like to reformat PDF main content, so I need to extract its main content, not just text, but also tables, diagrams, etc. with their layout information. I'm only interested in the main part of the content, for example, for technical paper, I'm only interested in the columns of text, tables, and diagrams. The headers, footers, and text on the margin can be ignored.
It would be like to scan content stream from PDF pages, recognize them whether they are text paragraph or other.
If they are text paragraph, I may apply certain format treatment to it.
If they are other like table, or diagrams, or anything not like a paragraph, I'll just keep them as is, or just shrink or enlarge to fit in the new display.
For example, the following stream, I'd collect the text, and make note of the starting point of the text relative to the page:
stream
BT
/F1 20 Tf
120 120 Td
(Hello from Steve) Tj
ET
endstream
Continue to decompose the stream content to organize in an array of document elements with relative position information, whether they are paragraph (to be able to reformat the associated text.)
I guess even just decompose a stream and tell whether they are paragraph of text and note down its relative position may not be trivial.
I found that pdf.js's page.render() might have the opportunity to help me to achieve the goal, but I haven't figured out how it could be adapted.
Also pdf2htmlEx might have similar mechanism to do so, as it can convert PDF file to html.
But not sure at what level the above tools do the rendering/conversion, if they directly do them as image, then they may not help to my purpose.
Adobe's PDF viewer on Android provides function of re-flow of PDF content on mobile phone's small screen. it may use some mechanism of full content capture, and transformation that I'd like to have.
So my question is for pointers how my requirements could be achieved?
Thanks a lot

How to determine font-weight bold without using getComputedStyle

I'm in the process of making an HTML text parser and I would like to be able to determine when a text node appears as a header (visually, not HTML headers).
One thing that can usually be said about headers are that they are emphasized - usually in one of two ways: Bold font or larger font size.
I could get both corresponding CSS values using getComputedStyle(), but I want to avoid this because the parser needs high performance (has to run smoothly on, for example, Chromebooks) and getComputedStyle() is not particularly fast when looking through hundreds or thousands of nodes.
Figuring out a font size isn't too hard - I can just select the node with range and check its client rects from range.getClientRects().I haven't figured out a smart way to check font weight though, which is why I'm here.
Can anyone think of higher-performance way of doing this than by using getComputedStyle()?
I'm aware this might not be possible - just looking to see if someone can think of an ingenious way to solve this problem.
Edit
This is for a Google Chrome extension only.
What you're aiming to do here is really messy. Since you want to determine if text is bold visually, on some devices, depending on how they render text, the whole system may just break!
A suggestion I have is to use the HTML5 Data atrributes - find out more here - and use it like so:
<div class="header" data-bold="yes">This will appear bold?</div>
Then, using JavaScript you can just go over all div elements with the data-bold attribute.
I hope this helped ;)

Constructing a web-reading aid: row selector

I've been trying to make a row-marking script to help me read text online. When I read books I always use a ruler or paper. Online, I don't have this option and usually get lost in text.
W - move div up
S - move div down
JS fiddle for my current effort (S = down, W = up)
The best thing I could come up with is to have equal row height for all text-elements, but makes styling quite hard. Also, I would like to be able to run it in the console to enable it on any website (or install it as a add-on).
Is there some better way to design a tool like this, that makes it more capable and adaptive to unknown content?
I can of course select text and I do that a lot. Usually, it results in me selecting only some words-> attention goes to that word, which is kind of what I want to avoid.
I could also use a secondary window, which I do sometimes. But it's a bit wobbly and as soon as you click it disappears.
A div that follows mouse-pointer is a possibility, but it's too shaky and feels like something got stuck on your finger.
EDIT: I updated the fiddle with the changes. I kept the javascript because without it you won't be able to move the ruler to the lower parts of the text.
Updated reading-ruler
Instead of JavaScript you could set a position: fixed for the .ruler. It will stay in the same place all the time so you can read the text with the .ruler element below it.
Here is the jsFiddle. http://jsfiddle.net/RxDpP/1/

Categories

Resources