innerText on content editable doubling some line breaks - javascript

I am trying to read multi line user input on a content editable div, and I don't get the right number of line breaks when I read the input with contentEditableDiv.innerText.
I tried textContent, but it doesn't return any line break, while innerText returns too many sometimes. innerHTML doesn't seem appropriate since I don't want any HTML code, just text.
If my div contains:
a
b
It returns "a↵b" (97 10 98 in the example)
But if my <div> contains:
a
b
innerText returns a↵↵↵b (one too many ↵, 97 10 10 10 98 in the example)
var input = document.getElementById("input");
var button = document.getElementById("button");
var result = document.getElementById("result");
button.addEventListener("click", (event) => {
var charCodes = "";
for (var i = 0; i < input.innerText.length; ++i) {
charCodes += input.innerText.charCodeAt(i) + " ";
}
result.innerText = charCodes;
});
<div id="input" contenteditable="true" spellcheck="true" style="border:1px #000 solid"></div>
<button id="button">check</button>
<div id="result"></div>

The standard is a bit vague:
UAs should offer a way for the user to request an explicit line break at the caret position without breaking the paragraph, e.g. as the default action of a keydown event whose identifier is the "Enter" key and that has a shift modifier set. Line separators are typically found within a poem verse or an address. To insert a line break, the user agent must insert a br element.
If the caret is positioned somewhere where phrasing content is not allowed (e.g. in an empty ol element), then the user agent must not insert the br element directly at the caret position. In such cases the behavior is UA-dependent, but user agents must not, in response to a request to insert a line separator, generate a DOM that is less conformant than the DOM prior to the request.
To conform this definition, it is safe to wrap the <br> element into <div>, which Chrome does, but it is UA dependent, so you should not rely on this. The side effect on this behavior and the cause of your problem is that both div and br elements produces line break in the innerText property.
The innerHTML of a↵↵b looks like this in Chrome:
a
<div>
<br>
<div>
<br>
<div>
b
</div>
</div>
</div>
But if you paste it (instead of typing char by char) it looks like this
<div>a</div>
<div>
<br>
</div>
<div>
<br>
</div>
<div>b</div>
To reduce the line breaks, you need to further process the innerHTML and treat
<div><br></div> after every input change as a single line-break and then read the innerText (with caution on places where phrasing content like br is not allowed, but in real life browsers can handle them well).

Related

Use innerText to retrieve text from hidden HTML elements

I need to get the text from an html element which is hidden with css. I know that per the specs innerText will abide the css rules. But the alternative of using textContent ignores line break and tabs which I need to keep in the string.
Is there any way around this?
For simplicity please see the following example:
const inntxt = document.querySelector('.expandable').innerText
console.log(inntxt) // Here we don't get the hidden div's text.
const txtct = document.querySelector('.expandable').textContent
console.log(txtct) // Here the result removes the line break.
.hidden{
display: none;
}
<div class='expandable'>
<span class='visib'>
Red balloon
</span>
<br>
<span class='hidden'>
Yellow badge<br>Green ribbon
</span>
</div>
I guess one way around it would be to replace the <br> with my own char like # by appending it instead of the <br>, but there must be a better way no?
UPDATE
To be more clear the end result should be:
For example if you would console.log() the string in node, then the string from our innerText or textContent should be:
'Red balloon\nYellow Badge\nGreen ribbon'
It is very unintuitive that textContent doesn't retrieve <br> new lines.
I suggest using a dummy element in Javascript using innerHTML and replacing all <br> with \n<br>.
Take a look at this code, I think it solves your issue:
const expandable = document.querySelector('.expandable')
const auxEl = document.createElement('div')
auxEl.innerHTML = expandable.innerHTML.replace(/(\<br\>)/, '\n$1')
const textContent = auxEl.textContent
console.log({ textContent })
/* Yields
{
"textContent": "\n \n Red balloon\n \n \n Yellow badge\nGreen ribbon\n \n"
}
*/
.hidden {
display: none;
}
<div class='expandable'>
<div class='visib'>
Red balloon
</div>
<div class='hidden'>
Yellow badge<br>Green ribbon
</div>
</div>
Not sure about what you're trying to achieve here, but what you might want to do is change innerText to innerHTML and then the line breaks are preserved.
Unlike innerText, though, innerHTML lets you work with HTML rich
text and doesn't automatically encode and decode text. In other words,
innerText retrieves and sets the content of the tag as plain text,
whereas innerHTML retrieves and sets the content in HTML format.
Quoted from https://stackoverflow.com/a/19030857/4073621.
Kindly use the following script to get innerText
const inntxt = document.getElementsByClassName('expandable')[0].innerText;
console.log(inntxt);

Is there any HTML element that can preserve windows CRLF, i.e \r\n

In my program, I need to render a HTML page where user can pick specific words from the text area and use javascript to capture the selected position indexes(startIndex, endIndex) and perform further process to the source data.
However, there's a use case the source data contains windows CRLF, when it's rendered to a text box(I used span), the CRLF is lost. So the index is always offset by 1 per line.
I know potentially we can fix it from the backend, but it would be nice and more consistent if we can preserve the CRLF on the web page.
Example code:
var input = "Hello\r\nJavaScript!"
document.getElementById("demo").innerHTML = input;
var test = document.getElementById("demo").innerHTML
console.log('length of input:' + input.length)
console.log("length of test: " + test.length)
span.demo {
white-space: pre-wrap;
}
<span class="demo" id="demo">
I want to see input.length==test.length
I have explored different fashion of white-space style(including pre), but no luck.
The console.log might not be very clear for this purpose, but you can use Firefox debuger to watch the input variable and see \r\n.
Thanks a lot
The pre element preserves newlines exactly the way you need it!
<pre id="test" />
<script>
document.getElementById("test").innerHTML = "Hello\r\nJavaScript!";
</script>

How can I get HTML-Code in a contentEditable-DIV

In a contentEditable-DIV I try to get the HTML-Code from strat-Position 0 to end-position where the user has clicked.
<div id="MyEditableId" contentEditable="true">
1. Some text 123. <span style="background-color: #0CF;">text 123</span> 456 <span style="background-color: #9F3;">2-> abc </span>
<br />
<p> E.g. here is clicked: "click" Text after click </p>
<p></p>
<br />
end of text.
</div>
Something as below code snippet, which delivers the text from 0 to end of clicked node. But I need also the HTML-Code in contentEditable-DIV.
$('#MyEditableId').on('mouseup', function(event) {
var MyEditable = document.getElementById('MyEditableId');
MyEditable.focus();
range = document.createRange();
// endOffset: It will be better the length of where actually was clicked, e.g. after 15-characters. But this.length will be also ok.
endOffset = $(this).length;
range.setStart(MyEditable.firstChild,0);
range.setEnd(event.target,endOffset);
var selection = window.getSelection();
selection.addRange(range);
// Below I get the selected text from 0 to end of clicked node. But I need the selected HTML-Code from 0 to end of clicked position.
alert( window.getSelection() );
});
I expect for the result something as follows:
1. Some text 123. <span style="background-color: #0CF;">text 123</span> 456 <span style="background-color: #9F3;">2-> abc </span>
<br />
<p> E.g. here is clicked: "click"
How can I get the HTML-Code instead of text in my contentEditable-DIV?
Thanks In Advance.
You can select the div and use its property innerHTML
http://jsfiddle.net/at917rss/
<div id="MyEditableId" contentEditable="true">
1. Some text 123. <span style="background-color: #0CF;">text 123</span> 456 <span style="background-color: #9F3;">2-> abc </span>
<br />
<p> E.g. here is clicked: "click" Text after click </p>
<p></p>
<br />
end of text.
</div>
$('#MyEditableId').on('mouseup', function(event) {
var MyEditable = document.getElementById('MyEditableId');
MyEditable.focus();
range = document.createRange();
// endOffset: It will be better the length of where actually was clicked, e.g. after 15-characters. But this.length will be also ok.
endOffset = $(this).length;
range.setStart(MyEditable.firstChild,0);
range.setEnd(event.target,endOffset);
var selection = window.getSelection();
selection.addRange(range);
// get html for your div
var myDiv = document.getElementById('MyEditableId');
alert(myDiv.innerHTML);
});
Just change the alert line in your code to below one works well..
alert($.trim($('<div>').append(range.cloneContents()).html()));
I was just going through the documentation for Range and selection. You could use the extractContents() or cloneContents() method supported by the Range object like so in your case Demo:
var fragment = window.getSelection().getRangeAt(0).extractContents();
This automatically gets the first range that the user has selected. Although users can select multiple ranges by holding down the "Ctrl" Key. This gives the exact match till the cursor position for the simpler cases.
There are some caveats to this though. Both the methods extractContents() or cloneContents() return a documentFragment and the documentation clearly states that:
Event Listeners added using DOM Events are not copied during cloning. HTML attribute events are duplicated as they are for the DOM Core cloneNode method. HTML id attributes are also cloned, which can lead to an invalid document through cloning.
In essence, document fragments can contain invalid HTML and therefore you can not use all the normal DOM like .html() in some cases.
I found a relevant SO post on getting the cursor position on a contentEditable element and came across a TextRange Object (supported in IE < 9) and it has an htmlText property which returns the HTML source of the selection as a 'valid' HTML fragment. So in that case you would do something like:
var fragment = document.selection.createTextRange().htmlText;
However since most of the modern browsers support Window.getSelection(), it's good practice that you use it and build upon the suitable methods you have at your disposal. Hope it gets you started in the right direction.
Sidenote - Also from the docs:
using a selection object as the argument to window.alert will call the object's toString method

Rangy and IE8 - positioning caret after element at end of paragraph

I'm using Rangy to perform several operations in a rich text editor (designmode = "on"). One of these functions is pasting formatted content which can represent certain pre-defined characters the user has created before-hand. All of the text content is held in paragraph elements. The user may start with this:
<p>The following is a special character: |</p>
where the pipe (|) is the caret position. They then choose to paste one of the 'special' characters via a button on the editor to end up with this:
<p>The following is a special character: <span class="read-only" contenteditable="false">SPECIAL</span>|</p>
The action uses Rangy behind the scenes to maintain the position of the caret (SelectionSaveRestoreModule) during the internal paste process which could be post-paste-processing the text in the editor and which likely messes up the position of the cursor otherwise.
However, in IE8 the caret cannot be placed after the <span> since there appears to be a bug which makes it an invalid position. As a result the cursor appears before the <span> element and it is not even possible to move the cursor after the span with the keyboard cursor controls. In fact, it even prevents the cursor moving on to any following paragraphs.
I have experimented with several techniques over recent days, including placing extra characters after the <span>s with some success. However those extra characters obviously cause confusion for the user when they appear and are not ideal. Using the zero-width space is visually better but attempting to tidy them up after the paste operation causes issues.
I need a 'tidy' method of supporting this user requirement for the special characters and I freely accept I may be approaching this in the wrong way.
I have a solution which seems to be working in my tests so far, but when I look at it it still fills me with a feeling that there must be a better way (not to mention a sense of dread).
What this code tries to do is place a zero-width-space after any read-only span which is at the end of a paragraph. It does this by inspecting the nodes after these elements to determine if there is actually text in them. At the same time it removes any zero-width-spaces which may still be in the text from previous inspections which are now no longer needed.
var ZWS = '\ufeff';
jQuery(_doc.body).find('p').each(function () {
var lastContentEditable = undefined;
// Look through the root contents of each paragraph to remove no-longer require zws fixes
jQuery(this).contents().each(function () {
if (this.nodeType === 3) {
if (this.nodeValue.indexOf(ZWS) != -1) {
// Text node containing a ZWS - remove for now
this.nodeValue = this.nodeValue.replace(new RegExp(ZWS, 'g'), '');
}
// Does this node now contain text?
if (this.nodeValue.length > 0 && lastContentEditable) {
// Found text after a read-only node, ergo we do not need to modify that read-only node at the end
lastContentEditable = undefined;
}
} else if (this.nodeType === 1 && this.getAttribute('contenteditable') === "false") {
// Indicate that this is currently the last read-only node
lastContentEditable = this;
}
});
if (lastContentEditable) {
// It appears that there is a read-only element at the end of the paragraph.
// Add the IE8 fix zws after it.
var node = document.createTextNode(ZWS);
jQuery(lastContentEditable).after(node);
}
});

Applying DIV/Span tag to a word by specific co-ordinates

Sample HTML Data
<body style="width:300px;">
<h3>Long-Text</h3>
A simple tool to store and display texts longer than a few lines.
The search button will highlight all the words matching the name of objects that are members of the classes listed in searchedClasses, itself a member of the KeySet class. The highlighted words are hypertext.
Edit invokes wscripts/acedb.editor, which by default launches emacs. Edit that file to start another editor in its place.
Save will recover from the emacs but will not destroy it.
Read will read a text file, so you could Search it.
**general** grep is a way to annotate a set of longtexts versus the searchedClasses. It outputs an ace file that you can then hand check and read back in acedb to create XREF from longTexts to genes etc.
<h3>World Wide NotePad</h3>
World wide notepad is a small text editor similar to Microsoft's notepad but has some more useful features like an auto typer to make typing the same sentence or word more easy, also World Wide NotePad has a text to speech feature which reads all text in the current open document and speaks it out load to you.
<h3>Etelka Wide Text Pro Bold Italic</h3>
</body>
For example -> "general" (between ** ) is at x=0 and y=465. I know the x,y position. But How to highlight a word located at specific location ?
Let me explain once again. I want to highlight a word by location.
for example I have a location value (x,y)=(0,625). I want to extract the first word by that location ( assume - at that location - we have word "World" ) Then how to highlight that word ?
Edit :
Here Y co-ordinate is absolute position of entire html document.
The only method I can think of involves wrapping every word in a span element, and then using document.elementFromPoint(x,y) to get the span element at the given location. Something like this:
function highlightWordAtXY(x, y) {
// Get the element containing the text
var par = document.elementFromPoint(x, y),
// textContent or innerText ?
t = "textContent" in par ? "textContent" : "innerText",
// Get the text of the element. No pun intended on the par[t].
text = par[t],
result;
// Wrap a span around every word
par.innerHTML = text.replace(/\b(\w+)\b/g, "<span>$1</span>");
// Get the elementFromPoint again, should be a span this time
result = document.elementFromPoint(x, y);
// Check that we actually clicked on a word
if (result == par)
return false;
// Wrap HTML text around the text at x, y
result[t] = '<span class="highlight">' + result[t] + '</span>';
// Restore the content with the wrapped text
par.innerHTML = par[t];
}
Example at http://jsfiddle.net/BSHYp/1/show/light/ - click a word and watch it highlight.
Some important caveats here:
Each block of text must be wrapped in an element (such as <p> or <div>). You should be wrapping paragraphs in <p> tags anyway,
The element at the given location (x, y) must only have text in it, no child HTML elements. Text nodes with sibling HTML elements will have them removed (e.g. Clicking "Some" or "here" in Some <b>text</b> here will remove the <b> tags). Dividing them into separate <span> elements would be the only solution without building a much more complex routine,
IE will throw an "Unknown runtime error" if you try and add a block level element to a <p> tag,
On very, very, very large blocks of text you might run into performance issues. Break them up where applicable.

Categories

Resources