Chrome inserts non-breaking spaces into copy and pasted content

Chrome inserts non-breaking spaces into copy and pasted content - javascript

I'm talking about content from inside a contenteditable div, and the target is the same contenteditable div. So no external programs involved.
The structure of the HTML in this div is that each individual word is inside a span with some data we need to track. Then the whitespace is left as text nodes between the spans. This works fine for the most part (screw you newlines) but I've encountered a strange problem when copy and pasting.
Chrome turns this
<span attrs="stuff">word</span> <span attrs="stuff">another</span>
into this:
<span attrs="stuff">word </span><span attrs="stuff">another</span>
or this:
<span attrs="stuff">word</span><span style="line-height: 16.79999"> </span><span attrs="stuff">another</span>
This obviously means that if the user copy and pastes over more than one line, then the formatting is completely screwed up, and the content of the span has changed which invalidates our data that we need to track.
The core problem is that other stuff in the div may contain non-breaking spaces for real reasons, so if I globally start swapping them out, then I might break that.
For my spans with my attrs, then I know what should be in them so it's easy to strip out the non-breaking spaces and restore it to how it should be. But for these strange spans with the odd line height, I've no idea how to clean them out without nuking everything.
Right now, I've stripped all the inserted spans that contain just a non-breaking space. But what I'd really like is to either stop Chrome from doing this in the first place, or an unambiguous means to identify the problematic extra spans so that I can clean them up in safety without breaking any similar spans that exist for real reasons. I could use this strange line-height I guess but that's pretty brittle and unsafe it feels.
How can I prevent the spans from appearing or identify them unambiguously?

The problem is not a Chrome problem only. All the time you copy HTML Code somewhere something like this can happen.
This is why you can use editors like CKEditor. They have advanced filter techniques to remove such bad HTML code.
I recommend to use a clipboard program to see how the HTML code is when you copy from different places: https://softwarerecs.stackexchange.com/questions/17710/see-clipboard-contents-hex-text
But implementing this on your own would be a waste of time in my opinion.
CKEditor can be configured very well to prevent the bad HTML code.
Recent versions of CKEditor have a very sophisticated content filtering approach. It is called "Advanced Content Filter".
Basically "Advanced Content Filter" means: The whole HTML code gets parsed or checked. In the case that there is no rule which matches to the given HTML code, it gets filtered out.

Related

Contenteditable - Editing <code> element is inconsistent in Chrome

I have a WYSIWYG Editor using contenteditable that allows users to insert "code snippets" using a <code> element. For instance:
<div contenteditable="true">
<p>
This is a paragraph with an <code>inline snippet</code>!
</p>
</div>
https://jsfiddle.net/wyeack/pyta77zd/2/
In firefox, if you place the caret directly before the first character of <code>, it will prepend the element:
However, if you try to do the same thing in Chrome, it appends it to the previous element:
This means that if you use chrome, there is no way of adding content to the beginning of this element.
What's going on here? Is there a way for me to make the behavior more consistent?

Firefox has a superior selection implementation in this case. You can place it inside an inline element like <code> if you move it from right to left using arrow keys. If you move it from left to right, it will stick to the left. This is, so called, gravity.
What's going on here? Is there a way for me to make the behavior more consistent?
First of all – don't use bare contentEditable. Use a good RTE.
Second (and last :P) of all – I don't know about any RTE which normalises this specific behaviour. It's an expensive thing to do and few users would notice it. It is possible, though, but you would need to use an RTE with a proper data model (where the selection is fully abstracted and all input intercepted) and based on that handle the input accordingly.
I could give you more details on how to do it with CKEditor 5, but it's not production ready yet. I've got no idea how to do that with other RTEs, but I know one thing for sure – I'd never ever attempted to fix this on a native contentEditable.

cross-browser way to get contenteditable's content

Different browsers generate the elements inside contenteditable differently. It's especially obvious when you have line breaks, or paste multi-line stuffs in.
In the old days with textarea, you can simply do $('textarea').val() to retrieve the content inside, and it's reliable and cross-browser compatible.
I wonder is there such universally-agreed method to retrieve content inside a contenteditble as well, such that it's striped off html tags, and lines are properly separated by \n. If not, how does Facebook messenger do it reliably? Do you need a complicated algorithm with browser detection?

One way to do it would be this:
var content = document.querySelector('[contenteditable]').textContent
The jury's still out on whether it's universally agreed upon because you would probably have to account for differences between browsers regarding newlines and what not.

Browser automatically adds <a>-Tag without JavaScript

I just ran into a problem on a bit bigger site than in the example below which caused me to investigate for several hours until I found the cause of the bug. My problem was, that the Browser automatically spawned <a>-Tags for no reason.
Let's say I've some HTML code that looks like this:
<a href="#">Link<a/>
<!-- Some code-magic; after a while you have something like: -->
<div>Not a link</div>
Of course the problem is pretty obvious in this case. But if the page is a bit more complex and you don't notice the wrong close of the <a>-Tag above, you're gonna have a bad time.
Why? Well, that's easy to show. You might expect that everything that follows is clickable. That is by fact true. But - and that's a thing that I did not know - the browser adds <a>-Tags after the page is loaded. Which means, in the inspector (tested in Chrome and Firefox) you'll find something like this:
Link
<a>
<!-- Some code-magic; after a while you have something like: -->
<div>Not a link</div>
</a>
Interesting, huh? The browser closes the <a>-Tag properly on the first line and opens a new one around the div. You might can guess that I've started debugging all JavaScripts on the page (an there were many) because I thought JavaScript is the only thing that changes the code after the page was fetched from the server.
Well, now I know, it's not the only thing. While trying to debug this problem I haven't found any information on the internet about it so I thought I'll share my knowledge with the guy having the exact same problem in the future (and we all know: He'll be there soon).
But, there is still one unanswered question: Why? I can't see a reason why the Browser should autofix this and create new tags. That doesn't make sense to me.

The actual question should be, why would you intentionally feed the browser with invalid HTML?
But back on-topic. Historically, HTML has been a jerky markup language. People without notion of DOM would write HTML such as:
<B><U>hi, </B> shall I be underlined or not?</U>
The above is clearly invalid HTML. However, the browser won't vomit if you feed it with invalid HTML. It will attempt to recover the document in the way it thinks the author intended.
Elements in the DOM (which is what the inspector shows you) can only have one parent element. So logically, the <U> must be closed before <B> is closed. But <U> hasn't been closed by the author, so the browser assumes the rest of the text shall too be underlined. Hence, the invalid HTML is recovered to approximately the following DOM structure:
<B><U>hi, </U></B><U> shall I be underlined or not?</U>
And in your specific case, <a> tags cannot be self-closed in HTML, so:
<a>...<a/>
The / is interpreted as syntactic sugar and the browser thinks you've opened a second <a> element without closing the first. It will most likely spawn the anchor element through the whole document until encountering the necessary closing </a>'s in the recover process.

Two things:
<a/> may be interpreted as an attempt for a self-closing anchor tag, so in fact a new anchor
Nested anchors are illegal, so the browser closes the first one before opening the second.
However, I have no proof of that, it's pretty much a guess.
More speculations: It is probably a browser-dependent behavior and related to why browsers automatically create <tbody> tags in tables. They just already know, if there is no <thead>, the website author intends to use <tbody>.
So once the self-closing <a/> appeared, it ended the unclosed anchor and started a new one. Though since <a> is meant to be a container, it can't be self-terminating. The anchor would then end according to XHTML rules, i.e. closing the contained tags before the containers.
This would require a lot of inquiries on both the HTML specifications and individual browser behavior.

Adding html/any tags to either side of selection - Javascript

Adding HTML/any tags to either side of selection - Javascript
The problem:
After creating a textarea box in my PHP/html file I wished to add a little more functionality and decided to make an textarea that can use formatting, for example
<textarea>
This is text that was inserted. <b>this text was selected and applied a style
via a button<b>
</textarea>
It doesn't matter what the tags are, (could be bubbles for all that I care due to the fact the PHP script, on receiving the $_POST data will automatically apply the correct tags with the tag as the style ID. Not relevant)
The Question/s
How can I create this feature using javascript?
Are there any links that may help?
And can, if there is information, can you explain it?
EDIT: Other close example but not quite is stackoverflow's editor and note that I do not wish to use 3rd party scripts, this is a learning process for me.
The tags that are inserted in the text are saved to a database and then when the page is requested the PHP replaces the tags with the style ID. If there is a work around not involving 3rd party scripts please suggest
And for the anti-research skeptics on a google search, little was found that made sense and there was Previous Research on SOF:
- https://stackoverflow.com/questions/8752123/how-to-make-an-online-html-editor
- Adding tags to selection
Thanks in Advance

<textarea> elements cannot contain special markup, only values. You can't apply any styling in a textarea.
What you'll need to do is fake everything that a text box would normally do, including drawing a cursor. This is a lot of work, as hackattack said.
You can do a lot if you grab jQuery and start poking around. Toss a <div> tag out there with an ID for ease and start hacking away.
I've never made one personally, but there is a lot to it. HTML5's contentEditable can maybe get you a good chunk of the way there: http://html5demos.com/contenteditable/
If you want to pass this data back to the server, you'll need to grab the innerHTML of the container and slap that into a hidden input upon submission of your form.
Here's other some things you can check out if you're just messing around:
tabindex HTML attribute, to get focus in your box from tabbing
jQuery.focus() http://api.jquery.com/focus/, to determine when someone clicks in your box
cursor: text in CSS for looks http://wap.w3schools.com/cssref/pr_class_cursor.asp
jQuery.keypress() http://api.jquery.com/keypress/, or similar for grabbing keystrokes
Edit: I think I completely misunderstood
If you're not looking for a rich text editor, and just want some helper buttons for code, maybe selectionStart and selectionEnd is what you're after. I don't know what the browser support is, but it's working in Chrome:
http://jsfiddle.net/5yXsd/

you can not do anything beside basic formatting inside a texarea. If you want complex formatting, look into setting a div's contentEditable attribute to true. Or you can make a wysisyg editor, but that is a big project. I strongly suggest using 3rd party code on this one.

I suggest you using the iframe to implement the WYSIWYG effect.
There is a property in iframe called designMode
See here for more
https://developer.mozilla.org/en/Rich-Text_Editing_in_Mozilla
Also there is a lightweight example maybe you would like to take a look:
http://code.google.com/p/rte-light/source/browse/trunk/jquery.rte.js

Is there a NO-OP tag in HTML?

I am looking for a tag that i can use to mark out a position in the html, which i can then find later using JQuery. However, I need the tag to be as useless as possible: even empty divs and spans can cause the layout to change depending on the CSS rules you set. For that matter, even rubbish tags that html doesn't understand seem to acquire styles from css, and I don't think there is any way to find comments via DOM traversal?
This tag will be used to mark out the start and end of a chunk of HTML to be Ajaxed. I do not want to wrap the whole chunk in a div or span (which i what i'm doing now), because this can affect how the CSS cascades and i want the fact that the html is marked out as a chunk to be completely transparent to the programmer (me).
Any ideas?
edit: I just thought of using empty script tags. Those should be completely inert and invisible. I shall look into it
edit: How could i forget about display: none? stupid stupid stupid

Script tags
Anchor tags <a name...>

Can you use comment tags: <!-- whatever -->? Parser would allow you to distinguish it.

Given that you're talking about trying to use comments or <script> tags it seems that you don't want the content of your "chunk" to be visible to the user? If so, why can't you just wrap it like this:
<div style="display:none;" id="myChunk1">...your content...</div>
That won't interfere with the layout. If you have multiple "chunks" on the page use class="chunkClass" instead of setting the style inline.
Using jQuery you can easily get access to the content, delete the whole chunk, replace it, make it visible, etc.
If one extra <div> or <span> is screwing up your layout there's probably something else going on with your CSS.

[Responding to the title, not the actual scenario] If PHP is involved, <?php  ?> makes a dandy no-op tag; e.g.,
<p>No space between this<?php
?>that.</p>
will render as
No space between thisthat.
(except this facility does not do the Right Thing for embedded PHP multi-line tags, so the preceding was coded without any embedded newlines).

Develop Reference

JavaScript is the programming language of the Web.

Chrome inserts non-breaking spaces into copy and pasted content - javascript

Related

Contenteditable - Editing <code> element is inconsistent in Chrome

cross-browser way to get contenteditable's content

Browser automatically adds <a>-Tag without JavaScript

Adding html/any tags to either side of selection - Javascript

Is there a NO-OP tag in HTML?

Categories

Resources