Parsing plain text Markdown from a ContentEditable div

Parsing plain text Markdown from a ContentEditable div - javascript

I know there are other questions on editable divs, but I couldn't find one specific to the Markdown-related issue I have.
User will be typing inside a ContentEditable div. And he may choose to do any number of Markdown-related things like code blocks, headers, and whatever.
I am having issues extracting the source properly and storing it into my database to be displayed again later by a standard Markdown parser. I have tried two ways:
$('.content').text()
In this method, the problem is that all the line breaks are stripped out and of course that is not okay.
$('.content').html()
In this method, I can get the line breaks working fine by using regex to replace <br\> with \n before inserting into database. But the browser also wraps things like ## Heading Here with divs, like this: <div>## Heading Here</div>. This is problematic for me because when I go to display this afterwards, I don't get the proper Markdown formatting.
What's the best (most simple and reliable) way to solve this problem as of 2015?
EDIT: Found a potential solution here: http://www.davidtong.me/innerhtml-innertext-textcontent-html-and-text/

if you check the documentation of jquery's .text() method,
The result of the .text() method is a string containing the combined text of all matched elements. (Due to variations in the HTML parsers in different browsers, the text returned may vary in newlines and other white space.)
so getting whitespaces is not guaranteed in all browsers.
try using the innerText property of the element.
document.getElementsByClassName('content')[0].innerText
this returns the text with all white spacing intact. But this is not cross browser compatible. It works in IE and Chrome, but not in Firefox.
the innerText equivalent for Firefox is textContent (link), but that strips out the whitespaces.

This is what I've been able to come up with using that link I posted above in my edit. It's in Coffeescript.
div = $('.content')[0]
if div.innerText
text = div.innerText
else
escapedText = div.innerHTML
.replace(/(?:\r\<br\>|\r|\<br\>)/g, '\n')
.replace(/(\<([^\>]+)\>)/gi, "")
text = _.unescape(escapedText)
Basically, I'm checking whether or not innerText works, and if it doesn't then we do this other thing where we:
Take the HTML, which has escaped text.
Replace all the <br> tags with line breaks.
Strip out any tags (escaped ones won't be stripped, i.e. the stuff the user types).
Unescape the escaped text.

Related

how to display whitespace characters.. but omit when text is selected

Setup:
I'd like to output some text that shows visible spaces, linebreaks, etc
(For the purpose of displaying strings for debug purposes (or for say a rich-text editor))
ie, id like to make the following type of substitutions
" " -> "<span class="whitespace">·</span>"
"\r" -> "<span class="whitespace">\\r</span>"
"\n" -> "<span class="whitespace">\\n</span>"
perhaps the following CSS rule could be defined
/*display whitespace chars as a light grey*/
.whitespace { color:#CCC; }
so that
this two line
string
would be displayed as
this·two·lined\n
\t string
The Question:
Is it possible so that when the above "visual-whitepace" text is selected / copied-to-clipboard... it copies without the whitespace markup?
Is there some CSS property to display x, but copy y?
javascript hack?
special whitespace-font?
other?

<style>.paragraph-marker:after { content: "\B6" }</style>
<p>Foo<span class="paragraph-marker"></span></p>
<p>Bar<span class="paragraph-marker"></span></p>
The :after is a "pseudo-selector" which matches a pseudo-node that immediately follows the affected element.
The content property can be used with these pseudo-nodes to specify the textual content of them. It comes in handy when specifying quotation marks before and after quoted sections, or list separators like commas in semantic HTML <ol> which you don't want to display in bullet format.
It should come in handy for your use case since browsers don't deal with pseudo-nodes when converting a DOM selection stored in the clipboard to plain text on paste.

http://codepen.io/msvbg/pen/ebgrj
Works fine in the latest version of Chrome. Flip the showWhitespace variable to try it both ways. It works by sticking a visible whitespace layer underneath the text layer, and only the top-most layer is copied by default.

Use JS to replace text in Gmail message body

I want to write an GnuPG extension for Google Chrome. So far, everything works as expected: If I detect ASCII armored crypt-text, I parse it with my extension and then replace it. (after password has been entered)
Gmail however litters the message body with an insane amount of tags, so my simple JS approach doesn't work anymore. Is there something which can select an certain amount of visible text, no matter how many tags are contained in it, and replace it with some other text? (the tags don't need to survive). ie I want to unencrypt the mailbody in place.

what do you need is something like this:
/<[^>]+>/g
this regexp will remove all tags, an leave plain text...
just gotta replace for nothing... something like this:
"<p>text <b>full</b> of <i>junk</i> and <u>unwanted</u> tags</p>".replace(/<[^>]+>/g, "");
...and about selecting an specific part you can use substring, I guess!

What I really needed to do was a little different:
expand my regex so it didn't care about tags:
var re = /-----[\s\S]+?-----[\s\S]+?-----[\s\S]+?-----/gm;
store all the matches, with tags
use the regex provided by gibatronic to remove tags and then further process the cleaned text using gpg
use body.innerHTML.replace() to replace the matches from 1) with the processed text from 3)
It works now, the only problem is it breaks Gmail. Site layout stays intact, but all buttons and links become defunct. Only solution is to reload the page. Gotta fix this :S

Need to color the tags in an xml, displayed in a textarea

I need to color the tags in an XML string, which is displayed in the textarea of an html page.
say for example, im having an xml string stored in a variable 'xmldata'.
the textarea tag in html is as below
<textarea id="xmlfile" cols="20" rows="30"></textarea>
using the below javascript statement, im displaying the xml string in the textarea
document.getElementById("xmlfile").value=xmldata;
But the xml string is displayed as a plain text in the textarea.
Is there any javascript function to color the tags in xml ?
I don't want any external javascript and css code work like "google-code-prettify"
All i need is a simple javascript function that colors the tags in an xml string which is displayed in the textarea.
Please help me with a solution.
-Dinesh

Since the contents of your text area are not separate DOM elements I don't believe you'll be able to individually set their attributes (since they don't have individual attributes). You might find some variation on a rich text editor that you can plug in. This may or may not violate your stipulation that you don't want external javascript libraries.

As replied here have a look at a self contained prettifier that works for most cases does nice indenting for long lines and colorizes the output if needed. Nevertheless I guess it might not help if you need it inside a textarea.
function formatXml(xml,colorize,indent) {
function esc(s){return s.replace(/[-\/&<> ]/g,function(c){ // Escape special chars
return c==' '?' ':'&#'+c.charCodeAt(0)+';';});}
var se='<p class="xel">',tb='<div class="xtb">',d=0,i,re='',ib,
sd='<p class="xdt">',tc='<div class="xtc">',ob,at,sz='</p>',
sa='<p class="xat">',tz='</div>',ind=esc(indent||' ');
if (!colorize) se=sd=sa=sz='';
xml.match(/(?<=<).*(?=>)|$/s)[0].split(/>\s*</).forEach(function(nd){
ob=nd.match(/^([!?\/]?)(.*?)([?\/]?)$/s); // Split outer brackets
ib=ob[2].match(/^(.*?)>(.*)<\/(.*)$/s)||['',ob[2],'']; // Split inner brackets
at=ib[1].match(/^--.*--$|=|('|").*?\1|[^\t\n\f \/>"'=]+/g)||['']; // Split attributes
if (ob[1]=='/') d--; // Decrease indent
re+=tb+tc+ind.repeat(d)+tz+tc+esc('<'+ob[1])+se+esc(at[0])+sz;
for (i=1;i<at.length;i+=3) re+=esc(' ')+sa+esc(at[i])+sz+"="+sd+esc(at[i+2])+sz;
re+=ib[2]?esc('>')+sd+esc(ib[2])+sz+esc('</')+se+ib[3]+sz:'';
re+=esc(ob[3]+'>')+tz+tz;
if (ob[1]+ob[3]+ib[2]=='') d++; // Increase indent
});
return re;
}
For demo see https://jsfiddle.net/dkb0La16/

Content inside CDATA is not displayed properly when processed through JavaScript

I have an XML document with some sample content like this:
<someTag>
<![CDATA[Hello World]]>
</someTag>
I'm parsing the above XML in JavaScript. When I try access and render the Hello World text using xmldoc.getElementsByTagName("someTag")[0].childNodes[0].textContent all I get was a blank text on screen.
The code is not returning undefined or any error messages. So I guess the code is properly accessing the message. But due to CDATA, it is not rendering properly on screen.
Anyway to fix the issue and get the Hello World out of this xml file?

Note that Firefox's behaviour is absolutely correct. someTag has three children:
A Text node containing the whitespace between the <someTag> and <!CDATA. This is one newline and one space;
the CDATASection node itself;
another whitespace Text node containing the single newline character between the end of the CDATA and the close-tag.
It's best not to rely closely on what combination of text and CDATA nodes might exist in an element if all you want is the text value inside it. Just call textContent on <someTag> itself and you'll get all the combined text content: '\n Hello World\n'. (You can .trim() this is you like.)

If you're running Firefox, maybe this is the issue you're having. The behavior looks very similair... The following might do the trick:
xmldoc.getElementsByTagName("someTag")[0].childNodes[1].textContent;

problem with IE which removes new lines from $("#content").text()

problem with IE which removes new lines from $("#content").text()
HTML Code
<div id="content">
<p>hello world</p>
<p>this is a paragraph</p>
</div>
jQuery Code
alert($("#content").text());
result (IE) IE removes new line (\n) how can i fix this problem ?
hello worldthis is a paragraph
result (FF)
hello world
this is a paragraph
take a look :
http://jsfiddle.net/vB3bx/

It works fine if you use the div's innerText property. Try replacing
alert($("#content").text());
with
alert( document.getElementById( "content" ).innerText );

Internet Explorer normalizes all white-space and new-line characters into the SPACE character. As far as I know, there is nothing you can do about it.
btw, it seems that IE9 beta changed this behavior. I get new-lines in it.

I'm pretty sure Sime Vidas is correct. I've tried just about everything and whatever I try (innerText, innerHTML, jQuery methods, TextRange, cloning element and putting it inside a pre element etc etc) white space is removed. I'm guessing IE will just remove it on any type of call. It clearly is there during rendering since a white-space:pre will show it, but retrieving it through javascript will always remove the white space except for pre and textarea content.
This behavior has changed in IE9. The only solution in older versions would be to replace new line characters with tags (or anything really, semicolon etc) on the server if possible and then replace them back into \n in javascript after retrieving the text content.

Not sure you're going to find a solution using text() if you consider the following:
Due to variations in the HTML parsers
in different browsers, the text
returned may vary in newlines and
other white space.
at http://api.jquery.com/text/

Develop Reference

JavaScript is the programming language of the Web.

Parsing plain text Markdown from a ContentEditable div - javascript

Related

how to display whitespace characters.. but omit when text is selected

Use JS to replace text in Gmail message body

Need to color the tags in an xml, displayed in a textarea

Content inside CDATA is not displayed properly when processed through JavaScript

problem with IE which removes new lines from $("#content").text()

Categories

Resources