How to realise safe text editor for editing content - javascript

There is different content on site, which is allowed to be created/edited - news, articles, etc.
How to make correct and safe data transfer from editor to database?
I'd like to use wysiwyg editor, because potential users of this editor will be not such experienced users (Markdown and BB-code will be difficult for them, they want like in MS Word =) )
Also I'd like to add restrictions to this editor, for example: no images, only 5 colors, only 3 types of fonts, etc. (This can be done with limited controls of this editor)
My question: How to make this editor safer? How to prevent adding extra-html from user, or <script> tags. Do I have to make a html-filter of data came from database (saved content, that users wrote in editor) while rendering template page of this content (news or article)?
Should I store content in HTML-way in database? (If I want wysiwig-editor and it outputs HTML after saving). Or may be I should convert HTML from editor to bb-code or markdown (will all my limitations and restrictions) and clearing all extra-HTML... And then when getting content from database - I should convert bb-code/markdown to HTML again.
Or maybe there are easier and faster ways to making this safe?

If you are populating the text into the innerHTML of lets say a div, it allows a user to write html and display it as HTML later. However, if you don't want to let people inject HTML you can use the innerText instead. innerText works just like innerHTML but does not hit the HTML parser.
If you plan on using bb code or markdown you would parse the text for the code that needs to be converted and leave the rest as text.
You could also use regex parser to convert special characters to the HTML code equivalent then the bb code or markdown to html
Try this:
When saving to the database:
Replace known well formatted html with bb code replacing <b> with [b]. However ill formatted html will remain as typed <b > will stay <b >. Then do a regex replace on all HTML special characters ( ie < and > )
Then when retrieving from the database, you replace the bb code with html and you are all set.

Related

Showing rich text from database

I have stored rich text in my db, and now I would like to show it to the website viewers, but when I echo the content I got this:
<p>sometext</p><strong>text</strong>
I would like to remove the 'P' tags and any other tags from the text.
I have used Ckeditor to store the rich text into DB.
I could use Ckeditor to show the rich text to the website viewers, but Ckeditor is an editor and I would like only to show the rich text.
Is there any in-built php command to convert the stored text into rich text and display it on my website?
Well "rich text" is has its own format. It's not xml like. So for example, a simple file where I will try to infer formatting of:
Hello
This is bold
This italic
Looks like this in "rich text":
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\f0\fs22 Hello\par
\b This is bold\b0\par
\i This italic\i0\par
\par
}
So it is not so simple to get this into HTML.
I'ts straight forward on steps to get it into html (some text parsing involved, and a loop) but from your question it doesnt seem like you are (1) aware of it's format, and (2) haven't tried to write code to make it html?
I can add to this answer if you have actually tried on how the parsing steps might go. I can add now but want to get more information so as not to provide useless code, say if you are already using an API that does the deed.
I use Draft.js with React.js but I hope this helps. if you're still facing this issue: You can see the solution in this video
Basically you have to convertToRaw, then JSON.stringify it. Then this can be sent to your backend as a string. To display it, make a GET request for that particular data, then JSON.parse it and then convertFromRaw. Pass this into another RichTextEditor as the editorState but set the readOnly={true}
Find the methods for your particular editor which allows you to convert states and display it.

How to insert a string of plain text from a webpage into my own site's HTML?

Apologies, I'm sure there is a simple solution for this. But after 2 days searching, I can't find the right answer.
The basics: I want to pull a string from a plain text file (.txt) that is live and online, uploaded to my server (src="example.txt") and insert that string into a webpage on that same site ("page.html").
More details: My site is very basic. It allows readers (language learners) to listen to audio of a reader, while the site highlights the passages being read. So the plain text files in question are text/audio timing codes for dialog subtitles, and I'm trying to automatically generate the HTML from them so I don't have to insert the text + span codes into each HTML page individually (there are a lot of them). I have a very simple script that turns the subtitle timing codes into simple tags for HTML, and those text strings are what need to be read from the txt file, and inserted into a p tag in the HTML page. There is nothing else in the txt file.
The closest result I have found to my query is here. The and solutions are no good, as I don't want the content embedded — I need the unicode text string inserted in my HTML so it can be read by the script for the read-along highlighting. The Javascript suggested here, upon testing, doesn't work.
Your linked question makes use of innerHTML, which can have issues when using XML escaping or XML tags, e.g. <test>, since it will get seen as HTML. (valid XML tags are also valid HTML tags)
Assuming that is the issue you're stumbling on (there isn't much else to go off on), you can use innerText instead, which prevents the text to be interpreted as HTML.
If that isn't your problem, you'll have to be more specific about the exact issue you encountered in your code. It also isn't fully clear whether your .txt files contain HTML tags, e.g. <p>, or purely text.

In content management system, How do I display contents(text includes codes) without executing them on html page?

I'm creating a content management system for blogging for a project.When creating new post using TinyMCE text editor if we insert texts with some images codes(html,javascript,php) to the database, How do I display them after fetching from database without executing those code on html page.
You have to write a custom parser to parse your text. That way you can decide to keep which HTML tags. For beginning, you can consider using this https://github.com/paquettg/php-html-parser (So we don't have to reinvent the wheels)
Take a look at : http://php.net/manual/en/function.htmlspecialchars.php
htmlspecialchars function converts special chars (such as < >) to HTML entities (such as &br; &nbsp). So browser will not execute as HTML code but it will display special chars.

get original dom element innerHTML without javascript processing

Background - in an article editor powered by TinyMCE for an enterprise in-house CMS behind large media site/s
HTML
<p>non-breaking-space: pound: £ copyright: ©</p>
JS
console.log($('p').html());
console.log(document.getElementsByTagName('p').item(0).innerHTML);
both return
non-breaking-space: pound: £ copyright: ©
when I'm expecting
non-breaking-space: pound: £ copyright: ©
some elements get their entities reversed (like pound and copyright), and some are preserved (non-breaking space). I need a way to get the original inner HTML, all preserved, not one that is processed by the browser; is that possible?
This is for a TinyMCE plugin which processes input using jQuery and puts it back. The content is loaded via a database, the plugin is processing image tags did not want to modify the text content at all. The automatic change of some entities back to the raw characters wouldn't be too much of a problem, but -
We cannot modify editorial's input, even if it were minor
We enforce that these must be entities before they save due to some browser compatibility issues on our sites
I would use this answer - https://stackoverflow.com/a/4404544/830171 - however cannot as my HTML code is within a textarea that the user needs to edit and that I need to run jQuery DOM manipulation on (via the plugin).
One way I can think of is not use jQuery/DOM to process the image tags I need to change, but to use regex like a lot of TinyMCE plugins do; but since I was shot down in regex to pull all attributes out of all meta tags for attempting any regex on HTML, was hoping for a better way!
Tinymce uses a contenteditable iframe to edit the content. That's the reason why
console.log($('p').html()); will log something else.
Use the following code to get the pure editor content:
tinymce.get('your_editor_id').getBody().innerHTML

Need a web-embedded editor that doesn't save in HTML

Our clients are able to edit some text on our "admin" web site, which then displays to their customers on another "client" web site. They now want the ability to add mark-up like bold, italic, underline (and combinations of the above) plus links to web pages. Unfortunately, because we use a web framework that passes the stored text through a XSLT decoder (don't ask), we can't just save their changes as HTML because otherwise it will screw up the XSLT step.
I was thinking that what I need is something like a Markdown or BBCode editor that stores the text in the database with the Markdown or BBCode markup, and then some javascript on the client side that interprets the markup into HTML. Is there such a thing?
Here are two:
http://attacklab.net/showdown/
http://github.com/openlibrary/wmd (similar or the one SO's markdown editor is based on)

Categories

Resources