removing BBcode from textarea with Javascript - javascript

I'm creating a small javscript for phpBB3 forum, that counts how much character you typed in.
But i need to remove the special characters(which i managed to do so.) and one BBcode: quote
my problem lies with the quote...and the fact that I don't know much about regex.
this is what I managed to do so far but I'm stranded:
http://jsfiddle.net/emjkc/
var text = '';
var char = 0;
text = $('textarea').val();
text = text.replace(/[&\/\\#,+()$~%.'":*?<>{}!?(\r\n|\n|\r)]/gm, '');
char = text.length;
$('div').text(char);
$('textarea').bind('input propertychange', function () {
text = $(this).val();
text = text.replace(/[&\/\\#,+()$~%.'":*?<>{}!?\-\–_;(\r\n|\n|\r)]/gm, '');
char = text.length;
$('div').text(char);
});

You'd better write a parser for that, however if you want to try with regexes, this should do the trick:
text = $('textarea').val();
while (text.match(/\[quote.*\[\/quote\]/i) != null) {
//remove the least inside the innermost found quote tags
text = text.replace(/^(.*)\[quote.*?\[\/quote\](.*)$/gmi, '\$1\$2');
}
// now strip anything non-character
text = text.replace(/[^a-z0-9]/gmi, '');

I'm not sure if this would work, but I think you can replace all bbcodes with a regex like this:
var withoutBBCodes = message.replace(/\[[^\]]*\]/g,"");
It just replaces everything like [any char != ']' goes here]
EDIT: sorry, didn't see that you only want to replace [quote] and not all bbcodes:
var withoutBBQuote = message.replace(/\[[\/]*quote[^\]]*\]/g,"");
EDIT: ok, you also want quoted content removed:
while (message.indexOf("[quote") != -1) {
message = message.replace(/\[quote[^\]]*\]((?!\[[[\/]*quote).)*\[\/quote\]/g,"");
}
I know you already got a solution thanks to #guido but didn't want to leave this answer wrong.

Related

javascript regex to remove whitespace fails, why?

I use text.replace(/\s/g, '') to remove all whitespace characters from a String.
I'm trying this on a russian text. I do an alert(text) which shows me the correct string, but the replace function throws this error - Bad Argument /\s/g
I'm creating .jsx files for Adobe InDesign scripting. The replace method works for some strings but fails sometimes. Any idea why?
Thanks.
EDIT
for (var i=0; i<arr.length; i++) {
// If there is no text for the current entry, remove it
alert(arr[i].text);
if (arr[i].text == undefined || arr[i].text === "") {
arr.splice(i,1);
i--;
continue;
}
var trimmed = arr[i].text.replace(/\s/g, '');
if (trimmed.text === "") {
entries.splice(i,1);
i--;
}
.
.
.
}
You need to escape ("\\") if there are any regex special characters like $, ^, etc... in your text.
-Try to post the fiddle or paste the failing text, we can check the issue.
My bad - this is my edited answer.
var str = "Hello this is my test string";
var newStr = str.replace(/ /g, '');
alert(newStr) // "Hellothisismyteststring";
I was using .text( ) to populate the text objects. I learnt that this function converts space to non breaking space (character 160).
Had to strip that too...
text.replace(/ |\s+/g)

Use JavaScript string operations to cut out exact text

I'm trying to cut out some text from a scraped site and not sure what functions or library's I can use to make this easier:
example of code I run from PhantomJS:
var latest_release = page.evaluate(function () {
// everything inside this function is executed inside our
// headless browser, not PhantomJS.
var links = $('[class="interesting"]');
var releases = {};
for (var i=0; i<links.length; i++) {
releases[links[i].innerHTML] = links[i].getAttribute("href");
}
// its important to take note that page.evaluate needs
// to return simple object, meaning DOM elements won't work.
return JSON.stringify(releases);
});
Class interesting has what I need, surrounded by new lines and tabs and whatnot.
here it is:
{"\n\t\t\t\n\t\t\t\tI_Am_Interesting\n\t\t\t\n\t\t":null,"\n\t\t\t\n\t\t\t\tI_Am_Interesting\n\t\t\t\n\t\t":null,"\n\t\t\t\n\t\t\t\tI_Am_Interesting\n\t\t\t\n\t\t":null}
I tried string.slice("\n"); and nothing happened, I really want a effective way to be able to cut out strings like this, based on its relationship to those \n''s and \t's
By the way this was my split code:
var x = latest_release.split('\n');
Cheers.
Its a simple case of stripping out all whitespace. A job that regexes do beautifully.
var s = " \n\t\t\t\n\t\t\t\tI Am Interesting\n\t\t \t \n\t\t";
s = s.replace(/[\r\t\n]+/g, ''); // remove all non space whitespace
s = s.replace(/^\s+/, ''); // remove all space from the front
s = s.replace(/\s+$/, ''); // remove all space at the end :)
console.log(s);
Further reading: https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/RegExp
var interesting = {
"\n\t\t\t\n\t\t\t\tI_Am_Interesting1\n\t\t\t\n\t\t":null,
"\n\t\t\t\n\t\t\t\tI_Am_Interesting2\n\t\t\t\n\t\t":null,
"\n\t\t\t\n\t\t\t\tI_Am_Interesting3\n\t\t\t\n\t\t":null
}
found = new Array();
for(x in interesting) {
found[found.length] = x.match(/\w+/g);
}
alert(found);
Could you try with "\\n" as pattern? your \n may be understood as plain string rather than special character
new_string = string.replace("\n", "").replace("\t", "");

How to detect line breaks in a text area input?

What is the best way to check the text area value for line breaks and then calculate the number of occurrences, if any?
I have a text area on a form on my webpage. I am using JavaScript to grab the value of the text area and then checking its length.
Example
enteredText = textareaVariableName.val();
characterCount = enteredText.length; // One line break entered returns 1
If a user enters a line break in the text area my calculation above gives the line break a length of 1. However I need to give line breaks a length of 2. Therefore I need to check for line breaks and the number of occurrences and then add this onto the total length.
Example of what I want to achieve
enteredText = textareaVariableName.val();
characterCount = enteredText.length + numberOfLineBreaks;
My solution before asking this question was the following:
enteredText = textareaVariableName.val();
enteredTextEncoded = escape(enteredText);
linebreaks = enteredTextEncoded.match(/%0A/g);
(linebreaks != null) ? numberOfLineBreaks = linebreaks.length : numberOfLineBreaks = 0;
I could see that encoding the text and checking for %0A was a bit long-winded, so I was after some better solutions. Thank you for all the suggestions.
You can use match on the string containing the line breaks, and the number of elements in that array should correspond to the number of line breaks.
enteredText = textareaVariableName.val();
numberOfLineBreaks = (enteredText.match(/\n/g)||[]).length;
characterCount = enteredText.length + numberOfLineBreaks;
/\n/g is a regular expression meaning 'look for the character \n (line break), and do it globally (across the whole string).
The ||[] part is just in case there are no line breaks. Match will return null, so we test the length of an empty array instead to avoid errors.
Here's one way:
var count = text.length + text.replace(/[^\n]/g, '').length;
Alternatively, you could replace all the "naked" \n characters with \r\n and then use the overall length.
I'd do this using a regular expression:
var inTxt = document.getElementById('txtAreaId').value;
var charCount = inTxt.length + inTxt.match(/\n/gm).length;
where /\n/ matches linebreaks (obviously), g is the global flag. m stands for mult-line, which you evidently need in this case...Alternatively, though as I recall this is a tad slower:
var charCount = inTxt.length + (inTxt.split("\n").length);
Edit
Just realized that, if no line breaks are matched, this will spit an error, so best do:
charCount = intTxt.length + (inTxt.match(/\n/) !== null ? inTxt.match(/\n/gm).length : 0);
Or something similar...
For new JS use encodeURI(), because escape() is deprecated in ECMAScript 1.5.
Instead use:
enteredText = textareaVariableName.val();
enteredTextEncoded = encodeURI(enteredText);
linebreaks = enteredTextEncoded.match(/%0A/g);
(linebreaks != null) ? numberOfLineBreaks = linebreaks.length : numberOfLineBreaks = 0;
You can split the text based on new lines:
let textArray = text.split(/^/gm)
console.log(textArray.length)

Can I use a regex and foo.replace() to substitute occurances of a string that aren't in anchor tag?

I'm trying to use JavaScript to replace target text with a hyperlinked version of the target text. Generally speaking, this is the function in question:
function replace_text_in_editor(target_text, target_type, target_slug) {
//if target_text was "Google", then the replacement_text might be "Google
var replacement_text = get_replacement_text(target_text, target_type, target_slug);
if(typeof replacement_text != undefined && replacement_text != '') {
var content = '';
content = jQuery( "#content" ).val();
content = content.replace(target_text,replacement_text)
if(content != '') {
jQuery( "#content" ).val(content);
}
}
}
I've tried a couple permutations of the following line, which I'd like to alert to only replace text that's not already hyperlinked.
var regex = "/" + target_text + "/";
content = content.replace(regex,replacement_text);
Example attempt:
var regex = "/^(<a.*?>)" + target_text + "^(<\/a>)/";
Can someone please correct me with a regex showing how I should be doing this? No need to explain what the regex does step by step, as I can infer that from the design. Thank you!
I think you want this but I'm not sure that I compeletly understand.
If you use RE to find the text you can assign it to group $1 by putting () around it (in this case "Google").
Then when you go to replace it you build the expression with that group assignment id $1
\<a href="$1\.com"\>$1\<\/a\>

Converting HTML to its safe entities with Javascript

I'm trying to convert characters like < and > into < and > etc.
User input is taken from a text box, and then copied into a DIV called changer.
here's my code:
function updateChanger() {
var message = document.getElementById('like').value;
message = convertHTML(message);
document.getElementById('changer').innerHTML = message;
}
function convertHTML(input)
{
input = input.replace('<', '<');
input = input.replace('>', '>');
return input;
}
But it doesn't seem to replace >, only <. Also tried like this:
input = input.replace('<', '<').replace('>', '>');
But I get the same result.
Can anyone point out what I'm doing wrong here? Cheers.
A more robust way to do this is to create an HTML text node; that way all of the other potentially invalid content (there's more than just < and >) is converted. For example:
var message = document.getElementById('like').value;
document.getElementById('changer').appendChild(document.createTextNode(message));
UPDATE
You mentioned that your event was firing upon each key press. If that's what's triggering this code, you'll want to remove what was previously in the div before appending the text. An easy way to do that is like this:
var message = document.getElementById('like').value;
var changer = document.getElementById('changer');
changer.innerHTML = '';
changer.appendChild(document.createTextNode(message));
Try something like this:
function convertHTML(input)
{
input = input.replace(/>/g, '>');
input = input.replace(/</g, '<');
return input;
}
replace only replaces the first occurrence of > or < in the string, in order to replace all occurrences of < or >, use regular expressions with the g param to ensure the entire string is searched for all occurrences of the values.

Categories

Resources