As part of a custom WYSIWYG editor, we've been asked to implement automatic emoticon parsing if enabled. To do this, we use Regular Expressions to replace character combinations with their associated PNG files.
Here is the relevant part of the code which handles this (it's triggered by an onkeyup event on a contenteditable element; I've trimmed it back to the relevant parts):
// Parse emjoi:
this.parseEmoji = function()
{
if( ! this.settings.parseSmileys )
{
return;
}
var _self = this,
url = 'http://cdn.jsdelivr.net/emojione/assets/png/',
$html = this.$editor.html();
// Loop through:
for( var i in _self.emoji )
{
var re = new RegExp( '\\B' + _self.regexpEscape(i) + '\\B', 'g' ),
em = _self.emoji[i];
if( re.test($html) )
{
var replace = '<img class="lw-emoji" height="16" src="'+(url + em[0] + '.png')+'" alt="'+em[1]+'" />';
this.insertAtCaret( replace );
_self.$editor.html(function() { return $(this).html().replace(re, ''); });
}
}
};
And here is the regexpEscape() function:
// Escape a string so that it's RegExp safe!
this.regexpEscape = function( txt )
{
return txt.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
};
We define all of the emoticons used in the system inside of an object which is referenced by the char combination itself as follows:
this.emoji = {
':)' : [ '1F642', 'Smiling face' ],
':-)' : [ '1F642', 'Smiling face' ],
':D' : [ '1F601', 'Happy face' ],
':-D' : [ '1F601', 'Happy face' ],
':\'(': [ '1F622', 'Crying face' ],
':(' : [ '1F614', 'Sad face' ],
':-(' : [ '1F614', 'Sad face' ],
':P' : [ '1F61B', 'Cheeky' ],
':-P' : [ '1F61B', 'Cheeky' ],
':/' : [ '1F615', 'Unsure face' ],
':-/' : [ '1F615', 'Unsure face' ],
'B)' : [ '1F60E', 'Too cool face' ],
'B-)' : [ '1F60E', 'Too cool face' ]
};
Now, the odd thing is that any of the character combinations which contain an alphabetical character do not get replaced, and fail the re.test() function. For example: :), :-), :( and :'( all get replaced without issue. However, :D and B) do not.
Can anyone explain why the alpha chars are causing issues inside of the RegExp?
Paired-back jsFiddle Demo
The problem is that \B is context-dependent, if there is a word character starting the pattern a word character must appear before it in the input string for a match. Same way at the end of the pattern, \B at the end of the pattern will require the same type of the symbol appear right after.
To avoid that issue, a lookaround-based solution is usually used: (?<!\w)YOUR_PATTERN(?!\w). However, in JS, a lookbehind is not supported. It can be worked around with a capturing group and and a backreference in the replace function later.
So, to replace those cases correctly, you need to change that part of code to
var re = new RegExp( '(^|\\W)' + _self.regexpEscape(i) + '(?!\\w)' ),
em = _self.emoji[i]; // match the pattern when not preceded and not followed by a word character
if( re.test($html) )
{
var replace = '<img class="lw-emoji" height="16" src="'+(url + em[0] + '.png')+'" alt="'+em[1]+'" />';
this.insertAtCaret( replace );
_self.$editor.html(function() { return $(this).html().replace(re, '$1'); }); // restore the matched symbol (the one \W matched) with $1
}
Here is the updated fiddle.
Related
I have to validate a string into the jquery textcomplete.
Now my strings are arragned on multiple lines, a single line may have a couple key = value or multiple separated by commas
like:
key1=value1, mkey2 = value2;
input3 = value3
The expression I need must match the left operand of the expression
What I did:
/(?:,\s*|^\s|^|,)(?:\s*)([^=]\w*)/
My issue is that it instead of just returning
"key1" "mkey2" "input3"
it returns
["key1", "key1"], [", mkey2", "mkey2"] , [" input3", "input3"]
But I'm expecting (actually jQuery.textcomplete is expecting)
["key1"], ["mkey2"] , ["input3"]
JSFIDDLE
EDIT: JS Code (previously was only on jfiddle)
var items = [
"key_1",
"key_2",
"key",
"halt",
"keybone",
"klingon",
"kabum"
];
$('#myTextArea1').textcomplete(
[{
match: /[^=](\w*)$/,
index: 1, // default = 2
search: function (term, callback) {
term = term.toLowerCase();
callback($.map(items, function (item) {
return item.toLowerCase().indexOf(term) >= 0 ? item : null;
}));
},
replace: function (item) {
return '' + item + ' = ';
}
}]);
I'm not sure how you're calling it, but try this out:
\s*(\w+)\s*=
This is the RegEx I was looking for:
/(?<=^|,\s|,)(\w+(?=[=]|$))/m
What does it do?
• (?<=^|,\s|,) is a lookbehind that checks if the string is the beginning ^, is , or ,
• (\w+(?=[=]|$)) could be decomposed in two sub-expressions:
<> \w+ that matches a word of at least a letter or digit POSIX [:alnum:] is not supported in JS
<> (?=[=]|$) is a lookahead that keeps the expression only if we're at the end of text $ or the next character is not a =
I have this array that has emo symbols and associated image files for each emo path.
Working demo of my code on JSFIDDLE
But using this code, only :) this emo returns the correct smile.png image but rest of the emo are not working.
How do I write the correct regex that match each of these symbols and choose the appropriate file for each emo ?
//Replace Emo with images
function replaceEmoticons(text) {
var emoticons = {
':)' : 'smile.png',
': )' : 'smile.png',
':D' : 'laugh.png',
':->' : 'laugh.png',
':d' : 'laugh-sm.png',
':-)': 'smile-simple.png',
':p': 'tounge.png',
':P': 'tounge-lg.png',
': P': 'tng1.png',
'>-)': 'evil.png',
':(': 'sad.png',
':-(': 'sadd.png',
':-<': 'sadd.png',
':-O': 'surprise.png',
':O': 'sur2.png',
':o': 'sur3.png',
':-o': 'sur3.png',
':-*': 'kiss.png',
':*': 'kiss.png',
':-#': 'angry.png',
':#': 'angry.png',
':$': 'con2.png',
':-$': 'con1.png',
'O.o': 'con2.png',
'o.O': 'con2.png',
':/': 'weird.png',
':x': 'x.png',
':X': 'x.png',
':!': 's.png',
'(:I': 'egg.png',
'^.^': 'kit.png',
'^_^': 'kit.png',
';)': 'wink.png',
';-)': 'wink.png',
":')": 'hc.png',
":'(": 'cry.png',
"|-O": 'yawn.png',
"-_-": 'poke.png',
":|": 'p1.png',
"$_$": 'he.png'
}, url = "images/emo/";
// a simple regex to match the characters used in the emoticons
return text.replace(/[:\-)D]+/g, function (match) {
return typeof emoticons[match] != 'undefined' ?
'<img class="emo" src="'+url+emoticons[match]+'"/>' :
match;
});
}
replaceEmoticons("Hi this is a test string :) with all :P emos working :D");
This regex:
[:\-)D]+
Does not match many of the emoticons in your list. Any character other than :, \, -, ),or D will prevent it from being recognized.
If you have a list of strings you want to match, you can easily build a regex to match any of them (and nothing else) by escaping each one and joining them together with |. Something like this:
// given a string, return the source for a regular expression that will
// match only that exact string, by escaping any regex metacharacters
// contained within.
RegExp.escape = function(text) {
return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}
// build a regex that will match all the emoticons. Written out, it looks
// like this: /:\)|: \)|:D|:->|:d|:-\)|:p|:P|..../g
var emoticonRegex =
new RegExp(Object.keys(emoticons).map(RegExp.escape).join('|'), 'g');
Then use that in place of your literal regex:
return text.replace(emoticonRegex, function (match) { ...
I have a string in javascript like
"some text #[14cd3:+Seldum Kype] things are going good for #[7f8ef3:+Kerry Williams] so its ok"
From this i want to extract the name and id for the 2 people. so data like -
[ { id: 14cd3, name : Seldum Kype},
{ id: 7f8ef3, name : Kerry Williams} ]
how can u use regex to extract this?
please help
var text = "some text #[14cd3:+Seldum Kype] things are going " +
"good for #[7f8ef3:+Kerry Williams] so its ok"
var data = text.match(/#\[.+?\]/g).map(function(m) {
var match = m.substring(2, m.length - 1).split(':+');
return {id: match[0], name: match[1]};
})
// => [ { id: '14cd3', name: 'Seldum Kype' },
// { id: '7f8ef3', name: 'Kerry Williams' } ]
// For demo
document.getElementById('output').innerText = JSON.stringify(data);
<pre id="output"></pre>
Get the id from Group index 1 and name from group index 2.
#\[([a-z\d]+):\+([^\[\]]+)\]
DEMO
Explanation:
# Matches a literal # symbol.
\[ Matches a literal [ symbol.
([a-z\d]+) Captures one or more chars lowercase alphabets or digits.
:\+ Matches :+ literally.
([^\[\]]+) Captures any character but not of [ or ] one or more times.
\] A literal ] symbol.
Try the following, the key is to properly escape reserved special symbols:
#\[([\d\w]+):\+([\s\w]+)\]
I want to alter a text string with a regular expression, removing every non-digit character, except a + sign. But only if it's the first character
+23423424dfgdgf234 --> +23423424234
2344 234fsdf 9 --> 23442349
4+4 --> 44
etc
The replacing of 'everything but' is pretty simple:
/[^\+|\d]/gi but that also removes the +-sign as a first character.
how can I alter the regexp to get what I want?
If it matters: I'm using the regexp in javascript's str.replace() function.
I would do it in two steps, first removing everything that must be removed apart the +, then the + that aren't the first char :
var str2 = str1.replace(/[^\d\+]+/g,'').replace(/(.)\++/g,"$1")
You'd have to do this in two steps:
// pass one - remove all non-digits or plus
var p1 = str.replace(/[^\d+]+/g, '');
// remove plus if not first
var p2 = p1.length ? p1[0] + p1.substr(1).replace(/\+/g, '') : '';
console.log(p2);
You can replace the following regex
[^\d+] with ''
and then on the resulting string, replace
(.)\+ with '$1'
Demo: http://regex101.com/r/eT6uF6
Updated: http://jsfiddle.net/QVd7R/2/
You could combine the above suggested 2 replaces in a single RegExp:
var numberWithSign = /(^\+)|[^\d]+/g;
var tests =
[
{"input" : "+23423424dfgdgf234", "output" : "+23423424234"},
{"input" : "2344 234fsdf 9" , "output" : "23442349"},
{"input" : "4+4" , "output" : "44"},
{"input" : "+a+4" , "output" : "+4"},
{"input" : "+a+b" , "output" : "+"},
{"input" : "++12" , "output" : "+12"}
];
var result = true;
for (index in tests) {
var test = tests[index];
testResult = test.input.replace(numberWithSign,"$1");
result = result && (testResult == test.output);
if (!result) {
return testResult + "\n" + test.output;
}
}
return result;
Basically the first part (^\+) would match only the + sign in the beginning of the line, and will put it as $1, so when you replace this match with $1 it will keep the plus sign in the beginning of the string. If it does not match, then the next part of the regexp [^\d]+ will take effect, replacing all non-digits with an empty string (as there would be nothing in the value of $1)
Try this:
var newString = Yourstring.match(/(^\+)?\d*/g).join("");
I have an array with:
emoticons = {
':-)' : 'smile1.gif',
':)' : 'smile2.gif',
':D' : 'smile3.gif'
}
then i have a variabile with the text.
var text = 'this is a simple test :)';
and a variable with the url of the website
var url = "http://www.domain.com/";
How to write a function that replace the symbols with their images?
The <img> tag result should be:
<img src="http://www.domain.com/simple2.gif" />
(I have to concatenate the url varible to the name of the image).
THank you very much!
Another approach:
function replaceEmoticons(text) {
var emoticons = {
':-)' : 'smile1.gif',
':)' : 'smile2.gif',
':D' : 'smile3.gif'
}, url = "http://www.domain.com/";
// a simple regex to match the characters used in the emoticons
return text.replace(/[:\-)D]+/g, function (match) {
return typeof emoticons[match] != 'undefined' ?
'<img src="'+url+emoticons[match]+'"/>' :
match;
});
}
replaceEmoticons('this is a simple test :)');
// "this is a simple test <img src="http://www.domain.com/smile2.gif"/>"
Edit: #pepkin88 made a really good suggestion, build the regular expression based on the property names of the emoticons object.
It can be easily done, but we have to escape meta-characters if we want this to work properly.
The escaped patterns are stored on an array, that is later used to build the regular expression using the RegExp constructor, by basically joining all the patterns separated with the | metacharacter.
function replaceEmoticons(text) {
var emoticons = {
':-)' : 'smile1.gif',
':)' : 'smile2.gif',
':D' : 'smile3.gif',
':-|' : 'smile4.gif'
}, url = "http://www.domain.com/", patterns = [],
metachars = /[[\]{}()*+?.\\|^$\-,&#\s]/g;
// build a regex pattern for each defined property
for (var i in emoticons) {
if (emoticons.hasOwnProperty(i)){ // escape metacharacters
patterns.push('('+i.replace(metachars, "\\$&")+')');
}
}
// build the regular expression and replace
return text.replace(new RegExp(patterns.join('|'),'g'), function (match) {
return typeof emoticons[match] != 'undefined' ?
'<img src="'+url+emoticons[match]+'"/>' :
match;
});
}
replaceEmoticons('this is a simple test :-) :-| :D :)');
for ( smile in emoticons )
{
text = text.replace(smile, '<img src="' + url + emoticons[smile] + '" />');
}
Using a regex with an array of find replace elements works well.
var emotes = [
[':\\\)', 'happy.png'],
[':\\\(', 'sad.png']
];
function applyEmotesFormat(body){
for(var i = 0; i < emotes.length; i++){
body = body.replace(new RegExp(emotes[i][0], 'gi'), '<img src="emotes/' + emotes[i][1] + '">');
}
return body;
}