JavaScript - Why does this code alert a message? - javascript

I don't know much about JavaScript, but I found this code as a part of some game engine code. I tried to inspect it, because I noticed this part of code alerts a message and I really cannot figure out how. Here is the minimal code (I reduced it and extracted from original script and I changed variable names to single letters):
var a = '͏‪͏‪‪‪‪‪͏͏‪‪‪‪͏‪͏͏‪͏͏‪‪‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪‪͏͏‪͏‪‪͏‪‪͏͏‪͏‪͏͏͏͏‪‪‪͏͏͏͏͏‪‪͏‪‪͏‪͏‪‪‪͏͏͏‪͏‪‪‪͏‪‪‪͏‪‪‪͏‪͏͏͏‪‪‪‪͏‪‪͏‪‪͏‪‪‪͏͏‪‪‪‪͏‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏‪‪͏‪‪‪͏‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪͏͏‪͏‪‪‪͏͏͏͏͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪͏‪‪͏‪͏‪‪‪͏͏͏‪͏‪‪‪͏‪‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏‪‪͏‪͏‪‪‪͏‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏͏͏͏͏‪͏‪͏͏͏͏‪‪‪͏‪͏‪͏‪‪‪͏͏͏‪͏‪‪͏‪‪‪͏͏‪‪‪͏͏‪͏͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪͏͏‪͏‪‪‪͏‪‪͏͏‪‪‪‪͏‪͏͏‪‪‪͏‪‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏‪‪͏‪‪‪͏͏͏‪͏‪‪‪͏͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪͏‪‪͏‪͏‪‪‪‪͏‪͏͏‪‪‪͏‪‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏͏‪͏͏‪͏‪͏͏͏͏͏‪͏‪͏͏‪͏‪‪‪‪‪‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪͏‪‪͏͏‪͏‪͏͏͏͏‪‪‪͏‪͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪͏͏͏͏‪͏‪‪‪͏‪͏‪‪‪‪‪͏‪͏‪͏‪‪͏‪‪͏‪‪͏‪‪‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪‪‪͏͏͏‪͏‪‪‪͏‪‪‪͏‪‪͏‪‪‪͏͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪͏͏‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪͏‪͏‪‪‪‪‪͏‪͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏͏‪͏͏‪͏͏͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪͏͏͏‪͏͏‪͏‪͏‪‪͏͏‪͏͏͏͏‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏‪͏͏‪͏‪‪͏‪‪͏͏‪͏͏͏͏‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏͏‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏‪‪͏͏‪͏‪͏͏͏͏‪‪‪‪͏͏͏͏͏‪͏͏͏͏‪͏‪‪͏‪‪‪‪͏‪‪‪͏‪‪‪͏͏‪͏‪͏‪‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪͏‪͏‪͏‪‪‪͏‪͏‪‪‪͏‪‪͏͏‪‪‪‪͏͏͏͏͏‪͏‪͏‪‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪‪͏͏͏‪͏‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪‪͏͏͏͏‪‪‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪‪‪͏‪͏͏͏‪‪‪‪͏͏‪͏‪‪͏‪‪‪‪͏‪͏‪‪͏‪‪͏‪‪͏‪‪‪‪͏‪͏‪‪‪͏‪͏‪‪‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪͏͏‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏‪‪͏‪‪‪‪͏‪͏͏‪‪‪͏͏‪͏͏‪‪‪‪‪͏͏͏͏‪͏͏͏͏‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏‪‪͏‪‪‪͏͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪͏͏͏͏͏‪͏‪‪͏‪‪͏‪‪‪‪‪‪͏͏‪‪‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪‪‪‪͏‪͏͏‪‪‪‪͏‪‪͏‪‪͏‪‪‪‪͏‪͏‪‪͏‪‪͏‪͏‪‪‪͏‪͏‪‪͏‪‪‪͏͏‪‪‪‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪‪͏‪͏͏‪‪͏‪‪͏‪͏‪‪‪‪͏͏‪͏͏‪͏͏͏͏‪͏‪‪‪͏‪‪͏͏͏‪͏͏‪‪‪͏͏‪‪‪͏‪‪͏‪‪͏͏‪‪͏͏‪‪͏‪‪‪‪͏‪‪‪͏͏‪͏͏͏‪͏‪͏͏͏͏‪‪͏‪͏͏‪͏͏‪͏͏͏͏͏͏‪‪͏‪‪‪‪͏‪‪͏͏‪‪͏͏͏‪͏͏‪‪‪͏‪‪͏‪‪͏‪͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪‪͏͏͏͏͏‪‪‪͏͏͏‪͏‪‪‪͏͏‪͏͏‪‪‪͏͏‪‪͏‪‪‪͏‪͏͏͏‪‪‪͏‪͏‪͏‪‪‪͏‪‪͏͏‪‪‪͏‪‪‪͏‪‪‪‪͏͏͏͏‪‪‪‪͏͏‪͏‪‪‪‪͏‪͏͏‪‪‪‪͏‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪‪‪͏͏͏‪͏͏‪‪‪͏͏‪͏‪͏͏‪͏‪‪‪͏‪‪‪͏‪‪͏‪͏͏‪͏‪‪‪͏‪͏͏͏‪‪͏‪͏͏͏͏͏‪͏‪͏͏͏͏‪͏‪‪‪‪‪͏͏‪͏‪‪‪͏͏‪‪‪͏͏‪‪͏‪‪‪͏͏͏͏͏‪‪͏‪‪͏͏͏‪‪͏‪͏͏‪͏‪‪‪͏‪͏͏͏͏‪͏‪͏͏͏͏‪‪͏‪͏͏‪͏͏‪͏‪͏͏‪͏͏‪͏‪͏͏‪͏‪͏‪‪‪‪‪͏͏‪‪‪‪͏‪͏‪‪͏‪͏‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪͏͏‪͏‪‪͏‪‪‪͏͏͏‪͏‪͏͏͏͏‪‪‪͏͏͏͏͏‪‪͏‪‪‪‪͏‪‪‪͏͏͏͏͏͏‪͏‪͏͏͏͏͏‪͏‪͏͏‪͏͏‪͏‪͏͏‪͏͏‪‪‪͏‪‪͏‪‪͏͏‪͏‪͏‪‪‪͏‪‪͏͏‪‪͏͏͏͏‪͏‪‪͏‪‪͏͏͏͏‪͏‪͏͏͏͏‪͏‪‪‪‪‪͏͏‪͏‪͏͏‪';
var b = a.match(/.{8}/g);
var c = b.map(a => [...a].map(a => a == '‪' | 0));
var d = c.map(a => parseInt(a.join``, 2).toString(16));
var e = d.map(a => eval(`'\\x${a.padStart(2, 0)}'`));
var f = eval(e.join``);
I'm trying to understand how they succeed to alert a message. It alerts number 12345, but how? I see some evals here, so I suppose they are making code on the fly, but still I tried using debugger but I couldn't find explanation. They are somehow generating code and executing it, I'm still unable to see how.
I tried this code in jsFiddle and it still works and I tried in Node.js and it throw error alert is not defined, so I am pretty sure everything this code does is to alert a message.
What trick did they use here? How are they making and evaling code and how do they succeed to alert a message? Is this some sort of encription or what?
My question has absolutely nothing to do with this question.

The code is all there, hidden in the variable a. No, it's not an empty string, its a string consisting of 1888 invisible characters - either \u034f or \u202a to be precise. So this is in fact just a disguised binary encoding.
The code part
var b = a.match(/.{8}/g);
var c = b.map(a => [...a].map(a => a == '‪' | 0));
var d = c.map(a => parseInt(a.join``, 2).toString(16));
breaks them in chunks of 8, then converts each chunk from an array of characters to an array of booleans (or rather, the integers 0 and 1) - notice that it compares the character against the invisible \u202a, and then converts each array-of-8-booleans (oh look, an octet!) into an actual byte and gets a hex representation of it. Here's the hex string (d.join('')):
5f3d275b7e5b28706d7177747b6e7b7c7d7c7b747d79707c7d6d71777c7b5d5d282875716e727c7d79767a775d2b7173737b737b7b737b7b7b6d7a775d2928297e5d5b28755b7d795b785d7d5b6f5d2971776e7c7d725d5d7d2b6f7c792175712b217d7a5b217d7b795d2b2878216f772b5b7d5d76782b5b7e2975787d2974796f5b6f5d7d295b735d2b7a727c217d7b7b7c7b715b7b705b7e7d297a7b6f5b5d6e79757a6d792176273b666f722869206f66276d6e6f707172737475767778797a7b7c7d7e272977697468285f2e73706c6974286929295f3d6a6f696e28706f702829293b6576616c285f29
The part
d.map(a => eval(`'\\x${a.padStart(2, 0)}'`));
has each of them parsed into a character, using a backslash escape. String.fromCharCode would have been the simpler choice. Also the padStart is not even required here, given that none of the bytes is a control character with a byte value less than 16. Maybe this would've been more familiar:
"\x5f\x3d\x27\x5b\x7e\x5b\x28\x70\x6d\x71\x77\x74\x7b\x6e\x7b\x7c\x7d\x7c\x7b\x74\x7d\x79\x70\x7c\x7d\x6d\x71\x77\x7c\x7b\x5d\x5d\x28\x28\x75\x71\x6e\x72\x7c\x7d\x79\x76\x7a\x77\x5d\x2b\x71\x73\x73\x7b\x73\x7b\x7b\x73\x7b\x7b\x7b\x6d\x7a\x77\x5d\x29\x28\x29\x7e\x5d\x5b\x28\x75\x5b\x7d\x79\x5b\x78\x5d\x7d\x5b\x6f\x5d\x29\x71\x77\x6e\x7c\x7d\x72\x5d\x5d\x7d\x2b\x6f\x7c\x79\x21\x75\x71\x2b\x21\x7d\x7a\x5b\x21\x7d\x7b\x79\x5d\x2b\x28\x78\x21\x6f\x77\x2b\x5b\x7d\x5d\x76\x78\x2b\x5b\x7e\x29\x75\x78\x7d\x29\x74\x79\x6f\x5b\x6f\x5d\x7d\x29\x5b\x73\x5d\x2b\x7a\x72\x7c\x21\x7d\x7b\x7b\x7c\x7b\x71\x5b\x7b\x70\x5b\x7e\x7d\x29\x7a\x7b\x6f\x5b\x5d\x6e\x79\x75\x7a\x6d\x79\x21\x76\x27\x3b\x66\x6f\x72\x28\x69\x20\x6f\x66\x27\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x27\x29\x77\x69\x74\x68\x28\x5f\x2e\x73\x70\x6c\x69\x74\x28\x69\x29\x29\x5f\x3d\x6a\x6f\x69\x6e\x28\x70\x6f\x70\x28\x29\x29\x3b\x65\x76\x61\x6c\x28\x5f\x29"
This string is the one evaled in the last line. But surprise, the contents of that string are just
_='[~[(pmqwt{n{|}|{t}yp|}mqw|{]]((uqnr|}yvzw]+qss{s{{s{{{mzw])()~][(u[}y[x]}[o])qwn|}r]]}+o|y!uq+!}z[!}{y]+(x!ow+[}]vx+[~)ux})tyo[o]})[s]+zr|!}{{|{q[{p[~})z{o[]nyuzmy!v';for(i of'mnopqrstuvwxyz{|}~')with(_.split(i))_=join(pop());eval(_)
So what does - still obfuscated - code do?
var _='[~[(pmqwt{n{|}|{t}yp|}mqw|{]]((uqnr|}yvzw]+qss{s{{s{{{mzw])()~][(u[}y[x]}[o])qwn|}r]]}+o|y!uq+!}z[!}{y]+(x!ow+[}]vx+[~)ux})tyo[o]})[s]+zr|!}{{|{q[{p[~})z{o[]nyuzmy!v';
for (var i of 'mnopqrstuvwxyz{|}~')
with (_.split(i))
_=join(pop());
eval(_)
Removing the with magic, we get
for (var i of 'mnopqrstuvwxyz{|}~') {
let temp = _.split(i);
_ = temp.join(temp.pop());
}
So for all of these characters from m to z, it splits _ by that, takes the last part out, and joins it back together, effectively
replacing m by y!v,
replacing n by yuz,
replacing o by [],
replacing p by [~})z{,
replacing q by [{,
replacing r by |!}{{|{,
replacing s by ]+z,
replacing t by y[][[]]})[,
replacing u by x}),
replacing v by x+[~),
replacing w by +[}],
replacing x by ![],
replacing y by ]+(,
replacing z by [!}{,
replacing { by +!},
replacing | by ]+(!![]})[,
replacing } by +[],
replacing ~ by ][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]
and after all that we get for _ to be evaled the code
[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]+(![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]+[+!+[]]+[!+[]+!+[]]+[!+[]+!+[]+!+[]]+[!+[]+!+[]+!+[]+!+[]]+[!+[]+!+[]+!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]])()
Now doesn't that look familiar? It's good old jsfuck!
I found this code as a part of some game engine code
I doubt it. Looks much more like a submission to a code obfusciation context. However, it doesn't appear to be hand-crafted, more likely someone just blindly chained multiple obfusciation tools together.

Related

Line break javascript doesn't insert new line "\n"

I've been trying to wrap my head around this whole line break thing, and I've searched and researched my soul out here. I can't seem to find an answer to my specific problem here. I want to fetch the input from a textarea and put it in an array with new lines. All it does is put a comma between the words, and it seems it only adds multiple commas to where the line breaks are supposed to be. When I add < br / >, all it does is exclude the letter b from the text.
function Wordscount() {
var pText = document.getElementById("myTextarea").value.split(/[\n <>.,\?]/);
document.getElementById("text").innerHTML = pText;
It basically just looks like this when I test it :
I am new to Javascript, and I wouldn't have gone for this solution unless this was the method our professor told us to use. I'm really frustrated here, and I'm just trying to get the hang of this.
Splitting a string turns it into an array. Treating an array as a string is equivalent to calling yourArray.join(','). Since you don't want to add commas, don't just treat the array as a string.
If you want to put HTML line breaks in, then you need to do so explicitly.
var array_of_lines = document.getElementById("myTextarea").value.split("\n");
var string_of_html = array_of_lines.join("<br>");
document.getElementById("text").innerHTML = string_of_html;
If you don't want HTML special characters to be treated as having special meaning, then convert each line to a text node and append it instead.
var array_of_lines = document.getElementById("myTextarea").value.split("\n");
document.getElementById("text").innerHTML = "";
while(var text = array_of_lines.unshift()) {
document.getElementById("text").appendChild(
document.createTextNode(text)
);
document.getElementById("text").appendChild(
document.createElement("br")
);
}

JavaScript remove ZERO WIDTH SPACE (unicode 8203) from string

I'm writing some javascript that processes website content. My efforts are being thwarted by SharePoint text editor's tendency to put the "zero width space" character in the text when the user presses backspace.
The character's unicode value is 8203, or B200 in hexadecimal. I've tried to use the default "replace" function to get rid of it. I've tried many variants, none of them worked:
var a = "o​m"; //the invisible character is between o and m
var b = a.replace(/\u8203/g,'');
= a.replace(/\uB200/g,'');
= a.replace("\\uB200",'');
and so on and so forth. I've tried quite a few variations on this theme. None of these expressions work (tested in Chrome and Firefox) The only thing that works is typing the actual character in the expression:
var b = a.replace("​",''); //it's there, believe me
This poses potential problems. The character is invisible so that line in itself doesn't make sense. I can get around that with comments. But if the code is ever reused, and the file is saved using non-Unicode encoding, (or when it's deployed to SharePoint, there's not guarantee it won't mess up encoding) it will stop working. Is there a way to write this using the unicode notation instead of the character itself?
[My ramblings about the character]
In case you haven't met this character, (and you probably haven't, seeing as it's invisible to the naked eye, unless it broke your code and you discovered it while trying to locate the bug) it's a real a-hole that will cause certain types of pattern matching to malfunction. I've caged the beast for you:
[​] <- careful, don't let it escape.
If you want to see it, copy those brackets into a text editor and then iterate your cursor through them. You'll notice you'll need three steps to pass what seems like 2 characters, and your cursor will skip a step in the middle.
The number in a unicode escape should be in hex, and the hex for 8203 is 200B (which is indeed a Unicode zero-width space), so:
var b = a.replace(/\u200B/g,'');
Live Example:
var a = "o​m"; //the invisible character is between o and m
var b = a.replace(/\u200B/g,'');
console.log("a.length = " + a.length); // 3
console.log("a === 'om'? " + (a === 'om')); // false
console.log("b.length = " + b.length); // 2
console.log("b === 'om'? " + (b === 'om')); // true
The accepted answer didn't work for my case.
But this one did:
text.replace(/(^[\s\u200b]*|[\s\u200b]*$)/g, '')

JS / RegEx to remove characters grouped within square braces

I hope I can explain myself clearly here and that this is not too much of a specific issue.
I am working on some javascript that needs to take a string, find instances of chars between square brackets, store any returned results and then remove them from the original string.
My code so far is as follows:
parseLine : function(raw)
{
var arr = [];
var regex = /\[(.*?)]/g;
var arr;
while((arr = regex.exec(raw)) !== null)
{
console.log(" ", arr);
arr.push(arr[1]);
raw = raw.replace(/\[(.*?)]/, "");
console.log(" ", raw);
}
return {results:arr, text:raw};
}
This seems to work in most cases. If I pass in the string [id1]It [someChar]found [a#]an [id2]excellent [aa]match then it returns all the chars from within the square brackets and the original string with the bracketed groups removed.
The problem arises when I use the string [id1]It [someChar]found [a#]a [aa]match.
It seems to fail when only a single letter (and space?) follows a bracketed group and starts missing groups as you can see in the log if you try it out. It also freaks out if i use groups back to back like [a][b] which I will need to do.
I'm guessing this is my RegEx - begged and borrowed from various posts here as I know nothing about it really - but I've had no luck fixing it and could use some help if anyone has any to offer. A fix would be great but more than that an explanation of what is actually going on behind the scenes would be awesome.
Thanks in advance all.
You could use the replace method with a function to simplify the code and run the regexp only once:
function parseLine(raw) {
var results = [];
var parsed = raw.replace(/\[(.*?)\]/g, function(match,capture) {
results.push(capture);
return '';
});
return { results : results, text : parsed };
}
The problem is due to the lastIndex property of the regex /\[(.*?)]/g; not resetting, since the regex is declared as global. When the regex has global flag g on, lastIndex property of RegExp is used to mark the position to start the next attempt to search for a match, and it is expected that the same string is fed to the RegExp.exec() function (explicitly, or implicitly via RegExp.test() for example) until no more match can be found. Either that, or you reset the lastIndex to 0 before feeding in a new input.
Since your code is reassigning the variable raw on every loop, you are using the wrong lastIndex to attempt the next match.
The problem will be solved when you remove g flag from your regex. Or you could use the solution proposed by Tibos where you supply a function to String.replace() function to do replacement and extract the capturing group at the same time.
You need to escape the last bracket: \[(.*?)\].

Can someone explain what this regex do?

It's part of code where javascript should watch for some price and match if it's lover than required, but i don't understand regex quite well and it's obvious that the error is in there.
So on a website i have price like
<div class="item_price_now"> $ 1,34 </div>
And on javascript part code looks like this
var maxprice = '0.98';
var itemprice = document.getElementByClassName('item_price_now');
var i = 0;
var currentprice = itemprice[i].innerHTML.replace(/\s+/g, ' ');
currentprice = currentprice.substring(2);
if (currentprice > maxprice)
{ do some code }
else
{ do some other code }
But this doesn't work, i assume that part of error is in regex, with this i don't get any values, i tried to change it to something like this
(\S+\w)
And it's outputing something (actually i get output of 1,34 ) but still can't match it with maxprice variable.
Can someone explain me what regex above means or at least point me in some direction. Thanks.
/\s+/g means "match any space/tab character that is repeated one of more times over the entire string".
Hence it's replacing any multiple whitespaces/tabs with a single whitespace.
It seems that your problem is that you use locale strings to describe your value, as you're comparing the string 0.98 (which is casted by JS) with 1,34 (which cannot be casted by JS, as , would be a thousand seperator)

Find longest repeating substring in JavaScript using regular expressions

I'd like to find the longest repeating string within a string, implemented in JavaScript and using a regular-expression based approach.
I have an PHP implementation that, when directly ported to JavaScript, doesn't work.
The PHP implementation is taken from an answer to the question "Find longest repeating strings?":
preg_match_all('/(?=((.+)(?:.*?\2)+))/s', $input, $matches, PREG_SET_ORDER);
This will populate $matches[0][X] (where X is the length of $matches[0]) with the longest repeating substring to be found in $input. I have tested this with many input strings and found am confident the output is correct.
The closest direct port in JavaScript is:
var matches = /(?=((.+)(?:.*?\2)+))/.exec(input);
This doesn't give correct results
input Excepted result matches[0][X]
======================================================
inputinput input input
7inputinput input input
inputinput7 input input
7inputinput7 input 7
XXinputinputYY input XX
I'm not familiar enough with regular expressions to understand what the regular expression used here is doing.
There are certainly algorithms I could implement to find the longest repeating substring. Before I attempt to do that, I'm hoping a different regular expression will produce the correct results in JavaScript.
Can the above regular expression be modified such that the expected output is returned in JavaScript? I accept that this may not be possible in a one-liner.
Javascript matches only return the first match -- you have to loop in order to find multiple results. A little testing shows this gets the expected results:
function maxRepeat(input) {
var reg = /(?=((.+)(?:.*?\2)+))/g;
var sub = ""; //somewhere to stick temp results
var maxstr = ""; // our maximum length repeated string
reg.lastIndex = 0; // because reg previously existed, we may need to reset this
sub = reg.exec(input); // find the first repeated string
while (!(sub == null)){
if ((!(sub == null)) && (sub[2].length > maxstr.length)){
maxstr = sub[2];
}
sub = reg.exec(input);
reg.lastIndex++; // start searching from the next position
}
return maxstr;
}
// I'm logging to console for convenience
console.log(maxRepeat("aabcd")); //aa
console.log(maxRepeat("inputinput")); //input
console.log(maxRepeat("7inputinput")); //input
console.log(maxRepeat("inputinput7")); //input
console.log(maxRepeat("7inputinput7")); //input
console.log(maxRepeat("xxabcdyy")); //x
console.log(maxRepeat("XXinputinputYY")); //input
Note that for "xxabcdyy" you only get "x" back, as it returns the first string of maximum length.
It seems JS regexes are a bit weird. I don't have a complete answer, but here's what I found.
Although I thought they did the same thing re.exec() and "string".match(re) behave differently. Exec seems to only return the first match it finds, whereas match seems to return all of them (using /g in both cases).
On the other hand, exec seems to work correctly with ?= in the regex whereas match returns all empty strings. Removing the ?= leaves us with
re = /((.+)(?:.*?\2)+)/g
Using that
"XXinputinputYY".match(re);
returns
["XX", "inputinput", "YY"]
whereas
re.exec("XXinputinputYY");
returns
["XX", "XX", "X"]
So at least with match you get inputinput as one of your values. Obviously, this neither pulls out the longest, nor removes the redundancy, but maybe it helps nonetheless.
One other thing, I tested in firebug's console which threw an error about not supporting $1, so maybe there's something in the $ vars worth looking at.

Categories

Resources