Template literal, weird output - javascript

I'm working with some lab equipment using ASTM over TCP/IP. Getting some weird behavior. Using just Node and the net package.
socket.on('data', data => {
let str = data.toString('ascii');
console.log(`the string ---- ${str}`);
if (str === ENQ) {
socket.write(ACK);
} else {
console.log(str);
}
outputs (given correct input):
E1 string ---- 1H|\^&|||1^Analyzer 1^6.0|||||||P||20201216150358
E1|\^&|||1^Analyzer 1^6.0|||||||P||20201216150358
I need the stuff on the top line after the dashes, but "The" becomes E1, then E1 moves down to the next line and replaces 1H. What's going on here? I'm hoping it just has something to do with console.log so I can still get to the results I'm looking for.

So it looks like some of the control characters are making the output weird. Towards the end of the line, there is a CR and ETX at the end of the line followed by a checksum of the line. So it seems that the carriage return is sending the cursor back to the front of the line and putting the ETX and checksum in place of "The"

Related

Replace new line characters with \r

I'm using hl7parser to parse ADM files.
The documentation states that to create a new Message object, a string should be passed:
var message = hl7parser.create("MSH|^~\&|||||20121031232617||ADT^A04|20381|P|2.3||||NE\rEVN|A04|20121031162617||01\rPID|1|16194|16194||Jones^Bob");
Notice that the string uses '\r' to separate segments (MSH, EVN, PID).
I'm fetching the data from a server, which returns for instance the following data.
MSH|^~\&|EPICADT|DH|LABADT|DH|201301011226||ADT^A01|HL7MSG00001|P|2.3.1|
EVN|A01|201301011223||
PID|||MRN12345^5^M11||APPLESEED^JOHN^A^III||19710101|M||C|1 CATALYZE STREET^^MADISON^WI^53005-1020|GL|(414)379-1212|(414)271-3434||S||MRN12345001^2^M10|123456789|987654^NC|
NK1|1|APPLESEED^BARBARA^J|WIFE||||||NK^NEXT OF KIN
PV1|1|I|2000^2012^01||||004777^GOOD^SIDNEY^J.|||SUR||||ADM|A0|
Replacing the \n with \r with replace() doesn't make the parsing work, neither does split('\n') and join('\r').
I noticed that there is a difference when logging the string passed in the example and the string after replacing with \r
With string in example:
PID|1|16194|16194||Jones^BobADT^A04|20381|P|2.3||||NE
It's only printing the last segment apparently because of the \r characters
With my replacement method:
PID|||MRN12345^5^M11||APPLESEED^JOHN^A^III||19710101|M||C|1 CATALYZE STREET^^MADISON^WI^53005-1020|GL|(414)379-1PV1|1|I|2000^2012^01||||004777^GOOD^SIDNEY^J.|||SUR||||ADM|A0|
The entire string is printed, not just the last segment.
I'm not sure why there is a difference when printing them. Is there a difference between passing a literal string with \r character and "adding" \r to a string?
Doing this should work:
const lines = "A\nB\nC";
const result = lines.split("\n").join("\r");
console.log(result);
The confusion probably comes from the fact that it looks like it didn't, since it looks like it just output ABC.
However, if we check out the length of the string produced:
const lines = "A\nB\nC";
const result = lines.split("\n").join("\r");
console.log(result);
console.log(result.length);
Notice that it is 5 characters long, not 3. The \r is there. It's just that when it is output to most things, it basically gets hidden because an \r doesn't really render to anything on its own.
It is a "carriage return" and only MacOS (before X) used it as a newline character. Windows uses a combination of \r\n to render a newline and Linux (and MacOSX) uses \n.
If it wanted an explicitly shown in the string \r, then you'd need to use an escaped one (though this is almost certainly not what it expects):
const lines = "A\nB\nC";
const result = lines.split("\n").join("\\r");
console.log(result);
console.log(result.length);
function replaceLfWithCr(text) {
return text.replace(/\n/g, '\r');
}

JavaScript - Why does this code alert a message?

I don't know much about JavaScript, but I found this code as a part of some game engine code. I tried to inspect it, because I noticed this part of code alerts a message and I really cannot figure out how. Here is the minimal code (I reduced it and extracted from original script and I changed variable names to single letters):
var a = '͏‪͏‪‪‪‪‪͏͏‪‪‪‪͏‪͏͏‪͏͏‪‪‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪‪͏͏‪͏‪‪͏‪‪͏͏‪͏‪͏͏͏͏‪‪‪͏͏͏͏͏‪‪͏‪‪͏‪͏‪‪‪͏͏͏‪͏‪‪‪͏‪‪‪͏‪‪‪͏‪͏͏͏‪‪‪‪͏‪‪͏‪‪͏‪‪‪͏͏‪‪‪‪͏‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏‪‪͏‪‪‪͏‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪͏͏‪͏‪‪‪͏͏͏͏͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪͏‪‪͏‪͏‪‪‪͏͏͏‪͏‪‪‪͏‪‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏‪‪͏‪͏‪‪‪͏‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏͏͏͏͏‪͏‪͏͏͏͏‪‪‪͏‪͏‪͏‪‪‪͏͏͏‪͏‪‪͏‪‪‪͏͏‪‪‪͏͏‪͏͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪͏͏‪͏‪‪‪͏‪‪͏͏‪‪‪‪͏‪͏͏‪‪‪͏‪‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏‪‪͏‪‪‪͏͏͏‪͏‪‪‪͏͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪͏‪‪͏‪͏‪‪‪‪͏‪͏͏‪‪‪͏‪‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏͏‪͏͏‪͏‪͏͏͏͏͏‪͏‪͏͏‪͏‪‪‪‪‪‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪͏‪‪͏͏‪͏‪͏͏͏͏‪‪‪͏‪͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪͏͏͏͏‪͏‪‪‪͏‪͏‪‪‪‪‪͏‪͏‪͏‪‪͏‪‪͏‪‪͏‪‪‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪‪‪͏͏͏‪͏‪‪‪͏‪‪‪͏‪‪͏‪‪‪͏͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪͏͏‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪͏‪͏‪‪‪‪‪͏‪͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏͏‪͏͏‪͏͏͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪͏͏͏‪͏͏‪͏‪͏‪‪͏͏‪͏͏͏͏‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏‪͏͏‪͏‪‪͏‪‪͏͏‪͏͏͏͏‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏͏‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏‪‪͏͏‪͏‪͏͏͏͏‪‪‪‪͏͏͏͏͏‪͏͏͏͏‪͏‪‪͏‪‪‪‪͏‪‪‪͏‪‪‪͏͏‪͏‪͏‪‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪͏‪͏‪͏‪‪‪͏‪͏‪‪‪͏‪‪͏͏‪‪‪‪͏͏͏͏͏‪͏‪͏‪‪͏‪͏‪‪͏‪‪͏‪‪‪‪‪‪͏͏͏‪͏‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪‪͏͏͏͏‪‪‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪‪‪͏‪͏͏͏‪‪‪‪͏͏‪͏‪‪͏‪‪‪‪͏‪͏‪‪͏‪‪͏‪‪͏‪‪‪‪͏‪͏‪‪‪͏‪͏‪‪‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪͏͏‪‪͏‪͏‪‪‪͏‪͏͏‪͏‪͏‪‪͏‪‪‪‪͏‪͏͏‪‪‪͏͏‪͏͏‪‪‪‪‪͏͏͏͏‪͏͏͏͏‪͏‪‪‪‪‪͏‪͏‪‪‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪͏‪‪͏‪‪‪͏͏͏‪͏‪͏‪‪͏‪‪͏‪‪‪‪͏‪‪͏‪‪‪͏͏͏͏͏‪͏‪‪͏‪‪͏‪‪‪‪‪‪͏͏‪‪‪‪‪͏‪͏͏‪͏‪͏͏‪͏‪‪‪‪͏‪͏͏‪‪‪‪͏‪‪͏‪‪͏‪‪‪‪͏‪͏‪‪͏‪‪͏‪͏‪‪‪͏‪͏‪‪͏‪‪‪͏͏‪‪‪‪͏͏‪͏‪‪‪͏‪͏‪͏‪‪‪‪͏‪͏͏‪‪͏‪‪͏‪͏‪‪‪‪͏͏‪͏͏‪͏͏͏͏‪͏‪‪‪͏‪‪͏͏͏‪͏͏‪‪‪͏͏‪‪‪͏‪‪͏‪‪͏͏‪‪͏͏‪‪͏‪‪‪‪͏‪‪‪͏͏‪͏͏͏‪͏‪͏͏͏͏‪‪͏‪͏͏‪͏͏‪͏͏͏͏͏͏‪‪͏‪‪‪‪͏‪‪͏͏‪‪͏͏͏‪͏͏‪‪‪͏‪‪͏‪‪͏‪͏‪‪͏‪‪‪͏͏‪‪͏‪‪‪‪͏‪‪‪͏͏͏͏͏‪‪‪͏͏͏‪͏‪‪‪͏͏‪͏͏‪‪‪͏͏‪‪͏‪‪‪͏‪͏͏͏‪‪‪͏‪͏‪͏‪‪‪͏‪‪͏͏‪‪‪͏‪‪‪͏‪‪‪‪͏͏͏͏‪‪‪‪͏͏‪͏‪‪‪‪͏‪͏͏‪‪‪‪͏‪‪͏‪‪‪‪‪͏͏͏‪‪‪‪‪͏‪͏‪‪‪‪‪‪͏͏͏‪͏͏‪‪‪͏͏‪͏‪͏͏‪͏‪‪‪͏‪‪‪͏‪‪͏‪͏͏‪͏‪‪‪͏‪͏͏͏‪‪͏‪͏͏͏͏͏‪͏‪͏͏͏͏‪͏‪‪‪‪‪͏͏‪͏‪‪‪͏͏‪‪‪͏͏‪‪͏‪‪‪͏͏͏͏͏‪‪͏‪‪͏͏͏‪‪͏‪͏͏‪͏‪‪‪͏‪͏͏͏͏‪͏‪͏͏͏͏‪‪͏‪͏͏‪͏͏‪͏‪͏͏‪͏͏‪͏‪͏͏‪͏‪͏‪‪‪‪‪͏͏‪‪‪‪͏‪͏‪‪͏‪͏‪͏͏‪‪͏‪‪‪‪͏‪‪͏‪͏͏‪͏‪‪͏‪‪‪͏͏͏‪͏‪͏͏͏͏‪‪‪͏͏͏͏͏‪‪͏‪‪‪‪͏‪‪‪͏͏͏͏͏͏‪͏‪͏͏͏͏͏‪͏‪͏͏‪͏͏‪͏‪͏͏‪͏͏‪‪‪͏‪‪͏‪‪͏͏‪͏‪͏‪‪‪͏‪‪͏͏‪‪͏͏͏͏‪͏‪‪͏‪‪͏͏͏͏‪͏‪͏͏͏͏‪͏‪‪‪‪‪͏͏‪͏‪͏͏‪';
var b = a.match(/.{8}/g);
var c = b.map(a => [...a].map(a => a == '‪' | 0));
var d = c.map(a => parseInt(a.join``, 2).toString(16));
var e = d.map(a => eval(`'\\x${a.padStart(2, 0)}'`));
var f = eval(e.join``);
I'm trying to understand how they succeed to alert a message. It alerts number 12345, but how? I see some evals here, so I suppose they are making code on the fly, but still I tried using debugger but I couldn't find explanation. They are somehow generating code and executing it, I'm still unable to see how.
I tried this code in jsFiddle and it still works and I tried in Node.js and it throw error alert is not defined, so I am pretty sure everything this code does is to alert a message.
What trick did they use here? How are they making and evaling code and how do they succeed to alert a message? Is this some sort of encription or what?
My question has absolutely nothing to do with this question.
The code is all there, hidden in the variable a. No, it's not an empty string, its a string consisting of 1888 invisible characters - either \u034f or \u202a to be precise. So this is in fact just a disguised binary encoding.
The code part
var b = a.match(/.{8}/g);
var c = b.map(a => [...a].map(a => a == '‪' | 0));
var d = c.map(a => parseInt(a.join``, 2).toString(16));
breaks them in chunks of 8, then converts each chunk from an array of characters to an array of booleans (or rather, the integers 0 and 1) - notice that it compares the character against the invisible \u202a, and then converts each array-of-8-booleans (oh look, an octet!) into an actual byte and gets a hex representation of it. Here's the hex string (d.join('')):
5f3d275b7e5b28706d7177747b6e7b7c7d7c7b747d79707c7d6d71777c7b5d5d282875716e727c7d79767a775d2b7173737b737b7b737b7b7b6d7a775d2928297e5d5b28755b7d795b785d7d5b6f5d2971776e7c7d725d5d7d2b6f7c792175712b217d7a5b217d7b795d2b2878216f772b5b7d5d76782b5b7e2975787d2974796f5b6f5d7d295b735d2b7a727c217d7b7b7c7b715b7b705b7e7d297a7b6f5b5d6e79757a6d792176273b666f722869206f66276d6e6f707172737475767778797a7b7c7d7e272977697468285f2e73706c6974286929295f3d6a6f696e28706f702829293b6576616c285f29
The part
d.map(a => eval(`'\\x${a.padStart(2, 0)}'`));
has each of them parsed into a character, using a backslash escape. String.fromCharCode would have been the simpler choice. Also the padStart is not even required here, given that none of the bytes is a control character with a byte value less than 16. Maybe this would've been more familiar:
"\x5f\x3d\x27\x5b\x7e\x5b\x28\x70\x6d\x71\x77\x74\x7b\x6e\x7b\x7c\x7d\x7c\x7b\x74\x7d\x79\x70\x7c\x7d\x6d\x71\x77\x7c\x7b\x5d\x5d\x28\x28\x75\x71\x6e\x72\x7c\x7d\x79\x76\x7a\x77\x5d\x2b\x71\x73\x73\x7b\x73\x7b\x7b\x73\x7b\x7b\x7b\x6d\x7a\x77\x5d\x29\x28\x29\x7e\x5d\x5b\x28\x75\x5b\x7d\x79\x5b\x78\x5d\x7d\x5b\x6f\x5d\x29\x71\x77\x6e\x7c\x7d\x72\x5d\x5d\x7d\x2b\x6f\x7c\x79\x21\x75\x71\x2b\x21\x7d\x7a\x5b\x21\x7d\x7b\x79\x5d\x2b\x28\x78\x21\x6f\x77\x2b\x5b\x7d\x5d\x76\x78\x2b\x5b\x7e\x29\x75\x78\x7d\x29\x74\x79\x6f\x5b\x6f\x5d\x7d\x29\x5b\x73\x5d\x2b\x7a\x72\x7c\x21\x7d\x7b\x7b\x7c\x7b\x71\x5b\x7b\x70\x5b\x7e\x7d\x29\x7a\x7b\x6f\x5b\x5d\x6e\x79\x75\x7a\x6d\x79\x21\x76\x27\x3b\x66\x6f\x72\x28\x69\x20\x6f\x66\x27\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x27\x29\x77\x69\x74\x68\x28\x5f\x2e\x73\x70\x6c\x69\x74\x28\x69\x29\x29\x5f\x3d\x6a\x6f\x69\x6e\x28\x70\x6f\x70\x28\x29\x29\x3b\x65\x76\x61\x6c\x28\x5f\x29"
This string is the one evaled in the last line. But surprise, the contents of that string are just
_='[~[(pmqwt{n{|}|{t}yp|}mqw|{]]((uqnr|}yvzw]+qss{s{{s{{{mzw])()~][(u[}y[x]}[o])qwn|}r]]}+o|y!uq+!}z[!}{y]+(x!ow+[}]vx+[~)ux})tyo[o]})[s]+zr|!}{{|{q[{p[~})z{o[]nyuzmy!v';for(i of'mnopqrstuvwxyz{|}~')with(_.split(i))_=join(pop());eval(_)
So what does - still obfuscated - code do?
var _='[~[(pmqwt{n{|}|{t}yp|}mqw|{]]((uqnr|}yvzw]+qss{s{{s{{{mzw])()~][(u[}y[x]}[o])qwn|}r]]}+o|y!uq+!}z[!}{y]+(x!ow+[}]vx+[~)ux})tyo[o]})[s]+zr|!}{{|{q[{p[~})z{o[]nyuzmy!v';
for (var i of 'mnopqrstuvwxyz{|}~')
with (_.split(i))
_=join(pop());
eval(_)
Removing the with magic, we get
for (var i of 'mnopqrstuvwxyz{|}~') {
let temp = _.split(i);
_ = temp.join(temp.pop());
}
So for all of these characters from m to z, it splits _ by that, takes the last part out, and joins it back together, effectively
replacing m by y!v,
replacing n by yuz,
replacing o by [],
replacing p by [~})z{,
replacing q by [{,
replacing r by |!}{{|{,
replacing s by ]+z,
replacing t by y[][[]]})[,
replacing u by x}),
replacing v by x+[~),
replacing w by +[}],
replacing x by ![],
replacing y by ]+(,
replacing z by [!}{,
replacing { by +!},
replacing | by ]+(!![]})[,
replacing } by +[],
replacing ~ by ][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]
and after all that we get for _ to be evaled the code
[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]+(![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]+[+!+[]]+[!+[]+!+[]]+[!+[]+!+[]+!+[]]+[!+[]+!+[]+!+[]+!+[]]+[!+[]+!+[]+!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]])()
Now doesn't that look familiar? It's good old jsfuck!
I found this code as a part of some game engine code
I doubt it. Looks much more like a submission to a code obfusciation context. However, it doesn't appear to be hand-crafted, more likely someone just blindly chained multiple obfusciation tools together.

Parse JSON but preserve \n in strings

I have this JSON string:
{\"text\":\"Line 1\\nLine 2\",\"color\":\"black\"}
I can parse it when I do this:
pg = JSON.parse(myJSONString.replace(/\\/g, ""));
But when I access pg.text the value is:
Line 1nLine 2.
But I want the value to be exactly:
Line 1\nLine 2
The JSON string is valid in terms of the target program which interprets it as part of a larger command. It's Minecraft actually. Minecraft will render this as you would expect with Line 1 and Line 2 on separate lines.
But I'm making a editor that needs to read the \n back in as is. Which will be displayed in an html input field.
Just as some context here is the full command which contains some JSON code.
/summon zombie ~ ~1 ~ {HandItems:[{id:"minecraft:written_book",Count:1b,tag:{title‌​:"",author:"",pages:‌​["{\"text\":\"Line 1\\nLine 2\",\"color\":\"black\"}"]}},{}]}
Try adding [1] at /\[1]/g but works for single slash only, but since the type of the quoted json i think is a string when you parse that it slash will automatically be removed so you don't even need to use replace. and \n will remain as.
var myString ='{\"text\":\"Line 1\\nLine 2\",\"color\":\"black\"}';
console.log(JSON.parse(myString.replace(/\\[1]/g, ""))); //adding [1] will remove single slash \\n -> \n
var myString =JSON.parse(myString.replace(/\\[1]/g, ""));
console.log(myString.text);
Your string is not valid JSON, and ideally you should fix the code that generates it, or contact the provider of it.
If the issue is that there is always one backslash too many, then you could do this:
// Need to escape the backslashes in this string literal to get the actual input:
var myJSONString = '{\\"text\\":\\"Line 1\\\\nLine 2\\",\\"color\\":\\"black\\"}';
console.log(myJSONString);
// Only replace backslashes that are not preceded by another:
var fixedJSON = myJSONString.replace(/([^\\])\\/g, "$1");
console.log(fixedJSON);
var pg = JSON.parse(fixedJSON);
console.log(pg);

Regex to remove white spaces, blank lines and final line break in Javascript

Ok guys, I'm having a hard time with regex..
Here's what I need... get a text file, remove all blank lines and white spaces in the beginning and end of these lines, the blank lines to be removed also include a possible empty line at the end of the file (a \n in the end of the whole text)
So my script was:
quotes.replace(/^\s*[\r\n]/gm, "");
This replaces fairly well, but leaves one white space at the end of each line and doesn't remove the final line break.
So I thought using something like this:
quotes.replace(/^\s*[\r\n]/gm, "").replace(/^\n$/, "");
The second "replace" would remove a final \n from the whole string if present.. but it doesn't work..
So I tried this:
quotes.replace(/^\s*|\s*$|\n\n+/gm, "")
Which removes line breaks but joins some lines when there is a line break in the middle:
so that
1
2
3
4
Would return the following lines:
["1", "2", "34"]
Can you guys help me out?
Since it sounds like you have to do this all in a single regex, try this:
quotes.replace(/^(?=\n)$|^\s*|\s*$|\n\n+/gm,"")
What we are doing is creating a group that captures nothing, but prevents a newline by itself from getting consumed.
Split, replace, filter:
quotes.split('\n')
.map(function(s) { return s.replace(/^\s*|\s*$/g, ""); })
.filter(function(x) { return x; });
With input value " hello \n\nfoo \n bar\nworld \n",
the output is ["hello", "foo", "bar", "world"].

parsing key/value pairs from string

I'm parsing the body text from incoming emails, looking for key/value pairs.
Example Email Body
First Name: John
Last Name:Smith
Email : john#example.com
Comments = Just a test comment that
may span multiple lines.
I tried using a RegEx ([\w\d\s]+)\s?[=|:]\s?(.+) in multiline mode. This works for most emails, but fails when there's a line break that should be part of the value. I don't know enough about RegEx to go any further.
I have another parser that goes line-by-line looking for the key/value pairs and simply folds a line into the last matched value if a key/value pair is NOT found. It's implemented in Scala.
val lines = text.split("\\r?\\n").toList
var lastLabelled: Int = -1
val linesBuffer = mutable.ListBuffer[(String, String)]()
// only parse lines until the first blank line
// null_? method is checks for empty strings and nulls
lines.takeWhile(!_.null_?).foreach(line => {
line.splitAt(delimiter) match {
case Nil if line.nonEmpty => {
val l = linesBuffer(lastLabelled)
linesBuffer(lastLabelled) = (l._1, l._2 + "\n" + line)
}
case pair :: Nil => {
lastLabelled = linesBuffer.length
linesBuffer += pair
}
case _ => // skip this line
}
})
I'm trying to use RegEx so that I can save the parser to the db and change it on a per-sender basis at runtime (implement different parsers for different senders).
Can my RegEx be modified to match values that contain newlines?
Do I need to just forget about using RegEx and use some JavaScript? I already have a JavaScript parser that lets me store the JS in the DB and essentially do everything that I want to do with the RegEx parser.
I think this should work...
((.+?)((\s*)(:|=)(\s*)))(((.|\n)(?!((.+?)(:|=))))+)
...as tested here http://regexpal.com/. If you loop through the matches you should be able to pull out the key and value.

Categories

Resources