Is it possible to have a comment inside a es6 Template-String? - javascript

Let's say we have a multiline es6 Template-String to describe e.g. some URL params for a request:
const fields = `
id,
message,
created_time,
permalink_url,
type
`;
Is there any way to have comments inside that backtick Template-String? Like:
const fields = `
// post id
id,
// post status/message
message,
// .....
created_time,
permalink_url,
type
`;

Option 1: Interpolation
We can create interpolation blocks that return an empty string, and embed the comments inside them.
const fields = `
id,${ /* post id */'' }
message,${ /* post status/message */'' }
created_time,
permalink_url,
type
`;
console.log(fields);
Option 2: Tagged Templates
Using tagged templates we can clear the comments and reconstruct the strings. Here is a simple commented function that uses Array.map(), String.replace(), and a regex expression (which needs some work) to clear comments, and return the clean string:
const commented = (strings, ...values) => {
const pattern = /\/{2}.+$/gm; // basic idea
return strings
.map((str, i) =>
`${str}${values[i] !== undefined ? values[i] : ''}`)
.join('')
.replace(pattern, '');
};
const d = 10;
const fields = commented`
${d}
id, // post ID
${d}
message, // post/status message
created_time, // ...
permalink_uri,
type
`;
console.log(fields);

I know it's an old answer, but seeing the answers above I feel compelled to both answer the pure question, and then to answer the spirit of the asker's question.
Can you use comments in template literal strings?
Yes. Yes you can. But it's not pretty.
const fields = `
id, ${/* post ID */''}
message, ${/* post/status message */''}
created_time, ${/*... */''}
permalink_url,
type
`;
Note that you have to put '' (an empty string) in the ${ } braces so that Javascript has an expression to insert. Not doing so will result in a runtime error. The quotes can go anywhere outside of the comment.
I'm not a huge fan of this. It's pretty ugly and makes commenting cumbersome, nevermind that toggling comments becomes difficult in most IDEs.
Personally, I use template strings wherever possible, as they are a fraction more efficient than regular Strings, and they capture literally all the text you want, mostly without escaping. You can even put function calls in there!
The string in the example above will be a little odd, and potentially useless for what you're looking for, however, as there will be an initial line-break, extra space between the comma and the comment, as well as an extra final line-break. Removing that unwanted space could be a small performance hit. You could use a regex for that, for speed and efficiency, though... more on that below...
.
Now to answer the intent of the question:
How do I write a comma-delimited list string, with comments on every line?
const fields = [
"id", // post ID
"message", // post/status message
"created_time", //...
"permalink_url",
"type"
].join(",\n");
Joining an Array is one way... (as suggested by #jared-smith )
However, in this case, you are creating an array and then immediately discarding the organized data when you only assign the return value of the join() function. Not only that, but you are creating a memory pointer for each string in the array, which won't be garbage collected till end of scope. In that case, it might be more useful to capture the array, joining on the fly as use dictates, or to use a template literal and differently comment your implementation, like ghostDoc style.
It seems that you are only using template literals in order to satisfy a desire to not have quote marks on each line, minimizing cognitive dissonance between the 'string' query parameters as they look in the url and the code. You should be aware that this preserves line breaks, and I doubt you want that. Consider instead:
/****************
* Fields:
* id : post ID
* message : post/status message
* created_time : some other comment...
*/
const fields = `
id,
message,
created_time,
permalink_uri,
type
`.replace(/\s/g,'');
This uses a regex to filter out all the whitespace, while keeping the list readable and rearrangeable. All the regex literal is doing is capturing the whitespace and then the replace method replaces the captured text with '' (the g on the end just tells the regex not to stop at the first match it finds, in this case, the first newline char.)
or, most nastily, you could just put the comments directly in your template literal, and then strip them with a regex:
const fields = `
id, // post ID
message, // post/status message
created_time, // ...
permalink_uri,
type
`.replace(/\s+\/\/.*\*\/\n/g,'').replace(/\s/g,'');
That first regex will find and replace with an empty string ('') all instances of: one or more whitespace characters that precede a double slash (each slash is escaped by a backslash) followed by whitespace and the new line character. If you wanted to use /* multiline */ comments, this regex becomes a little more complex, you'll have to add another .replace() on the end:
.replace(/\/\*.*\*\//g,'')
That regex can only go after you strip the \n newlines out, or the regex won't match the now-not-multiline comment. That would look something like this:
const fields = `
id, // post ID
message, /* post/
status message */
created_time, // ...
permalink_uri,
type
`.replace(/\s+\/\/.*\n/g,'').replace(/\s/g,'').replace(/\/\*.*\*\//g,'');
All of the above will result in this string:
"id,message,created_time,permalink_uri,type"
There's probably a way to do that with only one regex, but it's beyond the scope here, really. And besides, I'd encourage you to fall in love with regexes by playing with them yourself!
I'll try to get a https://jsperf.com/ up on this later. I'm super curious now!

No.
That syntax is valid, but will just return a string containing \n// post id\nid, rather than removing the comments and creating a string without them.
If you look at §11.8.6 of the spec, you can see that the only token recognized between the backtick delimiters is TemplateCharacters, which accepts escape sequences, line breaks, and normal characters. In §A.1, SourceCharacter is defined to be any Unicode point (except the ones excluded in 11.8.6).

Just don't use template strings:
const fields = [
'id', // comment blah blah
'message',
'created_time',
'permalink_url',
'type'
].join(',');
You pay the cost of the array and method call on initialization (assuming the JIT isn't smart enough to optimize it away entirely.
As pointed out by ssube, the resulting string will not retain the linebreaks or whitespace. It depends on how important that is, you can manually add ' ' and '\n' if necessary or decide you don't really need inline comments that badly.
UPDATE
Note that storing programmatic data in strings is generally held to be a bad idea: store them as named vars or object properties instead. Since your comment reflects you're just converting a bunch of stuff into a
url query string:
const makeQueryString = (url, data) => {
return url + '?' + Object.keys(data)
.map(k => `${k}=${encodeURIComponent(data[k))}`)
.join('&');
};
let qs = makeQueryString(url, {
id: 3,
message: 'blah blah',
// etc.
});
Now you have stuff that is easier to change, understand, reuse, and more transparent to code analysis tools (like those in your IDE of choice).

Yes it is possible
Use <!-- content here -->

Related

How to write a parser using javascript?

In our product we are trying to parse the following different formats from a given piece of text -
${{node::123456}}
${{node:123456}}
$fn{{#functionName('abcd',',',' somethingWithASpace')}}
$fn{{#functionName('abcd','#','${{node::123456}}')}}
${{rmtrqst:someText[]->abcd}}
Sample of the text is like -
Hi, how are you ${{node::123456}}? Your order id is ${{node::636636}}.
or
Your order was placed on $fn{{#dateConverterFunction('abcd','#','${{node::123456}}')}}
I tried with Regex /\$((fn)\{{2}(\#|)(\w*)((\(.*\))|([^\$]*))\}{2})/gi - but this is not helping much. Can anyone suggest me how to write a parser for this?
A grammar could be like this -
Every expression starts with $ followed by either fn{{ or {{
After that there will be a string like node or #functionName or something else
that might be followed by a parenthesis enclosed string (this may contain the whole expression like ${{node::1234}} inside it - we should ignore whatever inside parenthesis
Finally it will be closed by }}
Use a tokenizer and let it break the strings down to a meaningful structure.
The nearly.js library is a popular choice for parsing non-linear structures like yours. You can choose to keep your expressions simple - or, if choose otherwise, the library can create an abstract syntax tree for complicated grimmer.
To write a parser using the library, define your vocabulary in a seperate file and use it for parsing.
Or you can directly using the tokanizer to get your string tokanized.
#{%
const moo = require("moo");
const lexer = moo.compile({
ws: /[ \t]+/,
number: /[0-9]+/,
word: /[a-z]+/,
times: /\*|x/
});
%}
# Pass your lexer object using the #lexer option:
#lexer lexer
# Use %token to match any token of that type instead of "token":
multiplication -> %number %ws %times %ws %number {% ([first, , , , second]) => first * second %}
# Literal strings now match tokens with that text:
trig -> "sin" %number

Match between simple delimiters, but not delimiters themselves

I was looking at JSON data that was just in a text file. I don't want to do anything aside from just use regex to get the values in between quotes. I'm just using this as a way to help practice regex and got to this point that seems like it should be simple, but it turns out it's not (at least to me and a few other people at the office). I've matched complicated urls with ease in regex so I'm not completely new to regex. This just seems like a weird case for me.
I've tried:
/(?:")(.*?)(?:")/
/"(.*?)"/
and several others but these got me the closest.
Basically we can forget that it's JSON and just say I want to match the words value and stuff out of "value" and "stuff". Everything I try includes the quotes, so I'd have to clean the strings afterwards of the delimiters or else the string is literally "value" with the quotes.
Any help would be much appreciated, whether this is simple or complicated, I'd love to know! Thanks
Update: Alright so I think I'll go with (?<=")(.*?)(?=") and read things by line without the global setting on so I just get the first match on each line. In my code I was just plopping in a huge string into a var in the code instead of actually opening a file with ajax/filereader or having a form setup to input data. I think I'll mark this as solved, much appreciated!
You have two choices to solve this problem:
Use capturing groups
You can match the delimiters and use capturing groups to get the text within. In this case your two regexes will work, but you need to use access capturing group 1 to get the results (demo). See How do you access the matched groups in a JavaScript regular expression? for how to do that.
Use zero-width assertions
You can use zero-width assertions to match only the text within, require delimiters around them without actually matching them (demo):
(?<=")(.*?)(?=")
but now since I'm not consuming the quotes it'll find instances between each quote, not just between pairs of quotes: e.g., a"b"c" would find b and c.
As for getting just the first match, I think that'll happen by default in JavaScript. You'd have to ask for repeated matching before you see the subsequent ones. So if you process your file one line at a time, you should get what you want.
get the values in between quotes
One thing to keep in mind is that valid JSON accepts escaped quotes inside the quoted values. Therefore, the RegEx should take this into account when capturing the groups which is done with the “unrolling-the-loop” pattern.
var pattern = /"[^"\\]*(?:\\.[^"\\]*)*"/g;
var data = {
"value": "This is \"stuff\".",
"empty": "",
"null": null,
"number": 50
};
var dataString = JSON.stringify(data);
console.log(dataString);
var matched = dataString.match(pattern);
matched.map(item => console.log(JSON.parse(item)));

Matching a JS string with regex

I have a long xml raw message that is being stored in a string format. A sample is as below.
<tag1>val</tag><tag2>val</tag2><tagSomeNameXYZ/>
I'm looking to search this string and find out if it contains an empty html tag such as <tagSomeNameXYZ/>. This thing is, the value of SomeName can change depending on context. I've tried using Str.match(/tagSomeNameXYZ/g) and Str.match(/<tag.*.XYZ\/>/g) to find out if it contains exactly that string, but am able to get it return anything. I'm having trouble in writing a reg ex that matches something like <tag*XYZ/>, where * is going to be SomeName (which I'm not interested in)
Tl;dr : How do I filter out <tagSomeNameXYZ/> from the string. Format being : <constant variableName constant/>
Example patterns that it should match:
<tagGetIndexXYZ/>
<tagGetAllIndexXYZ/>
<tagGetFooterXYZ/>
The issue you have with Str.match(/<tag.*.XYZ\/>/g) is the .* takes everything it sees and does not stop at the XYZ as you wish. So you need to find a way to stop (e.g. the [^/]* means keep taking until you find a /) and then work back from there (the slice).
Does this help
testString = "<tagGetIndexXYZ/>"
res = testString.match(/<tag([^/]*)\/\>/)[1].slice(0,-3)
console.log(res)

es6 multiline template strings with no new lines and allow indents

Been using es6 more and more for most work these days. One caveat is template strings.
I like to limit my line character count to 80. So if I need to concatenate a long string, it works fine because concatenation can be multiple lines like this:
const insert = 'dog';
const str = 'a really long ' + insert + ' can be a great asset for ' +
insert + ' when it is a ' + dog;
However, trying to do that with template literals would just give you a multi-line string with ${insert} placing dog in the resulting string. Not ideal when you want to use template literals for things like url assembly, etc.
I haven't yet found a good way to maintain my line character limit and still use long template literals. Anyone have some ideas?
The other question that is marked as an accepted is only a partial answer. Below is another problem with template literals that I forgot to include before.
The problem with using new line characters is that it doesn't allow for indentation without inserting spaces into the final string. i.e.
const insert = 'dog';
const str = `a really long ${insert} can be a great asset for\
${insert} when it is a ${insert}`;
The resulting string looks like this:
a really long dog can be a great asset for dog when it is a dog
Overall this is a minor issue but would be interesting if there was a fix to allow multiline indenting.
Two answers for this problem, but only one may be considered optimal.
Inside template literals, javascript can be used inside of expressions like ${}. Its therefore possible to have indented multiline template literals such as the following. The caveat is some valid js character or value must be present in the expression, such as an empty string or variable.
const templateLiteral = `abcdefgh${''
}ijklmnopqrst${''
}uvwxyz`;
// "abcdefghijklmnopqrstuvwxyz"
This method makes your code look like crap. Not recommended.
The second method was recommended by #SzybkiSasza and seems to be the best option available. For some reason concatenating template literals didn't occur to me as possible. I'm derp.
const templateLiteral = `abcdefgh` +
`ijklmnopqrst` +
`uvwxyz`;
// "abcdefghijklmnopqrstuvwxyz"
Why not use a tagged template literal function?
function noWhiteSpace(strings, ...placeholders) {
let withSpace = strings.reduce((result, string, i) => (result + placeholders[i - 1] + string));
let withoutSpace = withSpace.replace(/$\n^\s*/gm, ' ');
return withoutSpace;
}
Then you can just tag any template literal you want to have line breaks in:
let myString = noWhiteSpace`This is a really long string, that needs to wrap over
several lines. With a normal template literal you can't do that, but you can
use a template literal tag to allow line breaks and indents.`;
The provided function will strip all line breaks and line-leading tabs & spaces, yielding the following:
> This is a really long string, that needs to wrap over several lines. With a normal template literal you can't do that, but you can use a template literal tag to allow line breaks and indents.
I published this as the compress-tag library.

Parsing malformed JSON in JavaScript

Thanks for looking!
BACKGROUND
I am writing some front-end code that consumes a JSON service which is returning malformed JSON. Specifically, the keys are not surrounded with quotes:
{foo: "bar"}
I have NO CONTROL over the service, so I am correcting this like so:
var scrubbedJson = dirtyJson.replace(/(['"])?([a-zA-Z0-9_]+)(['"])?:/g, '"$2": ');
This gives me well formed JSON:
{"foo": "bar"}
Problem
However, when I call JSON.parse(scrubbedJson), I still get an error. I suspect it may be because the entire JSON string is surrounded in double quotes but I am not sure.
UPDATE
This has been solved--the above code works fine. I had a rogue single quote in the body of the JSON that was returned. I got that out of there and everything now parses. Thanks.
Any help would be appreciated.
You can avoid using a regexp altogether and still output a JavaScript object from a malformed JSON string (keys without quotes, single quotes, etc), using this simple trick:
var jsonify = (function(div){
return function(json){
div.setAttribute('onclick', 'this.__json__ = ' + json);
div.click();
return div.__json__;
}
})(document.createElement('div'));
// Let's say you had a string like '{ one: 1 }' (malformed, a key without quotes)
// jsonify('{ one: 1 }') will output a good ol' JS object ;)
Here's a demo: http://codepen.io/csuwldcat/pen/dfzsu (open your console)
something like this may help to repair the json ..
$str='{foo:"bar"}';
echo preg_replace('/({)([a-zA-Z0-9]+)(:)/','$1"$2"${3}',$str);
Output:
{"foo":"bar"}
EDIT:
var str='{foo:"bar"}';
str.replace(/({)([a-zA-Z0-9]+)(:)/,'$1"$2"$3')
There is a project that takes care of all kinds of invalid cases in JSON https://github.com/freethenation/durable-json-lint
I was trying to solve the same problem using a regEx in Javascript. I have an app written for Node.js to parse incoming JSON, but wanted a "relaxed" version of the parser (see following comments), since it is inconvenient to put quotes around every key (name). Here is my solution:
var objKeysRegex = /({|,)(?:\s*)(?:')?([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*)(?:')?(?:\s*):/g;// look for object names
var newQuotedKeysString = originalString.replace(objKeysRegex, "$1\"$2\":");// all object names should be double quoted
var newObject = JSON.parse(newQuotedKeysString);
Here's a breakdown of the regEx:
({|,) looks for the beginning of the object, a { for flat objects or , for embedded objects.
(?:\s*) finds but does not remember white space
(?:')? finds but does not remember a single quote (to be replaced by a double quote later). There will be either zero or one of these.
([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) is the name (or key). Starts with any letter, underscore, $, or dot, followed by zero or more alpha-numeric characters or underscores or dashes or dots or $.
the last character : is what delimits the name of the object from the value.
Now we can use replace() with some dressing to get our newly quoted keys:
originalString.replace(objKeysRegex, "$1\"$2\":")
where the $1 is either { or , depending on whether the object was embedded in another object. \" adds a double quote. $2 is the name. \" another double quote. and finally : finishes it off.
Test it out with
{keyOne: "value1", $keyTwo: "value 2", key-3:{key4:18.34}}
output:
{"keyOne": "value1","$keyTwo": "value 2","key-3":{"key4":18.34}}
Some comments:
I have not tested this method for speed, but from what I gather by reading some of these entries is that using a regex is faster than eval()
For my application, I'm limiting the characters that names are allowed to have with ([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) for my 'relaxed' version JSON parser. If you wanted to allow more characters in names (you can do that and still be valid), you could instead use ([^'":]+) to mean anything other than double or single quotes or a colon. You can have all sorts of stuff in here with this expression, so be careful.
One shortcoming is that this method actually changes the original incoming data (but I think that's what you wanted?). You could program around that to mitigate this issue - depends on your needs and resources available.
Hope this helps.
-John L.
How about?
function fixJson(json) {
var tempString, tempJson, output;
tempString = JSON.stringify(json);
tempJson = JSON.parse(tempString);
output = JSON.stringify(tempJson);
return output;
}

Categories

Resources