Normal String assignment:
var str1 = "\320";
console.log(str1); // "Ð"
Raw String assignment:
var str2 = String.raw`\320`;
console.log(str2); // "\320"
In raw string, the backslashes are not interpreted. I need to interpret them so that "\320" will become "Ð". Should I have to convert the raw string to normal String. If so, How? If not so, what else should I do and how do I do?
The thing is, this code is octal, and since these are mapped with linguistic symbols, javascript interpretes it when defining new string.
what you can do is make a map of all the symbols you require with their key as actual string and value as actual symbol.
for ex -
var map = {
"\\320": "\320"
}
console.log(map);
now you can search you text in the map and get the required value.
var str2 = String.raw`\320`;
var s = map[str2];
console.log(s);
to make the map, try this -
visit this site - https://brajeshwar.github.io/entities/
and run this code on console
// for latin
var tbody = document.getElementById("latin");
var trs = tbody.children;
var map = {};
for(i=1;i<trs.length;i++) {
console.log(trs[i].children[6].innerText);
key = trs[i].children[6].innerText;
value = trs[i].children[1].innerText;
map[key] = value;
}
now console map, stringify it, and paste the string in your code and parse it.
I have done this only for latin, similarly do this for other elements also.
Question is a couple of months old, but I think this answer is your best bet, yet. Transforming escape sequences from raw strings is very much doable with ES6 String.fromcodepoint(<hex-value>). I'm in the middle of writing an NPM package which deals with this exact scenario.
First, you need a regular expression which matches all escape sequences in your string. I've used this as a reference for all the different ones. (I use a raw string for this to avoid spamming backslashes)
let [single, ...hex] = String.raw`
\\[bfnrtv0'"\\]
\\x[a-fA-F0-9]{2}
(\\u[a-fA-F0-9]{4}){1,}
\\u\{([0-9a-fA-F]{1,})\}`
.split("\n").slice(1).map(cur => cur.trim());
let escapes = new RegExp(`(${[single].concat(hex).join("|")})`, "gm"),
// We need these for later when differentiating how we convert the different escapes.
uniES6 = new RegExp(`${hex.pop()}`);
single = new RegExp(`${single}`);
Now you can match all the escapes; reserved single characters, extended ASCII range, ES6 "Astral" unicode hexadecimals and surrogate pairs. (except octals because they're deprecated, but you can always add it back). The next step is writing a function which can replace the code points with the corresponding symbols. First a switch-like function for singles:
const singleEscape = seq =>
(() => ({
"\\b" : "\b",
"\\f" : "\f",
"\\n" : "\n",
"\\r" : "\r",
"\\t" : "\t",
"\\v" : "\v",
"\\0" : "\0",
"\\'" : "\'",
"\\\"" : "\"",
"\\\\" : "\\"
}[seq]))();
Then we can rely on ES6 fromcodepoint to deal with the rest which are all hexadecimals.
const convertEscape = seq => {
if (single.test(seq))
return singleEscape(seq);
else if (uniES6.test(seq))
return String.fromCodePoint(`0x${seq.split("").slice(3, -1).join("")}`);
else
return String.fromCodePoint.apply(
String, seq.split("\\").slice(1).map(pt => `0x${pt.substr(1)}`)
);
}
Lastly, we tie it all together with a tagged template literal function named normal. I do not know why you need a raw string, but here you can have access to the raw string and put any additional logic while still resulting in a string where escape sequences are properly parsed.
const normal = (strings, ...values) => strings.raw
.reduce((acc, cur, i) => acc += (values[i-1] || "") + cur, "")
.replace(escapes, match => convertEscape(match));
Related
How do I convert strings with underscores into spaces and converting it to proper case?
CODE
const string = sample_orders
console.log(string.replace(/_/g, ' '))
Expected Output
Sample Orders
.replace replace only the first occurrence of the target input. In order to replace all the occurrences, use .replaceAll.
var string = 'sample_orders'
string = string.replaceAll('_', ' ')
Further converting it to the proper case could be accomplished through regEx
string = string.replace(/(^\w|\s\w)/g, firstCharOfWord => firstCharOfWord.toUpperCase());
Where:
^\w : First char of the string
| : or
\s\w : First char after whitespace
g : global scope(Match all occurrences)
The second argument is a function that accepts the first character of each word computed using the regular expression and converts it to an upper case.
Binding the above logic into a function:
function formatString(string){
return string.replaceAll('_', ' ').replaceAll(/(^\w|\s\w)/g, firstCharOfWord => firstCharOfWord.toUpperCase());
}
let formattedString = formatString('sample_orders')
console.log(formattedString) //Sample Orders
Caveat: You might encounter an error saying .replaceAll is not a function if you are running on an older browser or runtime environment.
The .replaceAll method was added in ES2021/ES12. The best workaround to run such functionalities on older versions of JS is to provide native support to older versions that do not support newly added methods or features ( .replaceAll) in this case.
function formatString(string){
return string.replace(/_/g, ' ').replace(/(^\w|\s\w)/g, firstCharOfWord => firstCharOfWord.toUpperCase());
}
let formattedString = formatString('sample_orders')
console.log(formattedString) //Sample Orders
Note that we're using the replace method with regular expressions on a global scope. This could also be used at instances where the strings might not contain underscores.
This function should do the job :
function humanize(str) {
var i, frags = str.split('_');
for (i=0; i<frags.length; i++) {
frags[i] = frags[i].charAt(0).toUpperCase() + frags[i].slice(1);
}
return frags.join(' ');
}
humanize('sample_orders');
// > Sample Orders
JSFiddle
I am designing a regular expression tester in HTML and JavaScript. The user will enter a regex, a string, and choose the function they want to test with (e.g. search, match, replace, etc.) via radio button and the program will display the results when that function is run with the specified arguments. Naturally there will be extra text boxes for the extra arguments to replace and such.
My problem is getting the string from the user and turning it into a regular expression. If I say that they don't need to have //'s around the regex they enter, then they can't set flags, like g and i. So they have to have the //'s around the expression, but how can I convert that string to a regex? It can't be a literal since its a string, and I can't pass it to the RegExp constructor since its not a string without the //'s. Is there any other way to make a user input string into a regex? Will I have to parse the string and flags of the regex with the //'s then construct it another way? Should I have them enter a string, and then enter the flags separately?
Use the RegExp object constructor to create a regular expression from a string:
var re = new RegExp("a|b", "i");
// same as
var re = /a|b/i;
var flags = inputstring.replace(/.*\/([gimy]*)$/, '$1');
var pattern = inputstring.replace(new RegExp('^/(.*?)/'+flags+'$'), '$1');
var regex = new RegExp(pattern, flags);
or
var match = inputstring.match(new RegExp('^/(.*?)/([gimy]*)$'));
// sanity check here
var regex = new RegExp(match[1], match[2]);
Here is a one-liner: str.replace(/[|\\{}()[\]^$+*?.]/g, '\\$&')
I got it from the escape-string-regexp NPM module.
Trying it out:
escapeStringRegExp.matchOperatorsRe = /[|\\{}()[\]^$+*?.]/g;
function escapeStringRegExp(str) {
return str.replace(escapeStringRegExp.matchOperatorsRe, '\\$&');
}
console.log(new RegExp(escapeStringRegExp('example.com')));
// => /example\.com/
Using tagged template literals with flags support:
function str2reg(flags = 'u') {
return (...args) => new RegExp(escapeStringRegExp(evalTemplate(...args))
, flags)
}
function evalTemplate(strings, ...values) {
let i = 0
return strings.reduce((str, string) => `${str}${string}${
i < values.length ? values[i++] : ''}`, '')
}
console.log(str2reg()`example.com`)
// => /example\.com/u
Use the JavaScript RegExp object constructor.
var re = new RegExp("\\w+");
re.test("hello");
You can pass flags as a second string argument to the constructor. See the documentation for details.
In my case the user input somethimes was sorrounded by delimiters and sometimes not. therefore I added another case..
var regParts = inputstring.match(/^\/(.*?)\/([gim]*)$/);
if (regParts) {
// the parsed pattern had delimiters and modifiers. handle them.
var regexp = new RegExp(regParts[1], regParts[2]);
} else {
// we got pattern string without delimiters
var regexp = new RegExp(inputstring);
}
Try using the following function:
const stringToRegex = str => {
// Main regex
const main = str.match(/\/(.+)\/.*/)[1]
// Regex options
const options = str.match(/\/.+\/(.*)/)[1]
// Compiled regex
return new RegExp(main, options)
}
You can use it like so:
"abc".match(stringToRegex("/a/g"))
//=> ["a"]
Here is my one liner function that handles custom delimiters and invalid flags
// One liner
var stringToRegex = (s, m) => (m = s.match(/^([\/~#;%#'])(.*?)\1([gimsuy]*)$/)) ? new RegExp(m[2], m[3].split('').filter((i, p, s) => s.indexOf(i) === p).join('')) : new RegExp(s);
// Readable version
function stringToRegex(str) {
const match = str.match(/^([\/~#;%#'])(.*?)\1([gimsuy]*)$/);
return match ?
new RegExp(
match[2],
match[3]
// Filter redundant flags, to avoid exceptions
.split('')
.filter((char, pos, flagArr) => flagArr.indexOf(char) === pos)
.join('')
)
: new RegExp(str);
}
console.log(stringToRegex('/(foo)?\/bar/i'));
console.log(stringToRegex('#(foo)?\/bar##gi')); //Custom delimiters
console.log(stringToRegex('#(foo)?\/bar##gig')); //Duplicate flags are filtered out
console.log(stringToRegex('/(foo)?\/bar')); // Treated as string
console.log(stringToRegex('gig')); // Treated as string
I suggest you also add separate checkboxes or a textfield for the special flags. That way it is clear that the user does not need to add any //'s. In the case of a replace, provide two textfields. This will make your life a lot easier.
Why? Because otherwise some users will add //'s while other will not. And some will make a syntax error. Then, after you stripped the //'s, you may end up with a syntactically valid regex that is nothing like what the user intended, leading to strange behaviour (from the user's perspective).
This will work also when the string is invalid or does not contain flags etc:
function regExpFromString(q) {
let flags = q.replace(/.*\/([gimuy]*)$/, '$1');
if (flags === q) flags = '';
let pattern = (flags ? q.replace(new RegExp('^/(.*?)/' + flags + '$'), '$1') : q);
try { return new RegExp(pattern, flags); } catch (e) { return null; }
}
console.log(regExpFromString('\\bword\\b'));
console.log(regExpFromString('\/\\bword\\b\/gi'));
Thanks to earlier answers, this blocks serves well as a general purpose solution for applying a configurable string into a RegEx .. for filtering text:
var permittedChars = '^a-z0-9 _,.?!#+<>';
permittedChars = '[' + permittedChars + ']';
var flags = 'gi';
var strFilterRegEx = new RegExp(permittedChars, flags);
log.debug ('strFilterRegEx: ' + strFilterRegEx);
strVal = strVal.replace(strFilterRegEx, '');
// this replaces hard code solt:
// strVal = strVal.replace(/[^a-z0-9 _,.?!#+]/ig, '');
You can ask for flags using checkboxes then do something like this:
var userInput = formInput;
var flags = '';
if(formGlobalCheckboxChecked) flags += 'g';
if(formCaseICheckboxChecked) flags += 'i';
var reg = new RegExp(userInput, flags);
Safer, but not safe. (A version of Function that didn't have access to any other context would be good.)
const regexp = Function('return ' + string)()
I found #Richie Bendall solution very clean. I added few small modifications because it falls appart and throws error (maybe that's what you want) when passing non regex strings.
const stringToRegex = (str) => {
const re = /\/(.+)\/([gim]?)/
const match = str.match(re);
if (match) {
return new RegExp(match[1], match[2])
}
}
Using [gim]? in the pattern will ignore any match[2] value if it's invalid. You can omit the [gim]? pattern if you want an error to be thrown if the regex options is invalid.
I use eval to solve this problem.
For example:
function regex_exec() {
// Important! Like #Samuel Faure mentioned, Eval on user input is a crazy security risk, so before use this method, please take care of the security risk.
var regex = $("#regex").val();
// eval()
var patt = eval(userInput);
$("#result").val(patt.exec($("#textContent").val()));
}
I have string that contains key value separated by diff. kind of chars.
I need to use pure JavaScript ( no lib like jquery or ecma 5 or 6) regx or logic that is faster to extract key value and create javasciprt object.
string can be like as following and it will be not so long .mostly i can have 2 or 3 key value pairs.
"key!value~key!value"
"c!XXXXXXX~e!YYYYY~k!YYXXXX~d!" where "~" separate between key value and "!"
separates between key and value.
Out put after parsting string will be
{c:"XXXXXXX",e:"YYYYY",k:"YYXXXX",d:''}
Is Regx is faster and what can be pattern?
or normal forloop and split function will be faster?
You don't need to use regex to separate the key-value pairs just use split function of string object. use code :
const KV_SEP = "!";
const ENTITY_SEP = "~";
"c!XXXXXXX~e!YYYYY~k!YYXXXX~d!".split(ENTITY_SEP).map(function(val){
return [val.split(KV_SEP)];
});
This is regex version
function splitString(str) {
const KEY_INDEX = 1
const VALUE_INDEX = 2
const myKeyValue = {}
const myRegex = /(?:([a-z])!([a-zA-z]*)~?)/g
while(1) {
match = myRegex.exec(str)
if (match === null) break
myKeyValue[match[KEY_INDEX]] = match[VALUE_INDEX]
}
return myKeyValue
}
console.log('result:', splitString('c!XXXXXXX~e!YYYYY~k!YYXXXX'))
I am parsing some key value pairs that are separated by colons. The problem I am having is that in the value section there are colons that I want to ignore but the split function is picking them up anyway.
sample:
Name: my name
description: this string is not escaped: i hate these colons
date: a date
On the individual lines I tried this line.split(/:/, 1) but it only matched the value part of the data. Next I tried line.split(/:/, 2) but that gave me ['description', 'this string is not escaped'] and I need the whole string.
Thanks for the help!
a = line.split(/:/);
key = a.shift();
val = a.join(':');
Use the greedy operator (?) to only split the first instance.
line.split(/: (.+)?/, 2);
If you prefer an alternative to regexp consider this:
var split = line.split(':');
var key = split[0];
var val = split.slice(1).join(":");
Reference: split, slice, join.
Slightly more elegant:
a = line.match(/(.*?):(.*)/);
key = a[1];
val = a[2];
May be this approach will be the best for such purpose:
var a = line.match(/([^:\s]+)\s*:\s*(.*)/);
var key = a[1];
var val = a[2];
So, you can use tabulations in your config/data files of such structure and also not worry about spaces before or after your name-value delimiter ':'.
Or you can use primitive and fast string functions indexOf and substr to reach your goal in, I think, the fastest way (by CPU and RAM)
for ( ... line ... ) {
var delimPos = line.indexOf(':');
if (delimPos <= 0) {
continue; // Something wrong with this "line"
}
var key = line.substr(0, delimPos).trim();
var val = line.substr(delimPos + 1).trim();
// Do all you need with this key: val
}
Split string in two at first occurrence
To split a string with multiple i.e. columns : only at the first column occurrence
use Positive Lookbehind (?<=)
const a = "Description: this: is: nice";
const b = "Name: My Name";
console.log(a.split(/(?<=^[^:]*):/)); // ["Description", " this: is: nice"]
console.log(b.split(/(?<=^[^:]*):/)); // ["Name", " My Name"]
it basically consumes from Start of string ^ everything that is not a column [^:] zero or more times *. Once the positive lookbehind is done, finally matches the column :.
If you additionally want to remove one or more whitespaces following the column,
use /(?<=^[^:]*): */
Explanation on Regex101.com
function splitOnce(str, sep) {
const idx = str.indexOf(sep);
return [str.slice(0, idx), str.slice(idx+1)];
}
splitOnce("description: this string is not escaped: i hate these colons", ":")
I'm trying to extract a substring from a file with JavaScript Regex. Here is a slice from the file :
DATE:20091201T220000
SUMMARY:Dad's birthday
the field I want to extract is "Summary". Here is the approach:
extractSummary : function(iCalContent) {
/*
input : iCal file content
return : Event summary
*/
var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
return(arr);
}
function extractSummary(iCalContent) {
var rx = /\nSUMMARY:(.*)\n/g;
var arr = rx.exec(iCalContent);
return arr[1];
}
You need these changes:
Put the * inside the parenthesis as
suggested above. Otherwise your matching
group will contain only one
character.
Get rid of the ^ and $. With the global option they match on start and end of the full string, rather than on start and end of lines. Match on explicit newlines instead.
I suppose you want the matching group (what's
inside the parenthesis) rather than
the full array? arr[0] is
the full match ("\nSUMMARY:...") and
the next indexes contain the group
matches.
String.match(regexp) is
supposed to return an array with the
matches. In my browser it doesn't (Safari on Mac returns only the full
match, not the groups), but
Regexp.exec(string) works.
You need to use the m flag:
multiline; treat beginning and end characters (^ and $) as working
over multiple lines (i.e., match the beginning or end of each line
(delimited by \n or \r), not only the very beginning or end of the
whole input string)
Also put the * in the right place:
"DATE:20091201T220000\r\nSUMMARY:Dad's birthday".match(/^SUMMARY\:(.*)$/gm);
//------------------------------------------------------------------^ ^
//-----------------------------------------------------------------------|
Your regular expression most likely wants to be
/\nSUMMARY:(.*)$/g
A helpful little trick I like to use is to default assign on match with an array.
var arr = iCalContent.match(/\nSUMMARY:(.*)$/g) || [""]; //could also use null for empty value
return arr[0];
This way you don't get annoying type errors when you go to use arr
This code works:
let str = "governance[string_i_want]";
let res = str.match(/[^governance\[](.*)[^\]]/g);
console.log(res);
res will equal "string_i_want". However, in this example res is still an array, so do not treat res like a string.
By grouping the characters I do not want, using [^string], and matching on what is between the brackets, the code extracts the string I want!
You can try it out here: https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_match_regexp
Good luck.
(.*) instead of (.)* would be a start. The latter will only capture the last character on the line.
Also, no need to escape the :.
You should use this :
var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
return(arr[0]);
this is how you can parse iCal files with javascript
function calParse(str) {
function parse() {
var obj = {};
while(str.length) {
var p = str.shift().split(":");
var k = p.shift(), p = p.join();
switch(k) {
case "BEGIN":
obj[p] = parse();
break;
case "END":
return obj;
default:
obj[k] = p;
}
}
return obj;
}
str = str.replace(/\n /g, " ").split("\n");
return parse().VCALENDAR;
}
example =
'BEGIN:VCALENDAR\n'+
'VERSION:2.0\n'+
'PRODID:-//hacksw/handcal//NONSGML v1.0//EN\n'+
'BEGIN:VEVENT\n'+
'DTSTART:19970714T170000Z\n'+
'DTEND:19970715T035959Z\n'+
'SUMMARY:Bastille Day Party\n'+
'END:VEVENT\n'+
'END:VCALENDAR\n'
cal = calParse(example);
alert(cal.VEVENT.SUMMARY);