understanding this regular expressions - javascript

var keys = {};
source.replace(
/([^=&]+)=([^&]*)/g,
function(full, key, value) {
keys[key] =
(keys[key] ? keys[key] + "," : "") + value;
return "";
}
);
var result = [];
for (var key in keys) {
result.push(key + "=" + keys[key]);
}
return result.join("&");
}
alert(compress("foo=1&foo=2&blah=a&blah=b&foo=3"));
i still confuse with this /([^=&]+)=([^&]*)/g , the + and * use for ?

The ^ means NOT these, the + means one or more characters matching, the () are groups. And the * is any ammount of matches (0+).
http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/
So by looking at it, I'm guesing its replacing anything thats NOT =&=& or &=& or ==, which is wierd.

+ and * are called quantifiers. They determine how many times can a subset match (the set of characters immediately preceding them usually grouped with [] or () to which the quantifiers apply) repeat.
/ start of regex
( group 1 starts
[^ anything that does not match
=& equals or ampersand
]+ one or more of above
) group 1 ends
= followed by equals sign followed by
( group 2 starts
[^ anything that does not match
=& ampersand
]* zero or more of above
) group 2 ends
/ end of regex

Related

Calculating mixed numbers and chars and concatinating it back again in JS/jQuery

I need to manipulate drawing of a SVG, so I have attribute "d" values like this:
d = "M561.5402,268.917 C635.622,268.917 304.476,565.985 379.298,565.985"
What I want is to "purify" all the values (to strip the chars from them), to calculate them (for the sake of simplicity, let's say to add 100 to each value), to deconstruct the string, calculate the values inside and then concatenate it all back together so the final result is something like this:
d = "M661.5402,368.917 C735.622,368.917 404.476,665.985 479.298,665.985"
Have in mind that:
some values can start with a character
values are delimited by comma
some values within comma delimiter can be delimited by space
values are decimal
This is my try:
let arr1 = d.split(',');
arr1 = arr1.map(element => {
let arr2 = element.split(' ');
if (arr2.length > 1) {
arr2 = arr2.map(el => {
let startsWithChar = el.match(/\D+/);
if (startsWithChar) {
el = el.replace(/\D/g,'');
}
el = parseFloat(el) + 100;
if (startsWithChar) {
el = startsWithChar[0] + el;
}
})
}
else {
let startsWithChar = element.match(/\D+/);
if (startsWithChar) {
element = element.replace(/\D/g,'');
}
element = parseFloat(element) + 100;
if (startsWithChar) {
element = startsWithChar[0] + element;
}
}
});
d = arr1.join(',');
I tried with regex replace(/\D/g,'') but then it strips the decimal dot from the value also, so I think my solution is full of holes.
Maybe another solution would be to somehow modify directly each of path values/commands, I'm opened to that solution also, but I don't know how.
const s = 'M561.5402,268.917 C635.622,268.917 304.476,565.985 379.298,565.985'
console.log(s.replaceAll(/[\d.]+/g, m=>+m+100))
You might use a pattern to match the format in the string with 2 capture groups.
([ ,]?\b[A-Z]?)(\d+\.\d+)\b
The pattern matches:
( Capture group 1
[ ,]?\b[A-Z]? Match an optional space or comma, a word boundary and an optional uppercase char A-Z
) Close group 1
( Capture group 2
\d+\.\d+ Match 1+ digits, a dot and 1+ digits
) Close group 1
\b A word boundary to prevent a partial word match
Regex demo
First capture the optional delimiter followed by an optional uppercase char in group 1, and the decimal number in group 2.
Then add 100 to the decimal value and join back the 2 group values.
const d = "M561.5402,268.917 C635.622,268.917 304.476,565.985 379.298,565.985";
const regex = /([ ,]?\b[A-Z]?)(\d+\.\d+)\b/g;
const res = Array.from(
d.matchAll(regex), m => m[1] + (+m[2] + 100)
).join('');
console.log(res);

Extract JPA Named Parameters in Javascript

I am trying to extract JPA named parameters in Javasacript. And this is the algorithm that I can think of
const notStrRegex = /(?<![\S"'])([^"'\s]+)(?![\S"'])/gm
const namedParamCharsRegex = /[a-zA-Z0-9_]/;
/**
* #returns array of named parameters which,
* 1. always begins with :
* 2. the remaining characters is guranteed to be following {#link namedParamCharsRegex}
*
* #example
* 1. "select * from a where id = :myId3;" -> [':myId3']
* 2. "to_timestamp_tz(:FROM_DATE, 'YYYY-MM-DD\"T\"HH24:MI:SS')" -> [':FROM_DATE']
* 3. "TO_CHAR(ep.CHANGEDT,'yyyy=mm-dd hh24:mi:ss')" -> []
*/
export function extractNamedParam(query: string): string[] {
return (query.match(notStrRegex) ?? [])
.filter((word) => word.includes(':'))
.map((splittedWord) => splittedWord.substring(splittedWord.indexOf(':')))
.filter((splittedWord) => splittedWord.length > 1) // ignore ":"
.map((word) => {
// i starts from 1 because word[0] is :
for (let i = 1; i < word.length; i++) {
const isAlphaNum = namedParamCharsRegex.test(word[i]);
if (!isAlphaNum) return word.substring(0, i);
}
return word;
});
}
I got inspired by the solution in
https://stackoverflow.com/a/11324894/12924700
to filter out all characters that are enclosed in single/double quotes.
While the code above fulfilled the 3 use cases above.
But when a user input
const testStr = '"user input invalid string \' :shouldIgnoreThisNamedParam \' in a string"'
extractNamedParam(testStr) // should return [] but it returns [":shouldIgnoreThisNamedParam"] instead
I did visit the source code of hibernate to see how named parameters are extracted there, but I couldn't find the algorithm that is doing the work. Please help.
You can use
/"[^\\"]*(?:\\[\w\W][^\\"]*)*"|'[^\\']*(?:\\[\w\W][^\\']*)*'|(:\w+)/g
Get the Group 1 values only. See the regex demo. The regex matches strings between single/double quotes and captures : + one or more word chars in all other contexts.
See the JavaScript demo:
const re = /"[^\\"]*(?:\\[\w\W][^\\"]*)*"|'[^\\']*(?:\\[\w\W][^\\']*)*'|(:\w+)/g;
const text = "to_timestamp_tz(:FROM_DATE, 'YYYY-MM-DD\"T\"HH24:MI:SS')";
let matches=[], m;
while (m=re.exec(text)) {
if (m[1]) {
matches.push(m[1]);
}
}
console.log(matches);
Details:
"[^\\"]*(?:\\[\w\W][^\\"]*)*" - a ", then zero or more chars other than " and \ ([^"\\]*), and then zero or more repetitions of any escaped char (\\[\w\W]) followed with zero or more chars other than " and \, and then a "
| - or
'[^\\']*(?:\\[\w\W][^\\']*)*' - a ', then zero or more chars other than ' and \ ([^'\\]*), and then zero or more repetitions of any escaped char (\\[\w\W]) followed with zero or more chars other than ' and \, and then a '
| - or
(:\w+) - Group 1 (this is the value we need to get, the rest is just used to consume some text where matches must be ignored): a colon and one or more word chars.

Regex replace all character except last 5 character and whitespace with plus sign

I wanted to replace all characters except its last 5 character and the whitespace with +
var str = "HFGR56 GGKDJ JGGHG JGJGIR"
var returnstr = str.replace(/\d+(?=\d{4})/, '+');
the result should be "++++++ ++++ +++++ JGJGIR" but in the above code I don't know how to exclude whitespace
You need to match each character individually, and you need to allow a match only if more than six characters of that type follow.
I'm assuming that you want to replace alphanumeric characters. Those can be matched by \w. All other characters will be matched by \W.
This gives us:
returnstr = str.replace(/\w(?=(?:\W*\w){6})/g, "+");
Test it live on regex101.com.
The pattern \d+(?=\d{4}) does not match in the example string as is matches 1+ digits asserting what is on the right are 4 digits.
Another option is to match the space and 5+ word characters till the end of the string or match a single word character in group 1 using an alternation.
In the callback of replace, return a + if you have matched group 1, else return the match.
\w{5,}$|(\w)
Regex demo
let pattern = / \w{5,}$|(\w)/g;
let str = "HFGR56 GGKDJ JGGHG JGJGIR"
.replace(pattern, (m, g1) => g1 ? '+' : m);
console.log(str);
Another way is to replace a group at a time where the number of +
replaced is based on the length of the characters matched:
var target = "HFGR56 GGKDJ JGGHG JGJGIR";
var target = target.replace(
/(\S+)(?!$|\S)/g,
function( m, g1 )
{
var len = parseInt( g1.length ) + 1;
//return "+".repeat( len ); // Non-IE (quick)
return Array( len ).join("+"); // IE (slow)
} );
console.log ( target );
You can use negative lookahead with string end anchor.
\w(?!\w{0,5}$)
Match any word character which is not followed by 0 to 5 characters and end of string.
var str = "HFGR56 GGKDJ JGGHG JGJGIR"
var returnstr = str.replace(/\w(?!\w{0,5}$)/g, '+');
console.log(returnstr)

Replace all “?” by “&” except first

I’d would to replace all “?” by “&” except the first one by javascript. I found some regular expressions but they didn’t work.
I have something like:
home/?a=1
home/?a=1?b=2
home/?a=1?b=2?c=3
And I would like:
home/?a=1
home/?a=1&b=2
home/?a=1&b=2&c=3
Someone know how to I can do it?
Thanks!
I don't think it's possible with regex but you can split the string and then join it back together, manually replacing the first occurance:
var split = 'home/?a=1?b=2'.split('?'); // [ 'home/', 'a=1', 'b=2' ]
var replaced = split[0] + '?' + split.slice(1).join('&') // 'home/?a=1&b=2'
console.log(replaced);
You could match from the start of the string not a question mark using a negated character class [^?]+ followed by matching a question mark and capture that in the first capturing group. In the second capturing group capture the rest of the string.
Use replace and pass a function as the second parameter where you return the first capturing group followed by the second capturing group where all the question marks are replaced by &
let strings = [
"home/?a=1",
"home/?a=1?b=2",
"home/?a=1?b=2?c=3"
];
strings.forEach((str) => {
let result = str.replace(/(^[^?]+\?)(.*)/, function(match, group1, group2) {
return group1 + group2.replace(/\?/g, '&')
});
console.log(result);
});
You can split it by "?" and then rewrap the array:
var string = "home/?a=1?b=2";
var str = string.split('?');
var new = str[0] + '?'; // text before first '?' and first '?'
for( var x = 1; x < str.length; x++ ) {
new = new + str[x];
if( x != ( str.length - 1 ) ) new = new + '&'; //to avoid place a '&' after the string
}
You can use /([^\/])\?/ as pattern in regex that match any ? character that isn't after / character.
var str = str.replace(/([^\/])\?/g, "$1&");
var str = "home/?a=1\nhome/?a=1?b=2\nhome/?a=1?b=2?c=3\n".replace(/([^\/])\?/g, "$1&");
console.log(str);

Remove Any Non-Digit And Check if Formatted as Valid Number

I'm trying to figure out a regex pattern that allows a string but removes anything that is not a digit, a ., or a leading -.
I am looking for the simplest way of removing any non "number" variables from a string. This solution doesn't have to be regex.
This means that it should turn
1.203.00 -> 1.20300
-1.203.00 -> -1.20300
-1.-1 -> -1.1
.1 -> .1
3.h3 -> 3.3
4h.34 -> 4.34
44 -> 44
4h -> 4
The rule would be that the first period is a decimal point, and every following one should be removed. There should only be one minus sign in the string and it should be at the front.
I was thinking there should be a regex for it, but I just can't wrap my head around it. Most regex solutions I have figured out allow the second decimal point to remain in place.
You can use this replace approach:
In the first replace we are removing all non-digit and non-DOT characters. Only exception is first hyphen that we negative using a lookahead.
In the second replace with a callback we are removing all the DOT after first DOT.
Code & Demo:
var nums = ['..1', '1..1', '1.203.00', '-1.203.00', '-1.-1', '.1', '3.h3',
'4h.34', '4.34', '44', '4h'
]
document.writeln("<pre>")
for (i = 0; i < nums.length; i++)
document.writeln(nums[i] + " => " + nums[i].replace(/(?!^-)[^\d.]+/g, "").
replace(/^(-?\d*\.\d*)([\d.]+)$/,
function($0, $1, $2) {
return $1 + $2.replace(/[.]+/g, '');
}))
document.writeln("</pre>")
A non-regex solution, implementing a trivial single-pass parser.
Uses ES5 Array features because I like them, but will work just as well with a for-loop.
function generousParse(input) {
var sign = false, point = false;
return input.split('').filter(function(char) {
if (char.match(/[0-9]/)) {
return sign = true;
}
else if (!sign && char === '-') {
return sign = true;
}
else if (!point && char === '.') {
return point = sign = true;
}
else {
return false;
}
}).join('');
}
var inputs = ['1.203.00', '-1.203.00', '-1.-1', '.1', '3.h3', '4h.34', '4.34', '4h.-34', '44', '4h', '.-1', '1..1'];
console.log(inputs.map(generousParse));
Yes, it's longer than multiple regex replaces, but it's much easier to understand and see that it's correct.
I can do it with a regex search-and-replace. num is the string passed in.
num.replace(/[^\d\-\.]/g, '').replace(/(.)-/g, '$1').replace(/\.(\d*\.)*/, function(s) {
return '.' + s.replace(/\./g, '');
});
OK weak attempt but seems fine..
var r = /^-?\.?\d+\.?|(?=[a-z]).*|\d+/g,
str = "1.203.00\n-1.203.00\n-1.-1\n.1\n3.h3\n4h.34\n44\n4h"
sar = str.split("\n").map(s=> s.match(r).join("").replace(/[a-z]/,""));
console.log(sar);

Categories

Resources