Backward capture group concatenated with forward capture group - javascript

I think the title says it all. I'm trying to get groups and concatenate them together.
I have this text:
GPX 10.802.123/3843­ 1 -­ IDENTIFIER 48
And I want this output:
IDENTIFIER 10.802.123/3843-48
So I want to explicitly say, I want to capture one group before this word and after, then concatenate both, only using regex. Is this possible?
I can already extract the 48 like this:
var text = GPX 10.802.123/3843­ 1 -­ IDENTIFIER 48
var reg = new RegExp('IDENTIFIER' + '.*?(\\d\\S*)', 'i');
var match = reg.exec(text);
Output:
48
Can it be done?
I'm offering 200 points.

You must precisely define the groups that you want to extract before and after the word. If you define the group before the word as four or more non-whitespace characters, and the group after the word as one or more non-whitespace characters, you can use the following regular expression.
var re = new RegExp('(\\S{4,})\\s+(?:\\S{1,3}\\s+)*?' + word + '.*?(\\S+)', 'i');
var groups = re.exec(text);
if (groups !== null) {
var result = groups[1] + groups[2];
}
Let me break down the regular expression. Note that we have to escape the backslashes because we're writing a regular expression inside a string.
(\\S{4,}) captures a group of four or more non-whitespace characters
\\s+ matches one or more whitespace characters
(?: indicates the start of a non-capturing group
\\S{1,3} matches one to three non-whitespace characters
\\s+ matches one or more whitespace characters
)*? makes the non-capturing group match zero or more times, as few times as possible
word matches whatever was in the variable word when the regular expression was compiled
.*? matches any character zero or more times, as few times as possible
(\\S+) captures one or more non-whitespace characters
the 'i' flag makes this a case-insensitive regular expression
Observe that our use of the ? modifier allows us to capture the nearest groups before and after the word.
You can match the regular expression globally in the text by adding the g flag. The snippet below demonstrates how to extract all matches.
function forward_and_backward(word, text) {
var re = new RegExp('(\\S{4,})\\s+(?:\\S{1,3}\\s+)*?' + word + '.*?(\\S+)', 'ig');
// Find all matches and make an array of results.
var results = [];
while (true) {
var groups = re.exec(text);
if (groups === null) {
return results;
}
var result = groups[1] + groups[2];
results.push(result);
}
}
var sampleText = " GPX 10.802.123/3843- 1 -- IDENTIFIER 48 A BC 444.2345.1.1/99x 28 - - Identifier 580 X Y Z 9.22.16.1043/73+ 0 *** identifier 6800";
results = forward_and_backward('IDENTIFIER', sampleText);
for (var i = 0; i < results.length; ++i) {
document.write('result ' + i + ': "' + results[i] + '"<br><br>');
}
body {
font-family: monospace;
}

You can do:
var text = 'GPX 10.802.123/3843­ 1 -­ IDENTIFIER 48';
var match = /GPX\s+(.+?) \d .*?(IDENTIFIER).*?(\d\S*)/i.exec(text);
var output = match[2] + ' ' + match[1] + '-' + match[3];
//=> "IDENTIFIER 10.802.123/3843­-48"

This would be possible through replace function.
var s = 'GPX 10.802.123/3843­ 1 -­ IDENTIFIER 48'
s.replace(/.*?(\S+)\s+\d+\s*-\s*(IDENTIFIER)\s*(\d+).*/, "$2 $1-$3")

^\s*\S+\s*\b(\d+(?:[./]\d+)+)\b.*?-.*?\b(\S+)\b\s*(\d+)\s*$
You can try this.Replace by $2 $1-$3.See demo.
https://regex101.com/r/sS2dM8/38
var re = /^\s*\S+\s*\b(\d+(?:[.\/]\d+)+)\b.*?-.*?\b(\S+)\b\s*(\d+)\s*$/gm;
var str = 'GPX 10.802.123/3843­ 1 -­ IDENTIFIER 48';
var subst = '$2 $1-$3';
var result = str.replace(re, subst);

You can use split too:
var text = 'GPX 10.802.123/3843­ 1 -­ IDENTIFIER 48';
var parts = text.split(/\s+/);
if (parts[4] == 'IDENTIFIER') {
var result = parts[4] + ' ' + parts[1] + '-' + parts[5];
console.log(result);
}

Related

Adding a whitespace in front of the first number in a string

I need a whitespace to be added in front of the first number of a string, unless there is one already and unless the number is the first character in the string.
So I wrote this JS code:
document.getElementById('billing:street1').addEventListener('blur', function() {
var value = document.getElementById('billing:street1').value;
var array = value.match(/\d{1,}/g);
if (array !== null) {
var number = array[0];
var index = value.indexOf(number);
if(index !== 0){
var street = value.substring(0, index);
var housenumber = value.substring(index);
if (street[street.length - 1] !== ' ') {
document.getElementById('billing:street1').value = street + ' ' + housenumber;
}
}
}
});
Fiddle
It works fine, but I feel like this can probably be done in a smarter, more compact way.
Also, JQuery suggestions welcome, I am just not very familiar with it.
Try this one :
const addSpace = (str) => {
return str.replace(/(\D)(\d)/, "$1 $2")
}
console.log(addSpace("12345")) // "12345"
console.log(addSpace("city12345")) // "city 12345"
console.log(addSpace("city")) // "city"
(\D) captures a non-digit
(\d) captures a digit
so (\D)(\d) means : non-digit followed by a digit
that we replace with "$1 $2" = captured1 + space + captured2
You can do it by using only regular expressions. For example:
var s = "abcde45";
if(!s.match(/\s\d/)){
s = s.replace(/(\d)/, ' $1');
}
console.log(s); // "abcde 45"
UPD : Of course, if you have string with a wrong syntax(e.g no numbers inside), that code wouldn't work.

Regex with dynamic requirement

Suppose I have string:
var a = '/c/Slug-A/Slug-B/Slug-C'
I have 3 possibility:
var b = 'Slug-A' || 'Slug-B' || 'Slug-C'
My expectation:
if (b === 'Slug-A') return 'Slug B - Slug C';
if (b === 'Slug-B') return 'Slug A - Slug C';
if (b === 'Slug-C') return 'Slug A - Slug B';
What I've done so far:
a.replace(b,'')
.replace(/\/c\//,'') // clear /c/
.replace(/-/g,' ')
.replace(/\//g,' - ')
Then, I'm stuck
Any help would be appreciated. Thanks
Try this:
var a = '/c/Slug-A/Slug-B/Slug-C'
var b = 'Slug-A' || 'Slug-B' || 'Slug-C'
var reg = new RegExp(b.replace(/([A-z]$)/,function(val){return '[^'+val+']'}),'g');
a.match(reg).map(function(val){return val.replace('-',' ')}).join(' - ');
Explication:
the replacement of the string b catch the last latter of the string and replace it with the ^ regex sign. this mean that instead of capture it it will ignore it.
That mean that mean that now it will match only the Slag- that isn't contain the last char.
All there is to do is to return it with any join you like.
Try this, I made it as simple as possible.
var a = '/c/Slug-A/Slug-B/Slug-C';
var b = 'Slug-A';
var regex = new RegExp(b+'|\/c\/|-|\/','g');
alert(a.replace(regex, " ").trim().replace(/(\s.*?)\s+/,'$1 - '));
//OR
alert(a.replace(regex, " ").trim().match(/\w+\s\w/g).join(' - '));
Explanation
1) b+'|\/c\/|-|\/','g' = matches b value, /c/ , - and /
2) a.replace(regex, " ") = replace all the matched part by space. so a would beSlug A Slug B
3) .replace(/(\s.*?)\s+/,'$1 - ') = match two spaces with anything within the spaces. And then replace it with the match + ' - ' appended to it.
Note that we have grouped the part (\s.*?) in the regex (\s.*?)\s+. So this grouping is done so that It can be used while replacing it with a new text. $1 holds the grouped part of the matched text so $1 = " A". So what I am doing is I match the regex Eg : " A " and replace only the grouped part ie " A" with " A" + " - " = " A - ".
4) .match(/\w+\s\w/g).join(' - ') = match all the part where characters followed by a space followed by a character. match will return a array of matched parts. So then I join this array by using join.
Do it this way
var a = '/c/Slug-A/Slug-B/Slug-C'
var b = 'Slug-A'
//var b = 'Slug-B'
//var b = 'Slug-C'
var k = a.replace(b,'')
.replace(/\/c\//,'') // clear /c/
.replace(/-/g,' ')
.match(/[a-zA-Z -]+/g).join(" - ")
console.log(k)
working array here

From 'one behind' to 'one before' (between)

In the string below, I want to find, for example, everything between '[' and 'A'.
Here 'match1', 'match2', 'match3'. And than replace every match with, for example, 'check'.
var str = "+dkdele*[match1Ayesses ss [match2Aevey[match3A";
var pattern = /(?=\[)(.*?)([]?A)/g; // includes '[' and 'A'
var res = str.replace(pattern, "check"); // could be '[checkA'
console.log(res);
The pattern includes '[' and 'A', what I don't want. How could a pattern look like, which matches between two any desired signs?
You can use this regex:
/\[(\w+)(?=A)/g
and grab captured group #1
RegEx Demo
If there can be more than word characters then use:
/\[([^A]+)(?=A)/g
Code:
var re = /\[([^A]+)(?=A)/g;
var str = '+dkdele*[match1Ayesses ss [match2Aevey[match3A';
var m;
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex)
re.lastIndex++;
console.log(m[1]);
}
Output:
match1
match2
match3
EDIT: Based on edited question and comment below:
For replacement you can use:
var repl = str.replace(/(\[)[^A]+(?=A)/, '$1check');
//=> +dkdele*[checkAyesses ss [checkAevey[checkA
PS: If you want A also to be replaced then use:
var repl = str.replace(/(\[)[^A]+A/, '$1check');
//=> +dkdele*[checkyesses ss [checkevey[check
You can use:
var pattern = /(\[).*?(A)/g;
var replacement = "check";
var res = str.replace( pattern, '$1' + replacement + '$2' );
The regular expression is:
(\[) - match an open square bracket and capture the match in a the first group.
.*? - match the minimum possible of zero-or-more of any characters
(A) - then match an A and capture it in the second group.
The replacement will put in the first and second capture groups instead of $1 and $2 (respectively).
With any start and end matches you can do:
var start = "\\[";
var end = "A";
var pattern = new RegExp( "(" + start + ").*?(" + end + ")", "g" );
var replacement = "check";
var res = str.replace( pattern, '$1' + replacement + '$2' );
If you don't want to include the start and end match characters then don't include the capture groups in the replacement:
var res = str.replace( pattern, replacement );

Javascript SUBSTR

I have a dynamic string value "radio_3_*".
Like:
1 - radio_3_5
2 - radio_3_8
3 - radio_3_78
4 - radio_3_157
5 - radio_3_475
How can I pick the radio_3 part.
Basic regular expression
var str = "radio_3_5";
console.log(str.match(/^[a-z]+_\d+/i));
And how the reg exp works
/ Start of reg exp
^ Match start of line
[a-z]+ Match A thru Z one or more times
_ Match underscore character
\d+ Match any number one or more times
/ End of Reg Exp
i Ignores case
Or with split
var str = "radio_334_1234";
var parts = str.split("_");
var result = parts[0] + "_" + parts[1];
Or even crazier (would not do)
var str = "radio_334_1234";
var result = str.split("_").slice(0,2).join("_");
You could just take your string and use javascript method match
myString = "radio_334_1234"
myString.match("[A-Za-z]*_[0-9]*")
//output: radio_334
[A-Za-z]* Will take any number of characters in upper or lower case
_ Will take the underscore
[0-9]* Will take any number of characters from 0 to 9
Try this:
var str = 'radio_3_78';
var splitStr = str.split('_');
var result = splitStr[0] + '_' + splitStr[1];
http://jsfiddle.net/7faag7ug/2/
Use split and pop, like below.
"radio_3_475".split("_").pop(); // = 475

Regexp search not surrounded by

I want to find all occurences of % that are not within quotation characters.
Example> "test% testing % '% hello' " would return ["%","%"]
Looking at another stack overflow thread this is what I found:
var patt = /!["'],*%,*!['"]/g
var str = "testing 123 '%' % '% ' "
var res = str.match(patt);
However this gives me null. Have you any tips of what I should do?
Demo
You could try the below positive lookahead assertion based regex.
> var s = "test% testing % '% hello' "
> s.match(/%(?=(?:[^']*'[^']*')*[^']*$)/g)
[ '%', '%' ]
> var str = "testing %"
undefined
> str.match(/%(?=(?:[^']*'[^']*')*[^']*$)/g)
[ '%' ]
> var str1 = "testing '%'"
undefined
> str1.match(/%(?=(?:[^']*'[^']*')*[^']*$)/g)
null
Try this:
var patt= /[^"'].*?(%).*?[^'"]/g ;
var str = "testing 123 '%' % '% ' "
var res = str.match(patt);
console.dir(res[1]); // result will be in the 1st match group: res[1]
Here is the link to the online testing.
Explanation:
[^"'] - any character except " or '
.*? any characters (except new line) any times or zero times not greedy.
Update
Actually you must check if behing and ahead of % there are no quotes.
But:
JavaScript regular expressions do not support lookbehinds
So you have no way to identify " or ' preceding % sign unless more restrictions are applied.
I'd suggest to do searching in php or other language (where lookbehind is supported) or impose more conditions.
Since I'm not a big fan of regular expressions, here's my approach.
What is important in my answer, if there would be a trailing quote in the string, the other answers won't work. In other words, only my answer works in cases where there is odd number of quotes.
function countUnquoted(str, charToCount) {
var i = 0,
len = str.length,
count = 0,
suspects = 0,
char,
flag = false;
for (; i < len; i++) {
char = str.substr(i, 1);
if ("'" === char) {
flag = !flag;
suspects = 0;
} else if (charToCount === char && !flag) {
count++;
} else if (charToCount === char) {
suspects++;
}
}
//this way we can also count occurences in such situation
//that the quotation mark has been opened but not closed till the end of string
if (flag) {
count += suspects;
}
return count;
}
As far as I believe, you wanted to count those percent signs, so there's no need to put them in an array.
In case you really, really need to fill this array, you can do it like that:
function matchUnquoted(str, charToMatch) {
var res = [],
i = 0,
count = countUnquoted(str, charToMatch);
for (; i < count; i++) {
res.push('%');
}
return res;
}
matchUnquoted("test% testing % '% hello' ", "%");
Trailing quote
Here's a comparison of a case when there is a trailing ' (not closed) in the string.
> var s = "test% testing % '% hello' asd ' asd %"
> matchUnquoted(s, '%')
['%', '%', '%']
>
> // Avinash Raj's answer
> s.match(/%(?=(?:[^']*'[^']*')*[^']*$)/g)
['%', '%']
Use this regex: (['"]).*?\1|(%) and the second capture group will have all the % signs that are not inside single or double quotes.
Breakdown:
(['"]).*?\1 captures a single or double quote, followed by anything (lazy) up to a matching single or double quote
|(%) captures a % only if it wasn't slurped up by the first part of the alternation (i.e., if it's not in quotes)

Categories

Resources