Regular expression match all except first occurence - javascript

I need a regular expression to match all occurrences of a dot (.) except the first one.
For example if the source is:
aaa.bbb.ccc..ddd
the expression should match the dots after bbb and ccc but not the dot after aaa. In other works it should match all dots except the first one.
I need it for javascript regex.

with pcre (PHP, R) you can do that:
\G(?:\A[^.]*\.)?+[^.]*\K\.
demo
details:
\G # anchor for the start of the string or the position after a previous match
(?:\A[^.]*\.)?+ # start of the string (optional possessive quantifier)
[^.]* # all that is not a dot
\K # remove all that has been matched on the left from the match result
\. # the literal dot
With .net: (easy since you can use a variable length lookbehind)
(?<!^[^.]*)\.
demo
With javascript there is no way to do it with a single pattern.
using a placeholder:
var result = s.replace('.', 'PLACEHOLDER')
.replace(/\./g, '|')
.replace('PLACEHOLDER', '.');
(or replace all dots with | and then replace the first occurrence of | with a dot).
using split:
var parts = s.split('.');
var result = parts.shift() + (parts.length ? '.': '') + parts.join('|');
with a counter:
var counter = 0;
var result = s.replace(/\./g, (_) => counter++ ? '|' : '.');
With NodeJS (or any other implementation that allows lookbehinds):
var result = s.replace(/((?:^[^.]*\.)?(?<=.)[^.]*)\./g, "$1|");

One-line solution for JavaScript using arrow function (ES6):
'aaa.bbb.ccc..ddd'
.replace(/\./g, (c, i, text) => text.indexOf(c) === i ? c : '|')
-> 'aaa.bbb|ccc||ddd'

Related

regex to extract numbers starting from second symbol

Sorry for one more to the tons of regexp questions but I can't find anything similar to my needs. I want to output the string which can contain number or letter 'A' as the first symbol and numbers only on other positions. Input is any string, for example:
---INPUT--- -OUTPUT-
A123asdf456 -> A123456
0qw#$56-398 -> 056398
B12376B6f90 -> 12376690
12A12345BCt -> 1212345
What I tried is replace(/[^A\d]/g, '') (I use JS), which almost does the job except the case when there's A in the middle of the string. I tried to use ^ anchor but then the pattern doesn't match other numbers in the string. Not sure what is easier - extract matching characters or remove unmatching.
I think you can do it like this using a negative lookahead and then replace with an empty string.
In an non capturing group (?:, use a negative lookahad (?! to assert that what follows is not the beginning of the string followed by ^A or a digit \d. If that is the case, match any character .
(?:(?!^A|\d).)+
var pattern = /(?:(?!^A|\d).)+/g;
var strings = [
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
for (var i = 0; i < strings.length; i++) {
console.log(strings[i] + " ==> " + strings[i].replace(pattern, ""));
}
You can match and capture desired and undesired characters within two different sides of an alternation, then replace those undesired with nothing:
^(A)|\D
JS code:
var inputStrings = [
"A-123asdf456",
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
console.log(
inputStrings.map(v => v.replace(/^(A)|\D/g, "$1"))
);
You can use the following regex : /(^A)?\d+/g
var arr = ['A123asdf456','0qw#$56-398','B12376B6f90','12A12345BCt', 'A-123asdf456'],
result = arr.map(s => s.match(/(^A|\d)/g).join(''));
console.log(result);

replacing all String in javascript using regex

i have a dynamic string expression
var expression = "count+count1+12-(count3+count4)";
I want to append v[...] in each string like this output
Output:-
v[count]+v[count1]+12-(v[count3]+v[count4]);
i have tried this regex expression,
expression = expression.replace(/[a-z]+|[A-Z]+/g, "v["/$1/"]").replace(/[\(|\|\.)]/g, "");
is it possible to write regex expression regex string.
You may use
var expression = "count+count1+12-(count3+count4)";
var res = expression.replace(/\b[a-z]\w*/ig, "v[$&]");
console.log(res);
Details:
\b - a leading word boundary
[a-z] - an ASCII letter
\w* - 0+ word chars ([a-zA-Z0-9_]).
The replacement contains $&, a backreference to the whole match.
Another solution that splits with the math operators and only wraps with v[...] those substrings that are not a number or the operator:
var expression = "count+count1+12+234.56-(count3+count4)";
var res = expression.split(/([-+\/*])/).map(function(x) {
return /^(\d*\.?\d+|[-*\/+])$/.test(x) ? x : "v["+x+"]";
}).join("");
console.log(res);

Inverting a rather complex set of regexes

I'm sort of new to regular expressions, and none of the solutions I found online helped/worked.
I'm dealing with a one-line String in JavaScript, it'll contain five types of data mixed in.
A "#" followed by six numbers/letters (HTML color) (/#....../g)
A forward slash followed by any of a few specific characters (/\/(\+|\^|\-|#|!\+|_|#|\*|%|&|~)/g)
A "$" followed by a sequence of letters and a "|" (/\$([^\|]+)/g)
A "|" alone (/\|/g)
Alphanumeric characters that do not fall under any of these categories
The thing is, I have regexes to match the first four categories, that are important.
The problem is that I need a single Regex that I'll use to replace all the characters that DO NOT match for the first four regexes with a single character, such as "§".
Example:
This#00CC00 is green$Courier| and /^mono|spaced
§§§§#00CC00§§§§§§§§§$Courier|§§§§§/^§§§§|§§§§§§
I know I may be attacking this problem the wrong way, I'm rather new to regular expressions.
Essentially, how do I make a regex that means "anything that doesn't have any matches for regexes x, y, or z"?
Thank you for your time.
use this pattern
((#\w{6}|\/[\/\(\+\^\-]|\$\w+\||\|)*).
and replace w/ $1§
Downside is your preserved pattern has to be followed by at least one character
Demo
( # Capturing Group (1)
( # Capturing Group (2)
# # "#"
\w # <ASCII letter, digit or underscore>
{6} # (repeated {6} times)
| # OR
\/ # "/"
[\/\(\+\^\-] # Character Class [\/\(\+\^\-]
| # OR
\$ # "$"
\w # <ASCII letter, digit or underscore>
+ # (one or more)(greedy)
\| # "|"
| # OR
\| # "|"
) # End of Capturing Group (2)
* # (zero or more)(greedy)
) # End of Capturing Group (1)
. # Any character except line break
Code copied from Regex101
var re = /((#\w{6}|\/[\/\(\+\^\-]|\$\w+\||\|)*)./gm;
var str = 'This#00CC00 is green$Courier| and /^mono|spaced|\n';
var subst = '$1§';
var result = str.replace(re, subst);
This isn't as efficient as a working regular expression but it works. Basically it gets all of the matches and fills the parts between with § characters. One nice thing is you don't have to be a regular expression genius to update it, so hopefully more people can use it.
var str = 'This#00CC00 is green$Courier| and /^mono|spaced';
var patt=/#(\d|\w){6}|\/(\+|\^|\-|#|!\+|_|#|\*|%|&|~)|\$([^\|]+)\||\|/g;
var ret = "";
pos = [];
while (match=patt.exec(str)) {
pos.push(match.index);
pos.push(patt.lastIndex);
console.log(match.index + ' ' + patt.lastIndex);
}
for (var i=0; i<pos.length; i+=2) {
ret += Array(1+pos[i]- (i==0 ? 0 : pos[i-1])).join("§");
ret += str.substring(pos[i], pos[i+1]);
}
ret += Array(1+str.length-pos[pos.length-1]).join("§");
document.body.innerHTML = str +"<br>"+ret;
console.log(str);
console.log(ret);
demo here

How to add white space in regular expression in Javascript

I have a string {{my name}} and i want to add white space in regular expression
var str = "{{my name}}";
var patt1 = /\{{\w{1,}\}}/gi;
var result = str.match(patt1);
console.log(result);
But result in not match.
Any solution for this.
Give the word character\w and the space character\s inside character class[],
> var patt1 = /\{\{[\w\s]+\}\}/gi;
undefined
> var result = str.match(patt1);
undefined
> console.log(result);
[ '{{my name}}' ]
The above regex is as same as /\{\{[\w\s]{1,}\}\}/gi
Explanation:
\{ - Matches a literal { symbol.
\{ - Matches a literal { symbol.
[\w\s]+ - word character and space character are given inside Character class. It matches one or more word or space character.
\} - Matches a literal } symbol.
\} - Matches a literal } symbol.
Try this on
^\{\{[a-z]*\s[a-z]*\}\}$
Explanation:
\{ - Matches a literal { symbol.
\{ - Matches a literal { symbol.
[a-z]* - will match zero or more characters
\s - will match exact one space
\} - Matches a literal } symbol.
\} - Matches a literal } symbol.
If you want compulsory character then use + instead of *.
To match this pattern, use this simple regex:
{{[^}]+}}
The demo shows you what the pattern matches and doesn't match.
In JS:
match = subject.match(/{{[^}]+}}/);
To do a replacement around the pattern, use this:
result = subject.replace(/{{[^}]+}}/g, "Something$0Something_else");
Explanation
{{ matches your two opening braces
[^}]+ matches one or more chars that are not a closing brace
}} matches your two closing braces

RegEx needed to split javascript string on "|" but not "\|"

We would like to split a string on instances of the pipe character |, but not if that character is preceded by an escape character, e.g. \|.
ex we would like to see the following string split into the following components
1|2|3\|4|5
1
2
3\|4
5
I'm expecting to be able to use the following javascript function, split, which takes a regular expression. What regex would I pass to split? We are cross platform and would like to support current and previous versions (1 version back) of IE, FF, and Chrome if possible.
Instead of a split, do a global match (the same way a lexical analyzer would):
match anything other than \\ or |
or match any escaped char
Something like this:
var str = "1|2|3\\|4|5";
var matches = str.match(/([^\\|]|\\.)+/g);
A quick explanation: ([^\\|]|\\.) matches either any character except '\' and '|' (pattern: [^\\|]) or (pattern: |) it matches any escaped character (pattern: \\.). The + after it tells it to match the previous once or more: the pattern ([^\\|]|\\.) will therefor be matches once or more. The g at the end of the regex literal tells the JavaScript regex engine to match the pattern globally instead of matching it just once.
What you're looking for is a "negative look-behind matching regular expression".
This isn't pretty, but it should split the list for you:
var output = input.replace(/(\\)?|/g, function($0,$1){ return $1?$1:$0+'\n';});
This will take your input string and replace all of the '|' characters NOT immediately preceded by a '\' character and replace them with '\n' characters.
A regex solution was posted as I was looking into this. So I just went ahead and wrote one without it. I did some simple benchmarks and it is -slightly- faster (I expected it to be slower...).
Without using Regex, if I understood what you desire, this should do the job:
function doSplit(input) {
var output = [];
var currPos = 0,
prevPos = -1;
while ((currPos = input.indexOf('|', currPos + 1)) != -1) {
if (input[currPos-1] == "\\") continue;
var recollect = input.substr(prevPos + 1, currPos - prevPos - 1);
prevPos = currPos;
output.push(recollect);
}
var recollect = input.substr(prevPos + 1);
output.push(recollect);
return output;
}
doSplit('1|2|3\\|4|5'); //returns [ '1', '2', '3\\|4', '5' ]

Categories

Resources