How to replace part of a string using regex - javascript

i need to replace a part of a string in Javascript
The following example should clarify what i mean
var str = "asd[595442/A][30327][0]";
var strToReplace = "30333";
var strDesiredResult = "asd[595442/A][30333][0]";
Basically it means the second area within the brackets should get replaced with another string
How to do that?
What i did so far is something like this :
var str = "asd[595442/A][30327][0]";
var regex = /asd\[(.*)\]\[(.*)\]\[(.*)\]/;
var arrMatches = regex.exec(str);
The string appears in arrMatches[2] correctly, and i could replace this. But what happens if in arrMatches[1] is the same string ?
Because it should only replace the value in the second bracket area.

You may use a regex that will match the first [....] followed with [ and capture that part into a group (that you will be able to refer to via a backreference), and then match 1+ chars other than ] to replace them with your replacement:
var str = "asd[595442/A][30327][0]";
var strToReplace = "30333";
console.log(str.replace(/(\[[^\]]*]\[)[^\]]*/, "$1" + strToReplace));
var strDesiredResult = "asd[595442/A][30333][0]";
console.log(strDesiredResult);
The /(\[[^\]]*]\[)[^\]]*/ has no gmodifier, it will be looking for one match only.
Since regex engine searches a string for a match from left to right, you will get the first match from the left.
The \[[^\]]*]\[ matches [, then any 0+ chars other than ] and then ][. The (...) forms a capturing group #1, it will remember the value that you will be able to get into the replacement with $1 backreference. [^\]]* matches 0+ chars other than ] and this will be replaced.
Details:
( - a capturing group start
\[ - a literal [ symbol (if unescaped, it starts a character class)
[^\]]* - a negated character class that matches zero or more (due to the * quantifier)
] - a literal ] (outside a character class, it does not have to be escaped)
\[ - a literal [
) - end of capturing group #1 (its value can be accessed with $1 backreference from the replacement pattern)
[^\]]* - 0+ (as the * quantifier matches zero or more occurrences, replace with + if you need to only match where there is 1 or more occurrences) chars other than ] (inside a character class in JS regex, ] must be escaped in any position).

Use this pattern:
'asd[595442/A][30327][0]'.replace(/^(asd\[[^\[\]]+\]\[)([^\[\]]+)(\]\[0\])$/, '$130333$3')
Test here
^ - match beginning of string
first group - match "asd[", any chars except [ and ], "]["
second group - match any chars except [ and ]
third group - match exactly: "][0]"
$ - match end of string

There are many ways to do this. One possible pattern is
str.replace(/^(.+)(\[.+\])(\[.+\])(\[.+\])$/, `$1$2[${strToReplace}]$4`)
You can see that $<number> is referred to captured string from regex (string groups in parentheses). We can refer to those and rearrange it however we want.

You can use Regular Expression like this /\[[0-9]+\]/ as below.
var str = "asd[595442/A][30327][0]";
var strToReplace = "30333";
var strDesiredResult = str.replace(/\[[0-9]+\]/, '[' + strToReplace + ']');
console.log(strDesiredResult); //"asd[595442/A][30333][0]";

Related

Validate text with javascript RegEX

I'm trying to validate text with javascript but can find out why it's not working.
I have been using : https://regex101.com/ for testing where it works but in my script it fails
var check = "test"
var pattern = new RegExp('^(?!\.)[a-zA-Z0-9._-]+$(?<!\.)','gmi');
if (!pattern.test(check)) validate_check = false;else validate_check = true;
What i'm looking for is first and last char not a dot, and string may contain [a-zA-Z0-9._-]
But the above check always fails even on the word : test
+$(?<!\.) is invalid in your RegEx
$ will match the end of the text or line (with the m flag)
Negative lookbehind → (?<!Y)X will match X, but only if Y is not before it
What about more simpler RegEx?
var checks = ["test", "1-t.e_s.t0", ".test", "test.", ".test."];
checks.forEach(check => {
var pattern = new RegExp('^[^.][a-zA-Z0-9\._-]+[^.]$','gmi');
console.log(check, pattern.test(check))
});
Your code should look like this:
var check = "test";
var pattern = new RegExp('^[^.][a-zA-Z0-9\._-]+[^.]$','gmi');
var validate_check = pattern.test(check);
console.log(validate_check);
A few notes about the pattern:
You are using the RegExp constructor, where you have to double escape the backslash. In this case with a single backslash, the pattern is ^(?!.)[a-zA-Z0-9._-]+$(?<!.) and the first negative lookahead will make the pattern fail if there is a character other than a newline to the right, that is why it does not match test
If you use the /i flag for a case insensitive match, you can shorten [A-Za-z] to just one of the ranges like [a-z] or use \w to match a word character like in your character class
This part (?<!\.) using a negative lookbehind is not invalid in your pattern, but is is not always supported
For your requirements, you don't have to use lookarounds. If you also want to allow a single char, you can use:
^[\w-]+(?:[\w.-]*[\w-])?$
^ Start of string
[\w-]+ Match 1+ occurrences of a word character or -
(?: Non capture group
[\w.-]*[\w-] Match optional word chars, a dot or hyphen
)? Close non capture group and make it optional
$ End of string
Regex demo
const regex = /^[\w-]+(?:[\w.-]*[\w-])?$/;
["test", "abc....abc", "a", ".test", "test."]
.forEach((s) =>
console.log(`${s} --> ${regex.test(s)}`)
);

Javascript how to identify a combination of letters and strip a portion of it

Im very new to Regex . Right now im trynig to use regex to prepare my markup string before sending it to the database.
Here is an example string:
#[admin](user:3) Testing this string #[hellotessginal](user:4) Hey!
So far i am able to identify #[admin](user:3) the entire term here using /#\[(.*?)]\((.*?):(\d+)\)/g
But the next step forward is that i wish to remove the (user:3) leaving me with #[admin].
Hence the result of passing through the stripper function would be:
#[admin] Testing this string #[hellotessginal] Hey!
Please help!
You may use
s.replace(/(#\[[^\][]*])\([^()]*?:\d+\)/g, '$1')
See the regex demo. Details:
(#\[[^\][]*]) - Capturing group 1: #[, 0 or more digits other than [ and ] as many as possible and then ]
\( - a ( char
[^()]*? - 0 or more (but as few as possible) chars other than ( and )
: - a colon
\d+ - 1+ digits
\) - a ) char.
The $1 in the replacement pattern refers to the value captured in Group 1.
See the JavaScript demo:
const rx = /(#\[[^\][]*])\([^()]*?:\d+\)/g;
const remove_parens = (string, regex) => string.replace(regex, '$1');
let s = '#[admin](user:3) Testing this string #[hellotessginal](user:4) Hey!';
s = remove_parens(s, rx);
console.log(s);
Try this:
var str = "#[admin](user:3) Testing this string #[hellotessginal](user:4) Hey!";
str = str.replace(/ *\([^)]*\) */g, ' ');
console.log(str);
You can replace matches of the following regular expression with empty strings.
str.replace(/(?<=\#\[(.*?)\])\(.*?:\d+\)/g, ' ');
regex demo
I've assumed the strings for which "admin" and "user" are placeholders in the example cannot contain the characters in the string "()[]". If that's not the case please leave a comment and I will adjust the regex.
I've kept the first capture group on the assumption that it is needed for some unstated purpose. If it's not needed, remove it:
(?<=\#\[.*?\])\(.*?:\d+\)
There is of course no point creating a capture group for a substring that is to be replaced with an empty string.
Javascript's regex engine performs the following operations.
(?<= : begin positive lookbehind
\#\[ : match '#['
(.*?) : match 0+ chars, lazily, save to capture group 1
\] : match ']'
) : end positive lookbehind
\(.*?:\d+\) : match '(', 0+ chars, lazily, 1+ digits, ')'

Javascript regular expression between brackets

Let's say in the following text
I want [this]. I want [this too]. I don't want \[this]
I want the contents of anything between [] but not \[]. How would I go about doing that? So far I've got /\[([^\]]+)\]/gi. but it matched everything.
Use this one: /(?:^|[^\\])\[(.*?)\]/gi
Here's a working example: http://regexr.com/3clja
?: Non-capturing group
^|[^\\] Beggining of string or anything but \
\[(.*?)\] Match anything between []
Here's a snippet:
var string = "[this i want]I want [this]. I want [this too]. I don't want \\[no]";
var regex = /(?:^|[^\\])\[(.*?)\]/gi;
var match = null;
document.write(string + "<br/><br/><b>Matches</b>:<br/> ");
while(match = regex.exec(string)){
document.write(match[1] + "<br/>");
}
Use this regexp, which first matches the \[] version (but doesn't capture it, thereby "throwing it away"), then the [] cases, capturing what's inside:
var r = /\\\[.*?\]|\[(.*?)\]/g;
^^^^^^^^^ MATCH \[this]
^^^^^^^^^ MATCH [this]
Loop with exec to get all the matches:
while(match = r.exec(str)){
console.log(match[1]);
}
/(?:[^\\]|^)\[([^\]]*)/g
The content is in the first capture group, $1
(?:^|[^\\]) matches the beginning of a line or anything that's not a slash, non-capturing.
\[ matches a open bracket.
([^\]]*) captures any number of consecutive characters that are not closed brackets
\] matches a closing bracket

JS regexp to match special characters

I'm trying to find a JavaScript regexp for this string: ![](). It needs to be an exact match, though, so:
`!()[]` // No match
hello!()[] // No match
!()[]hello // No Match
!()[] // Match
!()[] // Match (with a whitespace before and/or after)
I tried this: \b![]()\b. It works for words, like \bhello\b, but not for those characters.
The characters specified are control characters and need to be escaped also user \s if you want to match whitespace. Try the following
\s?!(?:\[\]\(\)|\(\)\[\])\s?
EDIT: Added a capture group to extract ![]() if needed
EDIT2: I missed that you wanted order independant for [] and () I've added it in this fiddle http://jsfiddle.net/MfFAd/3/
This matches your example:
\s*!\[\]\(\)\s*
Though the match also includes the spaces before and after !()[].
I think \b does not work here because ![]() is not a word. Check out this quote from MDN:
\b - Matches a word boundary. A word boundary matches the position where a word character is not followed or preceeded by another word-character. Note that a matched word boundary is not included in the match. In other words, the length of a matched word boundary is zero.
Let's create a function for convenience :
function find(r, s) {
return (s.match(r) || []).slice(-1);
}
The following regular expression accepts only the searched string and whitespaces :
var r = /^\s*(!\[\]\(\))\s*$/;
find(r, '![]() '); // ["![]()"]
find(r, '!()[] '); // []
find(r, 'hello ![]()'); // []
This one searches a sub-string surrounded by whitespaces or string boundaries :
var r = /(?:^|\s)(!\[\]\(\))(?:\s|$)/;
find(r, '![]() '); // ["![]()"]
find(r, 'hello ![]()'); // ["![]()"]
find(r, 'hello![]()'); // []
To match all characters except letters and numbers you can use this regex
/[^A-Z0-9]/gi
g - search global [ mean whole text, not just first match ]
i -case insensitive
to remove any other sign for example . and ,
/[^A-Z0-9\.\,]/gi
In order to match exact string you need to group it and global parameter
/(\!\[\]\(\))/g
so it will search for all matches

how to regex a string between two tokens in Javascript?

Asked many times, but I can't get it to work...
I have strings like:
"text!../tmp/widgets/tmp_widget_header.html"
and am trying like this to extract widget_header:
var temps[i] = "text!../tmp/widgets/tmp_widget_header.html";
var thisString = temps[i].regexp(/.*tmp_$.*\.*/) )
but that does not work.
Can someone tell me what I'm doing wrong here?
Thanks!
This prints widget_header:
var s = "text!../tmp/widgets/tmp_widget_header.html";
var matches = s.match(/tmp_(.*?)\.html/);
console.log(matches[1]);
var s = "text!../tmp/widgets/tmp_widget_header.html",
re = /\/tmp_([^.]+)\./;
var match = re.exec(s);
if (match)
alert(match[1]);
This will match:
a / character
the characters tmp_
one or more of any character that is not the . character. These are captured.
a . character
If a match was found, it will be at index 1 of the resulting Array.
In your code:
var temps[i] = "text!../tmp/widgets/tmp_widget_header.html";
var thisString = temps[i].regexp(/.*tmp_$.*\.*/) )
You are saying:
"Match any string that starts with any number of any characters, followed by "tmp_", followed by the end of input, followed by any number of periods."
.* : Any number of any character (except newline)
tmp_ : Literally "tmp_"
$ : End of input/newline - this will never be true in this position
\. : " . ", a period
\.* : Any number of periods
Plus when using the regex() function you need to pass a string, using string notation like var re = new RegExp("ab+c") or var re = new RegExp('ab+c') not in regex notation using slash. You also have either an extra, or missing parenthesis, and no characters are actually being captured.
What you want to do is:
"Find a string that preceded by the begining of input, followed by one or more of any character, followed by "tmp_"; followed by a single period, followed by one or more of any character, followed by the end of input;t that contains one or more of any character. Capture that string."
So:
var string = "text!../tmp/widgets/tmp_widget_header.html";
var re = /^.+tmp_(.+)\..+$/; //I use the simpler slash notation
var out = re.exec(string); //execute the regex
console.log(out[1]); //Note that out is an array, the first (here only) catpture sting is at index 1
This regex /^.+tmp_(.+)\..+$/ means:
^ : Match beginning of input/line
.+ : One or more of any character (except newline), "+" is one or more
tmp_ : Constant "tmp_"
\. : A single period
.+ : As above
$ : End of input/line
You could also use this as RegEx('^.+tmp_(.+)\..+$'); not that when we use RegEx(); we do not have the slash marks, instead we use quote marks (single or double will work), to pass it as a string.
Now this would also match var string = "Q%$#^%$^%$^%$^43etmp_ebeb.45t4t#$^g" and out == 'ebeb'. So depending on the specific use you may wish to replace any " . " used to signify any character (except newline) with bracketed "[ ]" character lists, as this may filter out unwanted results. You milage may vary.
For more information visit: https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Regular_Expressions

Categories

Resources