Javascript - return string between square brackets - javascript

I need to return just the text contained within square brackets in a string. I have the following regex, but this also returns the square brackets:
var matched = mystring.match("\\[.*]");
A string will only ever contain one set of square brackets, e.g.:
Some text with [some important info]
I want matched to contain 'some important info', rather than the '[some important info]' I currently get.

Use grouping. I've added a ? to make the matching "ungreedy", as this is probably what you want.
var matches = mystring.match(/\[(.*?)\]/);
if (matches) {
var submatch = matches[1];
}

Since javascript doesn't support captures, you have to hack around it. Consider this alternative which takes the opposite approach. Rather that capture what is inside the brackets, remove what's outside of them. Since there will only ever be one set of brackets, it should work just fine. I usually use this technique for stripping leading and trailing whitespace.
mystring.replace( /(^.*\[|\].*$)/g, '' );

To match any text in between two adjacent open and close square brackets, you can use the following pattern:
\[([^\][]*)]
(?<=\[)[^\][]*(?=])
See the regex demo #1 and regex demo #2. NOTE: The second regex with lookarounds is supported in JavaScript environments that are ECMAScript 2018 compliant. In case older environments need to be supported, use the first regex with a capturing group.
Details:
(?<=\[) - a positive lookbehind that matches a location that is immediately preceded with a [ char (i.e. this requires a [ char to occur immediately to the left of the current position)
[^\][]* - zero or more (*) chars other than [ and ] (note that ([^\][]*) version is the same pattern captured into a capturing group with ID 1)
(?=]) - a positive lookahead that matches a location that is immediately followed with a ] char (i.e. this requires a ] char to occur immediately to the right of the current regex index location).
Now, in code, you can use the following:
const text = "[Some text] ][with[ [some important info]";
console.log( text.match(/(?<=\[)[^\][]*(?=])/g) );
console.log( Array.from(text.matchAll(/\[([^\][]*)]/g), x => x[1]) );
// Both return ["Some text", "some important info"]
Here is a legacy way to extract captured substrings using RegExp#exec in a loop:
var text = "[Some text] ][with[ [some important info]";
var regex = /\[([^\][]*)]/g;
var results=[], m;
while ( m = regex.exec(text) ) {
results.push(m[1]);
}
console.log( results );

Did you try capturing parens:
("\\[(.*)]");
This should return the pattern within the brackets as a captured match in the returned array

Just use replace and map
"blabla (some info) blabla".match(/\((.*?)\)/g).map(b=>b.replace(/\(|(.*?)\)/g,"$1"))

You can't. Javascript doesn't support lookbehinds.
You'll have to either use a capture group or trim off the brackets.
By the way, you probably don't want a greedy .* in your regex. Try this:
"\\[.*?]"
Or better, this:
"\\[[^\\]]*]"

Related

Regex to match characters after the last colon that is not within curly brackets

I need a regex that matches (and lists) all 'modifiers' of a string. Modifiers are individual letters behind the last : in the string. Modifiers can have variables which would be written in curly brackets, e.g. a{variable}. Variables may contain the character : -- which makes it a bit tricky, because we must look for the last : that is NOT between { and }. This is currently my biggest problem, see Example 6 below.
(If it matters, the target language for this will be javascript.)
I got this working already for the most cases, but got a few edge cases that I can not get to work.
My regex so far is:
/(?!.*:)([a-z](\{.*?\})*)/g
Example 1: Single modifier
something:a should match a - working fine
Example 2: Multiple modifiers
something:abc should match a, b, and c - working fine
Example 3: Single modifier with variable
something:a{something} should match a{something} - working fine
Example 4: Single modifier with multiple variables
something:a{something}{something} should match a{something}{something} - working fine
Example 5: Multiple modifiers with variables
something:ab{something}cd{something}{something}efg should match a, b{something}, c, d{something}{something}, e, f, g - working fine
Example 6: Variable containing :
something:a{something:2} - should match a{something:2} - does NOT work. I probably need to modify the negative lookahead somehow to ignore colons in curly brackets, but I couldn't find out how to do that.
Example 7: String not containing a :
something - should match nothing, but matches each letter individually. This may or may not be easy to fix, but my brain currently can't work this out.
Here is a link to test / play around with this regex and the examples: https://regexr.com/6h4h0
If anyone can help me to figure out how to make the regex work for example 6 and 7, I'd be very grateful!
You can use
const regex = /.*:((?:[a-zA-Z](?:{[^{}]*})*)+)$/;
const extract_rx = /[a-zA-Z](?:{[^{}]*})*/g;
const texts = ['something:a','something:abc','something:a{something}','something:a{something}{something}','something:ab{something}cd{something}{something}efg','something:a{something:2}','something:a{something:2}b{something:3}','something'];
for (const text of texts) {
const m = text.match(regex);
if (m) {
const matches = m[1].match(extract_rx);
console.log(text, '=>', matches);
} else {
console.log(text, '=> NO MATCH');
}
}
See the main regex demo. Details:
.*: - matches any zero or more chars other than line break chars as many as possible and then a : followed with...
((?:[a-zA-Z](?:{[^{}]*})*)+) - Group 1: one or more sequences of
[a-zA-Z] - an ASCII letter
(?:{[^{}]*})* - zero or more sequences of a {, zero or more chars other than { and } and then a } char
$ - end of string.
Once there is a match, Group 1 is parsed again to extract all sequences of a letter and then any zero or more {...} substrings right after from it.
What you could do instead is make sure there is a colon somewhere before the matched string with a positive lookbehind.
Essentially switching (?!.*:) for (?<=:.*).
Playground
const regex = /(?<=:.*)([a-z](\{.*?\})*)/g;
const strings = [
"something:a",
"something:abc",
"something:a{something}",
"something:a{something}{something}",
"something:ab{something}cd{something}{something}efg",
"something:a{something:2}",
"something",
];
for (const string of strings) {
console.log(string.match(regex));
}
Not sure if this is what you want:
:([a-z\{.*?\}0-9])*
I would try longer, but have to go catch a flight.

Remove text between square brackets at the end of string

I need a regex to remove last expression between brackets (also with brackets)
source: input[something][something2]
target: input[something]
I've tried this, but it removes all two:
"input[something][something2]".replace(/\[.*?\]/g, '');
Note that \[.*?\]$ won't work as it will match the first [ (because a regex engine processes the string from left to right), and then will match all the rest of the string up to the ] at its end. So, it will match [something][something2] in input[something][something2].
You may specify the end of string anchor and use [^\][]* (matching zero or more chars other than [ and ]) instead of .*?:
\[[^\][]*]$
See the JS demo:
console.log(
"input[something][something2]".replace(/\[[^\][]*]$/, '')
);
Details:
\[ - a literal [
[^\][]* - zero or more chars other than [ and ]
] - a literal ]
$ - end of string
Another way is to use .* at the start of the pattern to grab the whole line, capture it, and the let it backtrack to get the last [...]:
console.log(
"input[something][something2]".replace(/^(.*)\[.*]$/, '$1')
);
Here, $1 is the backreference to the value captured with (.*) subpattern. However, it will work a bit differently, since it will return all up to the last [ in the string, and then all after that [ including the bracket will get removed.
Do not use the g modifier, and use the $ anchor:
"input[something][something2]".replace(/\[[^\]]*\]$/, '');
try this code
var str = "Hello, this is Mike (example)";
alert(str.replace(/\s*\(.*?\)\s*/g, ''));

jquery get value in the last instance parenthesis [duplicate]

I want to extract the text between the last () using javascript
For example
var someText="don't extract(value_a) but extract(value_b)";
alert(someText.match(regex));
The result should be
value_b
Thanks for the help
Try this
\(([^)]*)\)[^(]*$
See it here on regexr
var someText="don't extract(value_a) but extract(value_b)";
alert(someText.match(/\(([^)]*)\)[^(]*$/)[1]);
The part inside the brackets is stored in capture group 1, therefor you need to use match()[1] to access the result.
An efficient solution is to let .* consume everything before the last (
var str = "don't extract(value_a) but extract(value_b)";
var res = str.match(/.*\(([^)]+)\)/)[1];
console.log(res);
.*\( matches any amount of any character until the last literal (
([^)]+) captures one or more characters that are not )
[1] grab captures of group 1 (first capturing group).
use [\s\S] instead of . dot for multiline strings.
Here is a demo at regex101
/\([^()]+\)(?=[^()]*$)/
The lookahead, (?=[^()]*$), asserts that there are no more parentheses before the end of the input.
If the last closing bracket is always at the end of the sentence, you can use Jonathans answer. Otherwise something like this might work:
/\((\w+)\)(?:(?!\(\w+\)).)*$/

JS Regex: Remove anything (ONLY) after a word

I want to remove all of the symbols (The symbol depends on what I select at the time) after each word, without knowing what the word could be. But leave them in before each word.
A couple of examples:
!!hello! my! !!name!!! is !!bob!! should return...
!!hello my !!name is !!bob ; for !
and
$remove$ the$ targetted$# $$symbol$$# only $after$ a $word$ should return...
$remove the targetted# $$symbol# only $after a $word ; for $
You need to use capture groups and replace:
"!!hello! my! !!name!!! is !!bob!!".replace(/([a-zA-Z]+)(!+)/g, '$1');
Which works for your test string. To work for any generic character or group of characters:
var stripTrailing = trail => {
let regex = new RegExp(`([a-zA-Z0-9]+)(${trail}+)`, 'g');
return str => str.replace(regex, '$1');
};
Note that this fails on any characters that have meaning in a regular expression: []{}+*^$. etc. Escaping those programmatically is left as an exercise for the reader.
UPDATE
Per your comment I thought an explanation might help you, so:
First, there's no way in this case to replace only part of a match, you have to replace the entire match. So we need to find a pattern that matches, split it into the part we want to keep and the part we don't, and replace the whole match with the part of it we want to keep. So let's break up my regex above into multiple lines to see what's going on:
First we want to match any number of sequential alphanumeric characters, that would be the 'word' to strip the trailing symbol from:
( // denotes capturing group for the 'word'
[ // [] means 'match any character listed inside brackets'
a-z // list of alpha character a-z
A-Z // same as above but capitalized
0-9 // list of digits 0 to 9
]+ // plus means one or more times
)
The capturing group means we want to have access to just that part of the match.
Then we have another group
(
! // I used ES6's string interpolation to insert the arg here
+ // match that exclamation (or whatever) one or more times
)
Then we add the g flag so the replace will happen for every match in the target string, without the flag it returns after the first match. JavaScript provides a convenient shorthand for accessing the capturing groups in the form of automatically interpolated symbols, the '$1' above means 'insert contents of the first capture group here in this string'.
So, in the above, if you replaced '$1' with '$1$2' you'd see the same string you started with, if you did 'foo$2' you'd see foo in place of every word trailed by one or more !, etc.

Why my regex not working in this particular case

I want get out the number after "pw=" between these text
For example
"blablabla pw=0.5 alg=blablalbala"
would get me 0.5
The regex that I used was:
/.*pw=+(.*)\s+alg=.*/g
In the context of javascript, I would then use that regex in match function to get the number:
str.match(/.*pw=+(.*)\s+alg=.*/g)
But in regex101.com, the result of matching and highlight does not match at all(The result showed that the regex is correct while highlight part not)
You should remove the /g global modifier, and I suggest precising your value matching pattern to [\d.]*.
The point is that when a global modifier is used with String#match, all captured group values are discarded.
Use a regex like
str.match(/\bpw=([\d.]*)\s+alg=/)
^^^^^^ ^
Note that you do not need the .* at the start and end of the pattern, String#match does not require the full string match (unlike String#matches() in Java).
var str = 'blablabla pw=0.5 alg=blablalbala';
var m = str.match(/\bpw=([\d.]*)\s+alg=/);
if (m) {
console.log(m[1]);
}

Categories

Resources