Regex to select all spaces between a special enclosure

Regex to select all spaces between a special enclosure - javascript

I am trying to write a regex for Javascript that can select all whitespaces in between AMPScript brackets, the syntax for the Lang is something like this
%%[
set #truth = 'this is amp'
IF #truth == 'this is amp' THEN
set #response = 'amp rocks'
ELSE
set #response = '️...'
ENDIF
]%%
So far, I am able to select all the characters inside the given brackets using this expression:
%%\[(\s|.)*?\]%%
But this selects all the characters inside the enclosure, is there a method I can use to only select spaces/new lines/tabindents only?
Thanks in advance!

I believe this is what you want:
(?<=%%\[(\s|.)*)\s*(?=(\s|.)*\]%%)
https://regex101.com/r/EWhbuX/1
Edit
Here's a breakdown of the regular expression:
1.
First we start with a Positive Lookbehind (?<= )
This will make sure that the pattern that's following this, will be preceded by the pattern inside, but will not be include it in the matches.
In this case we want our matching pattern to be preceded by %%[ and any other character %%\[(\s|.)*
So, our resulting code for the Positive Lookbehind is
(?<=%%\[(\s|.)*)
2.
Next comes the pattern that we actually want to match after our Lookabehind, and (spoiler alert), before a Lookahead that we'll define later.
In this case, that's just any whitespace character, so our pattern will be
\s
(Yes, I just noticed that we don't even need the * in my original answer)
3.
Similarly to what we did at the beginning of the expression with the Lookbehind, we now need a Positive Lookahead (?= )
This is to make sure that our whitespaces will be followed by any character and ]%% (\s|.)*\]%%.
So this is our resulting Lookahead:
(?=(\s|.)*\]%%)
4.
Put everything together and you have your regular expression!
(?<=%%\[(\s|.)*)\s(?=(\s|.)*\]%%)

Related

how to negate a capture group?

Using a javascript regexp, I would like to find strings like "/foo" or "/foo d/" but not "/foo /"; ie, "annotation character", then either word with no terminating annotation, or multiple words, where the termination comes at the end of the phrase (with no space). Complicating the situation, there are three possible annotation symbols: /, \ and |.
I've tried something like:
/(?:^|\s)([\\\/|])((?:[\w_-]+(?![^\1]+[\w_-]\1))|(?:[\w\s]+[\w](?=\1)))/g
That is, start with space, then annotation, then
word not followed by (anything but annotation) then letter and annotation... or
possibly multiple words, immediately followed by annotation character.
The problem is the [^\1]: this doesn't read as "anything but the annotation character" in the angle brackets.
I could repeat the whole phrase three times, one for each annotation character. Any better ideas?

As you've mentioned, [^\1] doesn't work - it matches anything that is not the character 1. In JavaScript, you can negate \1 by using a lookahead: (?:(?!\1).)* . This is not as efficient, but it works.
Your pattern can be written as:
([\\\/|])([\w\-]+(?:(?:(?!\1).)*[\w\-]\1)?)
Working example at Regex101
\w already contains underscore.
Instead of alternation (a|ab) I'm using an optional group (a(?:b)?) - we always match the first word, with optional further words and tags.
You may still want to include (?:^|\s) at the beginning.

Select a character if some character from a list is before the character

I have this regular expression:
/([a-záäéěíýóôöúüůĺľŕřčšťžňď])-$\s*/gmi
This regex selects č- from my text:
sme! a Želiezovce 2015: Spoloíč-
ne pre Európu. Oslávili aj 940.
But I want to select only - (without č) (if some character from the list [a-záäéěíýóôöúüůĺľŕřčšťžňď] is before the -).

In other languages you would use a lookbehind
/(?<=[a-záäéěíýóôöúüůĺľŕřčšťžňď])-$\s*/gmi
This matches -$\s* only if it's preceded by one of the characters in the list.
However, Javascript doesn't have lookbehind, so the workaround is to use a capturing group for the part of the regular expression after it.
var match = /[a-záäéěíýóôöúüůĺľŕřčšťžňď](-$\s*)/gmi.match(string);
When you use this, match[1] will contain the part of the string beginning with the hyphen.

First, in regex everything you put in parenthesis will be broken down in the matching process, so that the matches array will contain the full matching string at it's 0 position, followed by all of the regex's parenthesis from left to right.
/[a-záäéěíýóôöúüůĺľŕřčšťžňď](-)$\s*/gmi
Would have returned the following matches for you string: ["č-", "-"] so you can extract the specific data you need from your match.
Also, the $ character indicates in regex the end of the line and you are using the multiline flag, so technically this part \s* is just being ignored as nothing can appear in a line after the end of it.
The correct regex should be /[a-záäéěíýóôöúüůĺľŕřčšťžňď](-)$/gmi

How to make a JS RegEx not to match if a certain string appear?

I am trying to build a RegEx in Javascript that does not match if a certain string appear. So I want to match this curly bracket { but only if in front of it is not the string else.
I am trying to do this ^ *[^else]* *{.*$ but in fact this doe not match if any character in elsestring appear, for example this does not match also this:
erai {
I want to match all the cases when { appear despite of this case else {.
Please could you help me. Here is my DEMO

You can use a negative lookahead. This is supported by JavaScript:
(?!\s*else).+ *({).*$|
DEMO
JavaScript RegEx doesn't support ifs but we can use a trick for it to work:
(?!RegExp)
That's the first part, if RegExp (which is a regex) doesn't appear, then we do the code after that:
.+ *({).*$
That's the RegEx we run. Broken does, it is:
.+ Match anything
* Until 0 - unlimited spaces
({) Capture the {
.*$ Match anything till the end
Now this won't work unless we add a | at the end, or an OR. This will trick it into working like an if statement
Debuggex Demo

What you are looking for is called negative lookbehind.

Can it be done with regex?

Having the following regex: ([a-zA-Z0-9//._-]{3,12}[^//._-]) used like pattern="([a-zA-Z0-9/._-]{3,12}[^/._-])" to validate an HTML text input for username, I wonder if is there anyway of telling it to check that the string has only one of the following: ., -, _
By that I mean, that I'm in need of regex that would accomplish the following (if possible)
alex-how => Valid
alex-how. => Not valid, because finishing in .
alex.how => Valid
alex.how-ha => Not valid, contains already a .
alex-how_da => Not valid, contains already a -
The problem with my current regex, is that for some reason, accepts any character at the end of the string that is not ._-, and can't figure it out why.
The other problem, is that it doesn't check to see that it contains only of the allowed special characters.
Any ideas?

Try this one out:
^(?!(.*[.|_|-].*){2})(?!.*[.|_|-]$)[a-zA-Z0-9//._-]{3,12}$
Regexpal link. The regex above allow at max one of ., _ or -.

What you want is one or more strings containing all upper, lower and digit characters
followed by either one or none of the characters in "-", ".", or "_", followed by at least one character:
^[a-zA-Z0-9]+[-|_|\.]{0,1}[a-zA-Z0-9]+$

Hope this will work for you:-
It says starts with characters followed by (-,.,_) and followed and end with characters
^[\w\d]*[-_\.\w\d]*[\w\d]$

Seems to me you want:
^[A-Za-z0-9]+(?:[\._-][A-Za-z0-9]+)?$
Breaking it down:
^: beginning of line
[A-Za-z0-9]+: one or more alphanumeric characters
(?:[\._-][A-Za-z0-9]+)?: (optional, non-captured) one of your allowed special characters followed by one or more alphanumeric characters
$: end of line
It's unclear from your question if you wanted one of your special characters (., -, and _) to be optional or required (e.g., zero-or-one versus exactly-one). If you actually wanted to require one such special character, you would just get rid of the ? at the very end.
Here's a demonstration of this regular expression on your example inputs:
http://rubular.com/r/SQ4aKTIEF6
As for the length requirement (between 3 and 12 characters): This might be a cop-out, but personally I would argue that it would make more sense to validate this by just checking the length property directly in JavaScript, rather than over-complicating the regular expression.

^(?=[a-zA-Z0-9/._-]{3,12}$)[a-zA-Z0-9]+(?:[/._-][a-zA-Z0-9]+)?$
or, as a JavaScript regex literal:
/^(?=[a-zA-Z0-9\/._-]{3,12})[a-zA-Z0-9]+(?:[\/._-][a-zA-Z0-9]+)?$/
The lookahead, (?=[a-zA-Z0-9/._-]{3,12}$), does the overall-length validation.
Then [a-zA-Z0-9]+ ensures that the name starts with at least one non-separator character.
If there is a separator, (?:[/._-][a-zA-Z0-9]+)? ensures that there's at least one non-separator following it.
Note that / has no special meaning in a regex. You only have to escape it if you're using a regex literal (because / is the regex delimiter), and you escape it by prefixing with a backslash, not another forward-slash. And inside a character class, you don't need to escape the dot (.) to make it match a literal dot.

The dot in regex has a special meaning: "any character here".
If you mean a literal dot, you should escape it to tell the regex parser so.
Escape dot in a regex range

Javascript lookahead regular expression

I'm trying to write a regular expression to parse the following string out into three distinct parts. This is for a highlighting engine I'm writing:
"\nOn and available after solution."
I have a regular expression that's dynamically created for any word a user might input. In the above example, the word is "on".
The regular expression expects a word with any amount of white space ([\s]*) followed by the search word (with no -\w following it, eg: on-time, on-wards should not be a valid result. To complicate this, there can be a -,$,< or > symbol following the example, so on-, on> or on$ are valid. This is why there is a negative lookahead after the search word in my regular expression below.
There's a complicated reason for this, but it's not relevant to the question. The last part should be the rest of the sentence. In this example, " and available after solution."
So,
p1 = "\n"
p2 = "On"
p3 = " and available after solution"
I currently have the following regular expression.
test = new RegExp('([\\s]*)(on(?!\\-\\w))([$\\-><]*?\\s(?=[.]*))',"gi")
The first part of this regular expression ([\\s]*)(on(?!\\-\\w))[$\\-><]*? works as expected. The last part does not.
In the last part, what I'm trying to do is force the regular expression engine to match whitespace before matching additional characters. If it can not match a space, then the regular expression should end. However, when I run this regular expression, I get the following results
str1 = "\nOn ly available after solution."
test.exec(str1)
["\n On ", "\n ", "On"]
So it would appear to me that the last positive look ahead is not working. Thanks for any suggestions, and if anyone needs some clarification, let me know.
EDIT:
It would appear that my regular expression was not matching because I didn't realize the following caveat:
You can use any regular expression inside the lookahead. (Note that this is not the case with lookbehind. I will explain why below.) Any valid regular expression can be used inside the lookahead. If it contains capturing parentheses, the backreferences will be saved. Note that the lookahead itself does not create a backreference. So it is not included in the count towards numbering the backreferences. If you want to store the match of the regex inside a backreference, you have to put capturing parentheses around the regex inside the lookahead, like this: (?=(regex)). The other way around will not work, because the lookahead will already have discarded the regex match by the time the backreference is to be saved.

The dot in the character class [.] means a literal dot. Change it to just . if you wish to match any character.
The lookahead (?=.*) will always match and is completely pointless. Change it to (.*) if you just want to capture that part of the string.

I think the problem is your positive lookahead on(?!\-\w) is trying to match any on that is not followed by - then \w. I think what you want instead is on(?!\-|\w), which matches on that is not followed by - OR \w

Develop Reference

JavaScript is the programming language of the Web.

Regex to select all spaces between a special enclosure - javascript

Related

how to negate a capture group?

Select a character if some character from a list is before the character

How to make a JS RegEx not to match if a certain string appear?

Can it be done with regex?

Javascript lookahead regular expression

Categories

Resources