Regex - Only detect if all single digits in all four octets [duplicate] - javascript

Overview:
I am trying to combine two REGEX queries into one:
\d+\.\d+\.\d+\.\d+
^(?!(10\.|169\.)).*$
I wrote this as a two part query. The first part would isolate IPs in a block of text and after I copy and paste this I select everything and that does not being with a 10 or 169.
Questions:
It seems like I am over complicating this:
Can anybody see a better way to do this?
Is there a way to combine these two queries?

Sure. Just put the anchored negative look ahead at the start:
^(?!10\.|169\.)\d+\.\d+\.\d+\.\d+$
Note: Unnecessary brackets have been removed.
To match within a line, ie remove the anchors and use a "word boundary" \b as the anchor:
\b(?!10\.|169\.)\d+\.\d+\.\d+\.\d+

A quick-and-gimme-regex style answer
Basic one (whole string looks like an IP): ^\d+\.\d+\.\d+\.\d+$
Lite (period-separated 4-digit chunks, a whole word): \b\d+\.\d+\.\d+\.\d+\b
Medium (excluding junk like 1.2.4.6.7.9.0): (?<!\d\.)\b\d+\.\d+\.\d+\.\d+\b(?!\.\d+)
Advanced 1 (not starting with 10 or 169): (?<!\d\.)\b(?!(?:1(?:0|69))\.)\d+\.\d+\.\d+\.\d+\b(?!\.\d+)
Advanced 2 (not ending with 8 or 10): (?<!\d\.)\b\d+\.\d+\.\d+\.(?!(?:8|10)\b)\d+\b(?!\.\d+)
Details for the curious
The \b is a word boundary that makes it possible to match exact "words" (entities consisting of [a-zA-Z0-9_] characteters) inside a longer text. So, if we do not want to match 12.12.23.56 inside g12.12.23.56g, we use the Lite version.
The lookarounds together with the word boundary, make it possible to further restrict the matches. (?<!\d\.) - a negative lookbehind - and a (?!\.\d+) - a negative lookahead - will fail a match if the IP-resembling substring is preceded with a digit+. or followed with a .+digit. So, we do not match 12.12.34.56.78.90899-like entities with this regex. Choose Medium regex for that case.
Now, you need to restrict the matches to those that do not start with some numeric value. You need to make use of either a lookbehind, or a lookahead. When choosing between a lookbehind or a lookahead solution, prefer the lookahead, because 1) it is less resource consuming, and 2) more flavors support it. Thus, to fail all matches where IP first number is equal to 10 or 169, we can use a negative lookahead anchored after the leading word boundary: (?!(?:1(?:0|69))\.). The syntax is (?!...) and inside, we match either 1 followed with 0 and then a ., or 1 followed with 69 and then .. Note that we could write (?!10\.|169\.) but there is some redundant backtracking overhead then, as 1 part is repeating. Best practice is to "contract" alternations so that the beginning of each branch did not repeat, make the alternation group more linear. So, use Advanced 1 regex version to get those IPs.
A similar case is the Advanced 2 regex for getting some IPs that do not end with some value.

Related

Trying to exclude match when surrounded on both sides by a certain string

What I'm looking to do is to modify a regex (JS flavor) to not match if the pattern is both preceded and followed by the same string.
By way of a simple analogy, say I want to match all instances of n that are not both preceded and followed by e. So, for example, the regex should not match the n in alkene, but it should still match the n in pen or nest, which only have the e directly adjacent to n on one side, not both.
Most older threads I've seen trying to find an answer basically say "just use negative lookarounds", but the problem is that (?<!e)n(?!e) doesn't match any of those inputs - because the lookbehind and lookahead are processed by the regex engine separately, so it considers either condition to be sufficient to exclude the match.
(The real regex is (?<!¸ª)()(ɣʷ|h₂|r₂|r₃|w|j)(?:e|o|ø|ɑ|i|ɚ|y|u|a)(?!¸ª) and it's failing to match the ɣʷ in t͡ʃe:h₁dɣʷo¸ªh₂¸ª, but that makes the problem look a lot harder to explain than it needs to be)
How do you modify a regex to only exclude patterns when they're nested?
The (?<!b)a(?!b) pattern here must be replaced with (?<!b(?=ab))a or a(?!(?<=ba)b). The point is to call a reverse lookahead or lookbehind from lookbehind or lookahead.
See your pattern fix (without any optimizations) where I took the lookahead, pasted it inside lookbehind after ª, reversed the lookahead (i.e. made it positive) and added the whole pattern before ¸ª in the lookahead to be able to get to the right-hand ¸ª:
(?<!¸ª(?!(ɣʷ|h₂|r₂|r₃|w|j)(?:e|o|ø|ɑ|i|ɚ|y|u|a)¸ª))()(ɣʷ|h₂|r₂|r₃|w|j)(?:e|o|ø|ɑ|i|ɚ|y|u|a)
Or, if you put the lookbehind into lookahead:
()(ɣʷ|h₂|r₂|r₃|w|j)(?:e|o|ø|ɑ|i|ɚ|y|u|a)(?!(?<=¸ª(ɣʷ|h₂|r₂|r₃|w|j)(?:e|o|ø|ɑ|i|ɚ|y|u|a))¸ª)
See the regex demo (and regex demo #2).
Whenever your pattern is simple, it is best not to repeat the pattern in the lookarounds, you may usually just use . or .{x} where x stands for the number of chars your consuming pattern part can match. Here, it is not clear how many chars the pattern can actually match, you may probably use (?<!¸ª(?!.{1,2}¸ª))()(ɣʷ|h₂|r₂|r₃|w|j)(?:e|o|ø|ɑ|i|ɚ|y|u|a), but I do not have any edge cases to test against.
Enhancing this further may yield (?<!¸ª(?!.{1,2}¸ª))()(ɣʷ|[hr]₂|r₃|w|j)([eoøɑiɚyua]) (demo).

Javascript RegEx match 1-1-1 and 1-1-1-1-1 but not -1-1-1-1 or 1-1-1-1-

i haven't found anything when using google and stack overflow.
I need to match 1-1-1 but not -1-1-1 or 1-1-1- with javascript RegEx.
So it has to start with a number and end with a number and has to be seperated with "-".
I can't figure out, how to do it.
Is it even possible?
Unfortunately, JavaScript regex doesn't have a look-behind (see javascript regex - look behind alternative?), so to exclude a preceding -, the regex will have to match on the preceding character too (as long as it's not a -).
Since there might not be a preceding character (input starts with 1), you have to also match on beginning of input (^).
So, this regex will do it: (?:[^-]|^)(1(?:-1)+)(?!-)
See regex101.com.
Whether it should match a standalone 1, or only on 1-1 (and longer), is up to you. The regex above will not match standalone 1. Change + to * to change that.
I also added capturing of the actual text you wanted to match, i.e. without the leading character. You can remove the extra () around 1(?:-1)+ if that's not needed.

Regex exact match on number, not digit

I have a scenario where I need to find and replace a number in a large string using javascript. Let's say I have the number 2 and I want to replace it with 3 - it sounds pretty straight forward until I get occurrences like 22, 32, etc.
The string may look like this:
"note[2] 2 2_ someothertext_2 note[32] 2finally_2222 but how about mymomsays2."
I want turn turn it into this:
"note[3] 3 3_ someothertext_3 note[32] 3finally_2222 but how about mymomsays3."
Obviously this means .replace('2','3') is out of the picture so I went to regex. I find it easy to get an exact match when I am dealing with string start to end ie: /^2$/g. But that is not what I have. I tried grouping, digit only, wildcards, etc and I can't get this to match correctly.
Any help on how to exactly match a number (where 0 <= number <= 500 is possible, but no constraints needed in regex for range) would be greatly appreciated.
The task is to find (and replace) "single" digit 2, not embedded in
a number composed of multiple digits.
In regex terms, this can be expressed as:
Match digit 2.
Previous char (if any) can not be a digit.
Next char (if any) can not be a digit.
The regex for the first condition is straightforward - just 2.
In other flavours of regex, e.g. PCRE, to forbid the previous
char you could use negative lookbehind, but unfortunately Javascript
regex does not support it.
So, to circumvent this, we must:
Put a capturing group matching either start of text or something
other than a digit: (^|\D).
Then put regex matching just 2: 2.
The last condition, fortunately, can be expressed as negative lookahead,
because even Javascript regex support it: (?!\d).
So the whole regex is:
(^|\D)2(?!\d)
Having found such a match, you have to replace it with the content
of the first capturing group and 3 (the replacement digit).
You can use negative look-ahead:
(\D|^)2(?!\d)
Replace with: ${1}3
If look behind is supported:
(?<!\d)2(?!\d)
Replace with: 3
See regex in use here
(\D|\b)2(?!\d)
(\D|\b) Capture either a non-digit character or a position that matches a word boundary
(?!\d) Negative lookahead ensuring what follows is not a digit
Alternations:
(^|\D)2(?!\d) # Thanks to #Wiktor in the comments below
(?<!\d)2(?!\d) # At the time of writing works in Chrome 62+
const regex = /(\D|\b)2(?!\d)/g
const str = `note[2] 2 2_ someothertext_2 note[32] 2finally_2222 but how about mymomsays2.`
const subst = "$13"
console.log(str.replace(regex, subst))

Regular Expressions - Match all alphanumeric characters except individual numbers

I would like to create a RegEx to match only english alphanumeric characters but ignore (or discard) isolated numbers in Ruby (and if possible in JS too).
Examples:
1) I would like the following to be matched:
4chan
9gag
test91323432
asf5asdfaf35edfdfad
afafaffe
But not:
92342424
343424
34432
and so on..
The above is exactly what I would want.
Edit: I deleted the second sub-question. Just focus on the first one, thank you very much for your answers!!
Sorry, my regex skills aren't that great (hence this question!)
Thank you.
You can try the following expression (works both in Ruby and Javascript):
^(?!^\d+$)[[:alnum:]]+$
This first ensures the string is not just digits by using a negative look ahead (?!^[0-9]+$), then it matches one or more alphanumeric character, Unicode characters are supported which means this works with French letters too.
EDIT: If you only want English alphabet:
^(?!^\d+$)\w+$
Rubular Demo
For any Latin letters:
/(?=.*\p{Alpha})\p{Alnum}+/
I'm pretty sure that you can't do what you want to do with one regex. A single alpha character, anywhere in a group of numbers, will make it a valid match, and there is no way to represent that in regex, because what you are really saying is something along the lines of "a letter is required at the front of this word, but only if there isn't a letter in the middle or at the end", and regex won't do that.
Your best bet is to do two passes:
one that matches your alphanumeric, plus special "French" characters (pattern: TBD, based on what special characters you want to accept), and
one that matches numbers only (pattern: would include [0-9]+ . . . need more information about the specific situation to give you a final, complete regex)
The values that you want in the end would need to pass the first regex and fail the second one.
Also . . .
To give you a better answer, we'll need to know a couple of things:
Are you testing that an entire string matches the pattern?
Are you trying to capture a single instance of the pattern in a bigger string?
Are you trying to capture all of the instances of the pattern in a bigger string?
The answers to those questions have a big impact on the final regex pattern that you will need.
And, finally . . .
A note on the "French" characters . . . you need to be very specific about which special characters are acceptable and which aren't. There are three main approaches to special character matching in regex: groups, additive, and subtractive
groups - these are characters that represent a preset group of characters in the version of regex that you are using. For example, \s matches all whitespaces
additive - this is the process of listing out each acceptable character (or range of characters) in your regex. This is better when you have a small group of acceptable characters
subtractive - this is the process of listing out each UNacceptable character (or range of characters) in your regex. This is better when you have a large group of acceptable characters
If you can clear up some of these questions, we should be able to give you a better answer.
Maybe this ^(?![0-9]+$)[a-zA-Z0-9\x80-\xa5]+$
Edit - fixed cut&paste error and added Extended character range \x80-\xa5
which includes the accent chars (depending on locale set, the figures may be different)

RegEx in JS to find No 3 Identical consecutive characters

How to find a sequence of 3 characters, 'abb' is valid while 'abbb' is not valid, in JS using Regex (could be alphabets,numerics and non alpha numerics).
This question is a variation of the question that I have asked in here : How to combine these regex for javascript.
This is wrong : /(^([0-9a-zA-Z]|[^0-9a-zA-Z]))\1\1/ , so what is the right way to do it?
This depends on what you actually mean. If you only want to match three non-identical characters (that is, if abb is valid for you), you can use this negative lookahead:
(?!(.)\1\1).{3}
It first asserts, that the current position is not followed by three times the same character. Then it matches those three characters.
If you really want to match 3 different characters (only stuff like abc), it gets a bit more complicated. Use these two negative lookaheads instead:
(.)(?!\1)(.)(?!\1|\2).
First match one character. Then we assert, the this is not followed by the same character. If so, we match another character. Then we assert that these are followed neither by the first nor the second character. Then we match a third character.
Note that those negative lookaheads ((?!...)) do not consume any characters. That is why they are called lookaheads. They just check what is coming next (or in this case what is not coming next) and then the regex continues from where it left of. Here is a good tutorial.
Note also that this matches anything but line breaks, or really anything if you use the DOTALL or SINGLELINE option. Since you are using JavaScript you can just activate the option by appending s after the regexes closing delimiter. If (for some reason) you don't want to use this option, replace the .s by [\s\S] (this always matches any character).
Update:
After clarification in the comments, I realised that you do not want to find three non-identical characters, but instead you want to assert that your string does not contain three identical (and consecutive) characters.
This is a bit easier, and closer to your former question, since it only requires one negative lookahead. What we do is this: we search the string from the beginning for three consecutive identical characters. But since we want to assert that these do not exist we wrap this in a negative lookahead:
^(?!.*(.)\1\1)
The lookahead is anchored to the beginning of the string, so this is the only place where we will look. The pattern in the lookahead then tries to find three identical characters from any position in the string (because of the .*; the identical characters are matched in the same way as in your previous question). If the pattern finds these, the negative lookahead will thus fail, and so the string will be invalid. If not three identical characters can be found, the inner pattern will never match, so the negative lookahead will succeed.
To find non-three-identical characters use regex pattern
([\s\S])(?!\1\1)[\s\S]{2}

Categories

Resources