Catching start number and final number - javascript

i'm trying to create a regex to catch the first number in the line and the last one, but i'm having some problem with the last one:
The lines look like this:
00005 SALARIO MENSAL 17030 36.397.291,92 36.397.291,92
00010 HORAS TRABALHADAS 0798 19.731,93 19.731,93
And this is my regex:
(^\d+).*(\d)
As you can see here: http://regexr.com/3crbt is not working as expected. I can get the first one, but the last is just the last number.
Thanks!

You can use
/^(\d+).*?(\d+(?:[,.]\d+)*)$/gm
See the regex demo
The regex matches:
^ - start of the line
(\d+) - captures into Group 1 one or more digits
.*? - matches any characters but a newline, as few as possible up to
(\d+(?:[,.]\d+)*) - one or more digits followed with zero or more sequences of , or . followed with one or more digits (Group 2)
$ - end of the string
The /g modifier ensures we get all matches and /m modifier makes the ^ and $ match start and end of a line respectively.

I tried the following one:
(^(\d+))|(\d+$)
And its seems to work on the regexr.com thingy. But matching them up might require some assumptions that each line has at least two numbers.

You need to make the .* non-greedy by changing it to .*? and add + to the second digit sequence match.
^(\d+).*?(\d+)$
If you want to match the full last number, use this:
^(\d+).*?([\d\.,]+)$
Example

Related

I want a regex that will select all characters including special characters, except the Last 4 Characters. Vanilla JS

let str = "AAAAA-0000021111-1111";
let res = str.match(/\d(?=\d{4})/g);
document.write(res);
//This didnt work, the output is given below:
Output:
0,0,0,0,0,2
// its only selecting these characters which are highlighted in bold AAAAA-0000021111-1111
And this is what I am expecting:
A,A,A,A,A,-,0,0,0,0,0,2,1,1,1,1,-
Basically, I want all characters to be selected including - or any other special characters except the last 4.
I am listing a couple of extra samples for a better understanding
Sample1: ABC-101010-1111
Expected output is: A,B,C,-,1,0,1,0,1,0,-
Sample2: ABCD101010-11111
Expected output is: A,B,C,D,1,0,1,0,1,0,-,1
I am using Vanilla JS.
Really appreciate your involvement in this.
Thanks in Advance Sir/Ma'am!!!
You can use
/(?!\d{1,4}$)./g
See the regex demo.
NOTE: If you need to match all chars except last four any chars, you will need to replace \d with . in the lookahead, and it can be even written in a bit more succint way:
/.(?!.{0,3}$)/g
See this regex demo.
Details:
. - any single char other than line break char
(?!.{0,3}$) - a negative lookahead that matches a location not immediately followed with zero to three chars other than line break chars, till end of string.
The regex will match any char other than a line break char (., to match line breaks, use [^], [\s\S]/[\d\D]/[\w\W] or add s flag) that does not start a sequence of one, two, three or four digits at the end of string ((?!\d{1,4}$)).
See a JavaScript demo:
console.log("AAAAA-0000021111-1111".match(/(?!\d{1,4}$)./g))
console.log("AAAAA-0000021111-1111".match(/.(?!.{0,3}$)/g))

Why does my Regular Expression ignore the last character in each match

Below regex works properly except that it ignores the last character in each match.
\d{4}\b.*?(?=[^:]\d{4}(?! ml| kg)( [A-Za-z]{2}| \d{1}-| 1H-| [A-Za-z0-9],[A-Za-z0-9]| \D{1}-)|$)
My question is:
How can this be updated to also include the last character in each match
Below an example of the data:
https://regex101.com/r/XRlr4Q/1
The [^:] pattern in the lookahead requires a char other than : before the first four digits of a match.
You need to use a lookbehind (?<!:) there:
\d{4}\b.*?(?=(?<!:)\d{4}(?! ml| kg)(?: [A-Za-z]{2}| \d-| 1H-| [A-Za-z0-9],[A-Za-z0-9]| \D-)|$)
See the regex demo.
I rewrote a bit your Regex to match
4 digits
Followed by your special char and unlimited [A-a]
Then a positive lookahead to match everything until it sees the start sequence again or the end of file
I removed some things which you can add again if needed, but it works with your dataset.
\d{4}( [A-a]+).+?(?=\d{4}( [A-a]+) | $)
Here an example based on your DataSet :
https://regex101.com/r/HWYJEf/1

Capture a string (from a certain point) with regex not starting with certain letters

I am in the process of writing a regex that captures everything from a certain point if the string doesn't start with certain letters.
More precisely I want to capture everything from - up until a comma, only IF this string doesn't start with pt.
en-GB should capture -GB
But if the word starts with pt I simply want to skip the capture:
pt-BR should capture nothing.
I created this regex:
-[^,]*
Which works nicely except that this also captures strings beginning with pt.
Unfortunately I can't use lookbehinds since its not supported by JS, so I tried using a negative lookahead like this:
^(?!pt).*
Problem is that this captures the entire string, and not from -. I tried replacing .* with something that starts capturing at -but I haven't been successful so far.
I am kinda new to regex so any guideance would be helpful.
To match pt- and any two letters at the start of the string or any two other letters, you may use
text.match(/^(?:pt-[a-zA-Z]{2}|[a-zA-Z]{2})/)
See the regex demo. Details:
^ - start of string
(?:pt-[a-zA-Z]{2}|[a-zA-Z]{2}) - either of the two alternatives:
pt-[a-zA-Z]{2} - pt- and any two ASCII letters
| - or
[a-zA-Z]{2} - any two ASCII letters
It looks like you need to use a .replace method for some reason. Then, you may use
text.replace(/\b(?!pt-)([A-Za-z]{2})-[a-zA-Z]{2}\b/, '$1')
See this regex demo. Details:
\b - a word boundary
(?!pt-) - no pt- allowed immediately to the right of the current location
([A-Za-z]{2}) - Group 1: any two ASCII letters
- - a hyphen
[a-zA-Z]{2} - any two ASCII letters
\b - a word boundary

RegEx for matching all chars except for comma separated digits

I have an input that I want to apply validation to. User can type any integer (positive or negative) numbers separated with a comma. I want to
Some examples of allowed inputs:
1,2,3
-1,2,-3
3
4
22,-33
Some examples of forbidden inputs:
1,,2
--1,2,3
-1,2,--3
asdas
[]\%$1
I know a little about regex, I tried lots of ways, they're not working very well see this inline regex checker:
^[-|\d][\d,][\d]
You can use
^(?:-?[0-9]+(?:,(?!$)|$))+$
https://regex101.com/r/PAyar7/2
-? - Lead with optional -
[0-9]+ - Repeat digits
(?:,(?!$)|$)) - After the digits, match either a comma, or the end of the string. When matching a comma, make sure you're not at the end of the string with (?!$)
As per your requirements I'd use something simple like
^-?\d+(?:,-?\d+)*$
at start ^ an optional minus -? followed by \d+ one or more digits.
followed by (?:,-?\d+)* a quantified non capturing group containing a comma, followed by an optional hyphen, followed by one or more digits until $ end.
See your updated demo at regex101
Another perhaps harder to understand one which might be a bit less efficient:
^(?:(?:\B-)?\d+,?)+\b$
The quantified non capturing group contains another optional non capturing group with a hyphen preceded by a non word boundary, followed by 1 or more digits, followed by optional comma.
\b the word boundary at the $ end ensures, that the string must end with a word character (which can only be a digit here).
You can test this one here at regex101

Regex not to allow to consecutive dot characters and more

I am trying to make a JavaScript Regex which satisfies the following conditions
a-z are possible
0-9 are possible
dash, underscore, apostrophe, period are possible
ampersand, bracket, comma, plus are not possible
consecutive periods are not possible
period cannot be located in the start and the end
max 64 characters
Till now, I have come to following regex
^[^.][a-zA-Z0-9-_\.']+[^.]$
However, this allows consecutive dot characters in the middle and does not check for length.
Could anyone guide me how to add these 2 conditions?
You can use this regex
^(?!^[.])(?!.*[.]$)(?!.*[.]{2})[\w.'-]{1,64}$
Regex Breakdown
^ #Start of string
(?!^[.]) #Dot should not be in start
(?!.*[.]$) #Dot should not be in start
(?!.*[.]{2}) #No consecutive two dots
[\w.'-]{1,64} #Match with the character set at least one times and at most 64 times.
$ #End of string
Correction in your regex
- shouldn't be in between of character class. It denotes range. Avoid using it in between
[a-zA-Z0-9_] is equivalent to \w
Here is a pattern which seems to work:
^(?!.*\.\.)[a-zA-Z0-9_'-](?:[a-zA-Z0-9_'.-]{0,62}[a-zA-Z0-9_'-])?$
Demo
Here is an explanation of the regex pattern:
^ from the start of the string
(?!.*\.\.) assert that two consecutive dots do not appear anywhere
[a-zA-Z0-9_'-] match an initial character (not dot)
(?: do not capture
[a-zA-Z0-9_'.-]{0,62} match to 62 characters, including dot
[a-zA-Z0-9_'-] ending with a character, excluding dot
)? zero or one time
$ end of the string
Here comes my idea. Used \w (short for word character).
^(?!.{65})[\w'-]+(?:\.[\w'-]+)*$
^ at start (?!.{65}) look ahead for not more than 64 characters
followed by [\w'-]+ one or more of [a-zA-Z0-9_'-]
followed by (?:\.?[\w'-]+)* any amount of non capturing group containing a period . followed by one or more [a-zA-Z0-9_'-] until $ end
And the demo at regex101 for trying

Categories

Resources