Regex that validates there is space after a period in javascript - javascript

Consider the following strings:
I am happy.And I know it.
I am happy. And I know it.
I am happy.
I am happy
The rule is simple: There must be a space after the period. Only one of these should fail, I have tried:
new RegExp(/(\.\s|^)(?!\S)/)
The result is:
new RegExp(/(\.\s|^)(?!\S)/).test('I am happy.And I know it.')
false
new RegExp(/(\.\s|^)(?!\S)/).test('I am happy. And I know it.')
false
new RegExp(/(\.\s|^)(?!\S)/).test('I am happy.')
false
new RegExp(/(\.\s|^)(?!\S)/).test('I am happy')
false
Only the first one should fail. The rest should pass.
I think I am close, I just need to adjust it to say "Do we have a character/word anything after the period, if so - require a space"
Thoughts?

You could assert that there is not a dot present followed by a non whitespace character.
^(?!.*?\.\S)
See the matched positions at the regex 101 demo.
console.log(/^(?!.*?\.\S)/.test('I am happy.And I know it.'));
console.log(/^(?!.*?\.\S)/.test('I am happy. And I know it.'));

just test for . followed by a word and negate
!(/\.\w/.test('I am happy.And I know it.'))
or . followed by non-whitespace and negate
!(/\.\S/.test('I am happy.And I know it.'))

Have you consider testing for periods without spaces after them using the regex \.[^ ] e.g.
str = "I am happy."
!str.search(/\.[^ ]/)
true

Related

Javascript - regex to check if user write correct formated input

in my CLI users can specify what they want to use:
A user command can look like this:
include=name1,name2,name3
category=name1,name2
category=name1
In another words, a command always consists of 3 parts:
command name: can be just include or category
=: is in every command
name or names of things they want to use, split by ,
How can I test this to get always true but false on everything else.
I am really bad in regex but I tried something like this:
/\category|include=\w/.test(str);
to simply test, at least, the most easy alternative which would be category=name1 but without success.
Can someone help me with this?
You were on the right path. Here's a fixed regex:
/^(category|include)=\w+(,\w+)*$/.test(str);
Note:
the parens around the alternative parts
the + after the \w so that you can have several characters
the optional (,\w+)*
the start and end of string marks (^ and $) in order to check the whole string
You can use this regex for your requorement:
/^(category|include)=(\w+(?:,\w+)*)$/
RegEx Demo
\w+(?:,\w+)*) in the value part after = will allow 1 or more of comma separated words.

why this regexp returns match?

http://jsfiddle.net/sqee98xr/
var reg = /^(?!managed).+\.coffee$/
var match = '20150212214712-test-managed.coffee'.match(reg)
console.log(match) // prints '20150212214712-test-managed.coffee'
I want to match regexp only if there is not word "managed" present in a string - how I can do that?
Negative lookaheads are weird. You have to match more than just the word you are looking for. It's weird, I know.
var reg = /^(?!.*managed).+\.coffee$/
http://jsfiddle.net/sqee98xr/3/
EDIT: It seems I really got under some people's skin with the "weird" descriptor and lay description. It's weird because on a surface level the term "negative lookahead" implies "look ahead and make sure the stuff in these parenthesis isn't up there, then come back and continue matching". As a lover of regex, I still proclaim this naming is weird, especially to first time users of the assertion. To me it's easier to think of it as a "not" operator as opposed to something which actually crawls forward and "looks ahead". In order to get behavior to resemble an actual "look ahead", you have to match everything before the search term, hence the .*.
An even easier solution would have been to remove the start-of-string (^) assertion. Again, to me it's easier to read ?! as "not".
var reg = /(?!managed).+\.coffee$/
While #RyanWheale's solution is correct, the explanation isn't correct. The reason essentially is that a string that contains the word "managed" (such as "test-managed" ) can count as not "managed". To get an idea of this first lets look at the regular expression:
/^(?!managed).+\.coffee$/
// (Not "managed")(one or more characters)(".")("coffee")
So first we cannot have a string with the text "managed", then we can have one or more characters, then a dot, followed by the text "coffee". Here is an example that fulfills this.
"Hello.coffee" [ PASS ]
Makes sense, "Hello" certainly is not "managed". Here is another example that works from your string:
"20150212214712-test-managed.coffee" [ PASS ]
Why? Because "20150212214712-test-managed" is not the string "managed" even though it contains the string, the computer does not know that's what you mean. It thinks that "20150212214712-test-managed" as a string that isn't "managed" in the same way "andflaksfj" isn't "managed". So the only way it fails is if "managed" was at the start of the string:
"managed.coffee" [ FAIL ]
This isn't just because the text "managed" is there. Say the computer said that "managed." was not "managed". It would indeed pass the (?!managed) part but the rest of the string would just be coffee and it would fail because there is no ".".
Finally the solution to this is as suggested by the other answer:
/^(?!.*managed).+\.coffee$/
Now the string "20150212214712-test-managed.coffee" fails because no matter how it's looked at: "test-managed", "-managed", "st-managed", etc. Would still count as (?!.*managed) and fail. As in the example above this one it could try adding a sub-string from ".coffee", but as explained this would cause the string to fail in the rest of the regexp ( .+\.coffee$ ).
Hopefully this long explanation explained that Negative look-aheads are not weird, just takes your request very literally.

Regex removing extension and numbers

I'm close to becoming a level 3 Regex Sorcerer (where I can find hidden traps and have a pet owl or bat), but I still need some help getting there...
The following works for the first two cases but fails for the third. I tried making the digits greedy but the whole thing fell over and I don't know where I'm going wrong.
Can you please help?
alert(removeNumberAndExtension("file 01.txt")) // works
alert(removeNumberAndExtension("file_01.txt")) // works
alert(removeNumberAndExtension("file.txt")) // fails
function removeNumberAndExtension(fname)
{
var rexp = new RegExp(/\s*\d+\.[a-zA-Z]+/g)
return fname.replace(rexp, "")
}
It's because of \d+: "one or more digits".
You need \d*: "zero or more digits".
Files extensions can also have digits (e.g. ".mp3"), so use [a-zA-Z0-9].
You should add the "end of the string" anchor ($), which makes the global flag (g) useless.
All these together: /\s*\d*\.[a-zA-Z0-9]+$/ :)

Breaking a String into Chunks based on Pattern

I have one string, that looks like this:
a[abcdefghi,2,3,jklmnopqr]
The beginning "a" is fixed and non-changing, however the content within the brackets is and can follow a pattern. It will always be an alphabetical string, possibly followed by numbers separate by commas or more strings and/or numbers.
I'd like to be able to break it into chunks of the string and any numbers that follow it until the "]" or another string is met.
Probably best explained through examples and expected ideal results:
a[abcdefghi] -> "abcdefghi"
a[abcdefghi,2] -> "abcdefghi,2"
a[abcdefghi,2,3,jklmnopqr] -> "abcdefghi,2,3" and "jklmnopqr"
a[abcdefghi,2,3,jklmnopqr,stuvwxyz] -> "abcdefghi,2,3" and "jklmnopqr" and "stuvwxyz"
a[abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz] -> "abcdefghi,2,3" and "jklmnopqr,1,9" and "stuvwxyz"
a[abcdefghi,1,jklmnopqr,2,stuvwxyz,3,4] -> "abcdefghi,1" and "jklmnopqr,2" and "stuvwxyz,3,4"
Ideally a malformed string would be partially caught (but this is a nice extra):
a[2,3,jklmnopqr,1,9,stuvwxyz] -> "jklmnopqr,1,9" and "stuvwxyz"
I'm using Javascript and I realize a regex won't bring me all the way to the solution I'd like but it could be a big help. The alternative is to do a lot of manually string parsing which I can do but doesn't seem like the best answer.
Advice, tips appreciated.
UPDATE: Yes I did mean alphametcial (A-Za-z) instead of alphanumeric. Edited to reflect that. Thanks for letting me know.
You'd probably want to do this in 2 steps. First, match against:
a\[([^[\]]*)\]
and extract group 1. That'll be the stuff in the square brackets.
Next, repeatedly match against:
[a-z]+(,[0-9]+)*
That'll match things like "abcdefghi,2,3". After the first match you'll need to see if the next character is a comma and if so skip over it. (BTW: if you really meant alphanumeric rather than alphabetic like your examples, use [a-z0-9]*[a-z][a-z0-9]* instead of [a-z]+.)
Alternatively, split the string on commas and reassemble into your word with number groups.
Why wouldn't a regex bring you all the way to a solution?
The following regex works against the given data, but it makes a few assumptions (at least two alphas followed by comma separated single digits).
([a-z]{2,}(?:,\\d)*)
Example:
re = new RegExp('[a-z]{2,}(?:,\\d)*', 'g')
matches = re.exec("a[abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz]")
Assuming you can easily break out the string between the brackets, something like this might be what you're after:
> re = new RegExp('[a-z]+(?:,\\d)*(?:,?)', 'gi')
> while (match = re.exec("abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz")) { print(match[0]) }
abcdefghi,2,3,
jklmnopqr,1,9,
stuvwxyz
This has the advantage of working partially in your malformed case:
> while (match = re.exec("abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz")) { print(match[0]) }
jklmnopqr,1,9,
stuvwxy
The first character class [a-z] can be modified if you meant for it to be truly alphanumeric.

How to search csv string and return a match by using a Javascript regex

I'm trying to extract the first user-right from semicolon separated string which matches a pattern.
Users rights are stored in format:
LAA;LA_1;LA_2;LE_3;
String is empty if user does not have any rights.
My best solution so far is to use the following regex in regex.replace statement:
.*?;(LA_[^;]*)?.*
(The question mark at the end of group is for the purpose of matching the whole line in case user has not the right and replace it with empty string to signal that she doesn't have it.)
However, it doesn't work correctly in case the searched right is in the first position:
LA_1;LA_2;LE_3;
It is easy to fix it by just adding a semicolon at the beginning of line before regex replace but my question is, why doesn't the following regex match it?
.*?(?:(?:^|;)(LA_[^;]*))?.*
I have tried numerous other regular expressions to find the solution but so far without success.
I am not sure I get your question right, but in regards to the regular expressions you are using, you are overcomplicating them for no clear reason (at least not to me). You might want something like:
function getFirstRight(rights) {
var m = rights.match(/(^|;)(LA_[^;]*)/)
return m ? m[2] : "";
}
You could just split the string first:
function getFirstRight(rights)
{
return rights.split(";",1)[0] || "";
}
To answer the specific question "why doesn't the following regex match it?", one problem is the mix of this at the beginning:
.*?
eventually followed by:
^|;
Which might be like saying, skip over any extra characters until you reach either the start or a semicolon. But you can't skip over anything and then later arrive at the start (unless it involves newlines in a multiline string).
Something like this works:
.*?(\bLA_[^;]).*
Meaning, skip over characters until a word boundary followed by "LA_".

Categories

Resources