Regex for matching certain url - javascript

It's should match those urls
https://example.com/id/username/
http://example.com/id/username/
https://www.example.com/id/username
http://example.com/id/username/
basically it's should start with http or https when maybe www when example.com and /id and last is username which could be anything, and / is not always in end
username could be anything
I got this so far:
if (input.match(/http\:\/\/example\.com/i)) {
console.log('-');
}
also how to check with regex if urls ends with 7 number like 1234567/ or 3523173. / not always in end

Use the following regular expression
http(s)?:\/\/(www\.)example.com\/id\/[a-zA-Z0-9]+
You can change [a-zA-Z0-9] as per your username format if you required. See following example:
[a-zA-Z0-9]+ ==> Username contain Uppercase, Lowercase, Number. (john008)
[a-zA-Z]+ ===> Username contain Uppercase, Lowercase. (john)
[0-9]+ ===> Username contain only Number. (123456)

https?\:\/\/(www\.)?example\.com\/id\/([a-zA-Z]+)\/?

Without further specification you could use
\bhttps?:.+?example\.com\/[a-zA-Z]+\/\w+\/?(?=\s|\Z)
See a demo on regex101.com.
This is
\b # a word boundary
https?: # http/https:
.+? # anything else afterwards, lazily
example\.com # what it says
\/[a-zA-Z]+\/\w+\/? # /id/username with / optional
(?=\s|\Z) # followed by a whitespace or the end of the string

Related

The file name must only contain ASCII 32-126 characters except some special character

I have a file and I require some Regex validation on that.
The validation is the file name must only contain ASCII 32-126 characters,
except:–34 ["] –39 [’] –59 [;] –60 [<] –61 [=] –62 [>] –92 [\]
Additionally, the file name cannot include the following sequence of characters: –%00
let filename = "filename"
let regex = ""
console.log(filename);
could someone take a look and let me know the solution?
Thanks
You can use the following Regex expression:
^(?:(?!["';<=>\\])[\x20-\x7E])+$
Regex Demo
Explanation:
^ # start of line
(?: # non-capturing group
(?!["';<=>\\]) # negative lookahead - do not to match if contains given symbols
[\x20-\x7E] # match in range from ASCII 32-126
) # close non-capturing group
+ # match 1-unlimited times
$ # end of line

exclude full word with javascript regex word boundary

I'am looking to exclude matches that contain a specific word or phrase. For example, how could I match only lines 1 and 3? the \b word boundary does not work intuitively like I expected.
foo.js # match
foo_test.js # do not match
foo.ts # match
fun_tset.js # match
fun_tset_test.ts # do not match
UPDATE
What I want to exclude is strings ending explicitly with _test before the extension. At first I had something like [^_test], but that also excludes any combination of those characters (like line 3).
Regex: ^(?!.*_test\.).*$
Working examples: https://regex101.com/r/HdGom7/1
Why it works: uses negative lookahead to check if _test. exists somewhere in the string, and if so doesn't match it.
Adding to #pretzelhammer's answer, it looks like you want to grab strings that are file names ending in ts or js:
^(?!.*_test)(.*\.[jt]s)
The expression in the first parentheses is a negative lookahead that excludes any strings with _test, the second parentheses matches any strings that end in a period, followed by [jt] (j or t), followed by s.

Regex matching a string pattern and number ( url format )

I have a string that follows this url pattern as
https://www.examples.org/activity-group/8712/activity/202803
// note : the end ending of the url can be different
https://www.examples.org/activity-group/8712/activity/202803‌​?ref=bla
https://www.examples.org/activity-group/8712/activity/202803‌​/something
I'm trying to write a regex that matches
https://www.examples.org/activity-group/{number}/activity/{number}*
Where {number} is an integer of length 1 to 10.
How to define a regex that checks the string pattern and checks if the number is at the right position in the string ?
Background: in Google form, in order validate an answer , I want to enforce people to enter an url in this format. Hence the use of this regular expression.
For Urls not matching that format, the regex should return false. For example : https://www.notthesite.org/group/8712/activity/astring
I went through several examples, but they match only if the number is present in the string.
Examples sources :
How to find a number in a string using JavaScript?
Get the first Int(s) in a string with javascript
^https:\/\/www\.examples\.org\/activity-group\/[0-9]{1,10}\/activity\/[0-9]{1,10}(\/[a-z]+)*((\?[a-z]+=[a-zA-Z0-9]+)(\&[a-z]+=[a-zA-Z0-9]+)*)*$
^ - start of string
\ - escape character
[0-9] - a digit
{1,10} - between one and ten of the previous items
(\/[a-z]+)* - Allow additional URL segments
((\?[a-z]+=[a-zA-Z0-9]+)(\&[a-z]+=[a-zA-Z0-9]+)*)* - Allow query parameters with first parameter using a ? and all others using &
$ - end of string
This is assuming the URL segment and query parameter keys are lowercase letters only. The query parameter values can be lowercase letters, uppercase letters, or digits.
You could use
https?:\/\/(?:[^/]+\/){2}(\d+)\/[^/]+\/(\d+)
See a demo on regex101.com.
Broken down, this says:
https?:\/\/ # http:// or https://
(?:[^/]+\/){2} # not "/", followed by "/", twice
(\d+) # 1+ digits
\/[^/]+\/ # same pattern as above
(\d+) # the other number
You'll need to use group 1 and 2, respectively.
If this is too permissive, use
https:\/\/[^/]+\/activity-group\/(\d+)\/activity\/(\d+)
Which reads
https:\/\/[^/]+ # https:// + some domain name
\/activity-group\/ # /activity-group/
(\d+) # first number
\/activity\/ # /activity/
(\d+) # second number
See another demo on regex101.com.
Probably you need something like:
(http[s]?:\/\/)?www.examples.org\/activity-group\/(\d{1,10})\/activity\/(\d{1,10})([\S]+?)$
Where:
(http[s]?:\/\/)? matches any http:// or https:// part.
www.examples.org is your domain name.
(\d{1,10}) will match the first integer with max len of 10(after activity-group).
Second (\d{1,10}) will match the second integer after activity.
And finally ([\S]+?)$ will match any optional data after the second number until a new line is found, assuming that you use multiline flag with \m.
Check it at http://regexr.com/3h448
Hope it helps!

JavaScript Regex does not match exact string

In the example below the output is true. It cookie and it also matches cookie14214 I'm guessing it's because cookie is in the string cookie14214. How do I hone-in this match to only get cookie?
var patt1=new RegExp(/(biscuit|cookie)/i);
document.write(patt1.test("cookie14214"));
Is this the best solution?
var patt1=new RegExp(/(^biscuit$|^cookie$)/i);
The answer depends on your allowance of characters surrounding the word cookie. If the word is to appear strictly on a line by itself, then:
var patt1=new RegExp(/^(biscuit|cookie)$/i);
If you want to allow symbols (spaces, ., ,, etc), but not alphanumeric values, try something like:
var patt1=new RegExp(/(?:^|[^\w])(biscuit|cookie)(?:[^\w]|$)/i);
Second regex, explained:
(?: # non-matching group
^ # beginning-of-string
| [^\w] # OR, non-alphanumeric characters
)
(biscuit|cookie) # match desired text/words
(?: # non-matching group
[^\w] # non-alphanumeric characters
| $ # OR, end-of-string
)
Yes, or use word boundaries. Note that this will match great cookies but not greatcookies.
var patt1=new RegExp(/(\bbiscuit\b|\bcookie\b)/i);
If you want to match the exact string cookie, then you don't even need regular expressions, just use ==, since /^cookie$/i.test(s) is basically the same as s.toLowerCase() == "cookie".

Javascript multiple regex pattern

I'm trying to exclude some internal IP addresses and some internal IP address formats from viewing certain logos and links in the site.I have multiple range of IP addresses(sample given below). Is it possible to write a regex that could match all the IP addresses in the list below using javascript?
10.X.X.X
12.122.X.X
12.211.X.X
64.X.X.X
64.23.X.X
74.23.211.92
and 10 more
Quote the periods, replace the X's with \d+, and join them all together with pipes:
const allowedIPpatterns = [
"10.X.X.X",
"12.122.X.X",
"12.211.X.X",
"64.X.X.X",
"64.23.X.X",
"74.23.211.92" //, etc.
];
const allowedRegexStr = '^(?:' +
allowedIPpatterns.
join('|').
replace(/\./g, '\\.').
replace(/X/g, '\\d+') +
')$';
const allowedRegexp = new RegExp(allowedRegexStr);
Then you're all set:
'10.1.2.3'.match(allowedRegexp) // => ['10.1.2.3']
'100.1.2.3'.match(allowedRegexp) // => null
How it works:
First, we have to turn the individual IP patterns into regular expressions matching their intent. One regular expression for "all IPs of the form '12.122.X.X'" is this:
^12\.122\.\d+\.\d+$
^ means the match has to start at the beginning of the string; otherwise, 112.122.X.X IPs would also match.
12 etc: digits match themselves
\.: a period in a regex matches any character at all; we want literal periods, so we put a backslash in front.
\d: shorthand for [0-9]; matches any digit.
+: means "1 or more" - 1 or more digits, in this case.
$: similarly to ^, this means the match has to end at the end of the string.
So, we turn the IP patterns into regexes like that. For an individual pattern you could use code like this:
const regexStr = `^` + ipXpattern.
replace(/\./g, '\\.').
replace(/X/g, '\\d+') +
`$`;
Which just replaces all .s with \. and Xs with \d+ and sticks the ^ and $ on the ends.
(Note the doubled backslashes; both string parsing and regex parsing use backslashes, so wherever we want a literal one to make it past the string parser to the regular expression parser, we have to double it.)
In a regular expression, the alternation this|that matches anything that matches either this or that. So we can check for a match against all the IP's at once if we to turn the list into a single regex of the form re1|re2|re3|...|relast.
Then we can do some refactoring to make the regex matcher's job easier; in this case, since all the regexes are going to have ^...$, we can move those constraints out of the individual regexes and put them on the whole thing: ^(10\.\d+\.\d+\.\d+|12\.122\.\d+\.\d+|...)$. The parentheses keep the ^ from being only part of the first pattern and $ from being only part of the last. But since plain parentheses capture as well as group, and we don't need to capture anything, I replaced them with the non-grouping version (?:..).
And in this case we can do the global search-and-replace once on the giant string instead of individually on each pattern. So the result is the code above:
const allowedRegexStr = '^(?:' +
allowedIPpatterns.
join('|').
replace(/\./g, '\\.').
replace(/X/g, '\\d+') +
')$';
That's still just a string; we have to turn it into an actual RegExp object to do the matching:
const allowedRegexp = new RegExp(allowedRegexStr);
As written, this doesn't filter out illegal IPs - for instance, 10.1234.5678.9012 would match the first pattern. If you want to limit the individual byte values to the decimal range 0-255, you can use a more complicated regex than \d+, like this:
(?:\d{1,2}|1\d{2}|2[0-4]\d|25[0-5])
That matches "any one or two digits, or '1' followed by any two digits, or '2' followed by any of '0' through '4' followed by any digit, or '25' followed by any of '0' through '5'". Replacing the \d with that turns the full string-munging expression into this:
const allowedRegexStr = '^(?:' +
allowedIPpatterns.
join('|').
replace(/\./g, '\\.').
replace(/X/g, '(?:\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])') +
')$';
And makes the actual regex look much more unwieldy:
^(?:10\.(?:\d{1,2}|1\d{2}|2[0-4]\d|25[0-5])\.(?:\d{1,2}|1\d{2}|2[0-4]\d|25[0-5]).(?:\d{1,2}|1\d{2}|2[0-4]\d|25[0-5])|12\.122\....
but you don't have to look at it, just match against it. :)
You could do it in regex, but it's not going to be pretty, especially since JavaScript doesn't even support verbose regexes, which means that it has to be one humongous line of regex without any comments. Furthermore, regexes are ill-suited for matching ranges of numbers. I suspect that there are better tools for dealing with this.
Well, OK, here goes (for the samples you provided):
var myregexp = /\b(?:74\.23\.211\.92|(?:12\.(?:122|211)|64\.23)\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])|(?:10|64)\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9]))\b/g;
As a verbose ("readable") regex:
\b # start of number
(?: # Either match...
74\.23\.211\.92 # an explicit address
| # or
(?: # an address that starts with
12\.(?:122|211) # 12.122 or 12.211
| # or
64\.23 # 64.23
)
\. # .
(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\. # followed by 0..255 and a dot
(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9]) # followed by 0..255
| # or
(?:10|64) # match 10 or 64
\. # .
(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\. # followed by 0..255 and a dot
(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\. # followed by 0..255 and a dot
(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9]) # followed by 0..255
)
\b # end of number
/^(X|\d{1,3})(\.(X|\d{1,3})){3}$/ should do it.
If you don't actually need to match the "X" character you could use this:
\b(?:\d{1,3}\.){3}\d{1,3}\b
Otherwise I would use the solution cebarrett provided.
I'm not entirely sure of what you're trying to achieve here (doesn't look anyone else is either).
However, if it's validation, then here's a solution to validate an IP address that doesn't use RegEx. First, split the input string at the dot. Then using parseInt on the number, make sure it isn't higher than 255.
function ipValidator(ipAddress) {
var ipSegments = ipAddress.split('.');
for(var i=0;i<ipSegments.length;i++)
{
if(parseInt(ipSegments[i]) > 255){
return 'fail';
}
}
return 'match';
}
Running the following returns 'match':
document.write(ipValidator('10.255.255.125'));
Whereas this will return 'fail':
document.write(ipValidator('10.255.256.125'));
Here's a noted version in a jsfiddle with some examples, http://jsfiddle.net/VGp2p/2/

Categories

Resources