Node.JS - How would I do this regex?

Node.JS - How would I do this regex? - javascript

Well I have this:
var regex = /convertID\s(\d+)/
var match = regex.exec(message);
if(match != null)
{
//do stuff here
}
That works fine and it recognizes if someone writes "convertID NumbersHere".
However I want to have another one under it as well checking if there's a specific link, for example:
var regex = /convertID\shttp://anysitehere dot com/id/[A-Z]
var match = regex.exec(message);
if(match != null)
{
//do stuff here
}
So how would I make it check for an specific site with any letters after /id/?

You can use this:
var regex = /convertID\shttp:\/\/thesite.com\/id\/[A-Za-z]+/;
slashes must be escaped since the slash is used to delimit the pattern. You can avoid this creating explicitly an instance of RegExp class:
var regex = new RegExp("convertID\\shttp://thesite.com/id/[A-Za-z]+");

Related

Javascript match URL with wildcards - Chrome Extension

I'm writing a chrome extension which allows the user to modify content on specific websites. I'd like the user to be able to specify these websites using wildcards, for example http://*.google.com or http://google.com/*
I found the following code
currentUrl = "http://google.com/";
matchUrl = "http://*.google.com/*";
match = RegExp(matchUrl.replace(/\*/g, "[^]*")).test(currentUrl);
But there are a few problems with it.
http://test.google.com/ is a match
http://google.com/ is not a match
http://test.google.com is not a match
http://.google.com/ is a match
Clarification:
http://google.com Isn't a match, and that is the real problem.
So how can I can I create a JavaScript code snippet that will check if there is a match correctly?

I suggest parsing the URL into protocol, base part and the rest, and then re-build the validation regex replacing * inside the base part with (?:[^/]*\\.)* and otherwise with (?:/[^]*)?. Also, you must escape all other special chars with .replace(/[?()[\]\\.+^$|]/g, "\\$&"). You will also need anchors (^ for start of string and $ for the end of string position) to match the entire string. A case insensitive /i modifier is just a bonus to make the pattern case insensitive.
So, for this exact matchUrl, the regex will look like
/^http:\/\/(?:[^\/]*\.)*google\.com(?:\/[^]*)?$/
See the regex demo
var rxUrlSplit = /((?:http|ftp)s?):\/\/([^\/]+)(\/.*)?/;
var strs = ['http://test.google.com/', 'http://google.com/','http://test.google.com', 'http://.google.com/','http://one.more.test.google.com'];
var matchUrl = "http://*.google.com/*";
var prepUrl = "";
if ((m=matchUrl.match(rxUrlSplit)) !== null) {
prepUrl = m[1]+"://"+m[2].replace(/[?()[\]\\.+^$|]/g, "\\$&").replace(/\*\\./g,'(?:[^/]*\\.)*').replace(/\*$/,'[^/]*');
if (m[3]) {
prepUrl+= m[3].replace(/[?()[\]\\.+^$|]/g, "\\$&").replace(/\/\*(?=$|\/)/g, '(?:/[^]*)?');
}
}
if (prepUrl) {
// console.log(prepUrl); // ^http://(?:[^/]*\.)*google\.com(?:/[^]*)?$
var rx = RegExp("^" + prepUrl + "$", "i");
for (var s of strs) {
if (s.match(rx)) {
console.log(s + " matches!<br/>");
} else {
console.log(s + " does not match!<br/>");
}
}
}

with this matchUrl
matchUrl = "http://*.google.com/*";
the RexExp is something like this
"http://.*.google.com/.*"
so try to replace the * entered by the user with .* in the regexp match
you can use this tool to test it

Matching css selectors with RegExp doesn't work in browser

I try to match css selectors as can be seen here:
https://regex101.com/r/kI3rW9/1
. It matches the teststring as desired, however when loading a .js file to test it in the browser it fails both in firefox and chrome.
The .js file:
window.onload = function() {
main();
}
main = function() {
var regexSel = new RegExp('([\.|#][a-zA-Z][a-zA-Z0-9.:_-]*) ?','g');
var text = "#left_nav .buildings #rfgerf .rtrgrgwr .rtwett.ww-w .tw:ffwwwe";
console.log(regexSel.exec(text));
}
In the browser it returns:["#left_nav ", "#left_nav", index: 0, input: "#left_nav .buildings #rfgerf .rtrgrgwr .rtwett.ww-w .tw:ffwwwe"]
So it appears it only captures the first selector with and without the whitespace, despite the whitespace beeing outside the () and the global flag set.
Edit:
So either looping over RegExp.exec(text) or just using String.match(str) will lead to the correct solution. Thanks to Wiktor's answer i was able to implement a convenient way of calling this functionality:
function Selector(str){
this.str = str;
}
with(Selector.prototype = new String()){
toString = valueOf = function () {
return this.str;
};
}
Selector.prototype.constructor = Selector;
Selector.prototype.parse = function() {
return this.match(/([\.|#][a-zA-Z][a-zA-Z0-9.:_-]*) ?/g);
}
//Using it the following way:
var text = new Selector("#left_nav .buildings #rfgerf .rtrgrgwr .rtwett.ww-w .tw:ffwwwe");
console.log(text.parse());
I decided however using
/([\.|#][a-zA-Z][a-zA-Z0-9.:_-]*) ?/g over the suggested
/([.#][a-zA-Z][a-zA-Z0-9.:_-]*)(?!\S)/g because it matches with 44 vs. 60 steps on regex101.com on my teststring.

You ran exec once, so you got one match object. You'd need to run it inside a loop.
var regexSel = new RegExp('([\.|#][a-zA-Z][a-zA-Z0-9.:_-]*) ?','g');
var text = "#left_nav .buildings #rfgerf .rtrgrgwr .rtwett.ww-w .tw:ffwwwe";
while((m=regexSel.exec(text)) !== null) {
console.log(m[1]);
}
A regex with a (?!\S) lookaround at the end (that fails the match if there is no non-whitespace after your main consuming pattern) will allow simpler code:
var text = "#left_nav .buildings #rfgerf .rtrgrgwr .rtwett.ww-w .tw:ffwwwe";
console.log(text.match(/[.#][a-zA-Z][a-zA-Z0-9.:_-]*(?!\S)/g));
Note that you should consider using regex literal notation when defining your static regexps. Only prefer constructor notation with RegExp when your patterns are dynamic, have some variables or too many / that you do not want to escape.
Look also at [.#]: the dot does not have to be escaped and | inside is treated as a literal pipe symbol (not alternation operator).

regex: any string between two slashes first of them is prefixed with a defined string

I'd like to get the talker name of some mp3s files paths such as the following:
/assets/audio/James_Lee/001.mp3
/assets/audio/Marc_Smith/001.mp3
/aasets/audio/blahblah/001.mp3
In the previous example we note that each talker name is surrounded by two slashes where the first of them is prefixed with the word audio. I need a pattern that matches names like the example above using javascript.
I tried at http://regexpal.com/ :
audio/.*/
but it only matches *audio/The_name/* where I need *The_name* only. The other thing I don't know how could I use such patterns with javascript replace().

This will get your the name: (?<=\/assets\/audio\/).*(?=\/)
Here's the regex in use: http://regexr.com?34747
Considering Javascript, you could do this:
var string = "/assets/audio/James_Lee/001.mp3";
var name = string.replace(/^.*\/audio\/|\/[\d]+\..*$/g, '');

Try this:
var str = "/assets/audio/James_Lee/001.mp3\n/assets/audio/Marc_Smith/001.mp3";
var pattern = /audio\/(.+?)\//g;
var match;
var matches = [];
while ((match = pattern.exec(str)) !== null){
matches.push(match[1]);
}
console.log(matches);
// If you want a string with only the names, you can re-combine the matches
str = matches.join('\n');

how about this?
str.replace(/.*audio\/([^\/]*)\/.*/,"$1")

how do I capture something after something else? like a referer=someString

I have ref=Apple
and my current regex is
var regex = /ref=(.+)/;
var ref = regex.exec(window.location.href);
alert(ref[0]);
but that includes the ref=
now, I also want to stop capturing characters if a & is at the end of the ref param. cause ref may not always be the last param in the url.

You'll want to split the url parameters, rather than using a regular expression.
Something like:
var get = window.location.href.split('?')[1];
var params = get.split('&');
for (p in params) {
var key = params[p].split('=')[0];
var value = params[p].split('=')[1];
if (key == 'ref') {
alert('ref is ' + value);
}
}

Use ref[1] instead.
This accesses what is captured by group 1 in your pattern.
Note that there's almost certainly a better way to do key/value parsing in Javascript than regex.
References
regular-expressions.info/Brackets for Capturing

You are using the ref wrong, you should use ref[1] for the (.+), ref[0] is the whole match.
If & is at the end, modify the regexp to /ref=([^&]+)/, to exclude &s.
Also, make sure you urldecode (unescape in JavaScript) the match.

Capture only word characters and numbers:
var regex = /ref=(\w+)/;
var ref = regex.exec(window.location.href);
alert(ref[1]);
Capture word characters, numbers, - and _:
var regex = /ref=([\w_\-]+)/;
var ref = regex.exec(window.location.href);
alert(ref[1]);
More information about Regular Expressions (the basics)

try this regex pattern ref=(.*?)&
This pattern will match anything after ref= and stop before '&'
To get the value of m just use following code:
var regex = /ref=(.*?)&/;
var ref = regex.exec(window.location.href);
alert(ref[1]);

Javascript substring() trickery

I have a URL that looks like http://mysite.com/#id/Blah-blah-blah, it's used for Ajax-ey bits. I want to use substring() or substr() to get the id part. ID could be any combination of any length of letters and numbers.
So far I have got:
var hash = window.location.hash;
alert(hash.substring(1)); // remove #
Which removes the front hash, but I'm not a JS coder and I'm struggling a bit. How can I remove all of it except the id part? I don't want anything after and including the final slash either (/Blah-blah-blah).
Thanks!
Jack

Now, this is a case where regular expressions will make sense. Using substring here won't work because of the variable lengths of the strings.
This code will assume that the id part wont contain any slashes.
var hash = "#asdfasdfid/Blah-blah-blah";
hash.match(/#(.+?)\//)[1]; // asdfasdfid
The . will match any character and
together with the + one or more characters
the ? makes the match non-greedy so that it will stop at the first occurence of a / in the string
If the id part can contain additional slashes and the final slash is the separator this regex will do your bidding
var hash = "#asdf/a/sdfid/Blah-blah-blah";
hash.match(/#(.+?)\/[^\/]*$/)[1]; // asdf/a/sdfid
Just for fun here are versions not using regular expressions.
No slashes in id-part:
var hash = "#asdfasdfid/Blah-blah-blah",
idpart = hash.substr(1, hash.indexOf("/"));
With slashes in id-part (last slash is separator):
var hash = "#asdf/a/sdfid/Blah-blah-blah",
lastSlash = hash.split("").reverse().indexOf("/") - 1, // Finding the last slash
idPart = hash.substring(1, lastSlash);

var hash = window.location.hash;
var matches = hash.match(/#(.+?)\//);
if (matches.length > 1) {
alert(matches[1]);
}

perhaps a regex
window.location.hash.match(/[^#\/]+/)

Use IndexOf to determine the position of the / after id and then use string.substr(start,length) to get the id value.
var hash = window.location.hash;
var posSlash = hash.indexOf("/", 1);
var id = hash.substr(1, posSlash -1)
You need ton include some validation code to check for absence of /

This one is not a good aproach, but you wish to use if you want...
var relUrl = "http://mysite.com/#id/Blah-blah-blah";
var urlParts = [];
urlParts = relUrl.split("/"); // array is 0 indexed, so
var idpart = = urlParts[3] // your id will be in 4th element
id = idpart.substring(1) //we are skipping # and read the rest

The most foolproof way to do it is probably the following:
function getId() {
var m = document.location.href.match(/\/#([^\/&]+)/);
return m && m[1];
}
This code does not assume anything about what comes after the id (if at all). The id it will catch is anything except for forward slashes and ampersands.
If you want it to catch only letters and numbers you can change it to the following:
function getId() {
var m = document.location.href.match(/\/#([a-z0-9]+)/i);
return m && m[1];
}

Develop Reference

JavaScript is the programming language of the Web.

Node.JS - How would I do this regex? - javascript

You can use this: var regex = /convertID\shttp:\/\/thesite.com\/id\/[A-Za-z]+/; slashes must be escaped since the slash is used to delimit the pattern. You can avoid this creating explicitly an instance of RegExp class: var regex = new RegExp("convertID\\shttp://thesite.com/id/[A-Za-z]+");

Related

Javascript match URL with wildcards - Chrome Extension

Matching css selectors with RegExp doesn't work in browser

regex: any string between two slashes first of them is prefixed with a defined string

how do I capture something after something else? like a referer=someString

Javascript substring() trickery

Categories

Resources