Bookmarklet - Verify URL format and extract substring - javascript

I'm trying to build a bookmarklet that preforms a service client side, but I'm not really fluent in Javascript. In my code below I want to take the current page url and first verify that it's a url following a specific format after the domain, which is...
/photos/[any alphanumeric string]/[any numeric string]
after that 3rd "/" should always be the numeric string that I need to extract into a var. Also, I can't just start from the end and work backwards because there will be times that there is another "/" after the numeric string followed by other things I don't need.
Is indexOf() the right function to verify if the url is the specific format and how would I write that expression? I've tried several things related to indexOf() and Regex(), but had no success. I seem to always end up with an unexpected character or it just doesn't work.
And of course the second part of my question is once I know the url is the right format, how do I extract the numeric string into a variable?
Thank you for any help!
javascript:(function(){
// Retrieve the url of the current page
var photoUrl = window.location.pathname;
if(photoUrl.indexOf(/photos/[any alphanumeric string]/[any numeric string]) == true) {
// Extract the numeric substring into a var and do something with it
} else {
// Do something else
}
})();

var id = window.location.pathname.match(/\/photos\/(\w+)\/(\d+)/i);
if (id) alert(id[1]); // use 1 or 2 depending on what you want
else alert('url did not fit expected format');
(EDIT: changed first \d* to \w+ and second \d* to \d+ and dig to id.)

To test strings for patterns and get their parts, you can use regular expressions. Exression for your criteria would be like this:
/^\/photos\/\w+\/(\d+)\/?$/
It will match any string starting with /photos/, followed by any alphanumeric character (and underscore), followed by any number and optional / at the end of string, wrapped in a capture group.
So, if we do this:
"/photos/abc123/123".match(/^\/photos\/\w+\/(\d+)\/?$/)
the result will be ["/photos/abc123/123", "123"]. As you might have noticed, capture group is the second array element.
Ready to use function:
var extractNumeric = function (string) {
var exp = /^\/photos\/\w+\/(\d+)\/?$/,
out = string.match(exp);
return out ? out[1] : false;
};
You can find more detailed example here.
So, the answers:
Is indexOf() the right function to verify if the url is the specific
format and how would I write that expression? I've tried several
things related to indexOf() and Regex(), but had no success. I seem to
always end up with an unexpected character or it just doesn't work.
indexOf isn't the best choice for the job, you were right about using regular expression, but lacked experience to do so.
And of course the second part of my question is once I know the url is
the right format, how do I extract the numeric string into a variable?
Regular expression together with match function will allow to test string for desired format and get it's portions at the same time.

Related

How to use Regular Expression Extractor while correlating using javascript in Webload PT tool?

I am able to extract a value from a given expression by using the Left and Right boundary in Webload, but I am unable to extract a particular value (for example, index) from the following expression using regular expression extractor:
index=2&Roll_ID=95372&NAME=ANDY&LastName=MURRAY&birthday
If we received the following 3 response in a page:
index=2&Roll_ID=9572&NAME=ANDY&LastName=MURRAY&a‌​mp&
index=1&Roll_ID=7875&NAME=TOM&LastName=SHAW&amp&
index=7&Roll_ID=8343&NAME=EMA&LastName=WINSTON&a‌​mp&Birthday
So, what must be our regular expression to catch the index value (7) from the last response as it has an additional tag Birthday As I believe in this case we have to pass the regular expressions for Roll ID, Name and Last Name as we don't know which one contains birthday....although we are extracting the value of Index.
Like in LoadRunner we write the regular expression to capture the index as following:
index=(.*?)&Roll_ID=.*?&NAME=.*?&LastName=.*?&Birthday
In Webload how can we write extract this value?
Is there any inbuilt function available in Webload to use regular expression extractor?
Or how can we extract this value using JavaScript code?
Here is a quick JavaScript example:
var str = "index=2&Roll_ID=95372&NAME=ANDY&LastName=MURRAY&"
var match_result = str.match(/index=([^&]*?)/);
var index_val = match_result[1];
For this example, I am assuming that this is going to be a standard URI query string. So, in the match() regex I am looking for index= explicitly and then grabbing the value (anything not a "&" character, the *? stops matching at the first occurrence of "&").
For HTML encoded query strings, you should would need to do a lookahead for "&" to determine the end of your value match.

Matching a JS string with regex

I have a long xml raw message that is being stored in a string format. A sample is as below.
<tag1>val</tag><tag2>val</tag2><tagSomeNameXYZ/>
I'm looking to search this string and find out if it contains an empty html tag such as <tagSomeNameXYZ/>. This thing is, the value of SomeName can change depending on context. I've tried using Str.match(/tagSomeNameXYZ/g) and Str.match(/<tag.*.XYZ\/>/g) to find out if it contains exactly that string, but am able to get it return anything. I'm having trouble in writing a reg ex that matches something like <tag*XYZ/>, where * is going to be SomeName (which I'm not interested in)
Tl;dr : How do I filter out <tagSomeNameXYZ/> from the string. Format being : <constant variableName constant/>
Example patterns that it should match:
<tagGetIndexXYZ/>
<tagGetAllIndexXYZ/>
<tagGetFooterXYZ/>
The issue you have with Str.match(/<tag.*.XYZ\/>/g) is the .* takes everything it sees and does not stop at the XYZ as you wish. So you need to find a way to stop (e.g. the [^/]* means keep taking until you find a /) and then work back from there (the slice).
Does this help
testString = "<tagGetIndexXYZ/>"
res = testString.match(/<tag([^/]*)\/\>/)[1].slice(0,-3)
console.log(res)

if pathname starts with, as well as contains . Regex

I am trying to test the pathname of the url, checking if pathname starts with privmsg as well as contains one of the words in the selection. And my quantifier is selecting that at least one word must be found.
New RegExp thanks to one of the answers and I extended it more.
var post = /(^\/privmsg\?).+(post|reply){1}(.*)?/;
My urls will look like
/privmsg?mode=post
/privmsg?mode=reply
/privmsg?mode=reply&p=2 //another way
Though we have other modes that I do not want. I need to just get the constant url beginning with privmsg and having at least post or reply in it. Can someone explain what is wrong with my regex string and if I used the quantifier incorrectly.
Problem now is that it is still coming out false...
You need to allow for arbitrary characters between ? and (post|reply) (i.e. mode=). E.g.:
var post = /^\/privmsg\?.+(post|reply){1}/g;
\/
|match any sequence of|
|1 or more characters |
You miss to include something for mode=.
With your regex you will match strings like /privmsg?post.
So alter your regex to include mode=:
^\/privmsg\?.*(post|reply)$

Regex validation rules

I'm writing a database backup function as part of my school project.
I need to write a regex rule so the database backup name can only contain legal characters.
By 'legal' I mean a string that doesn't contain ANY symbols or spaces. Only letters from the alphabet and numbers.
An example of a valid string would be '31Jan2012' or '63927jkdfjsdbjk623' or 'hello123backup'.
Here's my JS code so far:
// Check if the input box contains the charactes a-z, A-Z ,or 0-9 with a regular expression.
function checkIfContainsNumbersOrCharacters(elem, errorMessage){
var regexRule = new RegExp("^[\w]+$");
if(regexRule.test( $(elem).val() ) ){
return true;
}else{
alert(errorMessage);
return false;
}
}
//call the function
checkIfContainsNumbersOrCharacters("#backup-name", "Input can only contain the characters a-z or 0-9.");
I've never really used regular expressions before though, however after a quick bit of googling i found this tool, from which I wrote the following regex rule:
^[\w]+$
^ = start of string
[/w] = a-z/A-Z/0-9
'+' = characters after the string.
When running my function, the whatever string I input seems to return false :( is my code wrong? or am I not using regex rules correctly?
The problem here is, that when writing \w inside a string, you escape the w, and the resulting regular expression looks like this: ^[w]+$, containing the w as a literal character. When creating a regular expression with a string argument passed to the RegExp constructor, you need to escape the backslash, like so: new RegExp("^[\\w]+$"), which will create the regex you want.
There is a way to avoid that, using the shorthand notation provided by JavaScript: var regex = /^[\w]+$/; which does not need any extra escaping.
It can be simpler. This works:
function checkValid(name) {
return /^\w+$/.test(name);
}
/^\w+$/ is the literal notation for new RegExp(). Since the .test function returns a boolean, you only need to return its result. This also reads better than new RegExp("^\\w+$"), and you're less likely to goof up (thanks #x3ro for pointing out the need for two backslashes in strings).
The \w is a synonym for [[:alnum:]], which matches a single character of the alnum class. Note that using character classes means that you may match characters that are not part of the ASCII character encoding, which may or may not be what you want. If what you really intend to match is [0-9A-Za-z], then that's what you should use.
When you declare the regex as a string parameter to the RegExp constructor, you need to escape it. Both
var regexRule = new RegExp("^[\\w]+$");
...and...
var regexRule = new RegExp(/^[\w]+$/);
will work.
Keep in mind though, that client side validation for database data will never be enough, as the validation is easily bypassed by disabling javascript in the browser, and invalid/malicious data can reach your DB. You need to validate the data on the server side, but preventing the request with invalid data, but validating client side is good practice.
This is the official spec: http://dev.mysql.com/doc/refman/5.0/en/identifiers.html but it's not very easily converted to a regular expression. Just a regular expression won't do it as there are also reserved words.
Why not just put it in the query (don't forget to escape it properly) and let MySQL give you an error? There might for instance be a bug in the MySQL version you're using, and even though your check is correct, MySQL might still refuse.

How to search csv string and return a match by using a Javascript regex

I'm trying to extract the first user-right from semicolon separated string which matches a pattern.
Users rights are stored in format:
LAA;LA_1;LA_2;LE_3;
String is empty if user does not have any rights.
My best solution so far is to use the following regex in regex.replace statement:
.*?;(LA_[^;]*)?.*
(The question mark at the end of group is for the purpose of matching the whole line in case user has not the right and replace it with empty string to signal that she doesn't have it.)
However, it doesn't work correctly in case the searched right is in the first position:
LA_1;LA_2;LE_3;
It is easy to fix it by just adding a semicolon at the beginning of line before regex replace but my question is, why doesn't the following regex match it?
.*?(?:(?:^|;)(LA_[^;]*))?.*
I have tried numerous other regular expressions to find the solution but so far without success.
I am not sure I get your question right, but in regards to the regular expressions you are using, you are overcomplicating them for no clear reason (at least not to me). You might want something like:
function getFirstRight(rights) {
var m = rights.match(/(^|;)(LA_[^;]*)/)
return m ? m[2] : "";
}
You could just split the string first:
function getFirstRight(rights)
{
return rights.split(";",1)[0] || "";
}
To answer the specific question "why doesn't the following regex match it?", one problem is the mix of this at the beginning:
.*?
eventually followed by:
^|;
Which might be like saying, skip over any extra characters until you reach either the start or a semicolon. But you can't skip over anything and then later arrive at the start (unless it involves newlines in a multiline string).
Something like this works:
.*?(\bLA_[^;]).*
Meaning, skip over characters until a word boundary followed by "LA_".

Categories

Resources