What does this javascript regular expression code do in dealing with URLS?

What does this javascript regular expression code do in dealing with URLS? - javascript

I am looking at someone elses codebase and I as a javascript noob and doubly so a regular expression noob I can't figure out what the following lines do:
var url = sel.anchorNode.parentNode.href;
var match = self.location.href.replace(/\/$/i, '');
var replaced = url.replace(match,'');
I read it as:
set the var url to the href value of the parent node of the currently selected node
sets the var match to the browsers current URL with the trailing '/' removed (if it exists)
sets the var replaced to the string returned in 1. with the string returned in 2. removed from it
If I am reading it correctly I just can't figure out how it would ever do anything. There isn't any situation, I can think of, where the parent node of a currently selected node would have an href value pointing to the current URL.
So I think I am reading it incorrectly.

Because the href property of an anchor is a fully-resolved URL (even if the href attribute is relative), what that does is remove the current page's path and get you back to a relative URL. E.g., on the page:
http://example.com/foo/bar/
with a link like
...
...you get the href from the anchor which is:
http://example.com/foo/bar/nifty.html
...and then remove http://example.com/foo/bar from it, giving you:
/nifty.html
In this case, of course, that's probably not what you actually want. :-) I have to admit I fail to see how the code is useful, out of context, but then context is king sometimes...

Related

Optimising regex for matching domain name in url

I have a regex that matches iframe urls, and captures various components. The regex is given below
/(<iframe.*?src=['|"])((?:https?:\/\/|\/\/)[^\/]*)(?:.*?)(['|"][^>]*some-token:)([a-zA-Z0-9]+)(.*?>)/igm
To be clear my actual requirement is to transforms in a html string, such strings
<iframe src="http://somehost.com/somepath1/path2" class="some-token:abc123">
to
<iframe src="http://somehost.com/newpath?token=abc123" class="some-token:abc123">
The regex works as it is supposed to be, but for normal length html, it takes around 2 seconds to execute, which i think is very, high.
I would really appreciate if someone could point me how to optimise this regex, i am sure i am doing something terribly wrong, because before i used this regex
/(<iframe.*?src=['|"])(?:.*?)(['|"][^>]*some-token:)([a-zA-Z0-9]+)(.*?>)/igm
to completely replace the source url and just add the paramter, it was taking just 100 ms

You do not need to (and should not) parse the iframe element as a string; you just need to access its attributes, and retrieve information from them and rewrite them.
function fix_iframe_src(iframe) {
var src = iframe.getAttribute('src');
var klass = iframe.getAttribute('class');
var token = get_token(klass);
src = fix_src(src, token);
iframe.setAttribute('src', src);
}
Writing get_token and fix_src are left as an exercise.
If you want to find a bunch of iframes and fix them all up, then
var iframes = document.querySelectorAll('iframe');
for (var i = 0; i < iframes.length; i++) {
fix_iframe_src(iframes[i]);
}
By the way, the value of your class attribute seems to be broken. I doubt if it will match any CSS rules, if that's the intent. Are you using it for something other than to provide the token? In that case, you would be best off using a data attribute such as data-token.
Minor point about regexp flags: the g and m flags are going to do nothing for you. m is about matching anchors like ^ and $ to the beginning and end of lines within the source string, which is not an issue for you. g is about matching multiple times, which is also not an issue.
The reason your regexp is taking so long is most likely that you are throwing the entire DOM at it. Hard to tell unless you show us the code from which you are calling it.

HTML Append Variable to Query String

I have http://localhost/?val=1
When I click on a link, is there a way this link can append a query variable to the current string, for example:
Link
so when I click it the url would be:
http://localhost/?val=1&var2=2
but when I click the link it removes the first query string and looks like
http://localhost/&var2=2
Is such a thing possible with normal HTML?

You can't do that using only html, but you can do it with js or php:
Using JS:
<a onclick="window.location+=((window.location.href.indexOf('?')+1)?'':'?')+'&var2=2'">Link</a>
Using Php:
Link
Notice 1: make sure you don't have the new variable in the current link, or it'll be a loop of the same variable
Notice 2: this is not a professional way, but it could work if you need something fast.

Basically you want to get your current URL via JavaScript with:
var existingUrl = window.location.href; // http://localhost/?val=1
Then append any Query Strings that are applicable using:
window.location.href = existingUrl + '&var2=2';
or some other similar code. Take a look at this post about Query Parameters.
Note: A link would already have to exist with an OnClick event that calls a function with the above code in it for it to work appropriately.
Now obviously this isn't very useful information on it's own, so you are going to want to do some work either in JavaScript or in Server code (through use of NodeJS, PHP, or some other server-side language) to pass those variable names and their values down so that the button can do what you are wanting it to do.
You will have to have some logic to make sure the query parameters are put in the URL correctly though. I.E. if there is only one query param it's going to look like '?var1=1' and if it's any subsequent parameter it's going to look like '&var#=#'.

Extracting a specific part of a URL using regex in JavaScript

I'm fairly new to any kind of language but I need to modify a code at my work because the guy doing it previously left and no replacement.
I basically would like to put in a variable a specific part of a url.
The URLs look like this:
http://www.test.com/abc/hhhhhh/a458/example
I need to extract the a458 part and put it in a variable. This part is always at the same place but can be of variable length.
The URLs always have the same structure. I tried /hhhhhh\/{1}[a-z0-9]+\/{1}/g but it doesn't fully work. It keeps the hhh and the /.

no need for regex, just split it
var link = "http://www.test.com/abc/hhhhhh/a458/example";
var linkParts = link.split("/");
//If the link is always in that format then a458 or whatever
//would replace it will be in index 5
console.log(linkParts[5]);

What does the hash (#) mean after a .js file?

what is the significance of the hash (#) here, how does it relate to the .js file:
<script src="foo.js#bar=1"></script>

The hash after the script is used by the embedded script for configuration. For example, have a look at the provided example (facebook):
1. window.setTimeout(function () {
2. var a = /(connect.facebook.net|facebook.com\/assets.php).*?#(.*)/;
3. FB.Array.forEach(document.getElementsByTagName('script'), function (d) {
4. if (d.src) {
5. var b = a.exec(d.src); //RegExp.exec on the SRC attribute
6. if (b) {
7. var c = FB.QS.decode(b[2]); //Gets the information at the hash
8. ...
In the script, each <script> tagline 3 is checked for occurrencesline 5 of the hash line 2 at the attribute. Then, if the hash existsline 6, the hashdata is extractedline 7, and the function continues.

I doesn't do anything in terms of loading the script. What I am guessing is, the script itself looks for its own script tag, and picks out the piece after the hash (bar=1), and uses it to configure its behavior somehow. To do this, they probably have to loop through all script tags and match against the src attribute.

It is probably used within the referenced .js file reading the raw URL and extracting the parameter (using something window.location, for example and parsing out what is after the #).

The part after the hash in a URL is know as a fragment identifier. If present, it specifies a part or a position within the overall resource or document. When used with HTTP, it usually specifies a section or location within the page, and the browser may scroll to display that part of the page.
In relation to the JavaScript file, the author of the program is in all probability using it as a method to pass arguments to the file. However, this method should not be used. URLs may contain query strings which serve the same purpose.
Nevertheless, it's never a good idea to embed arguments to the URL of a JavaScript file because for every different set of parameters the URL is cached again which is a waste of memory. Instead, it's better to set the query string on the URL of HTML page which contains the script itself. This is because JavaScript has a built in property to access the query string of the web page: location.search. You may read more about it here.

REGEX / replace only works once

I'm using REGEX and js replace to dynamically populate a variable in a href. The following code works the first time it is used on the page, but if a different variable is passed to the function, it does not replace ANYTHING.
function change(fone){
$("a[href^='/application']").each(function(){
this.href = this.href.replace(/device=.*/,"device="+ fone);
});
}

The problem is that this.href actually returns a full absolute URL. So even your HTML is <a href="/foo"> the .href property will return http://mydomain.com/foo.
So your href attributes is being populated with a full absolute URL, and then the a[href^='/application'] selector doesn't match anymore, because the href attribute starts with the domain name, instead of /application.

.href returns a fully qualified URL, e.g. `http://www.mydomain.com/application/...'. So the selector doesn't work the 2nd time around since your domain relative URL has been replaced with a full URL, and it's looking for things that start with "/application".
Use $(this).attr('href').replace... instead.
Fiddle here: http://jsfiddle.net/pcm5K/3/

As Squeegy says, you're changing the href the first time around so it no longer begins with /application - the second time around it begins with http://.
You can use the jQuery Attribute Contains Selector to get the links, and it's probably also better practice to use a capture group to do the replacement. Like so:
$("a[href*='/application']").each(function(){
this.href = this.href.replace(/(device=)\w*/, "$1" + fone);
});

You'll need to add the g flag to match all instances of the pattern, so your regular expression will look like this:
/device=.*/g
and your code will look like:
this.href = this.href.replace(/device=.*/g,"device="+ fone);

The reason is that unless all you links start as "/device=." the regex wont work.
you need to use /.*device=.*/
the lack of global flag is not the problem. its the backslash in your pattern.

Develop Reference

JavaScript is the programming language of the Web.

What does this javascript regular expression code do in dealing with URLS? - javascript

Related

Optimising regex for matching domain name in url

HTML Append Variable to Query String

Extracting a specific part of a URL using regex in JavaScript

What does the hash (#) mean after a .js file?

REGEX / replace only works once

Categories

Resources