JavaScript: Refactoring looping regex

JavaScript: Refactoring looping regex - javascript

How might this be improved, as it pertains to the loop and the regex replace?
var properties = { ... };
var template = element.innerHTML;
for (var name in properties) {
template = template.replace
(new RegExp('\\${' + name + '}', 'gm'), properties[name]);
}
element.innerHTML = template;
Is there a way I could get all the matches for /\$\{\w+\}/gm and just use those to build a new string once for the entire operation?

I agree with Jason and Hans WRT not bothering with this from a performance perspective.
But, i would have written it differently in the first place:
element.innerHTML
= template.replace(/[$][{](\w+)[}]/g, function(x,y){return properties[y]||x;})
Some things to keep in mind
If at all possible, you want to avoid looping over the creation of a RegExp for each iteration. Compiling them is generally considered costly. Or even generalize that to any object creation. Though not at the cost of readability/maintainability.
If you're creating RegExp dynamically, be sure the result is a RegExp, otherwise see #1 as you'll likely be able to apply it.

I'll bite ;-)
var properties = { ... };
var template = element.innerHTML;
element.innerHTML = template.replace (
RegExp ('\\$\\{(' + getTags (properties).join ('|') +')\\}'),
function (m0, tag) {return properties[tag];});
function getTags (obj) {
var tags = [];
for (var t in obj)
hasOwnProperty (t) && tags.push (t);
return tags;
}
Still loops through the tags of properties (in call on getTags) but creates only
one regexp object and scans the template only once.
Note that the tags names in properties should not contain special regexp characters (like . or (etc.).
I'd agree with Jason though, probably not worth the effort unless there are lots of tags or the template is very large.

Inefficient as it seems, I don't think you're going to do much better than that. So long as you're not replacing more than a few dozen tokens, I'd be surprised if this was actually a bottleneck.
If your profiler hasn't identified this as being a bottleneck, I definitely wouldn't spend time rewriting it. If nothing else, it's a lot more readable than your other idea, and in the end it's probably just as fast.

Related

What is the best way to access an element through a data-attribute whose value is an object (JSON)?

Say I have the following element:
<div class='selector' data-object='{"primary_key":123, "foreign_key":456}'></div>
If I run the following, I can see the object in the console.
console.log($('.selector').data('object'));
I can even access data like any other object.
console.log($('selector').data('object').primary_key); //returns 123
Is there a way to select this element based on data in this attribute? The following does not work.
$('.selector[data-object.foreign_key=456]');
I can loop over all instances of the selector
var foreign_key = 456;
$('.selector').each(function () {
if ($(this).data('object').foreign_key == foreign_key) {
// do something
}
});
but this seems inefficient. Is there a better way to do this? Is this loop actually slower than using a selector?

You can try the contains selector:
var key_var = 456;
$(".selector[data-object*='foreign_key:" + key_var + "']");
I think that you may gain a little speed here over the loop in your example because in your example jQuery is JSON parsing the value of the attribute. In this case it's most likely using the JS native string.indexOf(). The potential downside here would be that formatting will be very important. If you end up with an extra space character between the colon and the key value, the *= will break.
Another potential downside is that the above will also match the following:
<div class='selector' data-object="{primary_key:123, foreign_key:4562}"></div>
The key is clearly different but will still match the pattern. You can include the closing bracket } in the matching string:
$(".selector[data-object*='foreign_key:" + key_var + "}']");
But then again, formatting becomes a key issue. A hybrid approach could be taken:
var results = $(".selector[data-object*='" + foreign_key + "']").filter(function () {
return ($(this).data('object').foreign_key == foreign_key)
});
This will narrow the result to only elements that have the number sequence then make sure it is the exact value with the filter.

With a "contains" attribute selector.
$('selector[data-object*="foreign_key:456"]')

using javascript to replace onpage javascript

I'm fairly new to javascript so please go easy on me,
I have this code on a webpage:
<script type="text/javascript"> bb1 = "oldcode"; bb2 = "morecodehgere"; bb3 = 160000;</script>
I want to replace 1% of all page loads oldcode to newcode
There are multiple instances of this code on the same page and I want to replace them all.
window.onload = replaceScript;
function replaceScript() {
var randomNumber = Math.floor(Math.random()*101);
var toReplace = 'oldcode';
var replaceWith ='newcode';
if randomNumber == 1 {
document.body.innerHTML = document.body.innerHTML.replace(/toReplace/g, replaceWith);
}
}
This is the current code I've got but it doesn't work.
Is javascript the bast way to achieve what I'm looking to do? If so whats the best way to do this?

The regular expression literal:
/toReplace/g
will create a regular expression object that matches the string "toReplace". If you want to create a regular expression to match the (string) value of the variable toReplace, you must use the RegExp constructor:
var re = new RegExp(toReplace, 'g');
It is not a good idea to replace the innerHTML of the body with a copy of itself. The innerHTML property doesn't necessarily reflect all the nuances of the DOM and will not include things like dynamically added listeners. It also varies from browser to browser.
Using a regular expression to replace parts of innerHTML is almost certain to produce unpredictable results, it may work well on trivial pages but will not be reliable on complex pages.

Detect if string contains javascript tags using jQuery/JavaScript

I am trying to create a very simplistic XSS detection system for a system I am currently developing. The system as it stands, allows users to submit posts with javascript embedded within the message. Here is what I currently have:-
var checkFor = "<script>";
alert(checkFor.indexOf("<script>") !== -1);
This doesn't really work that well at all. I need to write code that incorporates an array which contains the terms I am searching for [e.g - "<script>","</script>","alert("]
Any suggestions as to how this could be achieved using JavaScript/jQuery.
Thanks for checking this out. Many thanks :)

Replacing characters is a very fragile way to avoid XSS. (There are dozens of ways to get < in without typing the character -- like < Instead, HTML-encode your data. I use these functions:
var encode = function (data) {
var result = data;
if (data) {
result = $("<div />").html(data).text();
}
};
var decode = function (data) {
var result = data;
if (data) {
result = $("<div />").text(data).html();
}
};

As Explosion Pills said, if you're looking for cross–site exploits, you're probably best to either find one that's already been written or someone who can write one for you.
Anyway, to answer the question, regular expressions are not appropriate for parsing markup. If you have an HTML parser (client side is easy, server a little more difficult) you could insert the text as the innerHTML of an new element, then see if there are any child elements:
function mightBeMarkup(s) {
var d = document.createElement('div');
d.innerHTML = s;
return !!(d.getElementsByTagName('*').length);
}
Of course there still might be markup in the text, just that it's invalid so doesn't create elements. But combined with some other text, it might be valid markup.

The most effective way to prevent xss attacks is by replacing all <, > and & characters with
<, >, and &.
There is a javascript library from OWASP. I haven't worked with it yet so can't tell you anything about the quality. Here is the link: https://www.owasp.org/index.php/ESAPI_JavaScript_Readme

Advantages of createElement over innerHTML?

In practice, what are the advantages of using createElement over innerHTML? I am asking because I'm convinced that using innerHTML is more efficient in terms of performance and code readability/maintainability but my teammates have settled on using createElement as the coding approach. I just wanna understand how createElement can be more efficient.

There are several advantages to using createElement instead of modifying innerHTML (as opposed to just throwing away what's already there and replacing it) besides safety, like Pekka already mentioned:
Preserves existing references to DOM elements when appending elements
When you append to (or otherwise modify) innerHTML, all the DOM nodes inside that element have to be re-parsed and recreated. If you saved any references to nodes, they will be essentially useless, because they aren't the ones that show up anymore.
Preserves event handlers attached to any DOM elements
This is really just a special case (although common) of the last one. Setting innerHTML will not automatically reattach event handlers to the new elements it creates, so you would have to keep track of them yourself and add them manually. Event delegation can eliminate this problem in some cases.
Could be simpler/faster in some cases
If you are doing lots of additions, you definitely don't want to keep resetting innerHTML because, although faster for simple changes, repeatedly re-parsing and creating elements would be slower. The way to get around that is to build up the HTML in a string and set innerHTML once when you are done. Depending on the situation, the string manipulation could be slower than just creating elements and appending them.
Additionally, the string manipulation code may be more complicated (especially if you want it to be safe).
Here's a function I use sometimes that make it more convenient to use createElement.
function isArray(a) {
return Object.prototype.toString.call(a) === "[object Array]";
}
function make(desc) {
if (!isArray(desc)) {
return make.call(this, Array.prototype.slice.call(arguments));
}
var name = desc[0];
var attributes = desc[1];
var el = document.createElement(name);
var start = 1;
if (typeof attributes === "object" && attributes !== null && !isArray(attributes)) {
for (var attr in attributes) {
el[attr] = attributes[attr];
}
start = 2;
}
for (var i = start; i < desc.length; i++) {
if (isArray(desc[i])) {
el.appendChild(make(desc[i]));
}
else {
el.appendChild(document.createTextNode(desc[i]));
}
}
return el;
}
If you call it like this:
make(["p", "Here is a ", ["a", { href:"http://www.google.com/" }, "link"], "."]);
you get the equivalent of this HTML:
<p>Here is a link.</p>

User bobince puts a number of cons very, very well in his critique of jQuery.
... Plus, you can make a div by saying $(''+message+'') instead of having to muck around with document.createElement('div') and text nodes. Hooray! Only... hang on. You've not escaped that HTML, and have probably just created a cross-site-scripting security hole, only on the client side this time. And after you'd spent so long cleaning up your PHP to use htmlspecialchars on the server-side, too. What a shame. Ah well, no-one really cares about correctness or security, do they?
jQuery's not wholly to blame for this. After all, the innerHTML property has been about for years, and already proved more popular than DOM. But the library certainly does encourage that style of coding.
As for performance: InnerHTML is most definitely going to be slower, because it needs to be parsed and internally converted into DOM elements (maybe using the createElement method).
InnerHTML is faster in all browsers according to the quirksmode benchmark provided by #Pointy.
As for readability and ease of use, you will find me choosing innerHTML over createElement any day of the week in most projects. But as you can see, there are many points speaking for createElement.

While innerHTML may be faster, I don't agree that it is better in terms of readability or maintenance. It may be shorter to put everything in one string, but shorter code is not always necessarily more maintainable.
String concatenation just does not scale when dynamic DOM elements need to be created as the plus' and quote openings and closings becomes difficult to track. Consider these examples:
The resulting element is a div with two inner spans whose content is dynamic. One of the class names (warrior) inside the first span is also dynamic.
<div>
<span class="person warrior">John Doe</span>
<span class="time">30th May, 2010</span>
</div>
Assume the following variables are already defined:
var personClass = 'warrior';
var personName = 'John Doe';
var date = '30th May, 2010';
Using just innerHTML and mashing everything into a single string, we get:
someElement.innerHTML = "<div><span class='person " + personClass + "'>" + personName + "</span><span class='time'>" + date + "</span></div>";
The above mess can be cleaned up with using string replacements to avoid opening and closing strings every time. Even for simple text replacements, I prefer using replace instead of string concatenation.
This is a simple function that takes an object of keys and replacement values and replaces them in the string. It assumes the keys are prefixed with $ to denote they are a special value. It does not do any escaping or handle edge cases where $ appears in the replacement value etc.
function replaceAll(string, map) {
for(key in map) {
string = string.replace("$" + key, map[key]);
}
return string;
}
var string = '<div><span class="person $type">$name</span><span class="time">$date</span></div>';
var html = replaceAll(string, {
type: personClass,
name: personName,
date: date
});
someElement.innerHTML = html;
This can be improved by separating the attributes, text, etc. while constructing the object to get more programmatic control over the element construction. For example, with MooTools we can pass object properties as a map. This is certainly more maintainable, and I would argue more readable as well. jQuery 1.4 uses a similar syntax to pass a map for initializing DOM objects.
var div = new Element('div');
var person = new Element('span', {
'class': 'person ' + personClass,
'text': personName
});
var when = new Element('span', {
'class': 'time',
'text': date
});
div.adopt([person, when]);
I wouldn't call the pure DOM approach below to be any more readable than the ones above, but it's certainly more maintainable because we don't have to keep track of opening/closing quotes and numerous plus signs.
var div = document.createElement('div');
var person = document.createElement('span');
person.className = 'person ' + personClass;
person.appendChild(document.createTextNode(personName));
var when = document.createElement('span');
when.className = 'date';
when.appendChild(document.createTextNode(date));
div.appendChild(person);
div.appendChild(when);
The most readable version would most likely result from using some sort of JavaScript templating.
<div id="personTemplate">
<span class="person <%= type %>"><%= name %></span>
<span class="time"><%= date %></span>
</div>
var div = $("#personTemplate").create({
name: personName,
type: personClass,
date: date
});

You should use createElement if you want to keep references in your code. InnerHTML can sometimes create a bug that is hard to spot.
HTML code:
<p id="parent">sample <span id='test'>text</span> about anything</p>
JS code:
var test = document.getElementById("test");
test.style.color = "red"; //1 - it works
document.getElementById("parent").innerHTML += "whatever";
test.style.color = "green"; //2 - oooops
1) you can change the color
2) you can't change color or whatever else anymore, because in the line above you added something to innerHTML and everything is re-created and you have access to something that doesn't exist anymore. In order to change it you have to again getElementById.
You need to remember that it also affects any events. You need to re-apply events.
InnerHTML is great, because it is faster and most time easier to read but you have to be careful and use it with caution. If you know what you are doing you will be OK.

Template literals (Template strings) is another option.
const container = document.getElementById("container");
const item_value = "some Value";
const item = `<div>${item_value}</div>`
container.innerHTML = item;

refactor HTML-generating JavaScript

Unfortunately on my project, we generate a lot of the HTML code in JavaScript like this:
var html = new StringBuffer();
html.append("<td class=\"gr-my-deals\">").append(deal.description).append("</td>");
I have 2 specific complaints about this:
The use of escaped double quotes (\”) within the HTML string. These should be replaced by single quotes (‘) to improve readability.
The use of .append() instead of the JavaScript string concatentation operator “+”
Applying both of these suggestions, produces the following equivalent line of code, which I consider to be much more readable:
var html = "<td class=’gr-my-deals’><a href=’" + deal.url + "’ target=’_blank’>" + deal.description + "</a></td>";
I'm now looking for a way to automatically transform the first line of code into the second. All I've come up with so far is to run the following find and replace over all our Javascript code:
Find: ).append(
Replace: +
This will convert the line of code shown above to:
html.append("<td class=\"gr-my-deals\">" + deal.description + "</td>)";
This should safely remove all but the first 'append()' statement. Unfortunately, I can't think of any safe way to automatically convert the escaped double-quotes to single quotes. Bear in mind that I can't simply do a find/replace because in some cases you actually do need to use escaped double-quotes. Typically, this is when you're generating HTML that includes nested JS, and that JS contains string parameters, e.g.
function makeLink(stringParam) {
var sb = new StringBuffer();
sb.append("<a href=\"JavaScript:myFunc('" + stringParam + "');\">");
}
My questions (finally) are:
Is there a better way to safely replace the calls to 'append()' with '+'
Is there any way to safely replace the escaped double quotes with single quotes, regex?
Cheers,
Don

Consider switching to a JavaScript template processor. They're generally fairly light-weight, and can dramatically improve the clarity of your code... as well as the performance, if you have a lot of re-use and choose one that precompiles templates.

Here is a stringFormat function that helps eliminate concatenation and ugly replacment values.
function stringFormat( str ) {
for( i = 0; i < arguments.length; i++ ) {
var r = new RegExp( '\\{' + ( i ) + '\\}','gm' );
str = str.replace( r, arguments[ i + 1 ] );
}
return str;
}
Use it like this:
var htmlMask = "<td class=’gr-my-deals’><a href=’{0}’ target=’_blank’>{1}</a></td>";
var x = stringFormat( htmlMask, deal.Url, deal.description );

As Shog9 implies, there are several good JavaScript templating engines out there. Here's an example of how you would use mine, jQuery Simple Templates:
var tmpl, vals, html;
tmpl = '<td class="gr-my-deals">';
tmpl += '#{text}';
tmpl += '</td>';
vals = {
href : 'http://example.com/example',
text : 'Click me!'
};
html = $.tmpl(tmpl, vals);

There is a good reason why you should be using the StringBuffer() instead of string concatenation in JavaScript. The StringBuffer() and its append() method use Array and Array's join() to bring the string together. If you have a significant number of partial strings you want to join, this is known to be a faster method of doing it.

Templating? Templating sucks! Here's the way I would write your code:
TD({ "class" : "gr-my-deals" },
A({ href : deal.url,
target : "_blank"},
deal.description ))
I use a 20-line library called DOMination, which I will send to anyone who asks, to support such code.
The advantages are manifold but some of the most obvious are:
legible, intuitive code
easy to learn and to write
compact code (no close-tags, just close-parentheses)
well-understood by JavaScript-aware editors, minifiers, and so on
resolves some browser-specific issues (like the difference between rowSpan and rowspan on IE)
integrates well with CSS
(Your example, BTW, highlights the only disadvantage of DOMination: any HTML attributes that are also JavaScript reserved words, class in this case, have to be quoted, lest bad things happen.)

Develop Reference

JavaScript is the programming language of the Web.

JavaScript: Refactoring looping regex - javascript

Related

What is the best way to access an element through a data-attribute whose value is an object (JSON)?

using javascript to replace onpage javascript

Detect if string contains javascript tags using jQuery/JavaScript

Advantages of createElement over innerHTML?

refactor HTML-generating JavaScript

Categories

Resources