Performance comparison with V8 - javascript

I'm currently testing multiple cases for parsing lines.
Each line is formatted like that:
"dHdX5jOa7ww9cGsW7jQF=dHdX5jOa7ww9cGsW7jQF=dHdX5jOa7ww9cGsW7jQF=dHdX5jOa7ww9cGsW7jQF"
There are a lot of lines of course, and I need to extract the key, and the value.
The key is delimited by the first "=" found.
There is never a "=" char in the key.
The value is the rest of string next after the first "=" sign.
So for this exemple the result should be:
{
key: "dHdX5jOa7ww9cGsW7jQF",
value: "dHdX5jOa7ww9cGsW7jQF=dHdX5jOa7ww9cGsW7jQF=dHdX5jOa7ww9cGsW7jQF"
}
From here we can iterate on multiple solutions:
// the first one is not very efficient with split splice join method
function first(line) {
const lineSplit = line.split('='),
key = lineSplit[0],
value = lineSplit.splice(1, lineSplit.length).join('=');
return {
key,
value
};
}
// the second one execute only what i want to do
// with built-in String prototype's functions
function optimized(line) {
const index = line.indexOf("="),
key = line.substr(0, index),
value = line.substr(index + 1, line.length);
return {
key,
value
};
}
// i tried to code the logic myself
function homemade(line) {
const len = line.length;
let value = "", key = "", valued = false;
for (let i = 0; i < len; ++i) {
const char = line[i];
if (valued === false) {
if (char !== '=') {
key += char;
} else {
valued = true;
}
} else {
value += char;
}
}
return {
key,
value
};
}
// and next recode substr and foreach built-in to implemant the same
// function but with homemade substr&foreach
String.prototype.substr2 = function(from, to){
let str = "";
for (let i = from; i < to; ++i) {
str += this[i];
}
return str;
};
String.prototype.indexOf2 = function(occ){
const len = this.length;
for (let i = 0; i < len; ++i) {
if (this[i] === occ) {
return i;
}
}
return -1;
};
function overload(line) {
const index = line.indexOf2("="),
key = line.substr2(0, index),
value = line.substr2(index + 1, line.length);
return {
key,
value
};
}
And voila the results with jsBench:
[I'm using Google Chrome Version 59.0.3071.104 (Official Build) (64-bit)]
You can checkout the results of these functions with your browser in this jsBench
I don't understand what is going on. I imagined that cannot be possible since I wrote only the code i needed with native for() and other stuffs like this...
My questions are:
Why the builtin string operations are obviously much faster ?
Why this repeated string concatenation is inneficient ?
Is there an alternative to it ?

Why the builtin string operations are obviously much faster ?
Because they are optimized, and use internal implementation tricks that are not available to JavaScript code. For example, they avoid repeated string concatenation by building the result in one go.
Why this repeated string concatenation is inefficient ?
Because it creates many strings as intermediate results.
Is there an alternative to it ?
Use the builtin string operations :-)

Related

How to find an unknown pattern, that is present in multiple strings?

I'm trying to find a part in multiple strings, that all strings share in common. For example:
const string1 = '.bold[_ngcontent="_kjhafh-asda-qw"] {background:black;}';
const string2 = '[_ngcontent="_kjhafh-asda-qw"] {background-color:hotpink;}';
const string3 = 'div > p > span[_ngcontent="_kjhafh-asda-qw"] {background:hotpink;}'
I don't know in advance what exactly the string is that I'm looking for, so I have to loop over the strings and find out. In the example above, the pattern would be [_ngcontent="_kjhafh-asda-qw"].
Is this even possible? Also, it would have to understand that maybe no such pattern exists. And are there methods for that or do I need to implement such an algorithm myself?
EDIT (context): We are building a validator, that checks a micro-frontend for global CSS rules (not prefixed and outside a shadow-dom), by loading it in isolation in a headless browser (within a jenkins pipeline) and validate, that it should not break any other stuff by global rules, that might be outside the context of the micro-frontend, on the same page. Using a headless browser, we can make use of the document.styleSheets property and not miss any styles that are being loaded. This will find <style> tags and its contents, aswell as content of external stylesheets.
Leveraging the BLAST algorithm, the following code snippet seeks successively matching substrings.
//
// See https://stackoverflow.com/questions/13006556/check-if-two-strings-share-a-common-substring-in-javascript/13007065#13007065
// for the following function...
//
String.prototype.subCompare = function(needle, haystack, minLength) {
var i,j;
haystack = haystack || this.toLowerCase();
minLength = minLength || 5;
for (i=needle.length; i>=minLength; i--) {
for (j=0; j <= (needle.length - i); j++) {
var substring = needle.substr(j,i);
var k = haystack.indexOf(substring);
if (k !== -1) {
return {
found : 1,
substring : substring,
needleIndex : j,
haystackIndex : k
};
}
}
}
return {
found : 0
}
}
//
// Iterate through the array of strings, seeking successive matching substrings...
//
strings = [
'.bold[_ngcontent="_kjhafh-asda-qw"] {background:black;}',
'[_ngcontent="_kjhafh-asda-qw"] {background-color:hotpink;}',
'div > p > span[_ngcontent="_kjhafh-asda-qw"] {background:hotpink;}'
]
check = { found: 1, substring: strings[ 0 ] }
i = 1;
while ( check.found && i < strings.length ) {
check = check.substring.subCompare( strings[ i++ ] );
}
console.log( check );
Note that without seeing a larger sampling of string data, it's not clear whether this algorithm satisfies the objective...
Thanks to Trentium's answer, I was able to do it. I adapted the code a little bit, as it was doing too much and also, the substr didn't yield a consistent result (it depended on the order of input strings).
The code could obviously be further minified/simplified.
const findCommonPattern = (base, needle, minLength = 5) => {
const haystack = base.toLowerCase();
for (let i = needle.length; i >= minLength; i--) {
for (let j = 0; j <= needle.length - i; j++) {
let prefix = needle.substr(j, i);
let k = haystack.indexOf(prefix);
if (k !== -1) {
return {
found: true,
prefix,
};
}
}
}
return {
found: false,
};
};
const checkIfCssIsPrefixed = (strings) => {
let check = { found: true };
let matchingStrings = [];
for (let i = 1; check.found && i < strings.length; ++i) {
check = findCommonPattern(strings[0], strings[i]);
matchingStrings.push(check.prefix);
}
// Sort by length and take the shortest string, which will be the pattern that all of the strings share in common.
check.prefix = matchingStrings.sort((a, b) => a.length - b.length)[0];
return check;
};
console.log(
checkIfCssIsPrefixed([
".spacer[_ngcontent-wdy-c0]",
"[_nghost-wdy-c0]",
".toolbar[_ngcontent-wdy-c0]",
"p[_ngcontent-wdy-c0]",
".spacer[_ngcontent-wdy-c0]",
".toolbar[_ngcontent-wdy-c0] img[_ngcontent-wdy-c0]",
"h1[_ngcontent-wdy-c0], h2[_ngcontent-wdy-c0], h3[_ngcontent-wdy-c0], h4[_ngcontent-wdy-c0], h5[_ngcontent-wdy-c0], h6[_ngcontent-wdy-c0]",
".toolbar[_ngcontent-wdy-c0] #twitter-logo[_ngcontent-wdy-c0]",
".toolbar[_ngcontent-wdy-c0] #youtube-logo[_ngcontent-wdy-c0]",
".toolbar[_ngcontent-wdy-c0] #twitter-logo[_ngcontent-wdy-c0]:hover, .toolbar[_ngcontent-wdy-c0] #youtube-logo[_ngcontent-wdy-c0]:hover",
])
);

Swapping consecutive/adjacent characters of a string in JavaScript

Example string: astnbodei, the actual string must be santobedi. Here, my system starts reading a pair of two characters from the left side of a string, LSB first and then the MSB of the character pair. Therefore, santobedi is received as astnbodei. The strings can be a combination of letters and numbers and even/odd lengths of characters.
My attempt so far:
var attributes_ = [Name, Code,
Firmware, Serial_Number, Label
]; //the elements of 'attributes' are the strings
var attributes = [];
for (var i = 0; i < attributes_.length; i++) {
attributes.push(swap(attributes_[i].replace(/\0/g, '').split('')));
}
function swap(array_attributes) {
var tmpArr = array_attributes;
for (var k = 0; k < tmpArr.length; k += 2) {
do {
var tmp = tmpArr[k];
tmpArr[k] = tmpArr[k+1]
tmpArr[k+1] = tmp;
} while (tmpArr[k + 2] != null);
}
return tmpArr;
}
msg.Name = attributes; //its to check the code
return {msg: msg, metadata: metadata,msgType: msgType}; //its the part of system code
While running above code snippet, I received the following error:
Can't compile script: javax.script.ScriptException: :36:14 Expected : but found ( return {__if(); ^ in at line number 36 at column number 14
I'm not sure what the error says. Is my approach correct? Is there a direct way to do it?
Did you try going through the array in pairs and swapping using ES6 syntax?
You can swap variables like this in ES6:
[a, b] = [b, a]
Below is one way to do it. The code you have is not valid because return is not allowed outside a function.
let string = "astnbodei";
let myArray = string.split('');
let outputArray = [];
for (i=0; i<myArray.length; i=i+2) {
outputArray.push(myArray[i+1]);
outputArray.push(myArray[i]);
}
console.log(outputArray.join(''));
Consecutive pairwise character swapping of/within a string very easily can be solved by a reduce task ...
function swapCharsPairwise(value) {
return String(value)
.split('')
.reduce((result, antecedent, idx, arr) => {
if (idx % 2 === 0) {
// access the Adjacent (the Antecedent's next following char).
adjacent = arr[idx + 1];
// aggregate result while swapping `antecedent` and `adjacent`.
result.push(adjacent, antecedent);
}
return result;
}, []).join('');
}
console.log(
'swapCharsPairwise("astnbodei") ...',
swapCharsPairwise("astnbodei")
);
The reason for the error in the question was the placement of the function swap. However, switching its placement gave me another error:
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: javax.script.ScriptException: delight.nashornsandbox.exceptions.ScriptCPUAbuseException: Script used more than the allowed [8000 ms] of CPU time.
#dikuw's answer helped me partially. The following line of code worked for me:
var attributes_ = [Name, Code,
Firmware, Serial_Number, Label
]; //the elements of 'attributes' are the strings
var attributes = [];
for (var i = 0; i < attributes_.length; i++) {
attributes.push(swap(attributes_[i].replace(/\0/g, '').split('')));
}
return {msg: msg, metadata: metadata,msgType: msgType};
function swap(array_attributes) {
var tmpArr = [];
for (var k = 0; k < array_attributes.length; k= k+2) {
if( (array_attributes[k + 1] != null)) {
tmpArr.push(array_attributes[k+1]);
tmpArr.push(array_attributes[k]);
}}
return tmpArr.join('');
}

How to get odd and even position characters from a string?

I'm trying to figure out how to remove every second character (starting from the first one) from a string in Javascript.
For example, the string "This is a test!" should become "hsi etTi sats!"
I also want to save every deleted character into another array.
I have tried using replace method and splice method, but wasn't able to get them to work properly. Mostly because replace only replaces the first character.
function encrypt(text, n) {
if (text === "NULL") return n;
if (n <= 0) return text;
var encArr = [];
var newString = text.split("");
var j = 0;
for (var i = 0; i < text.length; i += 2) {
encArr[j++] = text[i];
newString.splice(i, 1); // this line doesn't work properly
}
}
You could reduce the characters of the string and group them to separate arrays using the % operator. Use destructuring to get the 2D array returned to separate variables
let str = "This is a test!";
const [even, odd] = [...str].reduce((r,char,i) => (r[i%2].push(char), r), [[],[]])
console.log(odd.join(''))
console.log(even.join(''))
Using a for loop:
let str = "This is a test!",
odd = [],
even = [];
for (var i = 0; i < str.length; i++) {
i % 2 === 0
? even.push(str[i])
: odd.push(str[i])
}
console.log(odd.join(''))
console.log(even.join(''))
It would probably be easier to use a regular expression and .replace: capture two characters in separate capturing groups, add the first character to a string, and replace with the second character. Then, you'll have first half of the output you need in one string, and the second in another: just concatenate them together and return:
function encrypt(text) {
let removedText = '';
const replacedText1 = text.replace(/(.)(.)?/g, (_, firstChar, secondChar) => {
// in case the match was at the end of the string,
// and the string has an odd number of characters:
if (!secondChar) secondChar = '';
// remove the firstChar from the string, while adding it to removedText:
removedText += firstChar;
return secondChar;
});
return replacedText1 + removedText;
}
console.log(encrypt('This is a test!'));
Pretty simple with .reduce() to create the two arrays you seem to want.
function encrypt(text) {
return text.split("")
.reduce(({odd, even}, c, i) =>
i % 2 ? {odd: [...odd, c], even} : {odd, even: [...even, c]}
, {odd: [], even: []})
}
console.log(encrypt("This is a test!"));
They can be converted to strings by using .join("") if you desire.
I think you were on the right track. What you missed is replace is using either a string or RegExp.
The replace() method returns a new string with some or all matches of a pattern replaced by a replacement. The pattern can be a string or a RegExp, and the replacement can be a string or a function to be called for each match. If pattern is a string, only the first occurrence will be replaced.
Source: String.prototype.replace()
If you are replacing a value (and not a regular expression), only the first instance of the value will be replaced. To replace all occurrences of a specified value, use the global (g) modifier
Source: JavaScript String replace() Method
So my suggestion would be to continue still with replace and pass the right RegExp to the function, I guess you can figure out from this example - this removes every second occurrence for char 't':
let count = 0;
let testString = 'test test test test';
console.log('original', testString);
// global modifier in RegExp
let result = testString.replace(/t/g, function (match) {
count++;
return (count % 2 === 0) ? '' : match;
});
console.log('removed', result);
like this?
var text = "This is a test!"
var result = ""
var rest = ""
for(var i = 0; i < text.length; i++){
if( (i%2) != 0 ){
result += text[i]
} else{
rest += text[i]
}
}
console.log(result+rest)
Maybe with split, filter and join:
const remaining = myString.split('').filter((char, i) => i % 2 !== 0).join('');
const deleted = myString.split('').filter((char, i) => i % 2 === 0).join('');
You could take an array and splice and push each second item to the end of the array.
function encrypt(string) {
var array = [...string],
i = 0,
l = array.length >> 1;
while (i <= l) array.push(array.splice(i++, 1)[0]);
return array.join('');
}
console.log(encrypt("This is a test!"));
function encrypt(text) {
text = text.split("");
var removed = []
var encrypted = text.filter((letter, index) => {
if(index % 2 == 0){
removed.push(letter)
return false;
}
return true
}).join("")
return {
full: encrypted + removed.join(""),
encrypted: encrypted,
removed: removed
}
}
console.log(encrypt("This is a test!"))
Splice does not work, because if you remove an element from an array in for loop indexes most probably will be wrong when removing another element.
I don't know how much you care about performance, but using regex is not very efficient.
Simple test for quite a long string shows that using filter function is on average about 3 times faster, which can make quite a difference when performed on very long strings or on many, many shorts ones.
function test(func, n){
var text = "";
for(var i = 0; i < n; ++i){
text += "a";
}
var start = new Date().getTime();
func(text);
var end = new Date().getTime();
var time = (end-start) / 1000.0;
console.log(func.name, " took ", time, " seconds")
return time;
}
function encryptREGEX(text) {
let removedText = '';
const replacedText1 = text.replace(/(.)(.)?/g, (_, firstChar, secondChar) => {
// in case the match was at the end of the string,
// and the string has an odd number of characters:
if (!secondChar) secondChar = '';
// remove the firstChar from the string, while adding it to removedText:
removedText += firstChar;
return secondChar;
});
return replacedText1 + removedText;
}
function encrypt(text) {
text = text.split("");
var removed = "";
var encrypted = text.filter((letter, index) => {
if(index % 2 == 0){
removed += letter;
return false;
}
return true
}).join("")
return encrypted + removed
}
var timeREGEX = test(encryptREGEX, 10000000);
var timeFilter = test(encrypt, 10000000);
console.log("Using filter is faster ", timeREGEX/timeFilter, " times")
Using actually an array for storing removed letters and then joining them is much more efficient, than using a string and concatenating letters to it.
I changed an array to string in filter solution to make it the same like in regex solution, so they are more comparable.

Counting the frequency of elements in an array in JavaScript

how do I count the frequency of the elements in the array, I'm new to Javascript and completely lost, I have looked at other answers here but can't get them to work for me. Any help is much appreciated.
function getText() {
var userText;
userText = document.InputForm.MyTextBox.value; //get text as string
alphaOnly(userText);
}
function alphaOnly(userText) {
var nuText = userText;
//result = nuText.split("");
var alphaCheck = /[a-zA-Z]/g; //using RegExp create variable to have only alphabetic characters
var alphaResult = nuText.match(alphaCheck); //get object with only alphabetic matches from original string
alphaResult.sort();
var result = freqLet(alphaResult);
document.write(countlist);
}
function freqLet(alphaResult) {
count = 0;
countlist = {
alphaResult: count
};
for (i = 0; i < alphaResult.length; i++) {
if (alphaResult[i] in alphaResult)
count[i] ++;
}
return countlist;
}
To count frequencies you should use an object which properties correspond to the letters occurring in your input string.
Also before incrementing the value of the property you should previously check whether this property exists or not.
function freqLet (alphaResult) {
var count = {};
countlist = {alphaResult:count};
for (i = 0; i < alphaResult.length; i++) {
var character = alphaResult.charAt(i);
if (count[character]) {
count[character]++;
} else {
count[character] = 1;
}
}
return countlist;
}
If you can use a third party library, underscore.js provides a function "countBy" that does pretty much exactly what you want.
_.countBy(userText, function(character) {
return character;
});
This should return an associative array of characters in the collection mapped to a count.
Then you could filter the keys of that object to the limited character set you need, again, using underscore or whatever method you like.
Do as below:
var __arr = [6,7,1,2,3,3,4,5,5,5]
function __freq(__arr){
var a = [], b = [], prev
__arr.sort((a,b)=>{return a- b} )
for(let i = 0; i<__arr.length; i++){
if(__arr[i] !== prev){
a.push(__arr[i])
b.push(1)
}else{
b[b.length - 1]++
}
prev = __arr[i]
}
return [a , b]
}

Multiple specials characters replacement optimization

I need to replace all the specials characters in a string with javascript or jQuery.
I am sure there is a better way to do this.
But I currently have no clue.
Anyone got an idea?
function Unaccent(str) {
var norm = new Array('À','Á','Â','Ã','Ä','Å','Æ','Ç','È','É','Ê','Ë','Ì','Í','Î','Ï', 'Ð','Ñ','Ò','Ó','Ô','Õ','Ö','Ø','Ù','Ú','Û','Ü','Ý','Þ','ß', 'à','á','â','ã','ä','å','æ','ç','è','é','ê','ë','ì','í','î','ï','ð','ñ', 'ò','ó','ô','õ','ö','ø','ù','ú','û','ü','ý','ý','þ','ÿ');
var spec = new Array('A','A','A','A','A','A','A','C','E','E','E','E','I','I','I','I', 'D','N','O','O','O','0','O','O','U','U','U','U','Y','b','s', 'a','a','a','a','a','a','a','c','e','e','e','e','i','i','i','i','d','n', 'o','o','o','o','o','o','u','u','u','u','y','y','b','y');
for (var i = 0; i < spec.length; i++) {
str = replaceAll(str, norm[i], spec[i]);
}
return str;
}
function replaceAll(str, search, repl) {
while (str.indexOf(search) != -1) {
str = str.replace(search, repl);
}
return str;
}
Here's a version using a lookup map that works a little more efficiently than nested loops:
function Unaccent(str) {
var map = Unaccent.map; // shortcut
var result = "", srcChar, replaceChar;
for (var i = 0, len = str.length; i < len; i++) {
srcChar = str.charAt(i);
// use hasOwnProperty so we never conflict with any
// methods/properties added to the Object prototype
if (map.hasOwnProperty(srcChar)) {
replaceChar = map[srcChar]
} else {
replaceChar = srcChar;
}
result += replaceChar;
}
return(result);
}
// assign this here so it is only created once
Unaccent.map = {'À':'A','Á':'A','Â':'A'}; // you fill in the rest of the map
Working demo: http://jsfiddle.net/jfriend00/rRpcy/
FYI, a Google search for "accent folding" returns many other implementations (many similar, but also some using regex).
Here's a bit higher performance version (2.5x faster) that can do a direct indexed lookup of the accented characters rather than having to do an object lookup:
function Unaccent(str) {
var result = "", code, lookup, replaceChar;
for (var i = 0, len = str.length; i < len; i++) {
replaceChar = str.charAt(i);
code = str.charCodeAt(i);
// see if code is in our map
if (code >= 192 && code <= 255) {
lookup = Unaccent.map.charAt(code - 192);
if (lookup !== ' ') {
replaceChar = lookup;
}
}
result += replaceChar;
}
return(result);
}
// covers chars from 192-255
// blank means no mapping for that char
Unaccent.map = "AAAAAAACEEEEIIIIDNOOOOO OUUUUY aaaaaaaceeeeiiiionooooo uuuuy y";
Working demo: http://jsfiddle.net/jfriend00/Jxr9u/
In this jsperf, the string lookup version (the 2nd example) is about 2.5x faster.
Using an object as a map is a good idea, but given the number of characters you're replacing, it's probably a good idea to pre-initialize the object so that it doesn't have to be re-initialized each time the function gets run (assuming you're running the function more than once):
var Unaccent = (function () {
var charMap = {'À':'A','Á':'A','Â':'A','Ã':'A','Ä':'A' /** etc. **/};
return function (str) {
var i, modified = "", cur;
for(i = 0; i < str.length; i++) {
cur = str.charAt(i);
modified += (charMap[cur] || cur);
}
return modified;
};
}());
This will front-load the heavy lifting of the function to page load time (you can do some modifications to delay it until the first call to the function if you like). But it will take some of the processing time out of the actual function call.
It's possible some browsers will actually optimize this part anyway, so you might not see a benefit. But on older browsers (where performance is of greater concern), you'll probably see some benefit to pre-processing your character map.
You can prepare key value pair type of array and via jquery each traverse that array.
Example :
function Unaccent(str) {
var replaceString = {'À':'A','Á':'A','Â':'A'}; // add more
$.each(replaceString, function(k, v) {
var regX = new RegExp(k, 'g');
str = str.replace(regX,v);
});
}
Working Demo
Good Luck !!

Categories

Resources