How does indexOf method on string work in JavaScript - javascript

I was doing a question on codility and came across this problem for which I wrote something like this:
function impact(s) {
let imp = 4; // max possible impact
for (let i = 0; i < s.length; i++) {
if (s[i] === 'A') return 1;
else if (s[i] === 'C') imp = Math.min(imp, 2);
else if (s[i] === 'G') imp = Math.min(imp, 3);
else if (s[i] === 'T') imp = Math.min(imp, 4);
}
return imp;
}
function solution(S, P, Q) {
const A = new Array(P.length);
for (let i = 0; i < P.length; i++) {
const s = S.slice(P[i], Q[i] + 1);
A[i] = impact(s);
}
return A;
}
And it failed all the performance tests
Now I changed it to the following code which I thought would be slower but to my surprise it scored 100%:
function solution(S, P, Q) {
let A = []
for (let i = 0; i < P.length; i++) {
let s = S.slice(P[i], Q[i] + 1)
if (s.indexOf('A') > -1) A.push(1)
else if (s.indexOf('C') > -1) A.push(2)
else if (s.indexOf('G') > -1) A.push(3)
else if (s.indexOf('T') > -1) A.push(4)
}
return A
}
Which to me made no sense, because I was using 4 indexOf which should be slower than 1 linear iteration of the same string. But it's not.
So, how does String.indexOf() work and why are 4 .indexOf so much faster than 1 iteration?

In your first solution you have two loops. The second loop is in impact. That second loop corresponds roughly to the four indexOf you have in the second solution.
One iteration of the second loop will do at most 4 comparisons, and there will be at most n iterations. So this makes at most 4n comparisons. The same can be said of the indexOf solution. Each of these four indexOf may need to scan the whole array, which represents n comparisons. And so that also amounts to a worst case of 4n comparisons.
The main difference however, is that the scanning that an indexOf performs, is not implemented in JavaScript, but in highly efficient pre-compiled code, while the first solution does this scanning with (slower) JavaScript code. As a rule of thumb, it is always more efficient to use native String/Array methods (like there are indexOf, slice, includes,...) than implementing a similar functionality with an explicit for loop.
Another thing to consider is that if there is an "A" in the data at position i, then the second solution will find it after i comparisons (internal to the indexOf implementation), while the first solution will find it after 4i comparisons, because it also makes the comparisons for the other three letters during the same iterations in which it looks for an "A". This extra cost decreases for when there is no "A", but a "C" somewhere, ...etc.

Related

How to improve performance of this Javascript/Cracking the code algorithm?

so here is the question below, with my answer to it. I know that because of the double nested for loop, the efficiency is O(n^2), so I was wondering if there were a way to improve my algorithm/function's big O.
// Design an algorithm and write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine. An extra copy of the array is not.
function removeDuplicates(str) {
let arrayString = str.split("");
let alphabetArray = [["a", 0],["b",0],["c",0],["d",0],["e",0],["f",0],["g",0],["h",0],["i",0],["j",0],["k",0],["l",0],["m",0],["n",0],["o",0],["p",0],["q",0],["r",0],["s",0],["t",0],["u",0],["v",0],["w",0],["x",0],["y",0],["z",0]]
for (let i=0; i<arrayString.length; i++) {
findCharacter(arrayString[i].toLowerCase(), alphabetArray);
}
removeCharacter(arrayString, alphabetArray);
};
function findCharacter(character, array) {
for (let i=0; i<array.length; i++) {
if (array[i][0] === character) {
array[i][1]++;
}
}
}
function removeCharacter(arrString, arrAlphabet) {
let finalString = "";
for (let i=0; i<arrString.length; i++) {
for (let j=0; j<arrAlphabet.length; j++) {
if (arrAlphabet[j][1] < 2 && arrString[i].toLowerCase() == arrAlphabet[j][0]) {
finalString += arrString[i]
}
}
}
console.log("The string with removed duplicates is:", finalString)
}
removeDuplicates("Hippotamuus")
The ASCII/Unicode character codes of all letters of the same case are consecutive. This allows for an important optimization: You can find the index of a character in the character count array from its ASCII/Unicode character code. Specifically, the index of the character c in the character count array will be c.charCodeAt(0) - 'a'.charCodeAt(0). This allows you to look up and modify the character count in the array in O(1) time, which brings the algorithm run-time down to O(n).
There's a little trick to "without using any additional buffer," although I don't see a way to improve on O(n^2) complexity without using a hash map to determine if a particular character has been seen. The trick is to traverse the input string buffer (assume it is a JavaScript array since strings in JavaScript are immutable) and overwrite the current character with the next unique character if the current character is a duplicate. Finally, mark the end of the resultant string with a null character.
Pseudocode:
i = 1
pointer = 1
while string[i]:
if not seen(string[i]):
string[pointer] = string[i]
pointer = pointer + 1
i = i + 1
mark string end at pointer
The function seen could either take O(n) time and O(1) space or O(1) time and O(|alphabet|) space if we use a hash map.
Based on your description, I'm assuming the input is a string (which is immutable in javascript) and I'm not sure what exactly does "one or two additional variables" mean so based on your implementation, I'm going to assume it's ok to use O(N) space. To improve time complexity, I think implementations differ according to different requirements for the outputted string.
Assumption1: the order of the outputted string is in the order that it appears the first time. eg. "bcabcc" -> "bca"
Suppose the length of s is N, the following implementation uses O(N) space and O(N) time.
function removeDuplicates(s) {
const set = new Set(); // use set so that insertion and lookup time is o(1)
let res = "";
for (let i = 0; i < s.length; i++) {
if (!set.has(s[i])) {
set.add(s[i]);
res += s[i];
}
}
return res;
}
Assumption2: the outputted string has to be of ascending order.
You may use quick-sort to do in-place sorting and then loop through the sorted array to add the last-seen element to result. Note that you may need to split the string into an array first. So the implementation would use O(N) space and the average time complexity would be O(NlogN)
Assumption3: the result is the smallest in lexicographical order among all possible results. eg. "bcabcc" -> "abc"
The following implementation uses O(N) space and O(N) time.
const removeDuplicates = function(s) {
const stack = []; // stack and set are in sync
const set = new Set(); // use set to make lookup faster
const lastPos = getLastPos(s);
let curVal;
let lastOnStack;
for (let i = 0; i < s.length; i++) {
curVal = s[i];
if (!set.has(curVal)) {
while(stack.length > 0 && stack[stack.length - 1] > curVal && lastPos[stack[stack.length - 1]] > i) {
set.delete(stack[stack.length - 1]);
stack.pop();
}
set.add(curVal);
stack.push(curVal);
}
}
return stack.join('');
};
const getLastPos = (s) => {
// get the last index of each unique character
const lastPosMap = {};
for (let i = 0; i < s.length; i++) {
lastPosMap[s[i]] = i;
}
return lastPosMap;
}
I was unsure what was mean't by:
...without using any additional buffer.
So I thought I would have a go at doing this in one loop, and let you tell me if it's wrong.
I have worked on the basis that the function you have provided gives the correct output, you were just looking for it to run faster. The function below gives the correct output and run's a lot faster with any large string with lots of duplication that I throw at it.
function removeDuplicates(originalString) {
let outputString = '';
let lastChar = '';
let lastCharOccurences = 1;
for (let char = 0; char < originalString.length; char++) {
outputString += originalString[char];
if (lastChar === originalString[char]) {
lastCharOccurences++;
continue;
}
if (lastCharOccurences > 1) {
outputString = outputString.slice(0, outputString.length - (lastCharOccurences + 1)) + originalString[char];
lastCharOccurences = 1;
}
lastChar = originalString[char];
}
console.log("The string with removed duplicates is:", outputString)
}
removeDuplicates("Hippotamuus")
Again, sorry if I have misunderstood the post...

Combine an array with other arrays, push each combination Javascript

I'm trying to take an array, and compare each value of that array to the next value in the array. When I run my code, components that should match with more than one array only return one match, instead of all of them. I'm probably doing something wrong somewhere, but for the life of my I don't seem to be able to figure it out.
This is my code:
INPUT
minterms = [["4",[0,1,0,0]],
["8",[1,0,0,0]],
["9",[1,0,0,1]],
["10",[1,0,1,0]],
["12",[1,1,0,0]],
["11",[1,0,1,1]],
["14",[1,1,1,0]],
["15",[1,1,1,1]]];
Function
function combineMinterms(minterms) {
var match = 0;
var count;
var loc;
var newMin = [];
var newMiny = [];
var used = new Array(minterms.length);
//First Component
for (x = 0; x < minterms.length; x++) {
if(minterms[x][1][minterms[x][1].length - 1] == "*") {
newMin.push(minterms[x].slice());
continue;
};
//Second Component
for (y = x + 1; y < minterms.length; y++) {
count = 0;
//Compare each value
for (h = 0; h < minterms[x][1].length; h++) {
if (minterms[x][1][h] != minterms[y][1][h]) {
count++;
loc = h;
}
if (count >= 2) {break; };
}
//If only one difference, push to new
if (count === 1) {
newMin.push(minterms[x].slice());
newMiny = minterms[y].slice();
newMin[match][1][loc] = "-";
while(newMin[match][0].charAt(0) === 'd') {
newMin[match][0] = newMin[match][0].substr(1);
}
while(newMiny[0].charAt(0) === 'd') {
newMiny[0] = newMiny[0].substr(1);
}
newMin[match][0] += "," + newMiny[0];
used[x] = 1;
used[y] = 1;
match++;
continue;
}
}
//If never used, push to new
if(used[x] != 1) {
newMin.push(minterms[x].slice());
newMin[match][1].push("*");
match++;
}
}
return newMin;
}
Desired Output
newMin = [["4,12",[-,1,0,0]],
["8,9",[1,0,0,-]],
["8,10",[1,0,-,0]],
["8,12",[1,-,0,0]],
["9,11",[1,0,-,1]],
["10,11",[1,0,1,-]],
["10,14",[1,-,1,0]],
["12,14",[1,1,-,0]],
["11,15",[1,-,1,1]],
["14,15",[1,1,1,-]]];
It will combine term 8, with 9 but won't continue to combine term 8 with 10, 12
Thanks in advance for the help.
Array.prototype.slice performs a shallow copy.
Each entry in minterms is an array of a string and a nested array.
When you slice the entry, you get a new array with a copy of the string and a copy of the Array object reference. But that copy of the Array reference still points to the array contained in an element of minterms.
When you update the nested array
newMin[match][1][loc] = "-";
you are updating the nested array within the input. I never fathomed the logic of what you are doing, but I believe this is the problem, with solution of cloning the nested array (as well) when cloning an input array element.
A secondary issue you will probably wish to fix is that not all variables were declared: var x,y,h; or equivalent inline declarations are missing.
let minterms = [4,8,9,10,12,11,14,15];
let newMin = [];
minterms.map((value, index) =>{
minterms.reduce((accumulator, currentValue, currentIndex, array) => {
accumulator = value;
let out = (accumulator ^ currentValue).toString(2);
if(out.split('').filter(n=>n==="1").length == 1) newMin.push([value, currentValue]);
}, value);
});
console.log(newMin);
There is a better approach (in 10 lines of code). Since you're working with binary representations, you might want to consider using BitWise operators. When coupled with array operators it makes most of this straight forward.
For instance:
Given a match means only a single bit differs between two binary numbers:
The bitwise XOR operator returns 1 for each bit that doesn't match. So:
0100 XOR 1000 results in 1000
Now, we need to count the number of '1' digits in the binary number returned. We can use the length property of an array to do this. To turn 1000 into an array, first we turn the binary into a string:
The binary representation of the integer 4 is easily retrieved with:
num.toString(2)
So if num === 4, the output above is the string "0100".
Now we use str.split() to turn the string into an array. Remove everything from the array that is not a '1'. Now simply get the length property. If the length === 1, it is a match.
I put together a solution like this for you. It is close to your use case. I didn't use the funny dash style in the output because that was not part of your question.
https://jsbin.com/xezuwax/edit?js,console

Longest Common Subsequence, required length on contiguous parts

I think I have enough grasp of the LCS algorithm from this page. Specifically this psedo-code implementation: (m and n are the lengths of A and B)
int lcs_length(char * A, char * B) {
allocate storage for array L;
for (i = m; i >= 0; i--)
for (j = n; j >= 0; j--) {
if (A[i] == '\0' || B[j] == '\0') L[i,j] = 0;
else if (A[i] == B[j]) L[i,j] = 1 + L[i+1, j+1];
else L[i,j] = max(L[i+1, j], L[i, j+1]);
}
return L[0,0];
}
The L array is later backtracked to find the specific subsequence like so:
sequence S = empty;
i = 0;
j = 0;
while (i < m && j < n) {
if (A[i]==B[j]) {
add A[i] to end of S;
i++; j++;
}
else if (L[i+1,j] >= L[i,j+1]) i++;
else j++;
}
I have yet to rewrite this into Javascript, but for now I know that the implementation at Rossetta Code works just fine. So to my questions:
1. How do I modify the algorithm to only return the longest common subsequence where the parts of the sequence are of a given minimum length?
For example, "thisisatest" and "thimplestesting" returns "thistest", with the contiguous parts "thi", "s" and "test". Let's define 'limit' as a minimum requirement of contiguous characters for it to be added to the result. With a limit of 3 the result would be "thitest" and with a limit of 4 the result would be "test". For my uses I would like to not only get the length, but the actual sequence and its indices in the first string. It doesn't matter if that needs to be backtracked later or not.
2. Would such a modification reduce the complexity or increase it?
From what I understand, analysing the entire suffix tree might be a solution to find a subsequence that fits a limit? If correct, is that significantly more complex than the original algorithm?.
3. Can you optimize the LCS algorithm, modified or not, with the knowledge that the same source string is compared to a huge amount of target strings?
Currently I'm just iterating through the target strings finding the LCS and selecting the string with the longest subsequence. Is there any significant preprocessing that could be done on the source string to reduce the time?
Answers to any of my questions are welcome, or just hints on where to research further.
Thank you for your time! :)

Most efficient way to generate a really long string (tens of megabytes) in JS

I find myself needing to synthesize a ridiculously long string (like, tens of megabytes long) in JavaScript. (This is to slow down a CSS selector-matching operation to the point where it takes a measurable amount of time.)
The best way I've found to do this is
var really_long_string = (new Array(10*1024*1024)).join("x");
but I'm wondering if there's a more efficient way - one that doesn't involve creating a tens-of-megabytes array first.
For ES6:
'x'.repeat(10*1024*1024)
The previously accepted version uses String.prototype.concat() which is vastly slower than using the optimized string concatenating operator, +. MDN also recommends to keep away from using it in speed critical code.
I have made three versions of the above code to show the speed differences in a JsPerf. Converting it to using only using concat is only a third as fast as only using the string concatenating operator (Chrome - your mileage will vary). The edited version below will run twice as fast in Chrome
var x = '1234567890'
var iterations = 14
for (var i = 0; i < iterations; i++) {
x += x + x
}
This is the more efficient algorithm for generating very long strings in javascript:
function stringRepeat(str, num) {
num = Number(num);
var result = '';
while (true) {
if (num & 1) { // (1)
result += str;
}
num >>>= 1; // (2)
if (num <= 0) break;
str += str;
}
return result;
}
more info here: http://www.2ality.com/2014/01/efficient-string-repeat.html.
Alternatively, in ECMA6 you can use String.prototype.repeat() method.
Simply accumulating is vastly faster in Safari 5:
var x = "1234567890";
var iterations = 14;
for (var i = 0; i < iterations; i++) {
x += x.concat(x);
}
alert(x.length); // 47829690
Essentially, you'll get x.length * 3^iterations characters.
Not sure if this is a great implementation, but here's a general function based on #oligofren's solution:
function repeat(ch, len) {
var result = ch;
var halfLength = len / 2;
while (result.length < len) {
if (result.length <= halfLength) {
result += result;
} else {
return result + repeat(ch, len - result.length);
}
}
return result;
}
This assumes that concatenating a large string is faster than a series of small strings.

Javascript regular expressions problem

I am creating a small Yahtzee game and i have run into some regex problems. I need to verify certain criteria to see if they are met. The fields one to six is very straight forward the problem comes after that. Like trying to create a regex that matches the ladder. The Straight should contain one of the following characters 1-5. It must contain one of each to pass but i can't figure out how to check for it. I was thinking /1{1}2{1}3{1}4{1}5{1}/g; but that only matches if they come in order. How can i check if they don't come in the correct order?
If I understood you right, you want to check if a string contains the numbers from 1 to 5 in random order. If that is correct, then you can use:
var s = '25143';
var valid = s.match(/^[1-5]{5}$/);
for (var i=1; i<=5; i++) {
if (!s.match(i.toString())) valid = false;
}
Or:
var s = '25143';
var valid = s.split('').sort().join('').match(/^12345$/);
Although this definitely can be solved with regular expressions, I find it quite interesting and educative to provide a "pure" solution, based on simple arithmetic. It goes like this:
function yahtzee(comb) {
if(comb.length != 5) return null;
var map = [0, 0, 0, 0, 0, 0];
for(var i = 0; i < comb.length; i++) {
var digit = comb.charCodeAt(i) - 48;
if(digit < 1 || digit > 6) return null;
map[digit - 1]++;
}
var sum = 0, p = 0, seq = 0;
for(var i = 0; i < map.length; i++) {
if(map[i] == 2) sum += 20;
if(map[i] >= 3) sum += map[i];
p = map[i] ? p + 1 : 0;
if(p > seq) seq = p;
}
if(sum == 5) return "Yahtzee";
if(sum == 23) return "Full House";
if(sum == 3) return "Three-Of-A-Kind";
if(sum == 4) return "Four-Of-A-Kind";
if(seq == 5) return "Large Straight";
if(seq == 4) return "Small Straight";
return "Chance";
}
for reference, Yahtzee rules
For simplicity and easiness, I'd go with indexOf.
string.indexOf(searchstring, start)
Loop 1 to 5 like Max but just check indexOf i, break out for any false.
This also will help for the small straight, which is only 4 out of 5 in order(12345 or 23456).
Edit: Woops. 1234, 2345, 3456. Sorry.
You could even have a generic function to check for straights of an arbitrary length, passing in the maximum loop index as well as the string to check.
"12543".split('').sort().join('') == '12345'
With regex:
return /^([1-5])(?!\1)([1-5])(?!\1|\2)([1-5])(?!\1|\2|\3)([1-5])(?!\1|\2|\3|\4)[1-5]$/.test("15243");
(Not that it's recommended...)
A regexp is likely not the best solution for this problem, but for fun:
/^(?=.*1)(?=.*2)(?=.*3)(?=.*4)(?=.*5).{5}$/.test("12354")
That matches every string that contains exactly five characters, being the numbers 1-5, with one of each.
(?=.*1) is a positive lookahead, essentially saying "to the very right of here, there should be whatever or nothing followed by 1".
Lookaheads don't "consume" any part of the regexp, so each number check starts off the beginning of the string.
Then there's .{5} to actually consume the five characters, to make sure there's the right number of them.

Categories

Resources