Sort all characters in a string - javascript

I'm trying to solve this problem where I want to sort array of characters in a string
Problem:
Sort an array of characters (ASCII only, not UTF8).
Input: A string of characters, like a full English sentence, delimited by a newline or NULL. Duplicates are okay.
eg: This is easy
Output: A string of characters, in sorted order of their ASCII values. You can overwrite the existing array.
eg: Taehiisssy
Solution Complexity: Aim for linear time and constant additional space.
I know that in JavaScript you can do something like
const sorted = str.split('').sort().join('')
EDIT: I'm trying to see if I can make use of charCodeAt(i) method if I can get anything out of it.
But this would be O(nLogN) ^^ not linear (+extra space O(N) for split)
But in constant space, how would we sort array of characters?

Character-by-character formulate a cumulative count
const s="This is easy";
// Create an array which will hold the counts of each character, from 0 to 255 (although strictly speaking ASCII is only up to 127)
let count = Array(256).fill(0);
// Look at each character in the input and increment the count for that character in the array.
for(let i=0; i<= s.length; i++) {
c=s.charCodeAt(i);
count[c]++;
}
let out="";
// Now scan through the character count array ...
for(let i=0; i<= 255; i++) {
// And for each character, e.g. "T", show it the number of times you saw it in the input
for(let rep=0; rep<count[i]; rep++){
out+=String.fromCharCode(i);
}
}
console.log(out);
This only uses a constant table size, 256 numbers long (or whatever number of different symbols you wish to allow).
And the time it takes is linearly dependent on the number of characters in the input string (assuming almost no time is spent on the inner FOR loop when the count is zero for that character).

Related

Space complexity of finding non-repeating character in string

Here is a simple algorithm exercise. The problem is to return the first non-repeating character. For example, I have this string: 'abbbcdd' and the answer is 'a' because 'a' appears before 'c'. In case it doesn't find any repeated characters, it will return '_'.
My solution works correctly, but my question is about the performance. The problem statement says: "Write a solution that only iterates over the string once and uses O(1) additional memory."
Here is my code:
console.log(solution('abbbcdd'))
function solution(str) {
let chars = buildCharMap(str)
for (let i in chars) {
if (chars[i] === 1) {
return i
}
}
return '_'
}
function buildCharMap(str) {
const charMap = {}
for (let i = 0; i < str.length; i++) {
!charMap[str[i]] ? charMap[str[i]] = 1 : charMap[str[i]]++
}
return charMap
}
Does my answer meet the requirement for space complexity?
The time complexity is straightforward: you have a loop over a string of length n, and another loop over an object with strictly at most n keys. The operations inside the loops take O(1) time, and the loops are consecutive (not nested), so the running time is O(n).
The space complexity is slightly more subtle. If the input were a list of numbers instead of a string, for example, then we could straightforwardly say that charMap takes O(n) space in the worst case, because all of the numbers in the list might be different. However, for problems on strings we have to be aware that there is a limited alphabet of characters which those strings could be formed of. If that alphabet has size a, then your charMap object can have at most a keys, so the space complexity is O(min(a, n)).
That alphabet is often explicit in the problem - for example, if the input is guaranteed to contain only lowercase letters, or only letters and digits. Otherwise, it may be implicit in the fact that strings are formed of Unicode characters (or in older languages, ASCII characters). In the former case, a = 26 or 62. In the latter case, a = 65,536 or 1,112,064 depending on if we're counting code units or code points, because Javascript strings are encoded as UTF-16. Either way, if a is a constant, then O(a) space is O(1) space - although it could be quite a large constant.
That means that in practice, your algorithm does use O(1) space. In theory, it uses O(1) space if the problem statement specifies a fixed alphabet, and O(min(a, n)) space otherwise; not O(n) space. Assuming the former, then your solution does meet the space-complexity requirement of the problem.
This raises the question of why, when analysing algorithms on lists of numbers, we don't likewise say that Javascript numbers have a finite "alphabet" defined by the IEEE 754 specification for floating point numbers. The answer is a bit philosophical; we analyse running time and auxiliary space using abstract models of computation which generally assume numbers, lists and other data structures don't have a fixed limit on their size. But even in those models, we assume strings are formed from some alphabet, and if the alphabet isn't fixed in the problem then we let the alphabet size be a variable a which we assume is independent of n. This is a sensible way to analyse algorithms on strings, because alphabet size and string length are independent in the problems we're usually interested in.

How is this O(1) space and not O(n) space. firstNotRepeatingCharacter Challenge solution

I am having trouble understanding how the following solution is O(1) space and not O(n) space. The coding challenge is as follows:
Write a solution that only iterates over the string once and uses O(1) additional memory, since this is what you would be asked to do during a real interview.
Given a string s, find and return the first instance of a non-repeating character in it. If there is no such character then return '_'.
The following is a solution that is O(1) space.
function firstNotRepeatingCharacters(s: string) : string {
const chars: string[] = s.split('');
let duplicates = {};
let answer = '_';
let indexAnswer = Number.MAX_SAFE_INTEGER;
chars.forEach((element, index) => {
if(!duplicates.hasOwnProperty(element)) {
duplicates[element] = {
count: 1,
index
}
} else {
duplicates[element].count++;
duplicates[element].index = index;
}
});
for(const key in duplicates) {
if(duplicates[key].count === 1 && duplicates[key].index < indexAnswer) {
answer = key;
indexAnswer = duplicates[key].index;
}
}
return answer;
}
console.log(firstNotRepeatingCharacter('abacabad'));
console.log(firstNotRepeatingCharacter('abacabaabacaba'));
I do not understand how the above solution is O(1) space. Since we are iterating through our array we are mapping each element to an object (duplicate). I would think this would be considered O(n), could somebody clarify how this is O(1) for me. Thanks.
The memory usage is proportion to the number of distinct characters in the string. The number of distinct characters has an upper limit of 52 (or some other finite value) and the potential memory usage does not increase as n increases once each of the distinct characters has been seen.
Thus, there exists an upper limit on the memory usage that is constant (does not depend on n), so the memory usage is O(1).
Indeed this is an 0(1) complexity, but only on space constraints. Since we have an upper limit. This limit could be UTF-16, it could be the amount of English letters.
This is a constraint given by the Developer. Saying that, it's only a 0(1) in space constraints if the code above ran with a finite set of combinations.
A String it's limited by implementation to a 64 bit character "array". So the store capacity generally of a "String" type it's 2147483647 (2ˆ31 - 1) characters. That's not really what 0(1) represents. So virtually that's an 0(N) in space constraints.
Now the situation here it's totally different for time complexity constraints. It should be in the optimal scenario a 0(N) + 0(N - E) + 0(N).
Explaining:
1. First 0(N) the first loop goes through all the elements
2. Second 0(N) is about the deletion. The code delete's element's from the array.
3. 0(N - E) the second forEach loops the final popped array, so we have a constant E.
And that's supposing that the data structure is an Array.
There's a lot to Digg here.
TL;DR
It's not a 0(1).
The algorithm has O(min(a,n)) space complexity (where a is number of letters used for text cooding e.g. for UTF8 a>1M). For worst case: string with uniqe characters (in this case n<=a) e.g. abcdefgh the duplicates object has the same number of keys as number letters of input string - and what is clear on this case, the size of used memory depends on n.
The O(1) is only for case when string contains one repeated letter e.g. aaaaaaa.
Bonus: Your code can be "compressed" in this way :)
function firstNotRepeatingCharacters(s, d={}, r="_") {
for(let i=0; i<s.length; i++) d[s[i]]=++d[s[i]]|0;
for(let i=s.length-1; i>=0; i--) if(!d[s[i]]) r=s[i];
return r;
}
console.log(firstNotRepeatingCharacters('abacabad'));
console.log(firstNotRepeatingCharacters('abacabaabacaba'));

How does this for loop and if statement work?

Trying to figure out in my own understanding how the for loop and if statement of this function work. This is so because having googled the challenge, this code is shorter but same result as my initial one. The confusion is at the longest variable. It stores the longest lengths of the words greater than str.length(5) - or I may be wrong. For some ununderstood reason, the length of language(8) is not stored in the variable although 5, 10 and 18 are.
function longestWord(str) {
str = str.split(" ");
var longest = 0;
var word = null;
for (var i = 0; i < str.length; i++) {
if (longest < str[i].length) {
console.log("str = " + str[i]);
longest = str[i].length;
console.log("longest = " + longest); //What happended to 8 for language?
word = str[i];
}
}
return word;
}
console.log(longestWord("Using the JavaScript language bademnostalgiastic"));
All this does it keep track of the longest word (and stores the char count in longest). For each iteration, it tests to see if the next string has more characters than the currently recorded longest string (determined by longest). If it does, it stores the new char count as it is the new "winner of being the longest".
Here's what's happening here:
take a string and split it up into words (determined by spaces)
at this point you have a string array of all the individual strings divided by " "
loop through all of the strings in the array
if the current string that you are iterating through has character counter more than any other previous ones, then store this current character count in the variable longest
continue the loop and use the above logic in the previous point
So at the end of this all you have the actual string (stored in word) and the character count (stored in longest) of the word with the most characters.
JavaScript is 10 characters long and is tested before language, so the if test fails and it is skipped.
It stores the longest lengths of the words greater than str.length(5)
No. It stores the longest length seen so far. It gets 5 when Using is tested, but that is quickly overwritten.
The length of "JavaScript" is 10, which is longer than "language". Since "JavaScript" comes first, "language" won't be longer than the longest, so the if statement will result in false.
array str[]={Using, the,JavaScript, language ,bademnostalgiastic}
Iteration 1
str[i]=Using
str.length=5 (a)
Longest =0 (b)
since (a)>(b)
Longest =5
word=Using
Iteration 2
str[i]=the
str.length=3 (a)
Longest =5 (b)
since (a)<(b)
Longest and word remain same
so,Longest =5
and,word=Using
Iteration 3
str[i]=JavaScript
str.length=10 (a)
Longest =5 (b)
since (a)>(b)
so,Longest =10
and,word=JavaScript
Iteration4
str[i]=language
str.length=8 (a)
Longest =10 (b)
since (a)<(b)
so longest remain same
so,Longest =10
and,word=JavaScript
Iteration5
str[i]=bademnostalgiastic
str.length=18 (a)
Longest =10 (b)
since (a)>(b)
so,Longest =18
and,word=bademnostalgiastic
END OF LOOP
so longest word bademnostalgiastic
Here's a breakdown:
str = str.split(" ");
This is making an array of strings split on spaces.
for (var i = 0; i < str.length; i++)
We're starting here with i (the iterator variable) at 0. We're going to keep doing this loop while i is less than the length of str. We're going to increase i by 1 each time we go through this loop.
if (longest < str[i].length)
Here we check if the longest we've saved is less than the string's length we're looking at.
longest = str[i].length;
Here we assign the new longest string, because this one is longer.
word = str[i];
We also save the word, likely so we can use it later.
return word;
After the loop ends, we're going to send word back as the result.
console.log(longestWord("Using the JavaScript language bademnostalgiastic"));
This is your call and print statement.
The reason you're seeing 5, 10, and 18, is because you're only printing out values when the value is bigger than what you've already seen.

Calculate real length of a string, like we do with the caret

What I want is to calculate how much time the caret will move from the beginning till the end of the string.
Explanations:
Look this string "" in this fiddle: http://jsfiddle.net/RFuQ3/
If you put the caret before the first quote then push the right arrow ► you will push 3 times to arrive after the second quote (instead of 2 times for an empty string).
The first way, and the easiest to calculate the length of a string is <string>.length.
But here, it returns 2.
The second way, from JavaScript Get real length of a string (without entities) gives 2 too.
How can I get 1?
1-I thought to a way to put the string in a text input, and then do a while loop with a try{setCaret}catch(){}
2-It's just for fun
The character in your question "󠀁" is the
Unicode Character 'LANGUAGE TAG' (U+E0001).
From the following Stack Overflow questions,
" Expressing UTF-16 unicode characters in JavaScript"
" How can I tell if a string contains multibyte characters in Javascript?"
we learn that
JavaScript strings are UCS-2 encoded but can represent Unicode code points outside the Basic Multilingual Pane (U+0000-U+D7FF and U+E000-U+FFFF) using two 16 bit numbers (a UTF-16 surrogate pair), the first of which must be in the range U+D800-U+DFFF.
The UTF-16 surrogate pair representing "󠀁" is U+DB40 and U+DC01. In decimal U+DB40 is 56128, and U+DC01 is 56321.
console.log("󠀁".length); // 2
console.log("󠀁".charCodeAt(0)); // 56128
console.log("󠀁".charCodeAt(1)); // 56321
console.log("\uDB40\uDC01" === "󠀁"); // true
console.log(String.fromCharCode(0xDB40, 0xDC01) === "󠀁"); // true
Adapting the code from https://stackoverflow.com/a/4885062/788324, we just need to count the number of code points to arrive at the correct answer:
var getNumCodePoints = function(str) {
var numCodePoints = 0;
for (var i = 0; i < str.length; i++) {
var charCode = str.charCodeAt(i);
if ((charCode & 0xF800) == 0xD800) {
i++;
}
numCodePoints++;
}
return numCodePoints;
};
console.log(getNumCodePoints("󠀁")); // 1
jsFiddle Demo
function realLength(str) {
var i = 1;
while (str.substring(i,i+1) != "") i++;
return (i-1);
}
Didn't try the code, but it should work I think.
Javascript doesn't really support unicode.
You can try
yourstring.replace(/[\uD800-\uDFFF]{2}/g, "0").length
for what it's worth

Regex for extracting separate letters in a loop with javascript

I'm working on a script to create metrics for online author identification. One of the things I came across in the literature is to count the frequency of each letter (how many a's, how many b's, etc) independent of upper or lower case. Since I don't want to create a separate statement for each letter, I'm trying to loop the thing, but I can't figure it out. The best I have been able to come up with is converting the ASCII letter code in to hex, and then...hopefully a miracle happens.
So far, I've got
element = id.toLowerCase();
var hex = 0;
for (k=97; k<122; k++){
hex = k.toString(16); //gets me to hex
letter = element.replace(/[^\hex]/g, "")//remove everything but the current letter I'm looking for
return letter.length // the length of the resulting string is how many times the ltter came up
}
but of course, when I do that, it interprets hex as the letters h e x, not the hex code for the letter I want.
Not sure why you'd want to convert to hex, but you could loop through the string's characters and keep track of how many times each one has appeared with an object used as a hash:
var element = id.toLowerCase();
var keys = {};
for(var i = 0, len = element.length; i<len; i++) {
if(keys[element.charAt(i)]) keys[element.charAt(i)]++;
else keys[element.charAt(i)] = 1;
}
You could use an array to do the same thing but a hash is faster.

Categories

Resources