Append array element only if it is not already there in Javascript - javascript

I need to add an element to an array only if it is not already there in Javascript. Basically I'm treating the array as a set.
I need the data to be stored in an array, otherwise I'd just use an object which can be used as a set.
I wrote the following array prototype and wanted to hear if anyone knew of a better way. This is an O(n) insert. I was hoping to do O(ln(n)) insert, however, I didn't see an easy way to insert an element into a sorted array. For my applications, the array lengths will be very small, but I'd still prefer something that obeyed accepted rules for good algorithm efficiency:
Array.prototype.push_if_not_duplicate = function(new_element){
for( var i=0; i<this.length; i++ ){
// Don't add if element is already found
if( this[i] == new_element ){
return this.length;
}
}
// add new element
return this.push(new_element);
}

If I understand correctly, you already have a sorted array (if you do not have a sorted array then you can use Array.sort method to sort your data) and now you want to add an element to it if it is not already present in the array. I extracted the binary insert (which uses binary search) method in the google closure library. The relevant code itself would look something like this and it is O(log n) operation because binary search is O(log n).
function binaryInsert(array, value) {
var index = binarySearch(array, value);
if (index < 0) {
array.splice(-(index + 1), 0, value);
return true;
}
return false;
};
function binarySearch(arr, value) {
var left = 0; // inclusive
var right = arr.length; // exclusive
var found;
while (left < right) {
var middle = (left + right) >> 1;
var compareResult = value > arr[middle] ? 1 : value < arr[middle] ? -1 : 0;
if (compareResult > 0) {
left = middle + 1;
} else {
right = middle;
// We are looking for the lowest index so we can't return immediately.
found = !compareResult;
}
}
// left is the index if found, or the insertion point otherwise.
// ~left is a shorthand for -left - 1.
return found ? left : ~left;
};
Usage is binaryInsert(array, value). This also maintains the sort of the array.

Deleted my other answer because I missed the fact that the array is sorted.
The algorithm you wrote goes through every element in the array and if there are no matches appends the new element on the end. I assume this means you are running another sort after.
The whole algorithm could be improved by using a divide and conquer algorithm. Choose an element in the middle of the array, compare with new element and continue until you find the spot where to insert. It will be slightly faster than your above algorithm, and won't require a sort afterwards.
If you need help working out the algorithm, feel free to ask.

I've created a (simple and incomplete) Set type before like this:
var Set = function (hashCodeGenerator) {
this.hashCode = hashCodeGenerator;
this.set = {};
this.elements = [];
};
Set.prototype = {
add: function (element) {
var hashCode = this.hashCode(element);
if (this.set[hashCode]) return false;
this.set[hashCode] = true;
this.elements.push(element);
return true;
},
get: function (element) {
var hashCode = this.hashCode(element);
return this.set[hashCode];
},
getElements: function () { return this.elements; }
};
You just need to find out a good hashCodeGenerator function for your objects. If your objects are primitives, this function can return the object itself. You can then access the set elements in array form from the getElements accessor. Inserts are O(1). Space requirements are O(2n).

If your array is a binary tree, you can insert in O(log n) by putting the new element on the end and bubbling it up into place. Checks for duplicates would also take O(log n) to perform.
Wikipedia has a great explanation.

Related

2 Sum algorithm explantion?

I am a noobie in JavaScript algorithm and cannot understand this optimal solution of the 2-sum
function twoNumberSum(array, target) {
const nums = {};
for (const num of array) {
const potentialMatch = target - num;
console.log('potential', potentialMatch);
if (potentialMatch in nums) {
return [potentialMatch, num]
} else {
nums[num] = true;
}
}
}
So the 2-sum problem basically says "find two numbers in an array that sum to the given target, and return their index". Let's walk through this code and talk about what's happening.
First, we start the function; I'm going to assume this makes sense (a function that's called twoNumberSum that takes in two arguments; namely, array and target) - note that in JS, we don't annotate types, so there is no return type
Now, first thing we do is create a new object called nums. In JS, objects are effectively hash maps (with some very important differences - see my note below); they store a key and a corresponding value. In JS, a key can be any string or number
Next, we start our iteration. If we do for (const a of b), and b is an array, this iterates over all the values of the array, with each iteration having that value stored in a.
Next, we subtract our current value from the target. Then comes the key line: if (potentialMatch in nums). The in keyword checks for the existence of a key: 'hello' in obj returns true if obj has the key 'hello'.
In this case, if we find this potential match, then that means we have found another number that is equal to target - num, which of course means we've found the other partner for our sum! So in this case, we simply return the two numbers. If, on the other hand, we do not find this potentialMatch, that means we need to keep looking. But we do want to remember we've seen this number - thus, we add it as a key by doing nums[num] = true (this creates a new key-value pair; namely the key is num and the value is true).
As one of the comments explained, this is just trying to keep track of a list of numbers; however, the author is trying to be clever by using a Hash Table instead of a normal array. This way, lookups are O(1) instead of O(n). For eyes not used to JS semantics, another way of explaining this code is that we build up a Map of the numbers, and then we check that map for our target value.
I mentioned earlier that using objects as hash tables isn't the best idea; this is because if you aren't careful, if you use user-provided keys, you can accidentally mess with what's called the Prototype Chain. This is beyond this discussion, but a better way forward would be to use a Set:
function twoNumberSum(array, target) {
// Create a new Hash Set. Sets take in an iterable, so we could
// Do it this way. But to remain as close to your original solution
// as possible, we won't for now, and instead populate it as we go
// const nums = new Set(array);
const nums = new Set();
for (const num of array) {
const potentialMatch = target - num;
if (nums.has(potentialMatch)) {
return [potentialMatch, num];
} else {
nums.add(num);
}
}
Sometimes, the problem instead asks for you to return the indices; using a Map instead makes this relatively trivial. Just store the index as the value and you're good to go!
function twoNumberSum(array, target) {
// Create the new map instead
const nums = new Map();
for (let n = 0; n < array.length; ++n) {
const potentialMatch = target - array[n];
if (nums.has(potentialMatch)) {
return [nums.get(potentialMatch), n];
} else {
nums.set(array[n], n);
}
}
Let me explain to you what it's all is working-.
function twoNumberSum(array, target) {
// This is and object in Javascript
const nums = {};
for (const num of array) { // This is for of loop which iterates the array.
//For of Doc - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...of
// Here's its calculating the potential.
const potentialMatch = target - num;
console.log('potential - ' + potentialMatch);
/**
* Nowhere `in` is used which check if any property exists in an object or not.
* in Usage - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/in
*
* It checks whether potential exists in the `nums` object, If exist it returns the array
* with potentialMatch and num to which it is matched.
*
* If the number is not there in nums object. It's setting there in else block
* to match in net iteration.
*/
if (potentialMatch in nums) {
return [potentialMatch, num]
} else {
nums[num] = true;
/**
* It forms an object when the potential match doesn't exist in nums for checking in the next iteration
* {
* 1: true,
* 2: true
* }
*/
}
console.log(nums)
}
}
console.log(twoNumberSum([1, 2, 4, 5, 6, 7, 8], 3))
You can also Run it from JSBin

Allocation-free abstractions in Javascript

I have a general question which is about whether it is possible to make zero-allocation iterators in Javascript. Note that by "iterator" I am not married to the current definition of iterator in ECMAScript, but just a general pattern for iterating over user-defined ranges.
To make the problem concrete, say I have a list like [5, 5, 5, 2, 2, 1, 1, 1, 1] and I want to group adjacent repetitions together, and process it into a form which is more like [5, 3], [2, 2], [1, 4]. I then want to access each of these pairs inside a loop, something like "for each pair in grouped(array), do something with pair". Furthermore, I want to reuse this grouping algorithm in many places, and crucially, in some really hot inner loops (think millions of loops per second).
Question: Is there an iteration pattern to accomplish this which has zero overhead, as if I hand-wrote the loop myself?
Here are the things I've tried so far. Let's suppose for concreteness that I am trying to compute the sum of all pairs. (To be clear I am not looking for alternative ways of writing this code, I am looking for an abstraction pattern: the code is just here to provide a concrete example.)
Inlining the grouping code by hand. This method performs the best, but obscures the intent of the computation. Furthermore, inlining by hand is error-prone and annoying.
function sumPairs(array) {
let sum = 0
for (let i = 0; i != array.length; ) {
let elem = array[i++], count = 1
while (i < array.length && array[i] == elem) { i++; count++; }
// Here we can actually use the pair (elem, count)
sum += elem + count
}
return sum
}
Using a visitor pattern. We can write a reduceGroups function which will call a given visitor(acc, elem, count) for each pair (elem, count), similar to the usual Array.reduce method. With that our computation becomes somewhat clearer to read.
function sumPairsVisitor(array) {
return reduceGroups(array, (sofar, elem, count) => sofar + elem + count, 0)
}
Unfortunately, Firefox in particular still allocates when running this function, unless the closure definition is manually moved outside the function. Furthermore, we lose the ability to use control structures like break unless we complicate the interface a lot.
Writing a custom iterator. We can make a custom "iterator" (not an ES6 iterator) which exposes elem and count properties, an empty property indicating that there are no more pairs remaining, and a next() method which updates elem and count to the next pair. The consuming code looks like this:
function sumPairsIterator(array) {
let sum = 0
for (let iter = new GroupIter(array); !iter.empty; iter.next())
sum += iter.elem + iter.count
return sum
}
I find this code the easiest to read, and it seems to me that it should be the fastest method of abstraction. (In the best possible case, scalar replacement could completely collapse the iterator definition into the function. In the second best case, it should be clear that the iterator does not escape the for loop, so it can be stack-allocated). Unfortunately, both Chrome and Firefox seem to allocate here.
Of the approaches above, the custom-defined iterator performs quite well in most cases, except when you really need to put the pedal to the metal in a hot inner loop, at which point the GC pressure becomes apparent.
I would also be ok with a Javascript post-processor (the Google Closure Compiler perhaps?) which is able to accomplish this.
Check this out. I've not tested its performance but it should be good.
(+) (mostly) compatible to ES6 iterators.
(-) sacrificed ...GroupingIterator.from(arr) in order to not create a (imo. garbage) value-object. That's the mostly in the point above.
afaik, the primary use case for this is a for..of loop anyways.
(+) no objects created (GC)
(+) object pooling for the iterators; (again GC)
(+) compatible with controll-structures like break
class GroupingIterator {
/* object pooling */
static from(array) {
const instance = GroupingIterator._pool || new GroupingIterator();
GroupingIterator._pool = instance._pool;
instance._pool = null;
instance.array = array;
instance.done = false;
return instance;
}
static _pool = null;
_pool = null;
/* state and value / payload */
array = null;
element = null;
index = 0;
count = 0;
/* IteratorResult interface */
value = this;
done = true;
/* Iterator interface */
next() {
const array = this.array;
let index = this.index += this.count;
if (!array || index >= array.length) {
return this.return();
}
const element = this.element = array[index];
while (++index < array.length) {
if (array[index] !== element) break;
}
this.count = index - this.index;
return this;
}
return() {
this.done = true;
// cleanup
this.element = this.array = null;
this.count = this.index = 0;
// return iterator to pool
this._pool = GroupingIterator._pool;
return GroupingIterator._pool = this;
}
/* Iterable interface */
[Symbol.iterator]() {
return this;
}
}
var arr = [5, 5, 5, 2, 2, 1, 1, 1, 1];
for (const item of GroupingIterator.from(arr)) {
console.log("element", item.element, "index", item.index, "count", item.count);
}

Methods changing the object that calls them vs methods that return a changed object [duplicate]

Is there any JavaScript Array library that normalizes the Array return values and mutations? I think the JavaScript Array API is very inconsistent.
Some methods mutate the array:
var A = [0,1,2];
A.splice(0,1); // reduces A and returns a new array containing the deleted elements
Some don’t:
A.slice(0,1); // leaves A untouched and returns a new array
Some return a reference to the mutated array:
A = A.reverse().reverse(); // reverses and then reverses back
Some just return undefined:
B = A.forEach(function(){});
What I would like is to always mutate the array and always return the same array, so I can have some kind of consistency and also be able to chain. For example:
A.slice(0,1).reverse().forEach(function(){}).concat(['a','b']);
I tried some simple snippets like:
var superArray = function() {
this.length = 0;
}
superArray.prototype = {
constructor: superArray,
// custom mass-push method
add: function(arr) {
return this.push.apply(this, arr);
}
}
// native mutations
'join pop push reverse shift sort splice unshift map forEach'.split(' ').forEach(function(name) {
superArray.prototype[name] = (function(name) {
return function() {
Array.prototype[name].apply(this, arguments);
// always return this for chaining
return this;
};
}(name));
});
// try it
var a = new superArray();
a.push(3).push(4).reverse();
This works fine for most mutation methods, but there are problems. For example I need to write custom prototypes for each method that does not mutate the original array.
So as always while I was doing this, I was thinking that maybe this has been done before? Are there any lightweight array libraries that do this already? It would be nice if the library also adds shims for new JavaScript 1.6 methods for older browsers.
I don't think it is really inconsistent. Yes, they might be a little confusing as JavaScript arrays do all the things for which other languages have separate structures (list, queue, stack, …), but their definition is quite consistent across languages. You can easily group them in those categories you already described:
list methods:
push/unshift return the length after adding elements
pop/shift return the requested element
you could define additional methods for getting first and last element, but they're seldom needed
splice is the all-purpose-tool for removing/replacing/inserting items in the middle of the list - it returns an array of the removed elements.
sort and reverse are the two standard in-place reordering methods.
All the other methods do not modify the original array:
slice to get subarrays by position, filter to get them by condition and concat to combine with others create and return new arrays
forEach just iterates the array and returns nothing
every/some test the items for a condition, indexOf and lastIndexOf search for items (by equality) - both return their results
reduce/reduceRight reduce the array items to a single value and return that. Special cases are:
map reduces to a new array - it is like forEach but returning the results
join and toString reduce to a string
These methods are enough for the most of our needs. We can do quite everything with them, and I don't know any libraries that add similar, but internally or result-wise different methods to them. Most data-handling libs (like Underscore) only make them cross-browser-safe (es5-shim) and provide additional utility methods.
What I would like is to always mutate the array and always return the same array, so I can have some kind of consistency and also be able to chain.
I'd say the JavaScript consistency is to alway return a new array when elements or length are modified. I guess this is because objects are reference values, and changing them would too often cause side effects in other scopes that reference the same array.
Chaining is still possible with that, you can use slice, concat, sort, reverse, filter and map together to create a new array in only one step. If you want to "modify" the array only, you can just reassign it to the array variable:
A = A.slice(0,1).reverse().concat(['a','b']);
Mutation methods have only one advantage to me: they are faster because they might be more memory-efficient (depends on the implementation and its garbage collection, of course). So lets implement some methods for those. As Array subclassing is neither possible nor useful, I will define them on the native prototype:
var ap = Array.prototype;
// the simple ones:
ap.each = function(){ ap.forEach.apply(this, arguments); return this; };
ap.prepend = function() { ap.unshift.apply(this, arguments); return this; };
ap.append = function() { ap.push.apply(this, arguments; return this; };
ap.reversed = function() { return ap.reverse.call(ap.slice.call(this)); };
ap.sorted = function() { return ap.sort.apply(ap.slice.call(this), arguments); };
// more complex:
ap.shorten = function(start, end) { // in-place slice
if (Object(this) !== this) throw new TypeError();
var len = this.length >>> 0;
start = start >>> 0; // actually should do isFinite, then floor towards 0
end = typeof end === 'undefined' ? len : end >>> 0; // again
start = start < 0 ? Math.max(len + start, 0) : Math.min(start, len);
end = end < 0 ? Math.max(len + end, 0) : Math.min(end, len);
ap.splice.call(this, end, len);
ap.splice.call(this, 0, start);
return this;
};
ap.restrict = function(fun) { // in-place filter
// while applying fun the array stays unmodified
var res = ap.filter.apply(this, arguments);
res.unshift(0, this.length >>> 0);
ap.splice.apply(this, res);
return this;
};
ap.transform = function(fun) { // in-place map
if (Object(this) !== this || typeof fun !== 'function') throw new TypeError();
var len = this.length >>> 0,
thisArg = arguments[1];
for (var i=0; i<len; i++)
if (i in this)
this[i] = fun.call(thisArg, this[i], i, this)
return this;
};
// possibly more
Now you could do
A.shorten(0, 1).reverse().append('a', 'b');
IMHO one of the best libraries is underscorejs http://underscorejs.org/
You probably shouldn't use a library just for this (it's a not-so-useful dependency to add to the project).
The 'standard' way to do is to call slice when you want to exec mutate operations. There is not problem to do this, since the JS engines are pretty good with temporary variables (since that's one of the key point of the javascript).
Ex. :
function reverseStringify( array ) {
return array.slice( )
.reverse( )
.join( ' ' ); }
console.log( [ 'hello', 'world' ] );

Java Script - How to search object within array efficiently [duplicate]

This question already has answers here:
JS - jQuery inarray ignoreCase() and contains()
(2 answers)
Closed 9 years ago.
I need to find an object (a string to be precise) inside a large array. While the below code works, it's tediously scrolling through each element of the array, a brute method. Is there a more efficient method? possibly calling on .search or .match or equivalent? Also how to make the search object (string) case insensitive? i.e object might be "abc" while array element is "ABC".
Many thanks in advance
function SearchArray(array, object){ //need to modify code to become case insensitive.
for (var i= 1; i< array.length; i++){
if (array[i] == object.toString()){
return i;
}
}
return 0;
}
I also forgot to mention that the search returns the index / position of the matched object within the one dimensional array, rather than simple true / false.
The following function does just that:
function findWord(array, word) {
return -1 < array.map(function(item) { return item.toLowerCase(); }).indexOf(word.toLowerCase());
}
What it does is:
Converting every string to lower case using the map() function.
Searching for a certain word also in lower case mode.
If the word is found, the function returns a value greater than -1, therefore the return value is either true or false
If you are only going to search the large array once then the only likely optimization is to store the object string representation instead of generating it before each comparison:
function SearchArray(array, object) {
var len=array.length, str=object.toString().toLowerCase();
for (var i=0; i<len; i++) {
if (array[i].toLowerCase() == str) { return i; }
}
return -1; // Return -1 per the "Array.indexOf()" method.
}
However, if you will be searching for many objects within the array then you will save time by storing the lower-case version of the elements:
var lowerArray = array.map(function(x){return x.toString().toLowerCase();});
var lowerObject = object.toString().toLowerCase();
lowerArray.indexOf(lowerObject); // Simply use "Array.indexOf()".
Moreover, if you will be searching this array many times, plenty of memory is available, and performance is critical then you should consider using an object for O(1) lookup:
function makeLowerCaseArrayIndexLookupFunction(array) {
var lookup = array.reduce(function(memo, x, i) {
memo[x.toString().toLowerCase()] = i;
return memo;
}, {});
return function(obj) {
var idx = lookup[obj.toString().toLowerCase()];
return (typeof(idx)==='undefined') ? -1 : idx;
}
}
var findWeekdays = makeLowerCaseArrayIndexLookupFunction([
'Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat', 'Sun'
]);
findWeekdays('mon'); // => 0
findWeekdays('FRI'); // => 4
findWeekdays('x'); // => -1

Three map implementations in javascript. Which one is better?

I wrote a simple map implementation for some task. Then, out of curiosity, I wrote two more. I like map1 but the code is kinda hard to read. If somebody is interested, I'd appreciate a simple code review.
Which one is better? Do you know some other way to implement this in javascript?
var map = function(arr, func) {
var newarr = [];
for (var i = 0; i < arr.length; i++) {
newarr[i] = func(arr[i]);
}
return newarr;
};
var map1 = function(arr, func) {
if (arr.length === 0) return [];
return [func(arr[0])].concat(funcmap(arr.slice(1), func));
};
var map2 = function(arr, func) {
var iter = function(result, i) {
if (i === arr.length) return result;
result.push(func(arr[i]));
return iter(result, i+1);
};
return iter([], 0);
};
Thanks!
EDIT
I am thinking about such function in general.
For example, right now I am going to use it to iterate like this:
map(['class1', 'class2', 'class3'], function(cls) {
el.removeClass(cls);
});
or
ids = map(elements, extract_id);
/* elements is a collection of html elements,
extract_id is a func that extracts id from innerHTML */
What about the map implementation used natively on Firefox and SpiderMonkey, I think it's very straight forward:
if (!Array.prototype.map) {
Array.prototype.map = function(fun /*, thisp*/) {
var len = this.length >>> 0; // make sure length is a positive number
if (typeof fun != "function") // make sure the first argument is a function
throw new TypeError();
var res = new Array(len); // initialize the resulting array
var thisp = arguments[1]; // an optional 'context' argument
for (var i = 0; i < len; i++) {
if (i in this)
res[i] = fun.call(thisp, this[i], i, this); // fill the resulting array
}
return res;
};
}
If you don't want to extend the Array.prototype, declare it as a normal function expression.
As a reference, map is implemented as following in jQuery
map: function( elems, callback ) {
var ret = [];
// Go through the array, translating each of the items to their
// new value (or values).
for ( var i = 0, length = elems.length; i < length; i++ ) {
var value = callback( elems[ i ], i );
if ( value != null )
ret[ ret.length ] = value;
}
return ret.concat.apply( [], ret );
}
which seems most similar to your first implementation. I'd say the first one is preferred as it is the simplest to read and understand. But if performance is your concern, profile them.
I think that depends on what you want map to do when func might change the array. I would tend to err on the side of simplicity and sample length once.
You can always specify the output size as in
var map = function(arr, func) {
var n = arr.length & 0x7fffffff; // Make sure n is a non-neg integer
var newarr = new Array(n); // Preallocate array size
var USELESS = {};
for (var i = 0; i < n; ++i) {
newarr[i] = func.call(USELESS, arr[i]);
}
return newarr;
};
I used the func.call() form instead of just func(...) instead since I dislike calling user supplied code without specifying what 'this' is, but YMMV.
This first one is most appropriate. Recursing one level for every array item may make sense in a functional language, but in a procedural language without tail-call optimisation it's insane.
However, there is already a map function on Array: it is defined by ECMA-262 Fifth Edition and, as a built-in function, is going to be the optimal choice. Use that:
alert([1,2,3].map(function(n) { return n+3; })); // 4,5,6
The only problem is that Fifth Edition isn't supported by all current browsers: in particular, the Array extensions are not present in IE. But you can fix that with a little remedial work on the Array prototype:
if (!Array.prototype.map) {
Array.prototype.map= function(fn, that) {
var result= new Array(this.length);
for (var i= 0; i<this.length; i++)
if (i in this)
result[i]= fn.call(that, this[i], i, this);
return result;
};
}
This version, as per the ECMA standard, allows an optional object to be passed in to bind to this in the function call, and skips over any missing values (it's legal in JavaScript to have a list of length 3 where there is no second item).
There's something wrong in second method. 'funcmap' shouldn't be changed to 'map1'?
If so - this method loses, as concat() method is expensive - creates new array from given ones, so has to allocate extra memory and execute in O(array1.length + array2.length).
I like your first implementation best - it's definitely easiest to understand and seems quick in execution to me. No extra declaration (like in third way), extra function calls - just one for loop and array.length assignments.
I'd say the first one wins on simplicity (and immediate understandability); performance will be highly dependent on what the engine at hand optimizes, so you'd have to profile in the engines you want to support.

Categories

Resources