Indexing (maintaining indices) in an array makes Array.prototype.shift and Array.prototype.unshift O(N) instead of O(1).
However, if we just want to use pop() / push() / shift() and unshift() and never use indices for lookup, is there a way to implement a JavaScript array that omits indexing?
I can't think of a way to do it.
The only way I can think of doing it would be with arrays, and only using pop() / push() (since those are O(1)) ... but even with multiple arrays, not sure if it's possible.
Looking to do this w/o a linked-list if possible. I implemented a solution to this with a doubly linked list, but wondering if it's possible to do this w/o a linked-list.
End goal: trying to create a FIFO queue where all operations are in constant time, without using a linked-list.
How about an ES2015 Map that you index with integers?
Let's call the map myFIFOMap.
You keep a first and last integer member as part of your FIFO class. Start them both at zero.
Every time you want to push() into your FIFO queue, you call myFIFOMap.set(++last,item). And pop() looks something like:
const item = myFIFOMap.get(first);
myFIFOMap.delete(first++);
return item;
Should be O(1) to push or pop.
Don't forget to check for boundary conditions (e.g., don't let them pop() when first===last).
Given that JavaScript actually uses double precision floating point, you should be able to run ~2^53 objects through your FIFO before you have problems with the integer precision. So if you run 10,000 items through your FIFO per second, that should be good for around 28,000 years of run time.
If the data you are storing is primitive (string, integers, floats, or combinations of primitives), you can use a JavaScript TypedArray, cast it into an appropriate typed array view, load it with data, and then keep track of the offset(s) yourself.
In your example, pop, shift, and unshift can all implemented by incrementing/decrementing an integer index. push is more difficult, because a TypedArray is a fixed size: if the ArrayBuffer is full, the only two options are to truncate the data, or allocate a new typed array, since JS cannot store pointers.
If you are storing homogeneous objects (they have the same properties), you can save each value into a TypedArray using different views and offsets to mimic a C struct (see the MDN example), and then use a JS function to serialize/unserialize them from the TypedArray, basically converting the data from a binary representation, into a full-fledged JS object.
Going with #SomeCallMeTim 's answer, which I think is on the right track, I have this:
export class Queue {
lookup = new Map<number, any>();
first = 0;
last = 0;
length = 0;
elementExists = false; // when first === last, and item exists there
peek() {
return this.lookup.get(this.first);
}
getByIndex(v: number) {
return this.lookup.get(v);
}
getLength() {
return this.length;
}
pop() {
const last = this.last;
if (this.elementExists && this.first === this.last) {
this.length--;
this.elementExists = false;
}
else if (this.last > this.first) {
this.length--;
this.last--;
}
const v = this.lookup.get(last);
this.lookup.delete(last);
return v;
}
shift() {
const first = this.first;
if (this.elementExists && this.first === this.last) {
this.length--;
this.elementExists = false;
}
else if (this.first < this.last) {
this.length--;
this.first++;
}
const v = this.lookup.get(first);
this.lookup.delete(first);
return v;
}
push(v: any) {
this.length++;
if (this.elementExists && this.first === this.last) {
this.last++;
}
else if (this.first === this.last) {
this.elementExists = true;
}
else {
this.last++;
}
return this.lookup.set(this.last, v);
}
enq(v: any) {
return this.push.apply(this, arguments);
}
enqueue(v: any) {
return this.push.apply(this, arguments);
}
deq() {
return this.shift.apply(this, arguments);
}
dequeue() {
return this.shift.apply(this, arguments);
}
unshift(v: any) {
this.length++;
if (this.elementExists && this.first === this.last) {
this.first--;
}
else if (this.first === this.last) {
this.elementExists = true;
}
else {
this.first--;
}
return this.lookup.set(this.first, v);
}
addToFront(v: any){
return this.unshift.apply(this,arguments);
}
removeAll() {
return this.clear.apply(this, arguments);
}
clear(): void {
this.length = 0;
this.elementExists = false;
this.first = 0;
this.last = 0;
this.lookup.clear();
}
}
takeaways:
it turns out, you can call getByIndex(), as Tim's suggestion points out.
Using Map is surprisingly ~10% faster than POJSO, possibly only because with a POJSO the integers need to get converted to strings for lookup.
The Map implementation is about 20% faster than doubly-linked list, so a doubly-linked list is not that much slower. It's probably slower mostly because we must create a container object with next/prev pointers for each item in the queue, whereas with the non-linked list implementation, we can insert primitives in the queue, etc.
The doubly-linked list allows us to remove/insert items from the middle of the queue in constant time; we cannot do the same with the non-linked list implementation as is.
All of the above are orders of magnitude more performant than a plain array when operating on an array with more than 10,000 elements or so.
I have some constant time queue implementations here:
https://github.com/ORESoftware/linked-queue
Tim had a good suggestion, to make getByIndex() easier to use - we can do this:
getByIndex(v: number) {
if(!Number.isInteger(v)){
throw new Error('Argument must be an integer.');
}
return this.lookup.get(v + this.first);
}
that way to get the 5th element in the queue, all we need to do is:
getByIndex(4);
Related
Languages such as Python and Java have special methods for sorting custom classes. In JavaScript, toString() can be overridden, but this does not work easily for numeric values.
As a workaround, I added a method called compareTo() to the class, although this still requires a function to call it.
class NumericSortable {
constructor(newVal) {
this.val = newVal
}
compareTo(other) {
return this.val - other.val
}
}
const objectList = [
new NumericSortable(3),
new NumericSortable(1),
new NumericSortable(20),
]
objectList.sort(
function(a, b) { return a.compareTo(b) })
console.log(objectList)
Is there a way to modify the class so it can be sorted without requiring a function to be defined inside sort()?
Perhaps there is a good way to override toString() that will work for numeric values. However, solutions such as localeCompare() or a collator require two arguments, and they would not be passed to the overridden toString() method at the same time.
You can add a static method to your NumericSortable class, and pass that into the sort call. This idea can be extended to any custom class that need to define how two instances are to be compared for sorting.
class NumericSortable {
constructor(value) {
this.value = value;
}
static compare(a,b) {
return a.value - b.value;
}
}
const arr = [
new NumericSortable(3),
new NumericSortable(1),
new NumericSortable(20),
];
arr.sort(NumericSortable.compare);
console.log(arr);
This makes things more explicit, and easier for anyone else reading the code to reason about how the array is being sorted.
I like to make a function that returns a sort function for these cases.
function by(prop){
return function(a,b){return a[prop]-b[prop];};
}
this let's you specify the object's to-be-sorted property at call-time, letting one generic function do a lot of heavy lifting.
objectList.sort(by("val"))
This avoids the need for a custom callback each sort, though with fat arrows that's not the burden it used to be anyway...
If no comparator is provided, each time two items in the array are compared, they'll be coerced to strings, and then those strings will be compared lexiographically to determine which object will come before the other in the sorted array. So, if you don't want to pass a comparator, adding a toString method to implement the desired sorting logic is the only other approach.
Unfortunately, for your situation, lexiographic comparison alone based on the .vals won't cut it; 1 will come before 20, and 20 will come before 3. If the numbers involved won't get so high as to have es in their string version, you could .padStart the returned string so that each compared numeric string will have the same number of characters (thereby allowing lexiographic comparison to work).
class NumericSortable {
constructor(newVal) {
this.val = newVal
}
// Unused now
compareTo(other) {
return this.val - other.val
}
toString() {
return String(this.val).padStart(15, '0');
}
}
const objectList = [
new NumericSortable(3),
new NumericSortable(1),
new NumericSortable(20),
]
objectList.sort()
console.log(objectList)
You may wish to account for negative numbers too.
Still, this whole approach is a bit smelly. When possible, I'd prefer a comparator instead of having to fiddle with the strings to get them to compare properly.
From my above comment ...
"The OP needs to wrap the build-in sort method into an own/custom implementation of Array.prototype.sort. But why should one do it? Just for the sake of not writing a comparing sort-callback?"
Having said the above, I herby nevertheless provide an implementation of the above mentioned approach just in order to prove to the OP that it's manageable (after all it is exactly what the OP did ask for), but also to show to the audience that the effort (even though it's a one time effort) of doing so is much greater than other already provided suggestions / solutions.
class NumericSortable {
constructor(newVal) {
this.val = newVal;
}
compareTo(other) {
return this.val - other.val;
}
}
const objectList = [
new NumericSortable(3),
new NumericSortable(1),
new NumericSortable(20),
];
objectList
// - sorting by overwritten custom `sort` function with
// an already build-in `compareTo` based custom compare
// function and no additionally passed compare function.
.sort();
console.log({ objectList });
objectList
// - reverse sorting by overwritten custom `sort` function
// with an additionally passed compare function.
.sort((a, b) => (-1 * a.compareTo(b)));
console.log({ objectList });
console.log(
'... simple sort-implementation defaul-test ...\n[1,4,9,0,6,3].sort() ...',
[1,4,9,0,6,3].sort()
);
console.log(
'... simple sort-implementation defaul-test ...\n["foo", "biz", "baz", "bar"].sort() ...',
["foo", "biz", "baz", "bar"].sort()
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
<script>
(function (arrProto) {
// save the native `sort` reference.
const coreApiSort = arrProto.sort;
// type detection helper.
function isFunction(value) {
return (
'function' === typeof value &&
'function' === typeof value.call &&
'function' === typeof value.apply
);
}
// different comparison helper functionality.
function defaultCompare(a, b) {
return (a < b && -1) || (a > b && 1) || 0;
}
function localeCompare(a, b) {
return a?.localeCompare?.(b) ?? defaultCompare(a, b);
}
function customDefaultCompare(a, b) {
const isCustomComparableA = isFunction(a.compareTo);
const isCustomComparableB = isFunction(b.compareTo);
return (isCustomComparableA && isCustomComparableB)
? a.compareTo(b)
: localeCompare(a, b);
}
// the new `sort` functionality.
function customSort(customCompare) {
return coreApiSort
// - (kind of "super") delegation to the
// before saved native `sort` reference ...
.call(this, (
// ... by using a cascade of different
// comparison functionality.
isFunction(customCompare)
&& customCompare
|| customDefaultCompare
));
}
Object
// - overwrite the Array prototype's native `sort`
// method with the newly implemented custom `sort`.
.defineProperty(arrProto, 'sort', {
value: customSort,
});
}(Array.prototype));
</script>
I was working on a Dynamic Programming Problem and was able to code up a Javascript solution:
function howSum(targetSum,numbers,memo = {}){
//if the targetSum key already in hashmap,return its value
if(targetSum in memo) return memo[targetSum];
if(targetSum == 0) return [];
if(targetSum < 0) return null;
for(let num of numbers){
let aWay = howSum(targetSum-num,numbers,memo);
if(aWay !== null){
memo[targetSum] = [...aWay,num];
return memo[targetSum];
}
}
//no way to generate the targetSum using any elements of input array
memo[targetSum] = null;
return null;
}
Now I was thinking over how I could translate this into a CPP code.
I would have to use a reference to an unordered map for the memo object.
But how should I go about returning the empty array and null values as in the base condition?Should I return an array pointer and realloc it when inserting an element?Wouldnt that be a C way of programming it?
Also how should I go about passing the default parameter to the memo unordered map in C++?Currently I have overloaded the function which creates the memo unorderd map and passes its reference.
Any guidance will be appreciated as I can solve future questions.
I was stuck in this problem too. This is how I made it work.
// howSum function
vector<int> howSum(int target, vector<int> numbers, unordered_map<int, vector<int>> &dp ){
// base case 1 - for dp
if(dp.find(target)!=dp.end()) return dp[target];
// making a vector to return in the following base cases
vector<int> res;
// base case 2
if(target == 0) {
return res;
}
// base case 3
if(target<0) {
res.push_back(-1); // using -1 instead of NULL
return res;
}
// the actual logic for the question
for(int i=0;i<numbers.size();i++){
int remainder = target - numbers[i];
vector<int> result = howSum(remainder,numbers,dp); // recursion
// if result vector doesn't contain -1, push target to result vector
if(find(result.begin(),result.end(),-1)==result.end()){
result.push_back(numbers[i]);
dp.emplace(target,result);
return result;
}
}
res.push_back(-1);
dp.emplace(target,res);
return res;
}
// main function
int main(){
vector<int>numbers = {20,50};
unordered_map<int, vector<int>> dp;
vector<int> res = howSum(300,numbers,dp);
for(int i=0;i<res.size();i++){
cout<<res[i]<<" ";
}
cout<<endl;
}
Here is my take at it:
#include <optional>
#include <vector>
#include <unordered_map>
using Nums = std::vector<int>;
using OptNums = std::optional<Nums>;
namespace detail {
using Memo = std::unordered_map<int, OptNum>>;
OptNums const & howSum(int targetSum, Nums const & numbers, Memo & memo) {
if (auto iter = memo.find(targetSum); iter != memo.end()) {
return iter->second; // elements are std::pair<int, OptNums>
}
auto & cached = memo[targetSum]; // create an empty optional in the map
if (targetSum == 0) {
cached.emplace(); // create an empty Nums in the optional
}
else if (targetSum > 0) {
for (int num : numbers) {
if (auto const & aWay = howSum(targetSum-num, numbers, memo)) {
cached = aWay; // copy vector into optional
cached->push_back(num);
}
}
}
return cached;
}
} // detail
std::optional<Nums> howSum(int targetSum, Nums const & numbers) {
detail::Memo memo;
return detail::howSum(targetSum, numbers, memo);
}
Some comments:
using two functions, one that creates the memo and passes it into the real implementation function is a good pattern. It makes the user-facing interface clean.
the "detail" namespace is just a name, no magic meaning, but is often used to indicate implementation detail.
In the implementation, I return references to an optional. This is an optimization to avoid copying the return vectors in every call where the algorithm unwinds from the recursion. This does require some care, however, because you must be careful to return references to objects that will outlive the local scope (so no returning std::nullopt, or the reference binds to a temporary optional, for example.) That is also why I always create the element in the memo object--even in the negative case--so I can return a reference to it safely. Note, operator[] applied to an unordered_map will create the element if it does not exist, while find will not.
Since the reference returned by the detail function has a lifetime only as long as the memo declared in the caller, the caller itself must return a copy of the optional it gets back, to ensure that the data is not destroyed during the cleanup of the function call. Note, it does not return a reference.
Also, the "if" inside the for loop has a little bit going on. It declares a local reference, initializes it to the result of the recursive call. That whole expression is a reference to optional, which has an implicit conversion to bool that is true if the optional holds a value. This is a useful idiom worth pointing out, though to be more explicit this is equivalent:
if (auto const & aWay = howSum(targetSum-num, numbers, memo); aWay.has_value())
Here's a fleshed out example, with a few test cases to show it work.
https://godbolt.org/z/cWrdhvM1n
I have an array of ~11,000 JavaScript dictionaries, each representing 1 row in an Excel file.
I want to loop through this array and parse each element into a new datastructure. For example, I might have a function that will count for {"foo": true} or something.
As I have multiple of these functions, my question is would it be better to loop through this array for each function, or have one single loop with functions that parse each element and store it in a global variable?
Ex. I'm currently doing one single loop, and parsing each element into a global variable
const arr = [...]; // array of ~11,000 dictionaries
// example parsing function
let count = 0;
function countFoos(el) {
if (el["foo"] === true) count++;
}
let count2 = 0;
function countBars(el) {
if (el["bar"] === false) count2++;
}
arr.forEach(el => {
countFoos(el);
countBars(el);
});
But would it be better to do it this way?
class Parse {
constructor(arr) {
this.arr = arr;
this.count = 0;
this.count2 = 0;
}
countFoos() {
this.arr.forEach((el) => {
if (el["foo"] === true) this.count++;
});
}
countBars() {
this.arr.forEach((el) => {
if (el["bar"] === false) this.count2++;
});
}
}
const arr = [...]; // array of ~11,000 dictionaries
let x = Parse();
x.countFoos();
x.countBars();
EDIT: I should've clarified early, the examples shown above are just very simplified examples of the production code. Approximately 20 'parsing functions' are being run on for each element, with each of its corresponding global variables being large dictionaries or arrays.
You should generally do just one iteration that calls both functions.
Iterating takes time, so doing two iterations will double the time taken to perform the iterations. How significant this is to the entire application depends on how much work is done in the body of the iteration. If the bodies are very expensive, the iteration time might fall into the noise. But if it's really simple, as in your examples of a simple test and variable increment, the iteration time will probably be significant.
If you are worried about performance, the first method is better as it only involves one iteration over the entire array while the second approach requires two.
If think using classes is more readable, you could simply put write that as one method in the class.
class Parse {
constructor(arr) {
this.arr = arr;
this.count = 0;
this.count2 = 0;
}
count() {
this.arr.forEach((el) => {
countFoos(el), countBars(el);
});
}
countFoos(el){
if(el.foo === true) this.count1++;
}
countBars() {
if(el.bar === false) this.count2++;
}
}
I would approach this by using the Array.prototype.reduce function, which would only require a single pass over the given array. I also would not use a class here as it would not really make sense, but you can if you really want!
function count(arr) {
return arr.reduce(([fooCount, barCount], next) => {
if (next.foo === true) {
fooCount = fooCount + 1
}
if (next.bar === false) {
barCount = barCount + 1
}
return [fooCount, barCount]
}, [0, 0]);
}
const [fooCount, barCount] = count(...);
You can also use generators to accomplish this, which is even better because it doesn't require that you to iterate the entire set of words in the dictionary, but it's a little more unwieldy to use.
This is actually easier to use than other examples that require if statements, because you could quite easily run a battery of functions over each result and add it to the accumulator.
Just remember though that you don't want to optimize before you prove something is a problem. Iterating 22000 objects is obviously more than iterating 11000, but it is still going to be quite fast!
Restricting the number of loops is your best option as it requires less overhead.
Also here is an idea, using the foreach to do the processing with if statements and using a single counter object to hold all of the values so they mean something and can be easily referenced later on.
const arr = [
{"foo" : true,"bar" : false},{"bar" : true,"foo" : false}
];
let counters = {};
function inc(field) {
if (counters[field] == undefined) {
counters[field] = 0;
}
counters[field]++;
}
arr.forEach(el => {
if (el["foo"] === true) {
inc("foo");
}
if (el["bar"] === true) {
inc("bar");
}
});
console.log(counters);
Is there any JavaScript Array library that normalizes the Array return values and mutations? I think the JavaScript Array API is very inconsistent.
Some methods mutate the array:
var A = [0,1,2];
A.splice(0,1); // reduces A and returns a new array containing the deleted elements
Some don’t:
A.slice(0,1); // leaves A untouched and returns a new array
Some return a reference to the mutated array:
A = A.reverse().reverse(); // reverses and then reverses back
Some just return undefined:
B = A.forEach(function(){});
What I would like is to always mutate the array and always return the same array, so I can have some kind of consistency and also be able to chain. For example:
A.slice(0,1).reverse().forEach(function(){}).concat(['a','b']);
I tried some simple snippets like:
var superArray = function() {
this.length = 0;
}
superArray.prototype = {
constructor: superArray,
// custom mass-push method
add: function(arr) {
return this.push.apply(this, arr);
}
}
// native mutations
'join pop push reverse shift sort splice unshift map forEach'.split(' ').forEach(function(name) {
superArray.prototype[name] = (function(name) {
return function() {
Array.prototype[name].apply(this, arguments);
// always return this for chaining
return this;
};
}(name));
});
// try it
var a = new superArray();
a.push(3).push(4).reverse();
This works fine for most mutation methods, but there are problems. For example I need to write custom prototypes for each method that does not mutate the original array.
So as always while I was doing this, I was thinking that maybe this has been done before? Are there any lightweight array libraries that do this already? It would be nice if the library also adds shims for new JavaScript 1.6 methods for older browsers.
I don't think it is really inconsistent. Yes, they might be a little confusing as JavaScript arrays do all the things for which other languages have separate structures (list, queue, stack, …), but their definition is quite consistent across languages. You can easily group them in those categories you already described:
list methods:
push/unshift return the length after adding elements
pop/shift return the requested element
you could define additional methods for getting first and last element, but they're seldom needed
splice is the all-purpose-tool for removing/replacing/inserting items in the middle of the list - it returns an array of the removed elements.
sort and reverse are the two standard in-place reordering methods.
All the other methods do not modify the original array:
slice to get subarrays by position, filter to get them by condition and concat to combine with others create and return new arrays
forEach just iterates the array and returns nothing
every/some test the items for a condition, indexOf and lastIndexOf search for items (by equality) - both return their results
reduce/reduceRight reduce the array items to a single value and return that. Special cases are:
map reduces to a new array - it is like forEach but returning the results
join and toString reduce to a string
These methods are enough for the most of our needs. We can do quite everything with them, and I don't know any libraries that add similar, but internally or result-wise different methods to them. Most data-handling libs (like Underscore) only make them cross-browser-safe (es5-shim) and provide additional utility methods.
What I would like is to always mutate the array and always return the same array, so I can have some kind of consistency and also be able to chain.
I'd say the JavaScript consistency is to alway return a new array when elements or length are modified. I guess this is because objects are reference values, and changing them would too often cause side effects in other scopes that reference the same array.
Chaining is still possible with that, you can use slice, concat, sort, reverse, filter and map together to create a new array in only one step. If you want to "modify" the array only, you can just reassign it to the array variable:
A = A.slice(0,1).reverse().concat(['a','b']);
Mutation methods have only one advantage to me: they are faster because they might be more memory-efficient (depends on the implementation and its garbage collection, of course). So lets implement some methods for those. As Array subclassing is neither possible nor useful, I will define them on the native prototype:
var ap = Array.prototype;
// the simple ones:
ap.each = function(){ ap.forEach.apply(this, arguments); return this; };
ap.prepend = function() { ap.unshift.apply(this, arguments); return this; };
ap.append = function() { ap.push.apply(this, arguments; return this; };
ap.reversed = function() { return ap.reverse.call(ap.slice.call(this)); };
ap.sorted = function() { return ap.sort.apply(ap.slice.call(this), arguments); };
// more complex:
ap.shorten = function(start, end) { // in-place slice
if (Object(this) !== this) throw new TypeError();
var len = this.length >>> 0;
start = start >>> 0; // actually should do isFinite, then floor towards 0
end = typeof end === 'undefined' ? len : end >>> 0; // again
start = start < 0 ? Math.max(len + start, 0) : Math.min(start, len);
end = end < 0 ? Math.max(len + end, 0) : Math.min(end, len);
ap.splice.call(this, end, len);
ap.splice.call(this, 0, start);
return this;
};
ap.restrict = function(fun) { // in-place filter
// while applying fun the array stays unmodified
var res = ap.filter.apply(this, arguments);
res.unshift(0, this.length >>> 0);
ap.splice.apply(this, res);
return this;
};
ap.transform = function(fun) { // in-place map
if (Object(this) !== this || typeof fun !== 'function') throw new TypeError();
var len = this.length >>> 0,
thisArg = arguments[1];
for (var i=0; i<len; i++)
if (i in this)
this[i] = fun.call(thisArg, this[i], i, this)
return this;
};
// possibly more
Now you could do
A.shorten(0, 1).reverse().append('a', 'b');
IMHO one of the best libraries is underscorejs http://underscorejs.org/
You probably shouldn't use a library just for this (it's a not-so-useful dependency to add to the project).
The 'standard' way to do is to call slice when you want to exec mutate operations. There is not problem to do this, since the JS engines are pretty good with temporary variables (since that's one of the key point of the javascript).
Ex. :
function reverseStringify( array ) {
return array.slice( )
.reverse( )
.join( ' ' ); }
console.log( [ 'hello', 'world' ] );
I need to add an element to an array only if it is not already there in Javascript. Basically I'm treating the array as a set.
I need the data to be stored in an array, otherwise I'd just use an object which can be used as a set.
I wrote the following array prototype and wanted to hear if anyone knew of a better way. This is an O(n) insert. I was hoping to do O(ln(n)) insert, however, I didn't see an easy way to insert an element into a sorted array. For my applications, the array lengths will be very small, but I'd still prefer something that obeyed accepted rules for good algorithm efficiency:
Array.prototype.push_if_not_duplicate = function(new_element){
for( var i=0; i<this.length; i++ ){
// Don't add if element is already found
if( this[i] == new_element ){
return this.length;
}
}
// add new element
return this.push(new_element);
}
If I understand correctly, you already have a sorted array (if you do not have a sorted array then you can use Array.sort method to sort your data) and now you want to add an element to it if it is not already present in the array. I extracted the binary insert (which uses binary search) method in the google closure library. The relevant code itself would look something like this and it is O(log n) operation because binary search is O(log n).
function binaryInsert(array, value) {
var index = binarySearch(array, value);
if (index < 0) {
array.splice(-(index + 1), 0, value);
return true;
}
return false;
};
function binarySearch(arr, value) {
var left = 0; // inclusive
var right = arr.length; // exclusive
var found;
while (left < right) {
var middle = (left + right) >> 1;
var compareResult = value > arr[middle] ? 1 : value < arr[middle] ? -1 : 0;
if (compareResult > 0) {
left = middle + 1;
} else {
right = middle;
// We are looking for the lowest index so we can't return immediately.
found = !compareResult;
}
}
// left is the index if found, or the insertion point otherwise.
// ~left is a shorthand for -left - 1.
return found ? left : ~left;
};
Usage is binaryInsert(array, value). This also maintains the sort of the array.
Deleted my other answer because I missed the fact that the array is sorted.
The algorithm you wrote goes through every element in the array and if there are no matches appends the new element on the end. I assume this means you are running another sort after.
The whole algorithm could be improved by using a divide and conquer algorithm. Choose an element in the middle of the array, compare with new element and continue until you find the spot where to insert. It will be slightly faster than your above algorithm, and won't require a sort afterwards.
If you need help working out the algorithm, feel free to ask.
I've created a (simple and incomplete) Set type before like this:
var Set = function (hashCodeGenerator) {
this.hashCode = hashCodeGenerator;
this.set = {};
this.elements = [];
};
Set.prototype = {
add: function (element) {
var hashCode = this.hashCode(element);
if (this.set[hashCode]) return false;
this.set[hashCode] = true;
this.elements.push(element);
return true;
},
get: function (element) {
var hashCode = this.hashCode(element);
return this.set[hashCode];
},
getElements: function () { return this.elements; }
};
You just need to find out a good hashCodeGenerator function for your objects. If your objects are primitives, this function can return the object itself. You can then access the set elements in array form from the getElements accessor. Inserts are O(1). Space requirements are O(2n).
If your array is a binary tree, you can insert in O(log n) by putting the new element on the end and bubbling it up into place. Checks for duplicates would also take O(log n) to perform.
Wikipedia has a great explanation.