Performance differences between jquery.inArray() vs Object.hasOwnProperty()? - javascript

I have a situation where I can choose to implement a collection of string keys as an object:
$.each(objects, function (key, object) {
collection[key] = "doesn't matter";
});
or an array:
$.each(objects, function (key, object) {
collection.push(key);
});
I'd like to be able to quickly determine whether or not the collection contains a given key. If collection is an object, I can use:
if (collection.hasOwnProperty(key_to_find)) { // found it!... }
else { // didn't find it... }
If collection is an array, I can use:
if ($.inArray(key_to_find, collection)) { // found it!... }
else { // didn't find it... }
I'd imagine using JavaScript's built-in hasOwnProperty would be faster than jQuery's inArray(), but I'm not entirely sure. Does anyone know more about the performance differences between these two methods? Or, is there a more efficient alternative here that I am not aware of?

If we're talking just how long it takes to check, then there's really no contest:
http://jsperf.com/array-vs-obj
hasOwnProperty is way way faster for the reasons stated by others.

Mmmh, if the collection is an array you can also use the native indexOf on it, no? https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/indexOf

The array method is slower, because it requires linear time to find an element in the array (the code must step through potentially every element). hasOwnProperty will be much faster as it can use a hash table lookup, which occurs in constant time.

An object will have very fast access times for properties. However, you still have to account for overhead from calculating the hash etc (if the implementation uses hash tables to back objects). If the set of keys is relatively small, there shouldn't be too much of a difference. If it's larger, then I would go with the object/hash to store properties.
That said, it's a little easier to manage duplicate keys with the object, so I would personally would go with the dictionary.
Unless this is a bottleneck in your application, you shouldn't over think it.

Short Answer: .indexOf() (as #Claudio mentions)
Long Answer: Have a look at a quick speed test I coded up - http://jsfiddle.net/cvallance/4YdxJ/3/ - the difference is really negligible.

Related

Realm-JS: Performant way to find the index of an element in sorted results list

I am searching for a perfomant way to find the index of a given realm-object in a sorted results list.
I am aware of this similar question, which was answered with using indexOf, so my current solution looks like this:
const sortedRecords = realm.objects('mySchema').sorted('time', true) // 'time' property is a timestamp
// grab element of interest by id (e.g. 123)
const item = realm.objectForPrimaryKey('mySchema','123')
// find index of that object in my sorted results list
const index = sortedRecords.indexOf(item)
My basic concern here is performance for lager datasets. Is the indexOf implementation of a realm-list improved for this in any way, or is it the same as from a JavaScript array? I know there is the possibility to create indexed properties, would indexing the time property improve the performance in this case?
Note:
In the realm-js api documentation, the indexOf section does not reference to Array.prototype.indexOf, as other sections do. This made me optimistic it's an own implementation, but it's not stated clearly.
Realm query methods return a Results object which is quite different from an Array object, the main difference is that the first one can change over time even without calling methods on it: adding and/or deleting record to the source schema can result in a change to Results object.
The only common thing between Results.indexOf and Array.indexOf is the name of the method.
Once said that is easy to also say that it makes no sense to compare the efficiency of the two methods.
In general, a problem common to all indexOf implementations is that they need a sequential scan and in the worst case (i.e. the not found case) a full scan is required. The wort implemented indexOf executed against 10 elements has no impact on program performances while the best implemented indexOf executed against 1M elements can have a severe impact on program performances. When possible it's always a good idea avoiding to use indexOf on large amounts of data.
Hope this helps.

Is it a bad idea to use indexOf inside loops?

I was studying big O notation for a technical interview and then I realized that javascript's indexOf method may have a time complexity of O(N) as it traverses through each element of an array and returns the index where its found.
We also know that a time complexity of O(n^2) (n square) is not a good performance measure for larger data.
So is it a bad idea to use indexOf inside loops? In javascript, its common to see code where indexOf method is being used inside loops, may be to measure equality or to prepare some object.
Rather than arrays, should we prefer objects wherever necessary as they provide lookup with constant time performance O(1).
Any suggestions will be appreciated.
It can be a bad idea to use indexOf inside loops especially if the dataStructure you are searching through is quite large.
One work around for this is to have a hash table or dictionary containing the index of every item which you can generate in O(N) time by looping through the data structure and updating it every time you add to the data structure.
If you push something on the end of the data structure it will take O(1) Time to update this table and the worst case scenario is if you push something to the beginning of the data structure it will take O(N).
In most scenarios it will be worth it as getting the index will be O(1) Time.
To be honest, tl;dr. But, I did some speed tests of the various ways of checking for occurrences in a string (if that is your goal for using indexOf. If you are actually trying to get the position of the match, I personally don't know how to help you there). The ways I tested were:
.includes()
.match()
.indexOf()
(There are also the variants such as .search(), .lastIndexOf(), etc. Those I have not tested).
Here is the test:
var test = 'test string';
console.time('match');
console.log(test.match(/string/));
console.timeEnd('match');
console.time('includes');
console.log(test.includes('string'));
console.timeEnd('includes');
console.time('indexOf');
console.log(test.indexOf('string') !== 0);
console.timeEnd('indexOf');
I know they are not loops, but show you that all are basically the same speed. And honestly, each do different things, depending on what you need (do you want to search by RegEx? Do you need to be pre ECMAScript 2015 compatible? etc. - I have not even listed all of them) is it really necessary to analyze it this much?
From my tests, sometimes indexOf() would win, sometimes one of the other ones would win.
based on the browser, the indexOf has different implementations (using graphs, trees, ...). So, the time complexity for each indexOf also differs.
Though, what is clear is that implementing indexOf to have O(n) would be so naive and I don't think there is a browser to have it implements like a simple loop. Therefore, using indexOf in a for loop is not the same as using 2 nested for loops.
So, this one:
// could be O(n*m) which m is so small
// could be O(log n)
// or any other O(something) that is for sure smaller than O(n^2)
console.time('1')
firstArray.forEach(item => {
secondArray.indexOf(item)
})
console.time('1')
is different than:
// has O(n^2)
console.time('2')
firstArray.forEach(item => {
secondArray.forEach(secondItem => {
// extra things to do here
})
})
console.time('2')

Underscore reduce for adding prices

I have an object wherein I am storing selections of a shopping cart. Because this is a Backbone app I decided that to derive the subtotal of the cart items I would use reduce to add the item prices by quantities:
Here's a (contrived) example:
var App = {
cart : {
item : []
}
}
//...
App.cart.item.push({ qty: 1, price: 34.95});
App.cart.item.push({ qty: 3, price: 24.95});
App.cart.item.push({ qty: 4, price: 1.99});
App.cart.item.push({ qty: 13, price: .99});
When the view is rendered the subtotal is calculated as follows:
_.reduce(App.cart.item,function(memo, num){ return memo + (num.price * num.qty); },0)
This sparked some debate:
Some colleagues have a disagreement that reduce is not the correct method to use here, rather, to use each and pass to a summing function and possibly use a memoize pattern to cache the result.
Another argues that I shouldn't use reduce without passing the function output from a matching map.
Still others agree that while reduce is the correct method to use I should instead use the foldl alias as it's clearer to future devs what the intention is.
Admittedly I know very little about FP, so my use of the reduce method in Underscore is purely a means to an end of deriving a subtotal without having to store it as a separate property in the JSON server-side.
Please explain to me why reduce is or is not "correct" approach to produce sums of the object properties. I'm not looking for a pure functional approach here, obviously, as I'm only using reduce as a utility.
This is a pretty subjective question, so I suppose I'll give a subjective answer:
There are basically three things at issue here: actual code efficiency, code legibility, and code style. The last one doesn't really matter IMO except insofar as it affects legibility and the consistency of your code base.
In terms of efficiency, _.reduce is more efficient than _.map + _.reduce, so you can drop that idea. Using _.map creates a new array with the transformed values - there's no need for this.
The _.each + summation function approach may be marginally less efficient (you need to keep track of your running-total variable somewhere else) and to my mind it's less legible, because it takes the relevant code and moves it further from the loop where you're using it. You also don't get a nice clean return value for your operation - instead, you have a variable hanging out somewhere in the outer scope that you need to first create and then re-use.
I don't think "memoization" makes much sense here. You might want to cache the return value for a given cart, and invalidate that on item changes, but that's not actually memoizing. Memoization makes sense when a) an easily-hashed input will always produce the same answer, and b) calculating that answer is expensive. In this case, calculating the sum is probably cheaper than hashing the input (your item list).
In terms of legibility, I tend strongly towards using the Underscore aliases that shadow the names of real JS 1.8 methods (the exception, for me, is .any vs. .some, because I find it so much more legible). In this case, that means .reduce, not .inject or .foldl. That makes it easier for a JS dev to understand what's intended, and JS devs are who you care about.

Efficient Javascript Array Lookup

If I have a whitelist of strings that I want to check everything the user inputs into my javascript program, what's the most efficient way to do that? I could just have an array and loop through it until a match has been found but that's O(N). Is there an yway to do it that's better and doesn't involve any sort of key value lookup, just checking to see if that value exists?
EDIT: I guess what I'm looking for is the equivalent of a set in C++ where I can just check to see if a value I'm given already exists in the set.
Just make it a simple js object instead of an array.
var whitelist = {
"string1":true,
"string2":true
}
and then you can just check if(whitelist[str]) to check if its available.
Or use if(str in whitelist).
I expect that the first will have slightly better performance (I haven't verified that), but the second is more readable and makes the purpose clear. So its your choice of which is a better fit.
Sort the array, use binary search for the look-ups.
Or
Create an object where the key is the item and use the hash look-up whitelist[value] != undefined
I think you'll find that key-value lookup is almost identical in performance to some kind of set implementation without values. (Many standard libraries actually just implement a set using a map)

Best way to store a huge list with hashes in Javascript

I have a list with 10.000 entrys.
for example
myList = {};
myList[hashjh5j4h5j4h5j4]
myList[hashs54s5d4s5d4sd]
myList[hash5as465d45ad4d]
....
I dont use an array (0,1,2,3) because i can check
very fast -> if this hash exist or not.
if(typeof myObject[hashjh5j4h5j4h5j4] == 'undefined')
{
alert('it is new');
}
else
{
alert('old stuff');
}
But i am not sure, is this a good solution?
Is it maybe a problem to handle an object with 10.000 entries?
EDIT:
I try to build an rss feed reader which shows only new feeds. So i calculate an hash from the link (every news has an uniqe link) and store it in the object (mongoDB). BTW: 10.000 entrys is not the normal case (but it is possible)
My advice:
Use as small of a hash as possible for the task at hand. If you are dealing with hundreds of hashable strings, compared to billions, then your hash length can be relatively small.
Store the hash as an integer, not a string, to avoid making it take less room than needed.
Don't store as objects, just store them in a simple binary tree log2(keySize) deep.
Further thoughts:
Can you come at this with a hybrid approach? Use hashes for recent feeds less than a month old, and don't bother showing items more than a month old. Store the hash and date together, and clean out old hashes each day?
You can use the in operator:
if ('hashjh5j4h5j4h5j4' in myList) { .. }
However, this will also return true for members that are in the objects prototype chain:
Object.prototype.foo = function () {};
if ("foo" in myList) { /* will be true */ };
To fix this, you could use hasOwnProperty instead:
if (myList.hasOwnProperty('hashjh5j4h5j4h5j4')) { .. }
Whilst you yourself may not have added methods to Object.prototype, you cannot guarantee that other 3rd party libraries you use haven't; incidentally, extending Object.prototype is frowned upon, so you shouldn't really do it. Why?; because you shouldn't modify things you don't own.
10.000 is quite a lot. You may consider storing the hashes in a database and query it using ajax. It maybe takes a bit longer to query one hash but your page loads much faster.
It is not a problem in modern browser on modern computers in any way.
10k entries that take up 50 bytes each would still take up less than 500KB ram.
As long as the js is served gzipped then bandwidth is no problem - but do try to serve the data as late as possible so they don't block perceived pageload performance.
All in all, unless you wish to cater to cellphones then your solution is fine.

Categories

Resources