knockout.js compare 2 arrays without looping through them both - javascript

I have 2 observable arrays and within an if statement I want to do something only if the arrays are identical, is there any way to do this without looping through each?

Generally speaking, no. You cannot differentiate an apple from an orange if you don't see that the apple is an apple and the orange is an orange. You can, however, use heuristics/shortcuts, such as:
Is the length of them equal? If not, then sure as hell, they are different.
Does the order of elements matter? If yes, the first time you find a difference, you can terminate early with a negative answer.
Also a few things to note about the JSON approach posted in the other answer:
It will not work if the order of items does not matter
It will not work if any object in the array contains circular references
It will not work if there are objects with the same properties and values but the keys are in a different order. For example: { a: 1, b: 2 } and { b: 2, a: 1 } would normally be considered equal (unless your explicit purpose is object reference equality) but the JSON representation of these will be different.
It is slower, because just to produce the JSONs you need to iterate over all properties anyway and you need to iterate over the individual characters of the JSON to perform the comparison.
So to sum it up, while it might be a tiny bit shorter and easier to read, there are several caveats, which might or might not mean a problem depending on your requirements.

You can call ko.toJSON on both arrays and then compare the json strings that are returned. Functionally that's probably just as much work if not more for the processor as looping through both arrays but it does look cleaner if that's all you're going for.
isEqual = ko.toJSON(aryA) === ko.toJSON(aryb)

Related

Is it a bad idea to use indexOf inside loops?

I was studying big O notation for a technical interview and then I realized that javascript's indexOf method may have a time complexity of O(N) as it traverses through each element of an array and returns the index where its found.
We also know that a time complexity of O(n^2) (n square) is not a good performance measure for larger data.
So is it a bad idea to use indexOf inside loops? In javascript, its common to see code where indexOf method is being used inside loops, may be to measure equality or to prepare some object.
Rather than arrays, should we prefer objects wherever necessary as they provide lookup with constant time performance O(1).
Any suggestions will be appreciated.
It can be a bad idea to use indexOf inside loops especially if the dataStructure you are searching through is quite large.
One work around for this is to have a hash table or dictionary containing the index of every item which you can generate in O(N) time by looping through the data structure and updating it every time you add to the data structure.
If you push something on the end of the data structure it will take O(1) Time to update this table and the worst case scenario is if you push something to the beginning of the data structure it will take O(N).
In most scenarios it will be worth it as getting the index will be O(1) Time.
To be honest, tl;dr. But, I did some speed tests of the various ways of checking for occurrences in a string (if that is your goal for using indexOf. If you are actually trying to get the position of the match, I personally don't know how to help you there). The ways I tested were:
.includes()
.match()
.indexOf()
(There are also the variants such as .search(), .lastIndexOf(), etc. Those I have not tested).
Here is the test:
var test = 'test string';
console.time('match');
console.log(test.match(/string/));
console.timeEnd('match');
console.time('includes');
console.log(test.includes('string'));
console.timeEnd('includes');
console.time('indexOf');
console.log(test.indexOf('string') !== 0);
console.timeEnd('indexOf');
I know they are not loops, but show you that all are basically the same speed. And honestly, each do different things, depending on what you need (do you want to search by RegEx? Do you need to be pre ECMAScript 2015 compatible? etc. - I have not even listed all of them) is it really necessary to analyze it this much?
From my tests, sometimes indexOf() would win, sometimes one of the other ones would win.
based on the browser, the indexOf has different implementations (using graphs, trees, ...). So, the time complexity for each indexOf also differs.
Though, what is clear is that implementing indexOf to have O(n) would be so naive and I don't think there is a browser to have it implements like a simple loop. Therefore, using indexOf in a for loop is not the same as using 2 nested for loops.
So, this one:
// could be O(n*m) which m is so small
// could be O(log n)
// or any other O(something) that is for sure smaller than O(n^2)
console.time('1')
firstArray.forEach(item => {
secondArray.indexOf(item)
})
console.time('1')
is different than:
// has O(n^2)
console.time('2')
firstArray.forEach(item => {
secondArray.forEach(secondItem => {
// extra things to do here
})
})
console.time('2')

how does javascript move to a specific index in an array?

This is more a general question about the inner workings of the language. I was wondering how javascript gets the value of an index. For example when you write array[index] does it loop through the array till it finds it? or by some other means? the reason I ask is because I have written some code where I am looping through arrays to match values and find points on a grid, I am wondering if performance would be increased by creating and array like array[gridX][gridY] or if it will make a difference. what I am doing now is going through a flat array of objects with gridpoints as properties like this:
var array = [{x:1,y:3}];
then looping through and using those coordinates within the object properties to identify and use the values contained in the object.
my thought is that by implementing a multidimensional grid it would access them more directly as can specify a gridpoint by saying array[1][3] instead of looping through and doing:
for ( var a = 0; a < array.length; a += 1 ){
if( array[a].x === 1 && array[a].y === 3 ){
return array[a];
}
}
or something of the like.
any insight would be appreciated, thanks!
For example when you write array[index] does it loop through the array till it finds it? or by some other means?
This is implementation defined. Javascript can have both numeric and string keys and the very first Javascript implementations did do this slow looping to access things.
However, nowadays most browsers are more efficient and store arrays in two parts, a packed array for numeric indexes and a hash table for the rest. This means that accessing a numeric index (for dense arrays without holes) is O(1) and accessing string keys and sparse arrays is done via hash tables.
I am wondering if performance would be increased by creating and array like array[gridX][gridY] or if it will make a difference. what I am doing now is going through a flat array of objects with gridpoints as properties like this array[{x:1,y:3}]
Go with the 2 dimension array. Its a much simpler solution and is most likely going to be efficient enough for you.
Another reason to do this is that when you use an object as an array index what actually happens is that the object is converted to a string and then that string is used as a hash table key. So array[{x:1,y:3}] is actually array["[object Object]"]. If you really wanted, you could override the toString method so not all grid points serialize to the same value, but I don't think its worth the trouble.
Whether it's an array or an object, the underlying structure in any modern javascript engine is a hashtable. Need to prove it? Allocate an array of 1000000000 elements and notice the speed and lack of memory growth. Javascript arrays are a special case of Object that provides a length method and restricts the keys to integers, but it's sparse.
So, you are really chaining hashtables together. When you nest tables, as in a[x][y], you creating multiple hashtables, and it will require multiple visits to resolve an object.
But which is faster? Here is a jsperf testing the speed of allocation and access, respectively:
http://jsperf.com/hash-tables-2d-versus-1d
http://jsperf.com/hash-tables-2d-versus-1d/2
On my machines, the nested approach is faster.
Intuition is no match for the profiler.
Update: It was pointed out that in some limited instances, arrays really are arrays underneath. But since arrays are specialized objects, you'll find that in these same instances, objects are implemented as arrays as well (i.e., {0:'hello', 1:'world'} is internally implemented as an array. But this shouldn't frighten you from using arrays with trillions of elements, because that special case will be discarded once it no longer makes sense.
To answer your initial question, in JavaScript, arrays are nothing more than a specialized type of object. If you set up an new Array like this:
var someArray = new Array(1, 2, 3);
You end up with an Array object with a structure that looks more-or-less, like this (Note: this is strictly in regards to the data that it is storing . . . there is a LOT more to an Array object):
someArray = {
0: 1,
1: 2,
2: 3
}
What the Array object does add to the equation, though, is built in operations that allow you to interact with it in the [1, 2, 3] concept that you are used to. push() for example, will use the array's length property to figure out where the next value should be added, creates the value in that position, and increments the length property.
However (getting back to the original question), there is nothing in the array structure that is any different when it comes to accessing the values that it stores than any other property. There is no inherent looping or anything like that . . . accessing someArray[0] is essentially the same as accessing someArray.length.
In fact, the only reason that you have to access standard array values using the someArray[N] format is that, the stored array values are number-indexed, and you cannot directly access object properties that begin with a number using the "dot" technique (i.e., someArray.0 is invalid, someArray[0] is not).
Now, admittedly, that is a pretty simplistic view of the Array object in JavaScript, but, for the purposes of your question, it should be enough. If you want to know more about the inner workings, there is TONS of information to be found here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array
single dimension hash table with direct access:
var array = {
x1y3: 999,
x10y100: 0
};
function getValue(x, y) {
return array['x' + x + 'y' + y];
}

Do undefined values in Javascript arrays use any memory or get iterated over in a for-in loop?

I'm building an Entity System for a game, and basically I'm not sure whether I should use simple objects (dictionaries) or arrays to to store the entities/components by their id.
My biggest issue is that I didn't want a dynamic entity id. If the id was just a string (using the dictionary to store entities), then it would always be valid and I could use storage[id] to get to that entity.
If I used arrays, I thought, the id's of entities, which would represent an index in the storage array, would change. Consider this array of entities:
[
Entity,
Entity, //This one is being removed.
Entity
];
If I was to remove the second entity in that array, I thought that the id required to access the third array would have to change to the id (index) of the (now removed) second entity. That's because I thought about deleting in terms of splice()ing.
But, I could use the delete expression to turn the element (an entity) into an undefined! And, if it's true that arrays in Javascript are actually just objects, and objects logically have infinitely many undefined values, does that mean that undefined values inside arrays don't use up memory?
Initially, I though that arrays were implemented in a way that they were aligned in memory, and that an index was just an offset from the first element, and by this logic I thought that undefined values would use at least the memory of a pointer (because I, actually, thought that pointers to elements are aligned, not elements themselves).
So, if I stored 10k+ entities in this array, and deleteed half of them, would the 5k undefined's use any memory at all?
Also, when I do a for entity in array loop, would these undefined elements be passed?
Also, where can I find resources to see how arrays are actually supposed to be implemented in Javascript? All I can find are general explanations of arrays and tutorials on how to use them, but I want to find out all about these little quirks that can prove important in certain situations. Something like a "Javascript quirks" site would be great.
Arrays are not just objects. In particular the length property is very magic.
Of course, a JavaScript engine is allowed to represent the array internally in any way it chooses, as long as the external API remains the same. For instance, if you set randomly separated values then they may be stored as a hash, but if you set consecutive values then they may be optimised into an array.
for ... in does not enumerate properties that are not set. This includes an array literal that skips values e.g. [true, , false], which will only enumerate indices 0 and 2.

What is the Python best practice concerning dicts vs objects for simple key-value storage?

After some time programming in Javascript I have grown a little fond of the duality there between objects and associative arrays (dictionaries):
//Javascript
var stuff = { a: 17, b: 42 };
stuff.a; //direct access (good sugar for basic use)
stuff['a']; //key based access (good for flexibility and for foreach loops)
In python there are basically two ways to do this kind of thing (as far as I know)
Dictionaries:
stuff = { 'a': 17, 'b':42 };
# no direct access :(
stuff['a'] #key based access
or Objects:
#use a dummy class since instantiating object does not let me set things
class O(object):
pass
stuff = O()
stuff.a = 17
stuff.a = 42
stuff.a #direct access :)
getattr(stuff, 'a') #key based access
edit: Some responses also mention namedtuples as a buitin way to create lighweight classes for immutable objects.
So my questions are:
Are there any established best-practices regarding whether I should use dicts or objects for storing simple, method-less key-value pairs?
I can imagine there are many ways to create little helper classes to make the object approach less ugly (for example, something that receives a dict on the constructor and then overrides __getattribute__). Is it a good idea or am I over-thinking it?
If this is a good thing to do, what would be the nicest approach? Also, would there be any good Python projects using said approach that I might take inspiration from?
Not sure about "established best practices", but what I do is:
If the value types are homogenous – i.e. all values in the mappings are numbers, use a dict.
If the values are heterogenous, and if the mapping always has a given more or less constant set of keys, use an object. (Preferrably use an actual class, since this smells a lot like a data type.)
If the values are heterogenous, but the keys in the mapping change, flip a coin. I'm not sure how often this pattern comes up with Python, dictionaries like this notably appear in Javascript to "fake" functions with keyword arguments. Python already has those, and **kwargs is a dict, so I'd go with dicts.
Or to put it another way, represent instances of data types with objects. Represent ad-hoc or temporary mappings with dicts. Swallow having to use the ['key'] syntax – making Python feel like Javascript just feels forced to me.
This would be how I decide between a dict and an object for storing simple, method-less key-value pairs:
Do I need to iterate over my key-value pairs?
Yes: use a dict
No: go to 2.
How many keys am I going to have?
A lot: use a dict
A few: go to 3.
Are the key names important?
No: use a dict
Yes: go to 4.
Do I wish to set in stone once and forever this important key names?
No: use a dict
Yes: use an object
It may also be interesting to tale a look at the difference shown by dis:
>>> def dictf(d):
... d['apple'] = 'red'
... return d['apple']
...
>>> def objf(ob):
... ob.apple = 'red'
... return ob.apple
...
>>> dis.dis(dictf)
2 0 LOAD_CONST 1 ('red')
3 LOAD_FAST 0 (d)
6 LOAD_CONST 2 ('apple')
9 STORE_SUBSCR
3 10 LOAD_FAST 0 (d)
13 LOAD_CONST 2 ('apple')
16 BINARY_SUBSCR
17 RETURN_VALUE
>>> dis.dis(objf)
2 0 LOAD_CONST 1 ('red')
3 LOAD_FAST 0 (ob)
6 STORE_ATTR 0 (apple)
3 9 LOAD_FAST 0 (ob)
12 LOAD_ATTR 0 (apple)
15 RETURN_VALUE
Well, if the keys are known ahead of time (or actually, even not, really), you can use named tuples, which are basically easily-created objects with whatever fields you choose. The main constraint is that you have to know all of the keys at the time you create the tuple class, and they are immutable (but you can get an updated copy).
http://docs.python.org/library/collections.html#collections.namedtuple
In addition, you could almost certainly create a class that allows you to create properties dynamically.
Well, the two approaches are closely related! When you do
stuff.a
you're really accessing
stulff.__dict__['a']
Similarly, you can subclass dict to make __getattr__ return the same as __getitem__ and so stuff.a will also work for your dict subclass.
The object approach is often convenient and useful when you know that the keys in your mapping will all be simple strings that are valid Python identifiers. If you have more complex keys, then you need a "real" mapping.
You should of course also use objects when you need more than a simple mapping. This "more" would normally be extra state or extra computations on the returned values.
You should also consider how others will use your stuff objects. If they know it's a simple dict, then they also know that they can call stuff.update(other_stuff) etc. That's not so clear if you give them back an object. Basically: if you think they need to manipulate the keys and values of your stuff like a normal dict, then you should probably make it a dict.
As for the most "pythonic" way to do this, then I can only say that I've seen libraries use both approaches:
The BeautifulSoup library parses HTML and hands you back some very dynamic objects where both attribute and item access have special meanings.
They could have chosen to give back dict objects instead, but there there is a lot of extra state associated with each object and so it makes perfect sense to use a real class.
There are of course also lots of libraries that simply give back normal dict objects — they are the bread and butter of many Python programs.

How do modern browsers implement JS Array, specifically adding elements?

By this I mean when calling .push() on an Array object and JavaScript increases the capacity (in number of elements) of the underlying "array". Also, if there is a good resource for finding this sort of information for JS, that would be helpful to include.
edit
It seems that the JS Array is like an object literal with special properties. However, I'm interested in a lower level of detail--how browsers implement this in their respective JS engines.
There cannot be any single correct answer to this qurstion. An array's mechanism for expanding is an internal implementation detail and can vary from one JS implementation to another. In fact, the Tamarin engine has two different implementations used internally for arrays depending on if it determines if the array is going to be sequential or sparse.
This answer is wrong. Please see #Samuel Neff's answer and the following resources:
http://news.qooxdoo.org/javascript-array-performance-oddities-characteristics
http://jsperf.com/array-popuplation-direction
Arrays in JavaScript don't have a capacity since they aren't real arrays. They're actually just object hashes with a length property and properties of "0", "1", "2", etc. When you do .push() on an array, it effectively does:
ary[ ary.length++ ] = the_new_element; // set via hash
Javascript does include a mechanism to declare the length of your array like:
var foo = new Array(3);
alert(foo.length); // alerts 3
But since arrays are dynamic in javascript there is no reason to do this, you don't have to manually allocate your arrays. The above example does not create a fixed length array, just initializes it with 3 undefined elements.
// Edit: I either misread your question or you changed it, sorry I don't think this is what you were asking.

Categories

Resources