So I have a Person object here with the following two attributes: firstname and mood. Assume the firstname attribute is unique.
If I have an array () of 3 Person objects (assume each line is a single person object):
Alison, Happy
Bob, Sad
Charles, Happy
If a have a second array for example of 3 people (from JSON array):
Alison, Sad
Bob, Happy
Jordan, Sad
I want the most efficient way to be able to iterate through the second array to update items in the first array. The way I think of it is with 2 for loops. So for example:
1st iteration, sees Alison, searches through first array for Alison, updates Alison to Sad. 2nd iteration, sees Bob, searches through...etc
3rd iteration, sees Jordan, searches through 1st array for Jordon, does not find... pushes new object onto array with Jordan, Sad.
Now I know that the array may not be the best way to do this, if there is a better way to accomplish this without the 1st array being an array.. could be a map, etc. What I care about is performance as the method i described is incredibly inefficient for many people for example if the arrays were of size 100.
Please help, would appreciate greatly.
This depends somewhat on your browser and library environment. There is Array.indexOf('value'), but that won't work in IE, and it's probably just doing a (optimized) loop behind the scenes anyway.
If you have a lot of data, and it's likely to be sorted, which doesn't seem to be the case for you, it's better to do a binary search. In other words, take the value from the middle of the data set and compare it to your query. If it's greater, your next query is the middle value of the first half of the set of values, and so on, until you get it.
Not necessarily a solution to the performance problem (though better than some) but a good alternative for this kind of work in terms of reduction of coding overhead is the underscore.js library. With underscore you could use _detect to fetch your value and then just use Array.push if not found.
If each person has a unique id or reference, you can put them in an object and use the id as a key, then do:
if (sadBob.id in personCollection) {
// sadBob is already there
personCollection.update(sadBob);
} else {
// add sadBob
personCollection.addPerson(sadBob);
}
Alternatively, you might add a property to the person objects in array2 that is their index in array1 when you create them. So if they don't have an index, they aren't in array1. If they have an index, you can go straight there (which is essentially the same as the above with an object and using the array index as the reference or id).
Related
Let's say you want to search for a match between two arrays.
You could loop through both array and find a match. This is O(N*M) though.
I've found another option is to create a lookuplist.
How is it that using a lookuplist (simple Object mapping dictionary) is much faster to search for an element than using a search algorithm?
i.e.
const map {
hundreds of keys to hundreds of values
}
One can simply find the value of a foo key by doing the below
map['foo'] = bar.
As opposed to multiple loops or using something like _.filter or _.find
I understand that the map method would introduce more memory into the equation. But processing-wise it seems to scale nicely. What is it with hashes that make searching so fast? Isn't a full-scan of all keys still needed at the end of the day?
I am searching for a perfomant way to find the index of a given realm-object in a sorted results list.
I am aware of this similar question, which was answered with using indexOf, so my current solution looks like this:
const sortedRecords = realm.objects('mySchema').sorted('time', true) // 'time' property is a timestamp
// grab element of interest by id (e.g. 123)
const item = realm.objectForPrimaryKey('mySchema','123')
// find index of that object in my sorted results list
const index = sortedRecords.indexOf(item)
My basic concern here is performance for lager datasets. Is the indexOf implementation of a realm-list improved for this in any way, or is it the same as from a JavaScript array? I know there is the possibility to create indexed properties, would indexing the time property improve the performance in this case?
Note:
In the realm-js api documentation, the indexOf section does not reference to Array.prototype.indexOf, as other sections do. This made me optimistic it's an own implementation, but it's not stated clearly.
Realm query methods return a Results object which is quite different from an Array object, the main difference is that the first one can change over time even without calling methods on it: adding and/or deleting record to the source schema can result in a change to Results object.
The only common thing between Results.indexOf and Array.indexOf is the name of the method.
Once said that is easy to also say that it makes no sense to compare the efficiency of the two methods.
In general, a problem common to all indexOf implementations is that they need a sequential scan and in the worst case (i.e. the not found case) a full scan is required. The wort implemented indexOf executed against 10 elements has no impact on program performances while the best implemented indexOf executed against 1M elements can have a severe impact on program performances. When possible it's always a good idea avoiding to use indexOf on large amounts of data.
Hope this helps.
This is more a general question about the inner workings of the language. I was wondering how javascript gets the value of an index. For example when you write array[index] does it loop through the array till it finds it? or by some other means? the reason I ask is because I have written some code where I am looping through arrays to match values and find points on a grid, I am wondering if performance would be increased by creating and array like array[gridX][gridY] or if it will make a difference. what I am doing now is going through a flat array of objects with gridpoints as properties like this:
var array = [{x:1,y:3}];
then looping through and using those coordinates within the object properties to identify and use the values contained in the object.
my thought is that by implementing a multidimensional grid it would access them more directly as can specify a gridpoint by saying array[1][3] instead of looping through and doing:
for ( var a = 0; a < array.length; a += 1 ){
if( array[a].x === 1 && array[a].y === 3 ){
return array[a];
}
}
or something of the like.
any insight would be appreciated, thanks!
For example when you write array[index] does it loop through the array till it finds it? or by some other means?
This is implementation defined. Javascript can have both numeric and string keys and the very first Javascript implementations did do this slow looping to access things.
However, nowadays most browsers are more efficient and store arrays in two parts, a packed array for numeric indexes and a hash table for the rest. This means that accessing a numeric index (for dense arrays without holes) is O(1) and accessing string keys and sparse arrays is done via hash tables.
I am wondering if performance would be increased by creating and array like array[gridX][gridY] or if it will make a difference. what I am doing now is going through a flat array of objects with gridpoints as properties like this array[{x:1,y:3}]
Go with the 2 dimension array. Its a much simpler solution and is most likely going to be efficient enough for you.
Another reason to do this is that when you use an object as an array index what actually happens is that the object is converted to a string and then that string is used as a hash table key. So array[{x:1,y:3}] is actually array["[object Object]"]. If you really wanted, you could override the toString method so not all grid points serialize to the same value, but I don't think its worth the trouble.
Whether it's an array or an object, the underlying structure in any modern javascript engine is a hashtable. Need to prove it? Allocate an array of 1000000000 elements and notice the speed and lack of memory growth. Javascript arrays are a special case of Object that provides a length method and restricts the keys to integers, but it's sparse.
So, you are really chaining hashtables together. When you nest tables, as in a[x][y], you creating multiple hashtables, and it will require multiple visits to resolve an object.
But which is faster? Here is a jsperf testing the speed of allocation and access, respectively:
http://jsperf.com/hash-tables-2d-versus-1d
http://jsperf.com/hash-tables-2d-versus-1d/2
On my machines, the nested approach is faster.
Intuition is no match for the profiler.
Update: It was pointed out that in some limited instances, arrays really are arrays underneath. But since arrays are specialized objects, you'll find that in these same instances, objects are implemented as arrays as well (i.e., {0:'hello', 1:'world'} is internally implemented as an array. But this shouldn't frighten you from using arrays with trillions of elements, because that special case will be discarded once it no longer makes sense.
To answer your initial question, in JavaScript, arrays are nothing more than a specialized type of object. If you set up an new Array like this:
var someArray = new Array(1, 2, 3);
You end up with an Array object with a structure that looks more-or-less, like this (Note: this is strictly in regards to the data that it is storing . . . there is a LOT more to an Array object):
someArray = {
0: 1,
1: 2,
2: 3
}
What the Array object does add to the equation, though, is built in operations that allow you to interact with it in the [1, 2, 3] concept that you are used to. push() for example, will use the array's length property to figure out where the next value should be added, creates the value in that position, and increments the length property.
However (getting back to the original question), there is nothing in the array structure that is any different when it comes to accessing the values that it stores than any other property. There is no inherent looping or anything like that . . . accessing someArray[0] is essentially the same as accessing someArray.length.
In fact, the only reason that you have to access standard array values using the someArray[N] format is that, the stored array values are number-indexed, and you cannot directly access object properties that begin with a number using the "dot" technique (i.e., someArray.0 is invalid, someArray[0] is not).
Now, admittedly, that is a pretty simplistic view of the Array object in JavaScript, but, for the purposes of your question, it should be enough. If you want to know more about the inner workings, there is TONS of information to be found here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array
single dimension hash table with direct access:
var array = {
x1y3: 999,
x10y100: 0
};
function getValue(x, y) {
return array['x' + x + 'y' + y];
}
Trying to plan out a function and I wanted to get some input. I am trying to find an efficient way to:
Double the frequency of numbers in an array
Randomize the location of the values in the array.
For example: Lets say I have an array. [0,1,2,3]
first I want to duplicate each number once in a new array. So now we would have something like this.
[0,0,1,1,2,2,3,3].
Lastly, I want to randomize these values so: [0,4,2,3,0,2,3,4]
Eventually, the algorithm I write will need to handle an initial array of 18 digits (so the final, randomized array will be of size 36)
My initial thoughts are to just have a simple while loop that:
Randomly selects a spot in the new array
Checks to see if it is full
-If it is full, then it will select a new spot and check again.
If it is not full, then it will place the value in the new array, and go to the next value.
I am leaving out some details, etc. but I want this algorithm to be fairly quick so that the user doesn't notice anything.
My concern is that when there is only one digit left to be placed, the algorithm will take forever to place it because there will be a 1/36 chance that it will select the empty space.
In general terms, how can I make a smarter and faster algorithm to accomplish what I want to do?
Many thanks!
first I want to duplicate each number once in a new array. So now we would have something like this. [0,0,1,1,2,2,3,3].
That would be rather complicated to accomplish. Since the positions are not relevant anyway, just build [0,1,2,3,0,1,2,3] by
var newArray = arr.concat(arr);
Lastly, I want to randomize these values so: [0,4,2,3,0,2,3,4]
Just use one of the recognized shuffle algorithms - see How to randomize (shuffle) a JavaScript array?. There are rather simple ones that run in linear time and do not suffer from the problem you described, because they don't need to randomly try.
Here is an alternative to the known methods with two arrays, I came up with. It gives relatively good randomization.
var array1=[]
var array=[0,1,2,3,0,1,2,3]
a=array.length;
b=a;
c=b;
function rand(){for (i=0;i<c;i++){
n=Math.floor(Math.random()*b);
array1[i]=array[n]; array.splice(n,1);b--;}
for (i1=0;i1<c;i1++){array[i1]=array1[i1]}
I'm building an Entity System for a game, and basically I'm not sure whether I should use simple objects (dictionaries) or arrays to to store the entities/components by their id.
My biggest issue is that I didn't want a dynamic entity id. If the id was just a string (using the dictionary to store entities), then it would always be valid and I could use storage[id] to get to that entity.
If I used arrays, I thought, the id's of entities, which would represent an index in the storage array, would change. Consider this array of entities:
[
Entity,
Entity, //This one is being removed.
Entity
];
If I was to remove the second entity in that array, I thought that the id required to access the third array would have to change to the id (index) of the (now removed) second entity. That's because I thought about deleting in terms of splice()ing.
But, I could use the delete expression to turn the element (an entity) into an undefined! And, if it's true that arrays in Javascript are actually just objects, and objects logically have infinitely many undefined values, does that mean that undefined values inside arrays don't use up memory?
Initially, I though that arrays were implemented in a way that they were aligned in memory, and that an index was just an offset from the first element, and by this logic I thought that undefined values would use at least the memory of a pointer (because I, actually, thought that pointers to elements are aligned, not elements themselves).
So, if I stored 10k+ entities in this array, and deleteed half of them, would the 5k undefined's use any memory at all?
Also, when I do a for entity in array loop, would these undefined elements be passed?
Also, where can I find resources to see how arrays are actually supposed to be implemented in Javascript? All I can find are general explanations of arrays and tutorials on how to use them, but I want to find out all about these little quirks that can prove important in certain situations. Something like a "Javascript quirks" site would be great.
Arrays are not just objects. In particular the length property is very magic.
Of course, a JavaScript engine is allowed to represent the array internally in any way it chooses, as long as the external API remains the same. For instance, if you set randomly separated values then they may be stored as a hash, but if you set consecutive values then they may be optimised into an array.
for ... in does not enumerate properties that are not set. This includes an array literal that skips values e.g. [true, , false], which will only enumerate indices 0 and 2.