Time complexity between two iterate methods of immutable collection

Time complexity between two iterate methods of immutable collection - javascript

There is an Immutable.js with the structural data as collections. Let's take a Map. Also there are methods to work with:
includes
filter
Let's consider these data:
const data = Map({
id1: Map({id: 'id1'}),
id2: Map({id: 'id2'}),
id100: Map({id: 'id100'}),
});
const ids = List(['id1', 'id100']);
And two approaches to iterate this Map:
function selectData() {
return data.filter((item) => ids.includes(item.get("id")));
}
function selectData() {
let selected = Map();
ids.forEach((id) => {
selected = selected.set(id, data.get(id));
});
return selected;
}
So, the question is: are these two approaches equivalent and have the same time complexity in
general
this special case with the data in Map above
From my POV they are not equivalent but time complexity should be the same.
Update: equivalent - do the same, provide the same result.

As you pointed out, the semantics are slightly different. In the example case they both provide an intersection of the ids, so
> JSON.stringify(s1())
'{"id1":{"id":"id1"},"id100":{"id":"id100"}}'
> JSON.stringify(s2())
'{"id1":{"id":"id1"},"id100":{"id":"id100"}}'
However, there are edge cases with the data structure which do not produce like for like results, such as the example you gave in the comment:
const data = Map({
id1: Map({id: 'id1'}),
id2: Map({id: 'id1'}),
id100: Map({id: 'id100'}),
});
const ids = List(['id1', 'id100']);
...
> JSON.stringify(s1())
'{"id1":{"id":"id1"},"id2":{"id":"id1"},"id100":{"id":"id100"}}'
> JSON.stringify(s2())
'{"id1":{"id":"id1"},"id100":{"id":"id100"}}'
Note. The above case looks like a bug as the id in the (value) map doesn't match the id of the key; but who knows?
In general
approach 1 produces 1 item for each item in the top level map (data) whose value has an id item that is contained in the list.
approach 2 produces 1 item for each value in the list that has an entry in the top level map (data)
As the two approaches differ in terms of lookup in the amp (data) -- one goes by the key, the other by the value of id in the value map -- if there is an inconsistency in these two values, as per the second example, you will get a difference.
In general you may be better with the second approach as the lookup into the Map will likely be cheaper than lookup into the list if both collections are of similar size. If the collections are of largely different size, you would need to take that into account.
Lookup into the list will be O(N) whereas lookup into the Map is documented as O(log32 N) (so some kind of wide tree implementation). So for a map M and list L, the cost of apprach 1 would be O(L * log32 M) whereas the cost of the second approach would be O(M * L), if M == L (or is close), then of course approach 1 wins, on paper.
It's almost always best to profile these things, rather than worry about the theoretical time complexity.
Practically, there may be another nice approach that relies on the already sorted nature of the map. If you sort the list first, (O(L log L)), then you can just use a sliding window over elements of both to find the intersection...

Related

Is this O(N) approach the only way of avoiding a while loop when walking this linked list in Javascript?

I have a data structure that is essentially a linked list stored in state. It represents a stream of changes (patches) to a base object. It is linked by key, rather than by object reference, to allow me to trivially serialise and deserialise the state.
It looks like this:
const latest = 'id4' // They're actually UUIDs, so I can't sort on them (text here for clarity)
const changes = {
id4: {patch: {}, previous: 'id3'},
id3: {patch: {}, previous: 'id2'},
id2: {patch: {}, previous: 'id1'},
id1: {patch: {}, previous: undefined},
}
At some times, a user chooses to run an expensive calculation and results get returned into state. We do not have results corresponding to every change but only some. So results might look like:
const results = {
id3: {performance: 83.6},
id1: {performance: 49.6},
}
Given the changes array, I need to get the results closest to the tip of the changes list, in this case results.id3.
I've written a while loop to do this, and it's perfectly robust at present:
let id = latest
let referenceId = undefined
while (!!id) {
if (!!results[id]) {
referenceId = id
id = undefined
} else {
id = changes[id].previous
}
}
The approach is O(N) but that's the pathological case: I expect a long changelist but with fairly frequent results updates, such that you'd only have to walk back a few steps to find a matching result.
While loops can be vulnerable
Following the great work of Gene Krantz (read his book "Failure is not an option" to understand why NASA never use recursion!) I try to avoid using while loops in code bases: They tend to be susceptible to inadvertent mistakes.
For example, all that would be required to make an infinite loop here is to do delete changes.id1.
So, I'd like to avoid that vulnerability and instead fail to retrieve any result, because not returning a performance value can be handled; but the user's app hanging is REALLY bad!
Other approaches I tried
Sorted array O(N)
To avoid the while loop, I thought about sorting the changes object into an array ordered per the linked list, then simply looping through it.
The problem is that I have to traverse the whole changes list first to get the array in a sorted order, because I don't store an ordering key (it would violate the point of a linked list, because you could no longer do O(1) insert).
It's not a heavy operation, to push an id onto an array, but is still O(N).
The question
Is there a way of traversing this linked list without using a while loop, and without an O(N) approach to convert the linked list into a normal array?

Since you only need to append at the end and possibly remove from the end, the required structure is a stack. In JavaScript the best data structure to implement a stack is an array -- using its push and pop features.
So then you could do things like this:
const changes = [];
function addChange(id, patch) {
changes.push({id, patch});
}
function findRecentMatch(changes, constraints) {
for (let i = changes.length - 1; i >= 0; i--) {
const {id} = changes[i];
if (constraints[id]) return id;
}
}
// Demo
addChange("id1", { data: 10 });
addChange("id2", { data: 20 });
addChange("id3", { data: 30 });
addChange("id4", { data: 40 });
const results = {
id3: {performance: 83.6},
id1: {performance: 49.6},
}
const referenceId = findRecentMatch(changes, results);
console.log(referenceId); // id3
Depending on what you want to do with that referenceId you might want findRecentMatch to return the index in changes instead of the change-id itself. This gives you the possibility to still retrieve the id, but also to clip the changes list to end at that "version" (i.e. as if you popped all the entries up to that point, but then in one operation).

While writing out the question, I realised that rather than avoiding a while-loop entirely, I can add an execution count and an escape hatch which should be sufficient for the purpose.
This solution uses Object.keys() which is strictly O(N) so not technically a correct answer to the question - but it is very fast.
If I needed it faster, I could restructure changes as a map instead of a general object and access changes.size as per this answer
let id = latest
let referenceId = undefined
const maxLoops = Object.keys(changes).length
let loop = 0
while (!!id && loop < maxLoops) {
loop++
if (!!results[id]) {
referenceId = id
id = undefined
} else {
id = changes[id].previous
}
}

Intersect multiple arrays of objects

So first of all, I am not expecting a specific solution to my problem, but instead some insights from more experienced developers that could enlighten me and put me on the right track. As I am not yet experienced enough in algorithms and data structures and I take this as a challenge for myself.
I have n number of arrays, where n >= 2.
They all contain objects and in the end, I want an array that contains only the common elements between all these arrays.
array1 = [{ id: 1 }, { id: 2 }, { id: 6 }, { id: 10 }]
array2 = [{ id: 2 }, { id: 4 }, { id: 10 }]
array3 = [{ id: 2 }, { id: 3 }, { id: 10 }]
arrayOfArrays = [array1, array2, array3]
intersect = [{ id: 2 }, { id: 10 }]
How would one approach this problem? I have read solutions using Divide And Conquer, or Hash tables, and even using the lodash library but I would like to implement my own solution for once and not rely on anything external, and at the same time practice algorithms.

For efficiency, I would start by locating the shortest array. This should be the one you work with. You can run a reduce on the arrayOfArrays to iterate through and return the index of the shortest length.
const shortestIndex = arrayOfArrays.reduce((accumulator, currentArray, currentIndex) => currentArray.length < arrayOfArrays[index] ? currentIndex : accumulator, 0);
Take the shortest array and call the reduce function again, this will iterate through the array and allow you to accumulate a final value. The second parameter is the starting value, which is a new array.
shortestArray.reduce((accumulator, currentObject) => /*TODO*/, [])
For the code, we basically need to loop through the remaining arrays and make sure it exists in all of them. You can use the every function since it will fail fast meaning the first array it doesn't exist in will trigger it to return false.
Inside the every you can call some to check if there is at least one match.
isMatch = remainingArrays.every(array => array.some(object => object.id === currentObject.id))
If it's a match, add it to the accumulator which will be your final result. Otherwise, just return the accumulator.
return isMatch ? [...accumulator, currentObject] : accumulator;
Putting all that together should get you a decent solution. I'm sure there are more optimizations that could be made, but that's where I would start.
reduce
every
some

The general solution is to iterate through an input and check for each value whether it exists in all of the other inputs. (Time complexity: O(l * n * l) where n is number of arrays and l is the average length of an array)
Following the ideas of the other two answers, we can improve this brute-force approach a bit by
iterating through the smallest input
using a Set for efficient lookup of ids instead of iteration
so it becomes (with O(l * n + min_l * n) = O(n * l))
const arrayOfIdSets = arrayOfArrays.map(arr =>
new Set(arr.map(val => val.id))
);
const smallestArray = arrayOfArrays.reduce((smallest, arr) =>
smallest.length < arr.length ? smallest : arr
);
const intersection = smallestArray.filter(val =>
arrayOfIdSets.every(set => set.has(val.id))
);

A good way to approach these kinds of problems, both in interviews and in just regular life, is to think of the most obvious approach you can come up with, no matter how inefficient, and think think about how you can improve it. This is usually called a "brute force" approach.
So for this problem, perhaps an obvious but inefficient approach would be to iterate through every item in array1 and check if it is in both array2 and array 3, and note it down (in another array) if it is. Then repeat again for each item in array2 and in array 3, making sure to only note down items you haven't noted down before.
We can see that will be inefficient because we'll be looking for a single item in an array many times, which is quite slow for an array. But it'll work!
Now we can get to work improving our solution. One thing to notice is that finding the intersection of 3 arrays is the same as finding the intersection of the third array with the intersection of the first and second array. So we can look for a solution to the simpler problem of the intersection of 2 arrays, to build one of an intersection for 3 arrays.
This is where it's handy to know your datastructures. You want to be able to ask the question, "does this structure contain a particular element?" as quickly as possible. Think about what structures are good for that kind of a lookup (known as search). More experienced engineers have this memorized/learned, but you can reference something like https://www.bigocheatsheet.com/ to see that sets are good at this.
I'll stop there to not give the full solution, but once you've seen that sets are fast at both insertion and search, think about how you can use that to solve your problem.

Prevent pushing to array if duplicate values are present

I'm mapping an array and based on data i'm pushing Option elements into an array as follows
let make_children: any | null | undefined = [];
buyerActivityResult && buyerActivityResult.simulcastMyAccount.data.map((item: { make: {} | null | undefined; }, key: any) => {
make_children.push(
<Option key={key}>{item.make}</Option>
);
});
Following data array has several objects and these objects have an attribute called model.
buyerActivityResult.simulcastMyAccount.data
I want to prevent pusing Options to my array if the attribute model has duplicate data. It only has to push once for all similar model values.
How can i do it?
I tried something like this
buyerActivityResult && buyerActivityResult.simulcastMyAccount.data.map((item: { model: {} | null | undefined; }, key: any) => {
model_children.indexOf(item.model) === -1 && model_children.push(
<Option key={key}>{item.model}</Option>
);
});
But still duplicate values are being pushed into my array.

Its difficult to tell what you are trying to achieve but it looks like a map may not be the right tool for the job.
A map returns the same sized length array as that of the original array that you are calling map on.
If my assumptions are correct, your buyerActivityResult.simulcastMyAccount.data array has duplicate values, and you want to remove these duplicates based on the model property? One way to achieve this would be to use the lodash library for this, using the uniq function:
const uniqueResults = _.uniq(buyerActivityResult.simulcastMyAccount.data, (item) => item.model);

The Array.prototype.map() method is supposed to be used for manipulating the data contained into the array performing the operation. To manipulate data from other variables I recommend to use a for-loop block.
If item.model is an object, the function Array.prototype.indexOf() always returns -1 because it compares the memory address of the objects and does not do a deep comparison of all properties values.
The usual solution to remove duplicate data from an array is converting the Array into a Set then back to an Array. Unfortunately, this works only on primary type values (string, number, boolean, etc...) and not on objects.
Starting here, I will review your source code and do some changes and explain why I would apply those changes. First of all, assuming the make_children array does not receive new attribution later in your code, I would turn it into a constant. Because of the initialization, I think the declaration is overtyped.
const make_children: any[] = [];
Then I think you try to do too much things at the same time. It makes reading of the source code difficult for your colleagues, for you too (maybe not today but what about in few weeks...) and it make testing, debugging and improvements nearly impossible. Let's break it down in at least 2 steps. First one is transforming the data. For example remove duplicate. And the second one create the Option element base on the result of the previous operation.
const data: { make: any }[] = buyerActivityResult?.simulcastMyAccount?.data || [];
let options = data.map((item) => !!item.model); // removing items without model.
// Here the hard part, removing duplicates.
// - if the models inside your items have a property with unique value (like an ID) you can implement a function to do so yourself. Take a look at: https://stackoverflow.com/questions/2218999/remove-duplicates-from-an-array-of-objects-in-javascript
// - or you can use Lodash library like suggested Rezaa91 in its answer
options = _.uniq(data, (item) => item.model);
Now you only have to create the Option elements.
for (var i = 0; i < options.length; i++) {
model_children.push(<Option key={i}>{options[i].model}</Option>);
}
// OR using the Array.prototype.map method (in this case, do not declare `model_children` at the beginning)
const model_children:[] = options.map((opt:any, i:number) => <Option key={i}>{opt.model}</Option>);
Despite the lack of context of the execution of the code you provided I hope my answer will help you to find a solution and encourage you to write clearer source code (for the sake of your colleagues and your future self).
PS: I do not know anything about ReactJs. forgive me my syntax mistakes.

What's the best way to filter an array of objects to only show those objects which were added since the last time it was filtered?

My first function scrapes my employers site for a list of users who have completed a task and outputs a json file containing the results. The json file is organized as follows:
{"Completed":[{"task":"TitleOfTaskAnd01/01/2019", "name":"UsersFullName"},{"task":"TitleOfTaskAnd01/01/2019", "name":"UsersFullName"}...]}
My second function uses the aforementioned json file to automatically generate receipts. On calling these two functions again I would like to leave out all of the previously utilized data, and only generate receipts for the tasks that were not in the results of any previous calls, therefore avoiding the generation of duplicates.
I tried to filter the first array by the elements of the second array, however as far as I can tell you cannot compare objects, or even arrays for that matter. Here is the function I tried to adjust to my needs:
let myArray = myArray.filter( ( el ) => !toRemove.includes( el ) );
I expect that my use case is not too uncommon and there is already a body of experience regarding best practices in this situation. I prefer solutions that use just javascript, so that I can understand how to navigate the situation better in the future. If however you have a library/module solution that is welcomed as well. Thanks in advance.

The problem is that two objects are never equal (except they are references to the same object). To check for structural equality, you have to manually compare their properties:
myArray.filter(el => !toRemove.some(el2 => el.task === el2.task && el.name === el2.name));
While that works, it will be quite slow for a lot of elements as you compare each object of myArray against all objects of toRemove. To improve that, you could generate a unique hash out of the properties and add that hash into a Set:
const hash = obj => JSON.stringify([obj.name, obj.task]);
const remove = new Set(toRemove.map(hash));
const result = myArray.filter(el => !remove.has(hash(el)));
This will be O(n + m), whereas the previous solutions was O(n * m).

Find closest array index

I'm dealing with an array of "events" where the key of the array is the Unix Timestamp of the event. In other words let's assume we have the following array of event objects in JS:
var MyEventsArray=[];
MyEventsArray[1513957775]={lat:40.671978333333, lng:14.778661666667, eventcode:46};
MyEventsArray[1513957845]={lat:40.674568332333, lng:14.568661645667, eventcode:23};
MyEventsArray[1513957932]={lat:41.674568332333, lng:13.568661645667, eventcode:133};
and so on for thousands rows...
Data are sent along Ajax call and encoded in JSON to be processed with JS. When the data set is received, I have another Unix Timestamp let say 1513957845, coming from another source and I want to find the event that happened at that time...it's quite easy I just need to take the element from the array having the given index (the second in the list above).
Now the question: immagine that the given index is not found (imagine we are looking for UXTimestamp=1513957855) and that this index was not existing in the array but I want to take the closest index (in the example above I would take the element MyEventsArray[1513957845] as it's index 1513957845 is the closest to 1513957855). What can I do to obtain this result?
My difficulties are in handling array index as I when I receive the array I don't know where the index begins.
How the machine will handle situations like that?
Will the machine allocate (and waste) memory for dummy/empty elements placed between each rows or the compiler have some kind of ability to build it's own index and optimize the space? In other words: is it safe to play with index as we're doing or it's better to allocate the array as:
var MyEventsArray=[];
MyEventsArray['1513957775']={lat:40.671978333333, lng:14.778661666667, eventcode:46};
MyEventsArray['1513957845']={lat:40.674568332333, lng:14.568661645667, eventcode:23};
MyEventsArray['1513957932']={lat:41.674568332333, lng:13.568661645667, eventcode:133};
and so on for thousands rows...
In this case the key and the index are clearly different so here it's possible to get the first element with MyArray[0] despite we don't know the key value. Is this approach more expensive (here we must save index and key) in term of memory or the effects are the same for the compiler?

There is no difference between MyEventsArray[1513957775] and MyEventsArray['1513957775']. Deep down, array indexes are just property names, and property names are strings.
Regarding the question of whether these sparse indices will lead to millions of empty cells being allocated, no, that won't happen. Sparse arrays only store what you put in them, not empty space.
If you want to find a key quickly, you can obtain an array of the keys, sort them, and then find the one you want:
var MyEventsArray=[];
MyEventsArray[1513957775]={lat:40.671978333333, lng:14.778661666667, eventcode:46};
MyEventsArray[1513957845]={lat:40.674568332333, lng:14.568661645667, eventcode:23};
MyEventsArray[1513957932]={lat:41.674568332333, lng:13.568661645667, eventcode:133};
var target = 1513957855;
var closest= Object.keys(MyEventsArray)
.map(k => ({ k, delta: Math.abs(target - k) }))
.sort((a, b) => a.delta - b.delta)[0].k;
console.log(closest);

You could take Array#some which allowes to exits the iteration if the delta is getting greater than the last delta.
var array = [];
array[1513957775] = { lat: 40.671978333333, lng: 14.778661666667, eventcode: 46 };
array[1513957845] = { lat: 40.674568332333, lng: 14.568661645667, eventcode: 23 };
array[1513957932] = { lat: 41.674568332333, lng: 13.568661645667, eventcode: 133 };
var key = 0,
search = 1513957855;
Object.keys(array).some(function (k) {
if (Math.abs(k - search) > Math.abs(key - search)) {
return true;
}
key = k;
});
console.log(key);

You can use Object.keys(MyEventsArray) to get an array of the keys (which are strangely expressed as strings); you could then iterate through that and find the closest match.
var MyEventsArray=[];
MyEventsArray[1513957775]={lat:40.671978333333, lng:14.778661666667, eventcode:46};
MyEventsArray[1513957845]={lat:40.674568332333, lng:14.568661645667, eventcode:23};
MyEventsArray[1513957932]={lat:41.674568332333, lng:13.568661645667, eventcode:133};
Object.keys(MyEventsArray)
["1513957775", "1513957845", "1513957932"]
Reference: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array

Develop Reference

JavaScript is the programming language of the Web.