I'm trying to create a simple array comparer that will tell the differences between two different arrays. I've managed to get this much, and it mostly works. However, my problem is that it compares all specific differences in the array, and because this is meant to compare two of the same types of arrays, but one is just an updated version, there is a problem with the counting. Say we have element 1 and element 2. If element 1 is not equal to element 2, then since we're comparing just an updated version of the array and an older version, all other elements are on that list still. Because of this, if we compare the data from all elements after the difference between element 1 and element 2, then all of our updated values should be x higher depending on how many differences there were before those elements. Because of this, after the first difference, every single value is different, even if it was already on the old version of the array.
This accurately describes my problem. The first difference comes at 1 and 2,
however, because of this difference, every other value is bumped up:
0|0
1|2
2|3
3|4
4|5
5|6
The actual arrays would kinda be like this:
['Auto','Sniper','Citadel','Tank']['Auto','Gunner','Sniper','Citadel','Tank']
As you can see because of the addition of Gunner, all the other argument values coming after Gunner are moved up. But because of this, ever single value after Gunner is now differnt from its original counterpart too, meaning that in what I originally had, it would log everything afterwards.
async function c() {
const fetchO = await fetch('https://a/data');
const fetchN = await fetch('https://a/data');
const O = await fetchO.json();
const N = await fetchN.json();
for(let count in N || O) {
if(N[count] !== O[count]) {
console.table('Old: ', O[count], 'New: ', N[count]);
}
}
}
c();
I've been trying to use a counting mediocre variable as a control so that this error doesn't occur. Like this.
let countControl=0;
async function c() {
const fetchO = await fetch('https://a/data');
const fetchN = await fetch('https://a/data');
const O = await fetchO.json();
const N = await fetchN.json();
for(let count in N || O) {
if(N[count] !== O[count+countControl]) {
console.table('Old: ', O[count+countControl], 'New: ', N[count]);
if(O.length>N.length){countControl=countControl-1;}
if(O.length<N.length){countControl=countControl+1;}
}
}
}
c();
The problem with this is that count isn't actually a defined variable, so when I do [count+countControl], it comes out as being undefined, but if I turn count into a defined variable I'll have to make an updating function on it, which won't work in this situation. How can I add in a working count monitor? Or is there some different way to do this?
If I follow your question correctly, you want to count the differences between the two arrays. If we can make the assumption that the elements in each array are unique (i.e. there are no duplicates) and that order does not matter, it becomes a very simple problem to solve. This would be the equivalent to the set operation "symmetric difference" which gives the set of elements only in one of two given sets. From there, it's just a matter of counting the elements which make up the symmetric difference. This can be implemented as follows.
const a = ['Auto','Sniper','Citadel','Tank'];
const b = ['Auto','Gunner','Sniper','Citadel','Tank'];
const symmetricDifference = (a, b) => {
const difference = new Set(a);
for (const element of new Set(b)) {
if (difference.has(element)) {
difference.delete(element);
} else {
difference.add(element);
}
}
return Array.from(difference);
};
const difference = symmetricDifference(a, b);
const differenceCount = difference.length;
console.log({ difference, differenceCount });
(This implementation still deals with Arrays instead of Sets outside of symmetricDifference(). Depending on other constraints and the number of objects involved, it might be significantly better for performance to use Sets outside of symmetricDifference() too.)
If the assumption of uniqueness does not hold, the general algorithm above should be adaptable to instead deal with counts of each unique element. If, however, the assumption of order not mattering does not hold up, the problem becomes much harder in the general case. Depending on your specific case though, there might be shortcuts to be had.
Related
I have a data structure that is essentially a linked list stored in state. It represents a stream of changes (patches) to a base object. It is linked by key, rather than by object reference, to allow me to trivially serialise and deserialise the state.
It looks like this:
const latest = 'id4' // They're actually UUIDs, so I can't sort on them (text here for clarity)
const changes = {
id4: {patch: {}, previous: 'id3'},
id3: {patch: {}, previous: 'id2'},
id2: {patch: {}, previous: 'id1'},
id1: {patch: {}, previous: undefined},
}
At some times, a user chooses to run an expensive calculation and results get returned into state. We do not have results corresponding to every change but only some. So results might look like:
const results = {
id3: {performance: 83.6},
id1: {performance: 49.6},
}
Given the changes array, I need to get the results closest to the tip of the changes list, in this case results.id3.
I've written a while loop to do this, and it's perfectly robust at present:
let id = latest
let referenceId = undefined
while (!!id) {
if (!!results[id]) {
referenceId = id
id = undefined
} else {
id = changes[id].previous
}
}
The approach is O(N) but that's the pathological case: I expect a long changelist but with fairly frequent results updates, such that you'd only have to walk back a few steps to find a matching result.
While loops can be vulnerable
Following the great work of Gene Krantz (read his book "Failure is not an option" to understand why NASA never use recursion!) I try to avoid using while loops in code bases: They tend to be susceptible to inadvertent mistakes.
For example, all that would be required to make an infinite loop here is to do delete changes.id1.
So, I'd like to avoid that vulnerability and instead fail to retrieve any result, because not returning a performance value can be handled; but the user's app hanging is REALLY bad!
Other approaches I tried
Sorted array O(N)
To avoid the while loop, I thought about sorting the changes object into an array ordered per the linked list, then simply looping through it.
The problem is that I have to traverse the whole changes list first to get the array in a sorted order, because I don't store an ordering key (it would violate the point of a linked list, because you could no longer do O(1) insert).
It's not a heavy operation, to push an id onto an array, but is still O(N).
The question
Is there a way of traversing this linked list without using a while loop, and without an O(N) approach to convert the linked list into a normal array?
Since you only need to append at the end and possibly remove from the end, the required structure is a stack. In JavaScript the best data structure to implement a stack is an array -- using its push and pop features.
So then you could do things like this:
const changes = [];
function addChange(id, patch) {
changes.push({id, patch});
}
function findRecentMatch(changes, constraints) {
for (let i = changes.length - 1; i >= 0; i--) {
const {id} = changes[i];
if (constraints[id]) return id;
}
}
// Demo
addChange("id1", { data: 10 });
addChange("id2", { data: 20 });
addChange("id3", { data: 30 });
addChange("id4", { data: 40 });
const results = {
id3: {performance: 83.6},
id1: {performance: 49.6},
}
const referenceId = findRecentMatch(changes, results);
console.log(referenceId); // id3
Depending on what you want to do with that referenceId you might want findRecentMatch to return the index in changes instead of the change-id itself. This gives you the possibility to still retrieve the id, but also to clip the changes list to end at that "version" (i.e. as if you popped all the entries up to that point, but then in one operation).
While writing out the question, I realised that rather than avoiding a while-loop entirely, I can add an execution count and an escape hatch which should be sufficient for the purpose.
This solution uses Object.keys() which is strictly O(N) so not technically a correct answer to the question - but it is very fast.
If I needed it faster, I could restructure changes as a map instead of a general object and access changes.size as per this answer
let id = latest
let referenceId = undefined
const maxLoops = Object.keys(changes).length
let loop = 0
while (!!id && loop < maxLoops) {
loop++
if (!!results[id]) {
referenceId = id
id = undefined
} else {
id = changes[id].previous
}
}
There is a simple function, its essence is to count from a number (n) to 0.
But when using reduce, the function just doesn't work, and no matter how I rewrite it, it returns either an empty array, or undefined, or the number itself 2.
First, I created an array that will take n, then I created a reduce method in which currentValue will take n and subtract 1 from it, after accumulator it takes the resulting number and using the push method, add it to the list array, but I don’t understand how I should add a condition that if accumulator is equal to 0, then the function must be stopped.
const countBits = (n) => {
let list = [n];
let resultReduce = n.reduce((accumulator, currentValue) => {
accumulator = currentValue - 1;
list.push(accumulator);
});
return resultReduce;
};
console.log(countBits([2]));
Why isn't this working the way I intended it to?
reduce will run on each of the items in the array, with the accumulator (first argument to the callback function) being the value that is returned from the callback function's previous iteration. So if you don't return anything, accumulator will be undefined for the next iteration.
If you want to count from n to 0, reduce is not the way to go (as well as the fact that in your current implementation, you don't even use list which would contain all of your stored numbers from n to 0). I would advise that instead, you simply loop from n to 0 and push those values into an array like so:
const countBits = (n) => {
let list = [];
for (let i = n; i > -1; i--) {
list.push(i);
}
return list;
};
console.log(countBits(2));
Also note I've changed your syntax slightly in the function call - you were passing an array with a single element seemingly unnecessarily, so I just passed the element itself to simplify the code.
The answer by Jack Bashford is correct, but for completeness I would like to point out that generating a range of numbers is a common need. Libraries like Underscore, Lodash and Ramda provide a ready-to-use function for this purpose. You don’t have to write your own implementation every time you need something common and mundane like that; save the time and enjoy the fact that you can spend your time on something more groundbreaking instead.
console.log(_.range(2, -1, -1));
<script src="https://underscorejs.org/underscore-umd-min.js"></script>
Also for the sake of completeness, let’s consider how you might implement a downwards range function using reduce, anyway. reduce expects an input array, though it can also accept an object if using Underscore or Lodash. To make meaningful use of the input collection, we could generate a consecutive number for every element of the collection. For an array, we could of course just do _.range(collection.length - 1, -1, -1) instead, but for an object, or something that you don’t know the length of in advance, such as a generator, using reduce for this purpose might make sense. The mapDownwardsRange function below will do this:
function unshiftNext(array) {
const front = array.length ? array[0] : -1;
return [front + 1].concat(array);
}
function mapDownwardsRange(collection) {
return _.reduce(collection, unshiftNext, []);
}
console.log(mapDownwardsRange(['a', 'b', 'c']));
<script src="https://underscorejs.org/underscore-umd-min.js"></script>
I've encountered this problem many times. The problem is that I have an array of elements that consist of a key and a value and I need to reduce the array to an object with the keys and the sum of values for each key (e.g. ["foo", 1], ["bar", 1], ["foo", 1] becomes {foo: 2, bar: 1}). Is it better to first initialize the object with all the keys set to 0 and then use that object or use an empty object and check if the property exists every time?
function juiceMapFromFormat(format) {
return format
.map(juiceString=>{
const [juiceName, quantityString] = juiceString.split(/\s*=>\s*/gim);
const quantity = +quantityString;
return [juiceName, quantity];
})
.reduce((juiceMap, juiceArr)=>{
const [juiceName, quantity] = juiceArr;
const previousQuantity = juiceMap.get(juiceName) || 0;
if (previousQuantity < 1000) {
juiceMap.delete(juiceName);
}
juiceMap.set(juiceName, previousQuantity + quantity);
return juiceMap;
}, new Map());
}
This is my current function. My example uses a map, but it's because of the problem. My question is for both maps and objects. I have two conditions - does the property exist and is the quantity smaller than 1000. The second condition is because of the problem I'm solving. My question is whether it's better to leave it like that or to save the array of juice arrays in a constant and use it to initialize a Map with every single juice name and give that map to the reduce method as initial value, this way removing the need for || 0 after juiceMap.get(juiceName). Which would be better?
Edit: The function takes a array of strings, which are mapped correctly. I need to return a map. My question is if I should keep the 2 conditions or remove the one checking for the existence of the property and make sure that every property exists by initializing every property to 0 beforehand.
Edit 2: If I initialize all properties beforehand, I will reduce the cyclomatic complexity as the condition will be removed. That's at least how I understand it.
I would take a single loop ad reduce the data directly.
function juiceMapFromFormat(format) {
return format.reduce((juiceMap, juiceString) => {
const [juiceName, quantityString] = juiceString.split(/\s*=>\s*/gim);
return juiceMap.set(juiceName, (juiceMap.get(juiceName) || 0) + +quantityString);
}, new Map());
}
I know that map returns a new array, and that forEach does not return anything (the docs say it returns undefined).
For example, if I had some code like this:
let test;
values.forEach((value, idx) => {
if (someNumber >= value) {
test = value;
}
});
Here I am just checking if someNumber is greater than some value, and if it is then set test = value. Is there another array method I should use here?
Or is it fine to use .forEach
Your example doesn't make sense because it finds the last value that is less than or equal to someNumber, repeatedly assigning to the test variable if more than one is found. Thus, your code is not truly expressing your intent well since other developers can be confused about what you're trying to achieve. In fact, other answers here have had differing opinions on your goal due to this ambiguity. You even said:
if the number is great than or equal to the value from the array at whatever index, stop, and set test equal to that value
But your code doesn't stop at the first value! It keeps going through the entire array and the result in test will be the last value, not the first one.
In general, making your loop refer to outside variables is not the best way to express your intent. It makes it harder for the reader to understand what you're doing. It's better if the function you use returns a value so that it's clear the variable is being assigned.
Here's a guide for you:
forEach
Use this when you want to iterate over all the values in order to do something with each of them. Don't use this if you are creating a new output value--but do use it if you need to modify existing items or run a method on each one, where the forEach has no logical output value. array.forEach at MDN says:
There is no way to stop or break a forEach() loop other than by throwing an exception. If you need such behavior, the forEach() method is the wrong tool, use a plain loop instead. If you are testing the array elements for a predicate and need a Boolean return value, you can use every() or some() instead. If available, the new methods find() or findIndex() can be used for early termination upon true predicates as well.
find
Use this when you want to find the first instance of something, and stop. What you said makes it sound like you want this:
let testResult = values.find(value => value <= someNumber);
This is far superior to setting the test value from inside the lambda or a loop. I also think that reversing the inequality and the variables is better because of the way we tend to think about lambdas.
some
These only give you a Boolean as a result, so you have to misuse them slightly to get an output value. It will traverse the array until the condition is true or the traversal is complete, but you have to do something a bit hacky to get any array value out. Instead, use find as above, which is intended to output the found value instead of simply a true/false whether the condition is met by any element in the array.
every
This is similar to some in that it returns a Boolean, but is what you would expect, it is only true if all the items in the array meet the condition. It will traverse the array until the condition is false or the traversal is complete. Again, don't misuse it by throwing away the Boolean result and setting a variable to a value. If you want to do something to every item in an array and return a single value, at that point you would want to use reduce. Also, notice that !arr.every(lambdacondition) is the same as arr.some(!lambdacondition).
reduce
The way your code is actually written—finding the last value that matches the condition—naturally lends itself to reduce:
let testResult = values.reduce(
(recent, value) => {
if (value <= someNumber) {
recent = value;
}
return recent;
},
undefined
);
This does the same job of finding the last value as your example code does.
map
map is for when you want to transform each element of an array into a new array of the same length. If you have any experience with C# it is much like the Linq-to-objects .Select method. For example:
let inputs = [ 1, 2, 3, 4];
let doubleInputs = inputs.map(value => value * 2);
// result: [ 2, 4, 6, 8]
New requirements
Given your new description of finding the adjacent values in a sorted array between which some value can be found, consider this code:
let sortedBoundaries = [ 10, 20, 30, 40, 50 ];
let inputValue = 37;
let interval = sortedBoundaries
.map((value, index) => ({ prev: value, next: sortedBoundaries[index + 1] }))
.find(pair => pair.prev < inputValue && inputValue <= pair.next);
// result: { prev: 20, next: 30 }
You can improve this to work on the ends so that a number > 50 or <= 10 will be found as well (for example, { prev: undefined, next: 10 }).
Final notes
By using this coding style of returning a value instead of modifying an outside variable, you not only communicate your intent better to other developers, you then get the chance to use const instead of let if the variable will not be reassigned afterward.
I encourage you to browse the documentation of the various Array prototype functions at MDN—doing this will help you sort them out. Note that each method I listed is a link to the MDN documentation.
I would suggest you to use Array#some, instead of Array#forEach.
Array#forEach keeps iterating the array even if given condition was fulfilled.
Array#some stops iteration when given condition was fulfilled.
One of the advantages would be connected with performance, another - depends on your purposes - Array#forEach keeps overwriting the result with every passed condition, Array#some assigns the first found value and stops the iteration.
let test,
values = [4,5,6,7],
someNumber = 5;
values.some((value, idx) => {
if (someNumber >= value) {
test = value;
return test;
}
});
console.log(test);
Another option would be to use the Array.some() method.
let test;
const someNumber = 10;
[1, 5, 10, 15].some(function (value) {
if (value > someNumber) {
return test = value
}
})
One advantage to the .some() method over your original solution is optimization, as it will return once the condition has been met.
How about Object?
you can search with for-of
Preface
Notice: This question is about complexity. I use here a complex design pattern, which you don't need to understand in order to understand the question. I could have simplified it more, but I chose to keep it relatively untouched for the sake of preventing mistakes. The code is written in TypeScript which is a super-set of JavaScript.
The code
Regard the following class:
export class ConcreteFilter implements Filter {
interpret() {
// rows is a very large array
return (rows: ReportRow[], filterColumn: string) => {
return rows.filter(row => {
// I've hidden the implementation for simplicity,
// but it usually returns either an empty array or a very short one.
}
}).map(row => <string>row[filterColumn]);
}
}
}
It receives an array of report row, then it filters the array by some logic that I've hidden. Finally it does not return the whole row, but only one stringy column that is mentioned in filterColumn.
Now, take a look at the following function:
function interpretAnd (filters: Filter[]) {
return (rows: ReportRow[], filterColumn: string) => {
var runFilter = filters[0].interpret();
var intersectionResults = runFilter(rows, filterColumn);
for (var i=1; i<filters.length; i++) {
runFilter = filters[i].interpret();
var results = runFilter(rows, filterColumn);
intersectionResults = _.intersection(intersectionResults, results);
}
return intersectionResults;
}
}
It receives an array of filters, and returns a distinct array of all the "filterColumn"s that the filters returned.
In the for loop, I get the results (string array) from every filter, and then make an intersection operation.
The problem
The report row array is large so every runFilter operation is expensive (while on the other hand the filter array is pretty short). I want to iterate the report row array as fewer times as possible. Additionally, the runFilter operation is very likely to return zero results or very few.
Explanation
Let's say that I have 3 filters, and 1 billion report rows. the internal iterration, i.e. the iteration in ConcreteFilter, will happen 3 billion times, even if the first execution of runFilter returned 0 results, so I have 2 billion redundant iterations.
So, I could, for example, check if intersectionResults is empty in the beginning of every iteration, and if so, then break the loop. But I'm sure that there are better solutions mathematically.
Also if the first runFIlter exectuion returned say 15 results, I would expect the next exectuion to receive an array of only 15 report rows, meaning I want the intersection operation to influence the input of the next call to runFilter.
I can modify the report row array after each iteration, but I don't see how to do it in an efficient way that won't be even more expensive than now.
A good solution would be to remove the map operation, and then passing the already filtered array in each operation instead of the entire array, but I'm not allowed to do it because I must not change the results format of Filter interface.
My question
I'd like to get the best solution you could think of as well as an explanation.
Thanks a lot in advance to every one who would spend his time trying to help me.
Not sure how effective this will be, but here's one possible approach you can take. If you preprocess the rows by the filter column you'll have a way to retrieve the matched rows. If you typically have more than 2 filters then this approach may be more beneficial, however it will be more memory intensive. You could branch the approach depending on the number of filters. There may be some TS constructs that are more useful, not very familiar with it. There are some comments in the code below:
var map = {};
// Loop over every row, keep a map of rows with a particular filter value.
allRows.forEach(row => {
const v = row[filterColumn];
let items;
items = map[v] = map[v] || [];
items.push(row)
});
let rows = allRows;
filters.forEach(f => {
// Run the filter and return the unique set of matched strings
const matches = unique(f.execute(rows, filterColumn));
// For each of the matched strings, go and look up the remaining rows and concat them for the next filter.
rows = [].concat(...matches.reduce(m => map[v]));
});
// Loop over the rows that made it all the way through, extract the value and then unique() the collection
return unique(rows.map(row => row[filterColumn]));
Thinking about it some more, you could use a similar approach but just do it on a per filter basis:
let rows = allRows;
filters.forEach(f => {
const matches = f.execute(rows, filterColumn);
let map = {};
matches.forEach(m => {
map[m] = true;
});
rows = rows.filter(row => !!map[row[filterColumn]]);
});
return distinctify(rows.map(row => row[filterColumn]));