Underscore reduce for adding prices - javascript

I have an object wherein I am storing selections of a shopping cart. Because this is a Backbone app I decided that to derive the subtotal of the cart items I would use reduce to add the item prices by quantities:
Here's a (contrived) example:
var App = {
cart : {
item : []
}
}
//...
App.cart.item.push({ qty: 1, price: 34.95});
App.cart.item.push({ qty: 3, price: 24.95});
App.cart.item.push({ qty: 4, price: 1.99});
App.cart.item.push({ qty: 13, price: .99});
When the view is rendered the subtotal is calculated as follows:
_.reduce(App.cart.item,function(memo, num){ return memo + (num.price * num.qty); },0)
This sparked some debate:
Some colleagues have a disagreement that reduce is not the correct method to use here, rather, to use each and pass to a summing function and possibly use a memoize pattern to cache the result.
Another argues that I shouldn't use reduce without passing the function output from a matching map.
Still others agree that while reduce is the correct method to use I should instead use the foldl alias as it's clearer to future devs what the intention is.
Admittedly I know very little about FP, so my use of the reduce method in Underscore is purely a means to an end of deriving a subtotal without having to store it as a separate property in the JSON server-side.
Please explain to me why reduce is or is not "correct" approach to produce sums of the object properties. I'm not looking for a pure functional approach here, obviously, as I'm only using reduce as a utility.

This is a pretty subjective question, so I suppose I'll give a subjective answer:
There are basically three things at issue here: actual code efficiency, code legibility, and code style. The last one doesn't really matter IMO except insofar as it affects legibility and the consistency of your code base.
In terms of efficiency, _.reduce is more efficient than _.map + _.reduce, so you can drop that idea. Using _.map creates a new array with the transformed values - there's no need for this.
The _.each + summation function approach may be marginally less efficient (you need to keep track of your running-total variable somewhere else) and to my mind it's less legible, because it takes the relevant code and moves it further from the loop where you're using it. You also don't get a nice clean return value for your operation - instead, you have a variable hanging out somewhere in the outer scope that you need to first create and then re-use.
I don't think "memoization" makes much sense here. You might want to cache the return value for a given cart, and invalidate that on item changes, but that's not actually memoizing. Memoization makes sense when a) an easily-hashed input will always produce the same answer, and b) calculating that answer is expensive. In this case, calculating the sum is probably cheaper than hashing the input (your item list).
In terms of legibility, I tend strongly towards using the Underscore aliases that shadow the names of real JS 1.8 methods (the exception, for me, is .any vs. .some, because I find it so much more legible). In this case, that means .reduce, not .inject or .foldl. That makes it easier for a JS dev to understand what's intended, and JS devs are who you care about.

Related

Realm-JS: Performant way to find the index of an element in sorted results list

I am searching for a perfomant way to find the index of a given realm-object in a sorted results list.
I am aware of this similar question, which was answered with using indexOf, so my current solution looks like this:
const sortedRecords = realm.objects('mySchema').sorted('time', true) // 'time' property is a timestamp
// grab element of interest by id (e.g. 123)
const item = realm.objectForPrimaryKey('mySchema','123')
// find index of that object in my sorted results list
const index = sortedRecords.indexOf(item)
My basic concern here is performance for lager datasets. Is the indexOf implementation of a realm-list improved for this in any way, or is it the same as from a JavaScript array? I know there is the possibility to create indexed properties, would indexing the time property improve the performance in this case?
Note:
In the realm-js api documentation, the indexOf section does not reference to Array.prototype.indexOf, as other sections do. This made me optimistic it's an own implementation, but it's not stated clearly.
Realm query methods return a Results object which is quite different from an Array object, the main difference is that the first one can change over time even without calling methods on it: adding and/or deleting record to the source schema can result in a change to Results object.
The only common thing between Results.indexOf and Array.indexOf is the name of the method.
Once said that is easy to also say that it makes no sense to compare the efficiency of the two methods.
In general, a problem common to all indexOf implementations is that they need a sequential scan and in the worst case (i.e. the not found case) a full scan is required. The wort implemented indexOf executed against 10 elements has no impact on program performances while the best implemented indexOf executed against 1M elements can have a severe impact on program performances. When possible it's always a good idea avoiding to use indexOf on large amounts of data.
Hope this helps.

Javascript: Efficiently move items in and out of a fixed-size array

If I have an array that I want to be of fixed size N for the purpose of caching the most recent of N items, then once limit N is reached, I'll have to get rid of the oldest item while adding the newest item.
Note: I don't care if the newest item is at the beginning or end of the array, just as long as the items get removed in the order that they are added.
The obvious ways are either:
push() and shift() (so that cache[0] contains the oldest item), or
unshift() and pop() (so that cache[0] contains the newest item)
Basic idea:
var cache = [], limit = 10000;
function cacheItem( item ) {
// In case we want to do anything with the oldest item
// before it's gone forever.
var oldest = [];
cache.push( item );
// Use WHILE and >= instead of just IF in case the cache
// was altered by more than one item at some point.
while ( cache.length >= limit ) {
oldest.push( cache.shift() );
}
return oldest;
}
However, I've read about memory issues with shift and unshift since they alter the beginning of the array and move everything else around, but unfortunately, one of those methods has to be used to do it this way!
Qs:
Are there other ways to do this that would be better performance-wise?
If the two ways I already mentioned are the best, are there specific advantages/disadvantages I need to be aware of?
Conclusion
After doing some more research into data structures (I've never programmed in other languages, so if it's not native to Javascript, I likely haven't heard of it!) and doing a bunch of benchmarking in multiple browsers with both small and large arrays as well as small and large numbers of reads / writes, here's what I found:
The 'circular buffer' method proposed by Bergi is hands-down THE best as far performance (for reasons explained in the answer and comments), and hence it has been accepted as the answer. However, it's not as intuitive, and makes it difficult to write your own 'extra' functions (since you always have to take offset into account). If you're going to use this method, I recommend an already-created one like this circular buffer on GitHub.
The 'pop/unpush' method is much more intuitive, and performs fairly well, accept at the most extreme numbers.
The 'copyWithin' method is, sadly, terrible for performance (tested in multiple browsers), quickly creating unacceptable latency. It also has no IE support. It's such a simple method! I wish it worked better.
The 'linked list' method, proposed in the comments by Felix Kling, is actually a really good option. I initially disregarded it because it seemed like a lot of extra stuff I didn't need, but to my surprise....
What I actually needed was a Least Recently Used (LRU) Map (which employs a doubly-linked list). Now, since I didn't specify my additional requirements in my original question, I'm still marking Bergi's answer as the best answer to that specific question. However, since I needed to know if a value already existed in my cache, and if so, mark it as the newest item in the cache, the additional logic I had to add to my circular buffer's add() method (primarily indexOf()) made it not much more efficient than the 'pop/unpush' method. HOWEVER, the performance of the LRUMap in these situations blew both of the other two out of the water!
So to summarize:
Linked List -- most options while still maintaining great performance
Circular Buffer -- best performance for just adding and getting
Pop / Unpush -- most intuitive and simplest
copyWithin -- terrible performance currently, no reason to use
If I have an array that caches the most recent of N items, once limit N is reached, I'll have to get rid of the oldest while adding the newest.
You are not looking to copy stuff around within the array, which would take O(n) steps every time.
Instead, this is the perfect use case for a ring buffer. Just keep an offset to the "start" and "end" of the list, then access your buffer with that offset and modulo its length.
var cache = new Array(10000);
cache.offset = 0;
function cacheItem(item) {
cache[cache.offset++] = item;
cache.offset %= cache.length;
}
function cacheGet(i) { // backwards, 0 is most recent
return cache[(cache.offset - 1 - i + cache.length) % cache.length];
}
You could use Array#copyWithin.
The copyWithin() method shallow copies part of an array to another location in the same array and returns it, without modifying its size.
Description
The copyWithin works like C and C++'s memmove, and is a high-performance method to shift the data of an Array. This especially applies to the TypedArray method of the same name. The sequence is copied and pasted as one operation; pasted sequence will have the copied values even when the copy and paste region overlap.
The copyWithin function is intentionally generic, it does not require that its this value be an Array object.
The copyWithin method is a mutable method. It does not alter the length of this, but will change its content and create new properties if necessary.
var array = [0, 1, 2, 3, 4, 5];
array.copyWithin(0, 1);
console.log(array);
You need to splice the existing item and put it in the front using unshift (as the newest item). If the item doesn't already exist in your cache, then you can unshift and pop.
function cacheItem( item )
{
var index = cache.indexOf( item );
index != -1 ? cache.splice( index, 1 ) : cache.pop();
cache.unshift( item );
}
item needs to be a String or Number, or otherwise you'll need to write your own implementation of indexOf using findIndex to locate and object (if item is an object).

Modify Key of the object by appending a field from Value into Key using Ramda

I am trying to apply Ramda on the following problem:-
{
"alpha": {
"reg": "alpha1",
"reg2": "alpha2"
},
"beta": {
"reg": "beta1",
"reg2": "beta2"
}
}
Output
{
"v:alpha|reg:alpha1": {
"reg": "alpha1",
"reg2": "alpha2"
},
"v:beta|reg:beta1": {
"reg": "beta1",
"reg2": "beta2"
}
}
Basically, the output will be the object modifying the key by combining the key and a field from the value into the key and forming the new key.
For e.g. if the key="alpha", value is an object with key reg="alpha1". so the key should be modified to v:alpha|reg:alpha1. v being unique string appending at the start of every key, and then appending reg:alpha1 to the key.
Thanks
I think what people are saying in the comments is mostly right. (Disclaimer: I'm one of the authors of Ramda.)
Let's imagine Ramda has a renameBy function:
const renameBy = curry((fn, obj) => pipe(
toPairs,
map(pair => [apply(fn, pair), pair[1]]),
fromPairs
)(obj))
Ramda doesn't include this now, but there is a slightly less powerful version in Ramda's Cookbook. It's certainly conceivable that it will one day be included. But failing that, you could include such a function in your own codebase. (If we really tried, we could probably make that points-free, but as I'm trying to explain, that is probably unnecessary.)
How much does this gain you?
You could then write a transformation function like
const transform = renameBy((k, v) => `v:${k}|reg:${v.reg}`);
Granted, that is now simpler than the one in zerkms's comment
But the fundamentally complex bit of that version is retained here: the function that performs string interpolation on your key and value to yield a new key. Any attempt to make that points-free is likely to be substantially uglier than this version.
Perhaps this is worth it to you. It does separate concerns between the action of renaming keys and your particular key-generation scheme. And it reduces visual complexity... but only if you move that renameBy out of sight into some utilities library. If you're going to have multiple uses of renameBy, or if you simply prefer building up a more complete utility library in order to keep your other code more focused, then this might be a reasonable approach.
But it doesn't reduce the essential complexity of the code.
I do look to points-free solutions when it's reasonable. But they are just another tool in my toolbox. They are only worth it if they make the code more readable and maintainable. If they add complexity, I would suggest you don't bother.
You can see this version in the Ramda REPL.

Is it a bad idea to use indexOf inside loops?

I was studying big O notation for a technical interview and then I realized that javascript's indexOf method may have a time complexity of O(N) as it traverses through each element of an array and returns the index where its found.
We also know that a time complexity of O(n^2) (n square) is not a good performance measure for larger data.
So is it a bad idea to use indexOf inside loops? In javascript, its common to see code where indexOf method is being used inside loops, may be to measure equality or to prepare some object.
Rather than arrays, should we prefer objects wherever necessary as they provide lookup with constant time performance O(1).
Any suggestions will be appreciated.
It can be a bad idea to use indexOf inside loops especially if the dataStructure you are searching through is quite large.
One work around for this is to have a hash table or dictionary containing the index of every item which you can generate in O(N) time by looping through the data structure and updating it every time you add to the data structure.
If you push something on the end of the data structure it will take O(1) Time to update this table and the worst case scenario is if you push something to the beginning of the data structure it will take O(N).
In most scenarios it will be worth it as getting the index will be O(1) Time.
To be honest, tl;dr. But, I did some speed tests of the various ways of checking for occurrences in a string (if that is your goal for using indexOf. If you are actually trying to get the position of the match, I personally don't know how to help you there). The ways I tested were:
.includes()
.match()
.indexOf()
(There are also the variants such as .search(), .lastIndexOf(), etc. Those I have not tested).
Here is the test:
var test = 'test string';
console.time('match');
console.log(test.match(/string/));
console.timeEnd('match');
console.time('includes');
console.log(test.includes('string'));
console.timeEnd('includes');
console.time('indexOf');
console.log(test.indexOf('string') !== 0);
console.timeEnd('indexOf');
I know they are not loops, but show you that all are basically the same speed. And honestly, each do different things, depending on what you need (do you want to search by RegEx? Do you need to be pre ECMAScript 2015 compatible? etc. - I have not even listed all of them) is it really necessary to analyze it this much?
From my tests, sometimes indexOf() would win, sometimes one of the other ones would win.
based on the browser, the indexOf has different implementations (using graphs, trees, ...). So, the time complexity for each indexOf also differs.
Though, what is clear is that implementing indexOf to have O(n) would be so naive and I don't think there is a browser to have it implements like a simple loop. Therefore, using indexOf in a for loop is not the same as using 2 nested for loops.
So, this one:
// could be O(n*m) which m is so small
// could be O(log n)
// or any other O(something) that is for sure smaller than O(n^2)
console.time('1')
firstArray.forEach(item => {
secondArray.indexOf(item)
})
console.time('1')
is different than:
// has O(n^2)
console.time('2')
firstArray.forEach(item => {
secondArray.forEach(secondItem => {
// extra things to do here
})
})
console.time('2')

Should I filter my data before calling setProps on Reactjs?

Is it faster if I filter my data before calling setProps (or setState) for Reactjs?
var component = React.renderComponent(
<MainApp />,
document.getElementById("container")
);
var data = {
name: "p",
list: [
{id:1, name:""},
{id:2, name:""},
{id:3, name:""},
//...
]
}
component.setProps(data);
data.name = "w";
Now I want to update the "p" to the "w". Is it more efficient/faster if I do this:
component.setProps(data);
Or this:
component.setProps({name: "w"});
The latter one doesn't need me to put the whole data object in it again, but then I have to do my own filtering.
If I put in the whole object again with only 1 thing changed, does Reactjs need to process the whole object in setProps/setState which slows it down, or will it need to process everything anyway inside render so it makes no difference?
Edit 1 hour later:
I didn't get a satisfactorily technical answer on which is faster.
Rather than get to the bottom of it, I just did a quick jsPerf testcase to see which is faster and simply accept it as a statistical truth rather than a technological truth.
http://jsperf.com/reactjs-setprops-big-or-setprops-small
The surprising thing is that giving it a large object is faster by a negligible amount. (I thought it would be slower.) I suppose internally ReactJS has to process the whole object anyways regardless of whether my input was small or big. It isn't able to skip over processing the large object (that wasn't modified), so there is no time savings in not passing it the large object.
Passing only the key you need to update is faster by a microscopic amount, but you should pass in whatever's most convenient -- doing your own filtering will cancel out any time advantage that you might gain from passing fewer keys.
Like I said in my other answer, these small changes won't make a large enough difference to worry about. Do whatever's easiest, then only if your app is slow, look at adding shouldComponentUpdate methods to reduce unnecessary updates.

Categories

Resources