Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have read this article:
https://gamealchemist.wordpress.com/2013/05/01/lets-get-those-javascript-arrays-to-work-fast/
At the end of point 6 the author says:
Rq about shift/unshift : beware, those are always O(n) operations
(meaning : each operation will take a time proportionnal to the number
of the array length). Unless you really need to, you shouldn’t use
them. Rather build your own rotating array if you need such feature.
And in the 7th point:
Rq for shift/unshift users : apply the same principle, with two
indexes, to avoid copy/reallocation. One index on the left, one on the
right, both starting at the middle of the array. Then you’ll be again
in O(1) time. Better. Don’t forget to re-center the indexes when they
are ==.
I was wondering what does the author mean when he says build your own rotating array and two indexes,...One index on the left, one on the right, both starting at the middle of the array. How should be this considerations translated into code (the author doesn't make an example for this use cases)?
Could the principles applied to shift and unshift be applied to Array.prototype.splice too?
EDIT: I have an ordered array of x coordinates going from indexes 0 (lower values for x) to n (higher x values). I would need to use myArray.splice(index, 0, item); several times and insert some new x coordinates between the already existent ones if this coordinate is < of an higher one and > of a lower one (I can easily find that out through a binary search) and I don't want it to reorder the indexes every time I call splice cause I have thousands of elements in the array myArray.
Can it be improved using the principles mentioned by the author of the linked article?
Thanks for the attention.
All performance questions must be answered by coding a very specific solution and then measuring the performance of that solution compared to your alternative with representative data in the browsers you care about. There are very few performance questions that can be answered accurately with an abstract question that does not include precise code to be measured.
There are some common sense items like if you're going to put 1000 items in an array, then yes it is probably faster to preallocate the array to the final length and then just fill in the array values rather than call .push() 1000 times. But, if you want to know how much difference there is and whether it's actually relevant in your particular situation, then you will need to code up two comparisons and measure them in multiple browsers in a tool like http://jsperf.com.
The recommendation in that article to create your own .splice() function seems suspect to me without measuring. It seems very possible that a good native code implementation of .splice() could be faster than one written entirely in Javascript. Again, if you really wanted to know, you would have to measure a specific test case.
If you have lots of array manipulations to do and you want to end up with a sorted array, it might be faster to just remove items, add new items onto the end of the array and the call .sort() with a custom comparison function when you're doing rather than inserting every new item in sorted order. But, again which way is faster will depend entirely upon exactly what you are doing, how often you're doing it and what browsers you care about the most. Measure, measure, measure if you really want to know.
As to whether your specific situation in your edit can be improved with a custom .splice(), you'd have to code it up both ways with a representative data set and then test in a tool like perf in multiple browsers to answer the question. Since you haven't provide code or data to test, none of us can really answer that one for you. There is no generic answer that works for all possible uses of .splice() on all possible data sets in all possible browsers. The devil is in the details and the details are in all the specifics of your situation.
My guess is that if you're really performance tweaking, you will often find bigger fish to fry in making your overall algorithm smarter (so it has less work to do in the first place) than by trying to rewrite array methods. The goal is to test/measure enough to understand where your performance bottlenecks really are so you can concentrate on the one area that makes the most difference and not spend a lot of time guessing about what might make things faster. I'm always going to code a smart use of the existing array methods and only consider a custom coded solution when I've proven to myself that I have a bottleneck in one particular operation that actually matters to the experience of my app. Per-optimization will just make the code more complicated and less maintainable and will generally be time poorly spent.
I was thinking about the same problem and have ended up using B+tree data structure. It takes time and not easy to implement but really good result. It can be considered the combination of both good aspects of array and linked-list:
In term of search performance, it is similar to binary search on array and even better version ? (not so sure but at least it's tight).
Modifying (insert, delete) the set without affect all other element index (affect range is very small constant - length of a block)
I would like to hear your thought, you can check this link for the visualization of b+tree in action.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
As the title says, I'm trying to research how to solve non-numeric non-exponential linear algebra expressions for a single variable and show the evaluation at each step until it's solved.
Ex: A+B(X+Z)/C = Y, solve for B
Ideally I'm thinking I would like it to output in a similar manner to how symbolab.com does it here
I have questions on a number of things:
1) How would one initialize variable constants to begin with that have no numeric value in order to be algebraically manipulable? And what type would they have? (Chars? Or something else?)
2) Does anyone have an idea of the methods symbolab is using to solve the above example?
3) I'm a fairly inexperienced coder, so before I embark on my own path, is any of what I'm asking already possible via a C# math library like SmartMathLibrary, Math.NET, or NCalc? Can you cite any examples that use only alphabet variables?
4) Does the fact that I don't need exponential operation (nor square roots) simplify this at all -- so that I might be able to do it in a fairly simple way?
5) Might the method solution have something to do with ASTs like in this question?
And more background -- why I'm trying to learn this:
I'm working on creating an VR puzzle game in unity (for gear VR/rift/and Vive) that uses primitive math manipulatives (cubes) to teach very simple (but specific) algebra rules (isolation, substitution, and inequalities.) The manipulatives pass between two tables (literally representing the LHS and RHS of an equation)...with the intended goal of isolating one cube variable. The cubes are only manipulable using simplistic algebraic rules that the user discovers through the act of play, so nothing is spoonfed. (Whether this sounds pointless or would even be useful in the manner I've come up with is aside the point -- it's really an exercise for me to gain a better grasp at programming and because I feel VR could be an interesting means of math education -- because the program wouldn't necessarily just be for math-oriented people -- it would also be for people who struggle with the way it's normally taught, and are more visual and hands-on in their approach to learning.)
Relating to my initial question, I need to figure out how to reference the user interactions to a pre-evaluated expression (or at least, that's what I think I need to do), in order to check their progress at each step (hence my interest in how symbolab.com evaluates non-numeric variables), and allow them to step-back and forward and check their answer at each point (which corresponds to various point deductions -- unless they only check their answer when they think they're finished --at which point they'd get a bonus). It's my belief that I need to write my own behind the scenes algebra parser (hence the question), because I want the initial setup to be randomized given the constraint of the game tables (only six placements above and to the bottom of each table on either side).
Here's a static walkthrough video I created that hopefully demonstrates what the actual game is meant to play like (though, I realize it may still be confusing b/c it's entirely static and has no user interaction via a gear vr/rift/vive controller.)
Ultimately, I would like to make this an open-source project on GitHub that could be used in a classroom setting.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
The community reviewed whether to reopen this question 2 months ago and left it closed:
Needs details or clarity Add details and clarify the problem by editing this post.
Improve this question
What is the current standard in 2017 in Javascript with for() loops vs a .forEach.
I am currently working my way through Colt Steeles "Web Dev Bootcamp" on Udemy and he favours forEach over for in his teachings. I have, however, searched for various things during the exercises as part of the course work and I find more and more recommendations to use a for-loop rather than forEach. Most people seem to state the for loop is more efficient.
Is this something that has changed since the course was written (circa 2015) or are their really pros and cons for each, which one will learn with more experience.
Any advice would be greatly appreciated.
for
for loops are much more efficient. It is a looping construct specifically designed to iterate while a condition is true, at the same time offering a stepping mechanism (generally to increase the iterator). Example:
for (var i=0, n=arr.length; i < n; ++i ) {
...
}
This isn't to suggest that for-loops will always be more efficient, just that JS engines and browsers have optimized them to be so. Over the years there have been compromises as to which looping construct is more efficient (for, while, reduce, reverse-while, etc) -- different browsers and JS engines have their own implementations that offer different methodologies to produce the same results. As browsers further optimize to meet performance demands, theoretically [].forEach could be implemented in such a way that it's faster or comparable to a for.
Benefits:
efficient
early loop termination (honors break and continue)
condition control (i<n can be anything and not bound to an array's size)
variable scoping (var i leaves i available after the loop ends)
forEach
.forEach are methods that primarily iterate over arrays (also over other enumerable, such as Map and Set objects). They are newer and provide code that is subjectively easier to read. Example:
[].forEach((val, index)=>{
...
});
Benefits:
does not involve variable setup (iterates over each element of the array)
functions/arrow-functions scope the variable to the block
In the example above, val would be a parameter of the newly created function. Thus, any variables called val before the loop, would hold their values after it ends.
subjectively more maintainable as it may be easier to identify what the code is doing -- it's iterating over an enumerable; whereas a for-loop could be used for any number of looping schemes
Performance
Performance is a tricky topic, which generally requires some experience when it comes to forethought or approach. In order to determine ahead of time (while developing) how much optimization may be required, a programmer must have a good idea of past experience with the problem case, as well as a good understanding of potential solutions.
Using jQuery in some cases may be too slow at times (an experienced developer may know that), whereas other times may be a non-issue, in which case the library's cross-browser compliance and ease of performing other functions (e.g., AJAX, event-handling) would be worth the development (and maintenance) time saved.
Another example is, if performance and optimization was everything, there would be no other code than machine or assembly. Obviously that isn't the case as there are many different high level and low level languages, each with their own tradeoffs. These tradeoffs include, but are not limited to specialization, development ease and speed, maintenance ease and speed, optimized code, error free code, etc.
Approach
If you don't have a good understanding if something will require optimized code, it's generally a good rule of thumb to write maintainable code first. From there, you can test and pinpoint what needs more attention when it's required.
That said, certain obvious optimizations should be part of general practice and not required any thought. For instance, consider the following loop:
for (var i=0; i < arr.length; ++i ){}
For each iteration of the loop, JavaScript is retrieving the arr.length, a key-lookup costing operations on each cycle. There is no reason why this shouldn't be:
for (var i=0, n=arr.length; i < n; ++i){}
This does the same thing, but only retrieves arr.length once, caching the variable and optimizing your code.
I was recently asked in an interview about advantages and disadvantages of linked list and arrays for dictionary of words implementation and also what is the best data structure for implementing it? This where I messed up things. After googling I couldn't specifically found exact answer that is specific to dictionaries but general linked list v arrays explanation. What is the best suited answer to above question?
If you're just going to use it for lookups, then an array is the obvious best choice of the two. You can build the dictionary from a list of words in O(n log n)--just build an array and sort it. Lookups are O(log n) with a binary search.
Although you can build a linked list of words in O(n), lookups will require, on average, that you look at n/2 words. The difference is pretty large. Given an English dictionary of 128K words, a linked list lookup will take on average 64,000 string comparisons. A binary search will require at most 17.
In addition, a linked list of n words will occupy more memory than an array of n words, because you need the next pointer in the list.
If you need the ability to update the dictionary, you'll probably still want to use an array if updates are infrequent compared to lookups (which is almost certainly the case). I can't think of a real-world example of a dictionary of words that's updated more frequently than it's queried.
As others have pointed out, neither array nor linked list is the best choice for a dictionary of words. But of the two options you're given, array is superior in almost all cases.
There is no one answer.
The two obvious choices would be something based on a hash table if you only want to look up individual items, or something based on a balanced tree if you want to look up ranges of items.
A sorted array can work well if you do a lot of searching and relatively little insertion or deletion. Finding situations where linked lists are preferred is rather more difficult. Depending on the situation (especially such things as finding all the words that start with, say, "ste"), tries can also work extremely well (and often do well at minimizing the storage needed for a given set of data as well).
Those are really broad categories though, not specific implementations. There are also variations such as extensible hashing and distributed hash tables that can be useful in specific situations (and also have somewhat tree-like properties, so things like range-based searching can be reasonable efficient).
Best data structure for implementing dictionaries is suffix trees. You can also have a look at tries.
Well, if you're building a dictionary, you'd want it to be a sorted structure. So you're going for a sorted-array or a sorted linked-list.
For a linked list retrieval is O(n) since you have to examine all words until you find the one you need. For a sorted array, you can use binary search to find the right location, which is O(log n).
For a sorted array, insertion is O(log n) to find the right location (binary search) and then O(n) to insert because you need to push everything down. For a linked list, it would be O(n) to find the location and then O(1) to insert because you only have to adjust pointers. The same applies for deletion.
Since you aren't going to be updating a dictionary much, you can just build and then sort the array in O(nlog n) time (using quicksort for example). After that, lookup is O(log n) using binary search. Furthermore, as delnan mentioned below, using an array has the advantage that everything you access is sequential in memory; i.e., the data are localized (locality of reference). This minimizes cache misses (which are expensive). With a linked list, the data are spread out all over and there is no guarantee that they are close together, which increases the chance of cache-misses. With this in mind, given the two options, use the array.
You can do an even better job if you implement a sorted hashmap using a red-black tree (your tree entries, which are the keys can be coupled with a hashmap); here search, insert, and delete are O(log n). But it really depends on your behavior profile; if you're only doing lookup, a simple hashmap would be best (O(1) retrieval).
Another interesting data-structure you can use is a Trie, where insertion and lookup are O(m); m being the length of the string.
I am currently working on a project that requires me to iterate through a list of values and add a new value in between each value already in the list. This is going to be happening for every iteration so the list will grow exponentially. I decided that implementing the list as a Linked List would be a great idea. Now, JS has no default Linked List data structure, and I have no problem creating one.
But my question is, would it be worth it to create a simple Linked List from scratch, or would it be a better idea to just create an array and use splice() to insert each element? Would it, in fact, be less efficient due to the overhead?
Use a linked list, in fact most custom implementations done well in user javascript will beat built-in implementations due to spec complexity and decent JITting. For example see https://github.com/petkaantonov/deque
What george said is literally 100% false on every point, unless you take a time machine to 10 years ago.
As for implementation, do not create external linked list that contains values but make the values naturally linked list nodes. You will otherwise use way too much memory.
Inserting each element with splice() would be slower indeed (inserting n elements takes O(n²) time). But simply building a new array (appending new values and appending the values from the old one in lockstep) and throwing away the old one takes linear time, and most likely has better constant factors than manipulating a linked list. It might even take less memory (linked list nodes can have surprisingly large space overhead, especially if they aren't intrusive).
Javascript is an interpreted language. If you want to implement a linked list then you will be looping a lot! The interpreter will perform vely slowly. The built-in functions provided by the intrepreter are optimized and compiled with the interpreter so they will run faster. I would choose to slice the array and then concatenate everything again, it should be faster then implementing your own data structure.
As well javascript passes by value not by pointer/reference so how are you going to implement a linked list?
Due to the reasons outlined in this question I am building my own client side search engine rather than using the ydn-full-text library which is based on fullproof. What it boils down to is that fullproof spawns "too freaking many records" in the order of 300.000 records whilst (after stemming) there are only about 7700 unique words. So my 'theory' is that fullproof is based on traditional assumptions which only apply to the server side:
Huge indices are fine
Processor power is expensive
(and the assumption of dealing with longer records which is just applicable to my case as my records are on average 24 words only1)
Whereas on the client side:
Huge indices take ages to populate
Processing power is still limited, but relatively cheaper than on the server side
Based on these assumptions I started of with an elementary inverted index (giving just 7700 records as IndexedDB is a document/nosql database). This inverted index has been stemmed using the Lancaster stemmer (most aggressive one of the two or three popular ones) and during a search I would retrieve the index for each of the words, assign a score based on overlap of the different indices and on similarity of typed word vs original (Jaro-Winkler distance).
Problem of this approach:
Combination of "popular_word + popular_word" is extremely expensive
So, finally getting to my question: How can I alleviate the above problem with a minimal growth of the index? I do understand that my approach will be CPU intensive, but as a traditional full text search index seems unusably big this seems to be the only reasonable road to go down on. (Pointing me to good resources or works is also appreciated)
1 This is a more or less artificial splitting of unstructured texts into small segments, however this artificial splitting is standardized in the relevant field so has been used here as well. I have not studied the effect on the index size of keeping these 'snippets' together and throwing huge chunks of texts at fullproof. I assume that this would not make a huge difference, but if I am mistaken then please do point this out.
This is a great question, thanks for bringing some quality to the IndexedDB tag.
While this answer isn't quite production ready, I wanted to let you know that if you launch Chrome with --enable-experimental-web-platform-features then there should be a couple features available that might help you achieve what you're looking to do.
IDBObjectStore.openKeyCursor() - value-free cursors, in case you can get away with the stem only
IDBCursor.continuePrimaryKey(key, primaryKey) - allows you to skip over items with the same key
I was informed of these via an IDB developer on the Chrome team and while I've yet to experiment with them myself this seems like the perfect use case.
My thought is that if you approach this problem with two different indexes on the same column, you might be able to get that join-like behavior you're looking for without bloating your stores with gratuitous indexes.
While consecutive writes are pretty terrible in IDB, reads are great. Good performance across 7700 entries should be quite tenable.