jstree: incremental loading

jstree: incremental loading - javascript

The JStree plugin for jQuery allows you to load data to feed a navigable tree GUI element provided by the library.
For small trees, just load them all in memory, and you're done. But for large trees, that might not be a good solution. The all-in-memory approach doesn't scale.
Think 6000 nodes (or 60,000), most of which will never be relevant to the user viewing the page. Wouldn't it then be better to load only the first level of branches and incrementally load more branches following the user's clicks along what's already displayed? It certainly would.
You'd mark the tree where it has missing branches, then you'd load the missing branches on demand, remove the mark from the tree, graft the branch onto the tree, and proceed in that manner recursively if necessary.
How do you do imcremental loading? I found a question from 2009 pertaining to the same problem, but the API appears to have changed. Does anyone have a recipe of how to proceed for the current version of the library?
Note incremental loading is not the same as incremental rendering, which is another optimization already provided by the library.

Lumi, that is how the plugin works.
Have a look at the Demo page, about half way down, at the section "PHP & mySQL demo + event order". The example uses the JSON format for transmitting data, which is the de facto standard, but the plugin also supports other formats. When you expand a parent node, an AJAX request is made to load the next level of nodes.

Related

Which Method Do I Need To Choose For Fetching Large Data?

My problem is that I've a large number of data (Over 50k), and I've to map it with DOM with the help of vanilla javascript (ES). Sometimes pages gets crashed while data is loading. What should I choose async/await or promises? Also which method would be better either XHR or Fetch method. Or I should use some third party library? That is big problem for me because sometimes it shows the data after an interval but sometimes pages is crashed. Can anyone explain here?

Firstly, I would look at the value gained out of really mapping 50K items to a single web page. Look at loading data on-emand in sets as users are scrolling through items - or some filtering mechanism to apply to the data before loading it.
If you really have to load so much data into a page - then look at loading it in chunks - or to optimize your code so that less strain is put on the browser.

Choosing DB model for an app similar to Notion, Block-based ("paragraphs") or document-based?

1. The problem
Lately, it seems that many note managers with "infinite" tree structure are choosing a block model (where each paragraph is an entry in the DB), instead of a document or file model.
Blocks
Documents
Notion Workflowy Remnote Dynalist Roam Research
Evernote Obsidian Bear app
If you find any errors in the table, please let me know.
We have been developing an app very similar to Notion for 8 months now, also using the block model, but we are considering making a radical change and switching to the document model. The structure of our blocks in MongoDB currently looks like this:
_id: "61fd3ede7f6d2cc7a53ca669"
children: Array
0: "61fd3ee87f6d2cc7a53ca66b"
1: "61fd3ef37f6d2cc7a53ca671"
2: "61fd3ef77f6d2cc7a53ca673"
backlinks: Array
type: "bullet"
parentPage: Array
_id: "61fd3ede7f6e2ccra53ca664"
userParent: "german-jablo"
permisionParent: "edit, comment, read"
parentParagraph: "61fd3ede7f6d2cc7a53ca668"
content: "<p>This is a paragraph</p>"
isCollapsed: false
createdAt: 2022-02-04T14:57:34.280+00:00
updatedAt: 2022-02-04T14:57:59.585+00:00
Many pages talk about the differences of both approaches (example) although in a very vague way, so we decided to open this thread to find a more scientific answer to the question.
Features of our app
app
Blocks that can be opened as documents
Blocks that can collapse or expand their children
Notion
Page type blocks
Toggle type blocks
Workflowy
All
All
Evernote
Documents
None
Our app
Page type blocks
All others
Our app has two types of "blocks". The page type (which, like in Notion, can be inserted into any note and generate a document "inside" the current document), and the rest of the blocks, which are equivalent to the "toggle" block type in Notion (i.e. they can be collapsed or their nested children can be expanded).
2. What we have tried
In trying to answer our question (which DB model would work best for our application), we've realized that the answer is probably "it depends". Perhaps both models have strengths or weaknesses in different types of operations or situations. That is why we formulated this comparison table describing how we believe the performance of both models would be for each of these operations.
Operation
Blocks
Documents
Apparent Winner
Fetch the contents of a page
Find all paragraphs in the DB.
Search the document in the DB.
Document
Render the content of a page**
Build the tree from the paragraphs recursively. You can omit the children of paragraphs whose isCollapsed property is true
Render the document
Document
Update the content of a paragraph in the DB
Only the modified paragraph is rewritten
The whole document is rewritten
Block
Alternatives for rendering very large documents *
Blocks can be fetched or rendered as you scroll (as Workflowy does), or as you expand child paragraphs that were collapsed.
I thought that Grifds could achieve similar behavior, breaking the document into smaller chunks and bringing them in piecemeal, but it doesn't support updating an individual chunk, or even the entire document. It could also corrupt an HTML by splitting it into binary format.
Block
Import or paste content
In addition to converting the clipboard to HTML and/or sanitizing it, you must set up paragraphs with tree structure recursively. Note: Roam Research e.g. supports importing in JSON format, but generally users do not handle this format beforehand.
Only convert the clipboard to HTML and/or sanitize the clipboard
Document
Copy content**
Clipboard must be sanitized and/or transformed
Correct by default**
Document
Real-Time Collaboration
At the document level, could use some tree-based (Json) library like Automerge, or combine with some CRDT library for paragraph level.
Could use tinymce solution.
Tie? Both seem to have their advantages and disadvantages.
*Render very large documents: Most users probably do not use notes larger than 250 kb (considering that multimedia files are referenced in a separate collection). Still, in the document model, the question arises: how can we load, render or edit large documents in manageable chunks? One idea we came up with is to split HTML documents that reach large dimensions into portions of a certain size in kb, instead of splitting them into paragraphs. (It would be like a kind of Gridfs that allows you to modify the file in parts.) Could this be a good idea?
**Should the DOM be nested? In order to be able to collapse or expand nested child paragraphs, note managers with a block model structure the DOM in a nested way (paragraphs are in divs, inside their parent divs, etc.). However, an alternative in the document model could be that when the user presses tab, only that block (HTML tag such as <p> or <li>) is assigned an attribute with a number less than or equal to 1, representing the nesting levels relative to the previous block. This way when you press tab to nest or shift-tab to un-nest, you only have to modify one attribute of an HTML element instead of many elements; and the DOM stays simple, without having nested blocks.
3. Our conclusions
We believe that for each of the rows in the comparison table, benchmarks could be done measuring the performance of both models.
Other people have done something similar here and here, comparing the performance of note managers using both models. The problem with those tests is that it is difficult to draw an accurate conclusion about the goodness of both models. Obsidian uses documents locally, so you don't have to sync notes. Roam Research is a very new and poorly optimized app. Standard Notes encrypts notes locally. In other words, it's not always apples to apples.
And even if tests could be done, we believe that the answer may even depend on how each user uses the application. Suppose user A usually organizes his notes in long documents using paragraph nesting (to collapse or expand them). On the other hand user B usually organizes his notes by creating new documents within documents. It is likely that a block model based manager would work better for user A while a document based one would work better for B.
So, we tried to push our doubt as far as we could, but we are still not sure of the answer Which of the two models do you think would offer better performance for our app and why?
4. Update
I just found some very interesting information. It seems that both TinyMCE and CKEditor (up to version 4), the view and the model converge with the HTML being based on content-editable. However, CKEditor 5 switched to MVC [source 1], [source 2].
I've done a short test pasting a large clipboard of a few MB in TinyMCE 5, CKEditor 4 and CKEditor 5, and the latter has been a bit slower. I hope soon to be able to do more tests with other things like dragging blocks or rendering large documents.
In a GitHub thread about CKEditor 5's performance when working with large documents, one of the contributors said "It works slower than the native content-editable element obviously, but the typing experience is pretty well".

Looks like you've done your homework. Database modeling is sometimes a bit of an art rather than a science. I think that with both models, you can achieve good performance if you optimize them well. So I would recommend that you go for the one that requires less work. Since you've been working on the block model for 8 months already, that's probably the best option for you.

Love how thorough you thought about the design of your application. I just want to add some suggestions you might look:
Obsidian is using a hybrid approach. It started document-based, but now supports block links and embeds while still being super-fast. From all programs tested in my benchmark you linked above, Obsidian was the fastest.
One of the most important functions of a notetaking tool is effectively searching the probably thousands of notes. I created an amazingly simple test (the "Spaghetti Parmesan"-test) where all block-based approaches currently fail. It is about searching for two ingredients (spaghetti and parmesan) in a recipe. When both ingredients are in different blocks, all common block-based applications ultimately fail to find the recipe. You can read more about this here. I also tried to start a discussion with some of the authors on Twitter but failed to get any serious results. If you want to continue the block-based approach, you might start designing a search algorithm that can handle search terms spread about blocks (if you haven't yet). I tried to outline an algorithm in the thread linked above, but not sure if it will really work with hundreds of thousands of blocks.

You seem to be making your choice based on what is easier to model or develop.
If you intend to compete in this increasingly crowded market, you need to consider what will give you an edge, and what will be sustainable in the long run.
Two of the main issues Evernote (longtime user and former employee) faced were:
a very badly designed back-end that was very expensive to scale and made sharing and collaboration a nightmare (engineering-wise)
a too loose document model that made it very hard to add new features to the editor (real time collaboration) and made collaboration very finicky even for simple features (checkboxes would often generate so many note conflicts as to be unusable)
On the other hand, block-based tools usually have really bad web clippers, because it's very hard to go from HTML to blocks.
Solve some of these hard problems!

Dojox's JsonRestStore loading the same thing several times

I'm using a lazy loading tree in a web app project; however, I've ran into some strange behavior. It seems a simple tree with just 3 levels causes 7 requests for the root structure. After looking at the official JRS tree test, I'm not sure whether this is normal or not.
Have a look at this example:
http://download.dojotoolkit.org/release-1.6.1/dojo-release-1.6.1/dijit/tests/tree/Tree_with_JRS.html
When I visit it, my browser makes 5 requests for the root structure. My only question is why?
Edit: Worth mentioning is this doesn't happen with dojo 1.5 or below.
Here's what it looks like in the inspector (Chrome):

finally I found a solution to this problem, thanks to this post on dojo interest: thisisalink.
basically, with dojo 1.6 the dijit.tree.ForestStoreModel was extended with a few new hook-like functions (I guess because of the work done with the TreeGrid). One of these, onSetItem is called once a tree node is expanded (thus going form preLoaded to fully loaded when using a lazyLoading store). In the basic implementation, this function calls _requeryTop(), which requeries all root items.
for our application we could simply replace dijit.tree.ForestStoreModel with our implementation digicult.dijit.tree.ForestStoreModel, where onSetItem and onNewItem don't call this._requeryTop.
Sadly it's not enough to subclass the ForestStoreModel, as there are this.inherited(arguments); calls in the functions which can't be replaced easily, so we had to copy the whole class (copy class, rename, comment out two lines - easiest fix in a long time :-) ) - this may force us to redesign the class again once we update dojo to an even newer version.

I've also faced the performance problems with dijit Tree when having a tree with 10000+ nodes to be all loaded at once, with ~3000 items at the very top level.
The tree had only one dummy root node which loads the whole tree on the first click via ajax call.
In this case the tree creation took more than 1 minute to load and I got 'Stop running this script' dialog popup on IE8.
After applying a few optimization steps, the tree now loads within 2 seconds on all major browsers (IE8-IE11 included).
The first optimization I made was using dijit/tree/ObjectStoreModel as the tree's model and dojo/store/Memory as the data store.
This speeded up inserting the ajax response json nodes into the tree's data store.
The second optimization concerned the slow creation of the Tree's nodes. That took more efforts to fix:
I had to extend dijit/Tree and override the setChildItems() function (the part of it which calls _createTreeNode() function).
I kept the whole logic of the setChildItems() intact, just added parallelization of creating the tree nodes using this technique:
http://www.picnet.com.au/blogs/Guido/post/2010/03/04/How-to-prevent-Stop-running-this-script-message-in-browsers.aspx
Hope it helps, if needed, I can post the source code of my workaround

Save or destroy data/DOM elements? Which takes more resources?

I've been getting more and more into high-level application development with JavaScript/jQuery. I've been trying to learn more about the JavaScript language and dive into some of the more advanced features. I was just reading an article on memory leaks when i read this section of the article.
JavaScript is a garbage collected language, meaning that memory is allocated to objects upon their creation and reclaimed by the browser when there are no more references to them. While there is nothing wrong with JavaScript's garbage collection mechanism, it is at odds with the way some browsers handle the allocation and recovery of memory for DOM objects.
This got me thinking about some of my coding habits. For some time now I have been very focused on minimizing the number of requests I send to the server, which I feel is just a good practice. But I'm wondering if sometimes I don't go too far. I am very unaware of any kind of efficiency issues/bottlenecks that come with the JavaScript language.
Example
I recently built an impound management application for a towing company. I used the jQuery UI dialog widget and populated a datagrid with specific ticket data. Now, this sounds very simple at the surface... but their is a LOT of data being passed around here.
(and now for the question... drumroll please...)
I'm wondering what the pros/cons are for each of the following options.
1) Make only one request for a given ticket and store it permanently in the DOM. Simply showing/hiding the modal window, this means only one request is sent out per ticket.
2) Make a request every time a ticket is open and destroy it when it's closed.
My natural inclination was to store the tickets in the DOM - but i'm concerned that this will eventually start to hog a ton of memory if the application goes a long time without being reset (which it will be).
I'm really just looking for pros/cons for both of those two options (or something neat I haven't even heard of =P).

The solution here depends on the specifics of your problem, as the 'right' answer will vary based on length of time the page is left open, size of DOM elements, and request latency. Here are a few more things to consider:
Keep only the newest n items in the cache. This works well if you are only likely to redisplay items in a short period of time.
Store the data for each element instead of the DOM element, and reconstruct the DOM on each display.
Use HTML5 Storage to store the data instead of DOM or variable storage. This has the added advantage that data can be stored across page requests.
Any caching strategy will need to consider when to invalidate the cache and re-request updated data. Depending on your strategy, you will need to handle conflicts that result from multiple editors.
The best way is to get started using the simplest method, and add complexity to improve speed only where necessary.

The third path would be to store the data associated with a ticket in JS, and create and destroy DOM nodes as the modal window is summoned/dismissed (jQuery templates might be a natural solution here.)
That said, the primary reason you avoid network traffic seems to be user experience (the network is slower than RAM, always). But that experience might not actually be degraded by making a request every time, if it's something the user intuits involves loading data.

I would say number 2 would be best. Because that way if the ticket changes after you open it, that change will appear the second time the ticket is opened.

One important factor in the number of redraws/reflows that are triggered for DOM manipulation. It's much more efficient to build up your content changes and insert them in one go than do do it incrementally, since each increment causes a redraw/reflow.
See: http://www.youtube.com/watch?v=AKZ2fj8155I to better understand this.

Strategies for rendering HTML with Javascript

I take a fat JSON array from the server via an AJAX call, then process it and render HTML with Javascript. What I want is to make it as fast as humanly possible.
Chrome leads over FF in my tests but it can still take 5-8 seconds for the browser to render ~300 records.
I considered lazy-loading such as that implemented in Google Reader but that goes against my other use cases, such as being able to get instantaneous search results (simple search being done on the client side over all the records we got in the JSON array) and multiple filters.
One thing I have noticed is that both FF and Chrome do not render anything until they loop over all items in the JSON array, even though I explicitly insert the newly created elements into DOM after every loop (as soon as I have the HTML). What I'd like to achieve would be just that: force the browser to render as soon as it can.
I tried deferring the calls (every item from the array would be processed by a deferred function) but ran into additional issues there as it seems that the order of execution isn't guaranteed anymore (some items further down the array would be processed before other items before it).
I'm looking for any hints and tips here.

try:
push rows into an array, then simply
el.innerHTML = array.join("");
use document fragments
var frag = document.createDocumentFragment();
for ( loop ) {
frag.appendChild( el );
}
parent.appendChild( frag );

If you don't need to display all 300 records at once you could try to paginate them 30 or 50 records at a time and only unroll the JSON array as those sub-parts are required to be displayed through a pager or a local search box. Once converted you could cache the content for subsequent display as users navigate up and down the pages.

Try creating the elements in a detached DOM node or a document fragment, then attaching the whole thing in one go.

300 isn't a lot.
I managed to create a tree of over 500 elements with data from JSON using jQuery, in a fraction of a second on Chrome.
300 isn't a big number.
If they are rendered so slowly, it might be due to a wrong way of doing it. Can you specify how you do it?
The slowest way would be to write HTML into a string in Javascript, then assign it with innerHtml member. But that would still be fast as hell for 300 rows.

Google Web Toolkit has BulkTableRenderers that are designed to render large tables quickly. Even if you choose not to use GWT, you might be able to pick up some techniques by looking through the source code which are available under Apache License, Version 2.0.

Develop Reference

JavaScript is the programming language of the Web.