I have a json nested object, similar to this.
In my case, I have a unique id field of type int(say instead name above). This is not a binary tree, but more depict parent-child relationship. I wanted a way to easy lookup the child tree (children) rooted at say id = 121. In a brute force way, I may compare all nodes till I find one, and return the children. But I was think of keeping a map of {id, node}. For example {"121" : root[1][10]..[1]}. This may be super wastefulness of memory (unless use a pointer to the array).Note sure any better way.
I have control over what to send from server, so may augment above data structure. but need a quick way to get child tree based on node id in the client side.
EDIT:
I am considering keeping another data structure, map of {id, []ids}, where ids is the ordered path from root. Any better way?
Objects in javascript are true pointer-based objects, meaning that you can keep multiple references to them without using much more memory. Why not do a single traversal to assign the sub-objects to a new id-based parent object? Unless your hierarchical object is simply enormous, this should be very fast.
In light of best practice and what would happen if the application you're building were to scale to millions of users, you might rethink whether you really want the server to do more work. The client's computer is sitting there, ready to provide you with remote computing power for FREE. Why move the work load to the server causing it to process fewer client requests per second? That may not be a direction you want to go.
Here is a fiddle demonstrating this index-building technique. You run through it once, and use the index over and over as you please. It only takes 4 or 5 ms to build said index. There is no performance problem!
One more note: if you are concerned with bandwith, one simple way to help with that is trim down your JSON. Don't put quotes around object key names, use one-letter key names, and don't use whitespace and line breaks. That will get you a very large improvement. Performing this change to your example JSON, it goes from 11,792 characters to 5,770, only 49% of the original size!
One minor note is that object keys in javascript are always Strings. The numeric ids I added to your example JSON are coerced to strings when used as a key name. This should be no impediment to usage, but it is a subtle difference that you may want to be aware of.
I don't assume that the ids are somehow ordered, but still it might help to prune at least parts of the tree if you add to each node the information about the minimum and maximum id value of its children (and sub... children).
This can be quite easily be achieved at server side and when searching the tree you can check if the id you're looking for is within the id-range of a node, before stepping inside and searching all children.
Related
I have to reach a value (direct access) many times in a very large 2D array. Is it better to assign a temporary variable or should I use the array[req.params.position.x][req.params.position.y].anyValue every time?
I know the "new variable" option would make it easier to look at it, I was wondering if that would make an impact on the performance of the code.
My hypothesis is that it acts as some kind of forEach in a forEach and thus takes more time to reach it every time.. ?
From your description array[req.params.position.x][req.params.position.y], it sounds like that whilst this is a 2D array, you also know up front the indexes of each array. This is direct access to the array which is extremely quick. It would be different if you needed to search for something in the array, but here you aren't needing that.
Internally, in browsers, this will be constant time access no matter how big the array is. It does not need to "lookup", since the passed indexes will reference the value location in memory -- where it will be retrieved directly.
So there is no performance concern here.
From Java I'm returning a few Objects in a
Map>>
in a REST endpoint for the front-end.
The important thing to consider here is that the same MyObject may be present in different positions, hence updating one Object updates all its occurrences.
I needed to restore those connections by reference among all the occurrences of a MyObject in TypeScript.
I can distinguish all the occurrences of a MyObject by their IDs.
So I'm currently replacing all the occurrences of a MyObject with its first one, previously persisted in a map by ID. This way I'able to restore the connections by reference among all the occurrences of the same MyObject.
My solution seems to be ok, but I'm still wondering if I'm not reinventing the wheel. Is there any alternative way, possibly better than mine, of achieving this goal?
Sounds like you're serializing the object from Java to JSON over the REST service.
By the time the object is received by the client in TypeScript, it is going to treat each of those objects as separate. That's the nature of sending it out across the wire... every basically written out as a string and things like memory addresses of individual objects are moot at that point.
I'm not sure there's much more you could do beyond what you are already doing by replacing all equal objects with your one singleton per each.
To be honest, it sounds like a pretty unique use-case.
I am currently working on a project that requires me to iterate through a list of values and add a new value in between each value already in the list. This is going to be happening for every iteration so the list will grow exponentially. I decided that implementing the list as a Linked List would be a great idea. Now, JS has no default Linked List data structure, and I have no problem creating one.
But my question is, would it be worth it to create a simple Linked List from scratch, or would it be a better idea to just create an array and use splice() to insert each element? Would it, in fact, be less efficient due to the overhead?
Use a linked list, in fact most custom implementations done well in user javascript will beat built-in implementations due to spec complexity and decent JITting. For example see https://github.com/petkaantonov/deque
What george said is literally 100% false on every point, unless you take a time machine to 10 years ago.
As for implementation, do not create external linked list that contains values but make the values naturally linked list nodes. You will otherwise use way too much memory.
Inserting each element with splice() would be slower indeed (inserting n elements takes O(n²) time). But simply building a new array (appending new values and appending the values from the old one in lockstep) and throwing away the old one takes linear time, and most likely has better constant factors than manipulating a linked list. It might even take less memory (linked list nodes can have surprisingly large space overhead, especially if they aren't intrusive).
Javascript is an interpreted language. If you want to implement a linked list then you will be looping a lot! The interpreter will perform vely slowly. The built-in functions provided by the intrepreter are optimized and compiled with the interpreter so they will run faster. I would choose to slice the array and then concatenate everything again, it should be faster then implementing your own data structure.
As well javascript passes by value not by pointer/reference so how are you going to implement a linked list?
I've seen a lot of questions about the fastest way to access object properties (like using . vs []), but can't seem to find whether it's faster to retrieve object properties that are declared higher than others in object literal syntax.
I'm working with an object that could contain up to 40,000 properties, each of which is an Array of length 2. I'm using it as a lookup by value.
I know that maybe 5% of the properties will be the ones I need to retrieve most often. Is either of the following worth doing for increased performance (decreased lookup time)?
Set the most commonly needed properties at the top of the object literal syntax?
If #1 has no effect, should I create two separate objects, one with the most common 5% of properties, search that one first, then if the property isn't found there, then look through the object with all the less-common properties?
Or, is there a better way?
I did a js perf here: http://jsperf.com/object-lookup-perf
I basically injected 40000 props with random keys into an object, saved the "first" and "last" keys and looked them up in different tests. I was surprised by the result, because accessing the first was 35% slower than accessing the last entry.
Also, having an object of 5 or 40000 entries didn’t make any noticeable difference.
The test case can most likely be improved and I probably missed something, but there is a start for you.
Note: I only tested chrome
Yes, something like "indexOf" searches front to back, so placing common items higher in the list will return them faster. Most "basic" search algorithms are basic top down (simple sort) searches. At least for arrays.
If you have so many properties, they must be computed, no ? So you can replace the (string, most probably) computation by an integer hash computation, then use this hash in a regular array.
You might even use one single array by putting values in the 2*ith, 2*i+1th slot.
If you can use a typed array here, do it and you could no go faster.
Set the most commonly needed properties at the top of the object literal syntax?
No. Choose readability over performance. If you've got few enough properties that you use a literal in the code, it won't matter anyway; and you should order the properties in a logical sequence.
Property lookup in objects is usually based on hash maps, and position should not make a substantial difference. Depending on the implementation of the hash, they might be neglible slower, but I'd guess this is quite random and depends heavily on the applied optimisations. It should not matter.
If #1 has no effect, should I create two separate objects, one with the most common 5% of properties, search that one first, then if the property isn't found there, then look through the object with all the less-common properties?
Yes. If you've got really huge objects (with thousands of properties), this is a good idea. Depending on the used data structure, the size of the object might influence the lookup time, so if you've got a smaller object for the more frequent properties it should be faster. It's possible that different structures are chosen for the two objects, which could perform better than the single one - especially if you know beforehand in which object to look. However you will need to test this hypothesis with your actual data, and you should beware of premature [micro-]optimisation.
I've started a new JavaScript project based on the example at:
http://bl.ocks.org/mbostock/4063570
Everything with the d3 Dendrogram is great so far except that my data will probably always contain duplicate leaves (terminal nodes). In my data only the leaves could ever contain duplicate data. All internal nodes (between root and leaves) are strictly distinct well before d3 comes into play.
I could add something to the node(s) name (d.name) to make each node totally unique, but I'd rather 'reuse' leaf nodes and make all internal nodes point to and share a single leaf if possible.
Does anyone out there know how to do this?
Many thanks in advance!
Drew Barfield
The D3 data join expects that each DOM node will correspond to a different element in the data array. However, there's nothing stopping 2 elements in the data array from referring to the same underlying object.
It comes down to whether you are OK with the default join key (which is array index) or if you want to achieve a sense of "object permanence" on data update by mapping specific data elements to specific nodes. To have that happen you need to define a custom join key function, which by definition relies on some way to differentiate the data elements.
Personally, I think that if you're doing any amount of data updating involving enter/exit/update, life will be much easier if each data element is unique and has some kind of "id" or "key" property that you can use to identify it. Reusing data elements will likely be more headache than it's worth.
You didn't actually mention what you are trying to achieve by sharing data? Is it just a memory saving optimization or is there another reason? If it's just memory, I wouldn't bother.