Associate Data With HTML Element (without jQuery) - javascript

I need to associate some data with an HTML Element, hopefully avoiding any memory leaks. Obviously a simple solution would be to throw some kind of identifier on the element, and then create a dictionary mapping that identifier to the data I need. However, this is in a javascript library that users will add to their page, thus I don't have control over when elements are added or removed.
What I'm looking to do is associate data with an HTML element while it's on the page, while allowing for that data to be GC'd when/if the element is removed. Is there a way to do this that doesn't involve writing my own periodic GC to clean up orphaned data? Is it safe to add properties to HTML elements?

Attribute approach
You can store data in elements using custom data-* attributes.
This has the limitation that you can only store strings, but you can use JSON to store plain objects or arrays (not by reference).
To avoid conflicts with other code, it would be a good idea to include the name of your library in the attribute.
They can be set directly in the HTML:
<div data-mylibrary-foo="bar"></div>
And they can be read or written with JavaScript:
element.getAttribute('data-mylibrary-foo'); // read (old way)
element.setAttribute('data-mylibrary-foo', 'bar'); // write (old way)
element.dataset.mylibraryFoo; // read (new way)
element.dataset.mylibraryFoo = 'bar'; // write (new way)
They can also be read by CSS, using some attribute selector.
Property approach
This is much more flexible than the attribute approach, allowing to store arbitrary data in your element.
To avoid conflicts with other code, better wrap all properties in an object with the name of your library:
element.mylibrary = {};
element.mylibrary.foo = 'bar';
The problem is that a future HTML standard could define mylibrary as a native property, so there could be problems.
Symbol approach
ECMAScript 6 introduces symbols, which can be used as properties. The advantage is that each symbol has an unique identity, so you don't need to worry about some other library or a future standard using the same properties as your code.
var foo = Symbol("foo");
element[foo] = 'bar';
WeakMap approach
ECMAScript 6 introduces WeakMaps, which allow you to associate data with objects in a way that, if the objects are no longer referenced anywhere else, they will be garbage collected.
Like the property approach, they allow you to store arbitrary data.
Since the data is not stored in the elements themselves, there is no risk of conflicts.
var allData = new WeakMap();
var data1 = allData.get(element1) || {}; // read
data1.foo = "bar";
allData.set(element1, data1); // write

An HTML element in JavaScript is just a JavaScript object, so you can certainly add arbitrary properties to it. JavaScript will handle the garbage collection just fine.
The data- attribute is another option and might be a better choice if your properties are strings; their values will be visible in the DOM (which might be a good thing for debugging). If your properties are themselves objects, then you'd have to stringify them (and reverse the process to retrieve their values).

Related

JS: Using dataset vs using (adhoc) properties

I need to store data (a string) on a certain HTML node.
This can be done via the dataset object
document.body.dataset.foo = 'bar';
Given the nature of JS however, node instances are also (just) objects, and thus you can also simply define a new property implicitly on the fly like so:
document.body.bar = 'foo';
Both methods fulfill the requirement of storing and then retrieving data of simple types.
I'm aware of the following differences:
dataset writes the the information as attribute on the node (data-{key}="{value}")
by using the subobject (dataset) you are less likely to overwrite an existing property
In my case, I would prefer not to have the information written as data-attribute, so I lean towards the property approach. Most sources on the internet however want you to use the dataset method.
Am I missing something here? Are there any other particularities to consider?

What's the purpose of Symbol in terms of unique object identifiers?

Notes about 'Not a duplicate':
I've been told this is a duplicate of What is the use of Symbol in javascript ECMAScript 6?. Well, it doesn't seem right to me. The code they've given is this:
const door = {};
// library 1
const cake1 = Symbol('cake');
door[cake1] = () => console.log('chocolate');
// library 2
const cake2 = Symbol('cake');
door[cake2] = () => console.log('vanilla');
// your code
door[cake1]();
door[cake2]();
The only thing that makes this work is because cake1 and cake2 are different (unique) names. But the developer has explicitly given these; there is nothing offered by Symbol which helps here.
For example if you change cake1 and cake2 to cake and run it, it will error:
Uncaught SyntaxError: Identifier 'cake' has already been declared
If you're already having to manually come up with unique identifiers then how is Symbol helping?
If you execute this in your console:
Symbol('cake') === Symbol('cake');
It evaluates to false. So they're unique. But in order to actually use them, you're now having to come up with 2 key names (cake1 and cake2) which are unique. This has to be done manually by the developer; there's nothing in Symbol or JavaScript in general which will help with that. You're basically creating a unique identifier using Symbol but then having to assign it manually to...a unique identifier that you've had to come up with as a developer.
With regards to the linked post they cite this as an example which does not use Symbol:
const door = {};
// from library 1
door.cake = () => console.log('chocolate');
// from library 2
door.cake = () => console.log('vanilla');
// your code
door.cake();
They try to claim this is a problem and will only log "vanilla". Well clearly that's because door.cake isn't unique (it's declared twice). The "fix" is as simple as using cake1 and cake2:
door.cake1 = () => console.log('chocolate');
door.cake2 = () => console.log('vanilla');
door.cake1(); // Outputs "chocolate"
door.cake2(); // Outputs "vanilla"
That will now work and log both "chocolate" and "vanilla". In this case Symbol hasn't been used at all, and indeed has no bearing on that working. It's simply a case that the developer has assigned a unique identifier but they have done this manually and without using Symbol.
Original question:
I'm taking a course in JavaScript and the presenter is discussing Symbol.
At the beginning of the video he says:
The thing about Symbol's is that every single one is unique and this makes them very valuable in terms of things like object property identifiers.
However he then goes on to say:
They are not enumerable in for...in loops.
They cannot be used in JSON.stringify. (It results in an empty object).
In the case of point (2) he gives this example:
console.log(JSON.stringify({key: 'prop'})); // object without Symbol
console.log(JSON.stringify({Symbol('sym1'): 'prop'})); // object using Symbol
This logs {"key": "prop"} and {} to the console respectively.
How does any of this make Symbol "valuable" in terms of being unique object keys or identifiers?
In my experience two very common things you'd want to do with an object is enumerate it, or convert the data in them to JSON to send via ajax or some such method.
I can't understand what the purpose of Symbol is at all, but especially why you would want to use them for making object identifiers? Given it will cause things later that you cannot do.
Edit - the following was part of the original question - but is a minor issue in comparison to the actual purpose of Symbol with respect to unique identifiers:
If you needed to send something like {Symbol('sym1'): 'prop'} to a backend via ajax what would you actually need to do in this case?
I replied to your comment in the other question, but since this is open I'll try to elaborate.
You are getting variable names mixed up with Symbols, which are unrelated to one another.
The variable name is just an identifier to reference a value. If I create a variable and then set it to something else, both of those refer to the same value (or in the case of non-primitives in JavaScript, the same reference).
In that case, I can do something like:
const a = Symbol('a');
const b = a;
console.log(a === b); // true
That's because there is only 1 Symbol created and the reference to that Symbol is assigned to both a and b. That isn't what you would use Symbols for.
Symbols are meant to provide unique keys which are not the same as a variable name. Keys are used in objects (or similar). I think the simplicity of the other example may be causing the confusion.
Let us imagine a more complex example. Say I have a program that lets you create an address book of people. I am going to store each person in an object.
const addressBook = {};
const addPerson = ({ name, ...data }) => {
addressBook[name] = data;
};
const listOfPeople = [];
// new user is added in the UI
const newPerson = getPersonFromUserEntry();
listOfPeople.push(newPerson.name);
addPerson(newPerson);
In this case, I would use listOfPeople to display a list and when you click it, it would show the information for that user.
Now, the problem is, since I'm using the person's name, that isn't truly unique. If I have two "Bob Smith"'s added, the second will override the first and clicking the UI from "listOfPeople" will take you to the same one for both.
Now, instead of doing that, lets use a Symbol in the addPerson() and return that and store it in listOfPeople.
const addressBook = {};
const addPerson = ({ name, ...data }) => {
const symbol = Symbol(name);
addressBook[symbol] = data;
return symbol;
};
const listOfPeople = [];
// new user is added in the UI
const newPerson = getPersonFromUserEntry();
listOfPeople.push(addPerson(newPerson));
Now, every entry in listOfPeople is totally unique. If you click the first "Bob Smith" and use that symbol to look him up you'll get the right one. Ditto for the second. They are unique even though the base of the key is the same.
As I mentioned in the other answer, the use-case for Symbol is actually fairly narrow. It is really only when you need to create a key you know will be wholly unique.
Another scenario where you might use it is if you have multiple independent libraries adding code to a common place. For example, the global window object.
If my library exports something to window named "getData" and someone has a library that also exports a "getData" one of us is going to override the other if they are loaded at the same time (whoever is loaded last).
However, if I want to be safer, instead of doing:
window.getData = () => {};
I can instead create a Symbol (whose reference I keep track of) and then call my getData() with the symbol:
window[getDataSymbol]();
I can even export that Symbol to users of my library so they can use that to call it instead.
(Note, all of the above would be fairly poor naming, but again, just an example.)
Also, as someone mentioned in the comments, these Symbols are not for sharing between systems. If I call Symbol('a') that is totally unique to my system. I can't share it with anyone else. If you need to share between systems you have to make sure you are enforcing key uniqueness.
As a very practical example what kind of problem Symbols solve, take angularjs's use of $ and $$:
AngularJS Prefixes $ and $$: To prevent accidental name collisions with your code, AngularJS prefixes names of public objects with $ and names of private objects with $$. Please do not use the $ or $$ prefix in your code.
https://docs.angularjs.org/api
You'll sometimes have to deal with objects that are "yours", but that Angular adds its own $ and $$ prefixed properties to, simply as a necessity for tracking certain states. The $ are meant for public use, but the $$ you're not supposed to touch. If you want to serialise your objects to JSON or such, you need to use Angular's provided functions which strip out the $-prefixed properties, or you need to otherwise be aware of dealing with those properties correctly.
This would be a perfect case for Symbols. Instead of adding public properties to objects which are merely differentiated by a naming convention, Symbols allow you to add truly private properties which only your code can access and which don't interfere with anything else. In practice Angular would define a Symbol once somewhere which it shares across all its modules, e.g.:
export const PRIVATE_PREFIX = Symbol('$$');
Any other module now imports it:
import { PRIVATE_PREFIX } from 'globals';
function foo(userDataObject) {
userDataObject[PRIVATE_PREFIX] = { foo: 'bar' };
}
It can now safely add properties to any and all objects without worrying about name clashes and without having to advise the user about such things, and the user doesn't need to worry about Angular adding any of its own properties since they won't show up anywhere. Only code which has access to the PRIVATE_PREFIX constant can access these properties at all, and if that constant is properly scoped, that's only Angular-related code.
Any other library or code could also add its own Symbol('$$') to the same object, and it would still not clash because they're different symbols. That's the point of Symbols being unique.
(Note that this Angular use is hypothetical, I'm just using its use of $$ as a starting point to illustrate the issue. It doesn't mean Angular actually does this in any way.)
To expand on #samanime's excellent answer, I'd just like to really put emphasis on how Symbols are most commonly used by real developers.
Symbols prevent key name collision on objects.
Let's inspect the following page from MDN on Symbols. Under "Properties", you can see some built-in Symbols. We'll look at the first one, Symbol.iterator.
Imagine for a second that you're designing a language like JavaScript. You've added special syntax like for..of and would like to allow developers to define their own behavior when their special object or class is iterated over using this syntax. Perhaps for..of could check for a special function defined on the object/class, named iterator:
const myObject = {
iterator: function() {
console.log("I'm being iterated over!");
}
};
However, this presents a problem. What if some developer, for whatever reason, happens to name their own function property iterator:
const myObject = {
iterator: function() {
//Iterate over and modify a bunch of data
}
};
Clearly this iterator function is only meant to be called to perform some data manipulation, probably very infrequently. And yet if some consumer of this library were to think myObject is iterable and use for..of on it, JavaScript will go right ahead and call that function, thinking it's supposed to return an iterator.
This is called a name collision and even if you tell every developer very firmly "don't name your object properties iterator unless it returns a proper iterator!", someone is bound to not listen and cause problems.
Even if you don't think just that one example is worthy of this whole Symbol thing, just look at the rest of the list of well-known symbols. replace, match, search, hasInstance, toPrimitive... So many possible collisions! Even if every developer is made to never use these as keys on their objects, you're really restricting the set of usable key names and therefore developer freedom to implement things how they want.
Symbols are the perfect solution for this. Take the above example, but now JavaScript doesn't check for a property named "iterator", but instead for a property with a key exactly equal to the unique Symbol Symbol.iterator. A developer wishing to implement their own iterator function writes it like this:
const myObject = {
[Symbol.iterator]: function() {
console.log("I'm being iterated over!");
}
};
...and a developer wishing to simply not be bothered and use their own property named iterator can do so completely freely without any possible hiccups.
This is a pattern developers of libraries may implement for any unique key they'd like to check for on an object, the same way the JavaScript developers have done it. This way, the problem of name collisions and needing to restrict the valid namespace for properties is completely solved.
Comment from the asker:
The bit which confused me on the linked OP is they've created 2 variables with the names cake1 and cake2. These names are unique and the developer has had to determine them so I didn't understand why they couldn't assign the variable to the same name, as a string (const cake1 = 'cake1'; const cake2 = 'cake2'). This could be used to make 2 unique key names since the strings 'cake1' !== 'cake2'. Also the answer says for Symbol you "can't share it" (e.g. between libraries) so what use is that in terms of avoiding conflict with other libraries or other developers code?
The linked OP I think is misleading - it seems the point was supposed to be that both symbols have the value "cake" and thus you technically have two duplicate property keys with the name "cake" on the object which normally isn't possible. However, in practice the capability for symbols to contain values is not really useful. I understand your confusion there, again, I think it was just another example of avoiding key name collision.
About the libraries, when a library is published, it doesn't publish the value generated for the symbol at runtime, it publishes code which, when added to your project, generates a completely unique symbol different than what the developers of the library had. However, this means nothing to users of the library. The point is that you can't save the value of a symbol, transfer it to another machine, and expect that symbol reference to work when running the same code. To reiterate, a library has code to create a symbol, it doesn't export the generated value of any symbols.
What's the purpose of Symbol in terms of unique object identifiers?
Well,
Symbol( 'description' ) !== Symbol( 'description' )
How does any of this make Symbol "valuable" in terms of being unique object keys or identifiers?
In a visitor pattern or chain of responsibility, some logic may add additional metadata to any object and that's it (imagine some validation OR ORM metadata) attached to objects but that does not persist *.
If you needed to send something like {Symbol('sym1'): 'prop'} to a backend via ajax what would you actually need to do in this case?
If I may assure you, you won't need to do that. you would consider { sym1: 'prop' } instead.
Now, this page even has a note about it
Note: If you are familiar with Ruby's (or another language) that also has a feature called "symbols", please don’t be misguided. JavaScript symbols are different.
As I said, there are useful for runtime metadata and not effective data.

Best (most performant) way to declare (class) properties with unknown values in v8

So I learned a bit about the hidden class concept in v8. It is said that you should declare all properties in the constructor (if using prototype based "pseudo classes") and that you should not delete them or add new ones outside of the constructor. So far, so good.
1) But what about properties where you know the type (that you also shouldn't change) but not the (initial) value?
For example, is it sufficient to do something like this:
var Foo = function () {
this.myString;
this.myNumber;
}
... and assign concrete values later on, or would it be better to assign a "bogus" value upfront, like this:
var Foo = function () {
this.myString = "";
this.myNumber = 0;
}
2) Another thing is with objects. Sometimes I just know that an object wont have a fixed structure, but I want to use it as a hash map. Is there any (non verbose) way to tell the compiler I want to use it this way, so that it isn't optimized (and deopted later on)?
Update
Thanks for your input! So after reading your comments (and more on the internet) I consider these points as "best practices":
Do define all properties of a class in the constructor (also applies for defining simple objects)
You have to assign something to these properties, even if thats just null or undefined - just stating this.myString; is apparently not enough
Because you have to assign something anyways I think assigning a "bogus" value in case you can't assign the final value immediatly cannot hurt, so that the compiler does "know" ASAP what type you want to use. So, for example this.myString = "";
In case of objects, do assign the whole structure if you know it beforehand, and again assign dummy values to it's properties if you don't know them immediatly. Otherwise, for example when intending to use the Object as a hashmap, just do: this.myObject = {};. Think its not worth indicating to the compiler that this should be a hashmap. If you really want to do this, I found a trick that assigns a dummy property to this object and deletes it immediatly afterwards. But I won't do this.
As for smaller Arrays it's apparently recommended (reference: https://www.youtube.com/watch?v=UJPdhx5zTaw&feature=youtu.be&t=25m40s) to preallocate them especially if you know the final size, so for example: this.myArray = new Array(4);
Don't delete properties later on! Just null them if needed
Don't change types after assigning! This will add another hidden class and hurt performance. I think thats best practice anyways. The only case where I have different types is for certain function arguments anyways. In that case I usually convert them to the same target type.
Same applies if you keep adding additional properties later on.
That being said, I also think doing this will lean to cleaner and more organized code, and also helps with documenting.
Yeah, so one little thing I am unsure remains: What if I define properties in a function (for example a kind of configure() method) called within the constructor?
Re 1): Just reading properties, like in your first snippet, does not do anything to the object. You need to assign them to create the properties.
But for object properties it doesn't actually matter much what values you initialise them with, as long as you do initialise them. Even undefined should be fine.
The concrete values are much more relevant for arrays, where you want to make sure to create them with the right elements (and without any holes!) because the VM tries to keep them homogeneous. In particular, never use the Array constructor, because that creates just holes.
Re 2): There are ways to trick the VM into using a dictionary representation, but they depend on VM and version and aren't really reliable. In general, it is best to avoid using objects as maps altogether. Since ES6, there is a proper Map class.

JS object custom variables

I read somewhere that objects were basically hash tables, and you could assign values to them willy nilly. Well, I am hoping to take advantage of this, but I want to know if it is even possible, if it is considered "correct", and, if there are any unwanted situations.
My situation:
I have a serious of objects (the kind which CANNOT be stored in the DOM!) which I want to assign to DOM objects. My plan is to:
FInd a dom object (A div, or area of some form), and then assign that to the variable myVar
I will then call: myVar.customVal = value
customVal of course is not defined in the DOM specification. Will that even work, though? Will it show up in the DOM, or stay a virtual variable? Is there any way to assign custom values to members of the DOM for access later?
You can do it:
var foo = document.getElementById('sidebar');
foo.party = 3;
console.dir(foo);
But no, it's not considered good practice. Rather, consider using HTML5's custom data attributes, or better yet, jQuery's abstraction of them.

javascript: array of variables or separate variables?

I used to store variables separately, for example, if I have an old element I would save it in a variable called "ele_old", and then later if I have a new element I would save that in a variable called "ele_new". However, it just occurred to me that I can save the 2 variables in 1 array variable, so I could do something like this:
eles_arr['old'] = //old element;
eles_arr['new'] = //new element;
this way would allow me to put variables of the same type into the same array for better organization, e.g. elements together in 1 array, and then ids together in another array.
The problem is I am quite new to javascript (and any other programming languages, for that matter), so I'm wondering if this is an inferior way than just keeping each variable separate. Will doing this cause any problems? for example, poor runtime performance?
Thanks!
What you're doing is setting properties on the eles_arr object. It's akin to saying:
eles_arr.oldEl = //old element
eles_arr.newEl = //new element
It's not storing them in an array, so it's similar to having the variables separated -- they're just grouped in the eles_arr object. To put them in an array you'd do:
eles_arr.push(oldEl);
eles_arr.push(newEl);
That being said I wouldn't put these two variables in an array. I would keep them separated to increase readability. It might make sense to you to have the values in the same array, and you may remember their positioning. Other developers may not though, which could lead to problems in the future.
Finally, having an array of two values for 'old' and 'new' will in no way affect performance in your case, but I still would not recommend using an array for readability's sake.
Update
I changed the variables from 'old' and 'new' to 'oldEl' and 'newEl' to reflect Šime's comment on the reserved keyword 'new'.
You can use an object to create a namespace / module, and so reduce the number of global / local variables:
var eles = {};
eles.foo = document.getElementById("foo");
eles.bar = document.getElementById("bar");
But that probably makes sense only if you have like 20 or more elements and you need them in various event handlers / functions. In that case, it makes sense to create the eles namespace and populate it with all your element references.

Categories

Resources