How to store data correctly in a JSON-like object? From what I've seen, there are two ways to store data inside a json object. Both have different ways to access the data (examples in Python):
Option 1:
obj1 = [
{"id": 1, "payload": "a"},
{"id": 2, "payload": "b"},
{"id": 3, "payload": "c"},
]
Access a certain id's payload in option 1:
for element in obj1:
if element["id"] == 2:
print(element["payload"])
This means, the whole list of elements has to be scanned (possibly) to find the right id and return its value.
On the other hand theres Option 2:
obj2 = {
1: "a",
2: "b",
3: "c",
}
To access the payload of the second "id" is just:
print(obj2[2])
My question now is, why is it more common to see option 1, even though that one seems more complicated to search through? When would I use option 1 and when option 2?
We should clarify some terminology.
When you put a list of values in [], you are creating an array; it is keyed by the array index, not by any element of the data.
When you put a list of keys and values in {}, you are creating an object; you can, as you note, locate a value if you know the corresponding key.
There are many differences between the two structures. In the particular use case you site - wanting to find a data instance based on one of its field's values - an object that uses that field as the key makes sense.
But object keys are unordered. And arrays lend themselves more naturally to having you iterate over all their elements.
It depends what you're going to do with the values; that's why there isn't just one data structure.
Apart from technical artifacts that somehow evolved, option one makes sense in case the order of elements is important.
Related
For the past few days i've been looking into Javascript and how to make list out of comparison, i found something about arrays, and while i knew that this is what i need, I couldn't understand how it exactly worked with my very little knowledge about Javascript.
I want to make a list, say we have 5 items (let's take a, b, c, d, e as examples) and you're asked to compare one to another (do u like "a" or "d"?) and each time you choose an item it gets a point and it keeps asking you until all the comparisons are made with each item, then a list is made out of those comparisons and the items with the most points are ranked first.
if there is an article, a video, a code that i can analyze, anything would help.
Thanks in advance
Here is an idea of how you could approach it.
So you could have the following setup:
const items = ["a", "b", "c", "d", ...and so on]
const pointsRecord = {"a": 0, "b": 0, "c": 0, ...and so on}
With these two pieces you can build pairs to compare, and keep track of the points of each item.
From your description it sounds like you want to compare every item with every other item on the list. You will probably have to loop through the list twice and build a comparison pair that you can output, or render on the screen.
You could write a function to build comparison pairs to look something like this:
const comparisonPairs = [
[ "a", "b" ],
[ "a", "c" ],
[ "a", "d" ],
[ "b", "c" ],
...and so on
]
With something like this you could iterate through comparisonPairs, render the pair on the screen or output to the terminal. Then take the input, and update the pointsRecord object. Something like:
pointsRecord = {
...pointsRecord, // copy over all the data from the original point record
[selectedItem]: pointsRecord[selectedItem] + 1 // increment the points of the selected item
}
After updating pointsRecord you could move on to the next pair.
To sort them at the end you would just have to run a sort function on the items. Something like:
const sortedItems = items.slice().sort((a, b) => pointsRecord[b] - pointsRecord[a]);
Adding the .slice() creates a new array so you don't alter the original typically this is good practice but you may want to sort the original in place depending on your use case.
This is just one approach. For sure you want to think about how you want to organize the data, which it sounds like you are already doing so you are on the right track. As you are working through this if you find that it's getting messy, updating things, and keeping track of the data, that may be a good sign to re-asses and re-think if there is a better way to store the information.
I've been reading all around the MDN, but I get stuff like:
keyPath
The key path for the index to use. Note that it is possible to create an index with an empty keyPath, and also to pass in a sequence (array) as a keyPath.
Well, no s#!t, keyPath is a key path. But what is that?
In all examples, it's the same thing as the name of the column (or index, as they call it):
objectStore.createIndex("hours", "hours", { unique: false });
objectStore.createIndex("minutes", "minutes", { unique: false });
objectStore.createIndex("day", "day", { unique: false });
objectStore.createIndex("month", "month", { unique: false });
objectStore.createIndex("year", "year", { unique: false });
They say that you can pass:
An empty string - ""
A valid JavaScript identifier (I assume this means valid JS variable name)
Multiple javascript identifiers separated by periods, eg: "name.name2.foo.bar"
An array containing any of the above, eg.: ["foo.bar","","name"]
I can't imagine what purpose does this serve. I absolutely do not understand what keyPath is and what can I use it for. Can someone please provide example usage where keyPath is something else than the name of the column? An explanation what effect do values of keyPath have on the database?
Examples might help. If you use a key path with an object store, you can have keys plucked out of the objects being stored rather than having to specify them on each put() call. For example, with records that just have an id and name, you could use the record's id as a primary key for the object store:
store = db.createObjectStore('my_store', {keyPath: 'id'});
store.put({id: 987, name: 'Alice'});
store.put({id: 123, name: 'Bob'});
Which gives you this store:
key | value
------+-------------------
123 | {id: 123, name: 'Bob'}
987 | {id: 987, name: 'Alice'}
But if you want to look record up by name, you create an index:
name_index = store.createIndex('index_by_name', 'name');
Which gives you this index:
key | primary key | value
--------+-------------+--------------------------
'Alice' | 987 | {id: 987, name: 'Alice'}
'Bob' | 123 | {id: 123, name: 'Bob'}
(The index doesn't really store a copy of the value, just the primary key. But it's easier to visualize this way. This also explains the properties you'll see on a cursor if you iterate over the index.)
So now you can look up a record by name with:
req = name_index.get('Alice')
When records are added to the store, the key path is used to generate the key for the index.
Key paths with . separators can be used for lookups in more complex records. Key paths that are arrays can either produce compound keys (where the key is itself an array), or multiple index entries (if multiEntry: true is specified)
A great way to understand things is to think how you would design it yourself.
Let's take a step back, and look at how a very simple NoSQL database would store data, then add indices.
Each Store would look like a JavaScript object (Or dictionary in python, a hash in C# ) etc...
{
"1": {"name": "Lara", "age": "20"},
"2": {"name": "Sara", "age": "22"},
"3": {"name": "Joe", "age": "22"},
}
This structure is basically a list of values, plus an index to retrieve the values, like so:
//Raw JS:
var sara = data["2"]
// IndexedDB equivalent:
store.get(2)
This is super fast for direct object retrieval, but sucks for filtering. If you want to get people of a specific age, you need to loop over every value.
If you knew in advance you would be doing your queries by age, you could really speed up such queries by creating a new object which indexes the value by age:
{
"20": [<lara>],
"22": [<joe>, <sara>],
...
}
But how would you query that? You can't use the default indexing syntax (e.g. data[22] or store.get(22)) because those expect the key.
One way would be to name that second index, e.g. by_age_index and give our store object a way to access that index by name, so you could to this:
store.index('by_age_index').get(22) // Returns Joe and Sara
The last bit of the puzzle would be telling that index how to determine which records go against which key (because it has to keep itself updated when records are added/changed/removed)
In other words, how does it know Joe and Sara go against key 22, but Lara goes against key 20?
In our case, we want to use the field age from each record. This is what we mean by a keyPath.
So when defining an index, it makes sense that we would specify that as well as the name, e.g.
store.createIndex('by_age_index', 'age');
Of course, if you want to access your index like this:
store.index('age').get(22) // Returns Joe and Sara
Then you need to create your index like this:
store.createIndex('age', 'age');
Which is what most people do, which is why we see that in examples, which gives the impression that the first argument is the keyPath (whereas it's actually just the arbitrary name we give that index) leaving us unsure about what the second argument might be for.
I could have explained all this by saying:
The first parameter is the handle by which you access the index on the store, the second parameter is the name of the field on the record by which that index should group its records.
But maybe this rundown will help other people too :-)
A keypath is how you indicate to indexedDB which properties of your object play a special role. Similar to how you would indicate to an SQL database that a certain column in a table is the primary key of the table, or how you could tell a database to create an index on one or more particular columns in a table.
In other words, it is the path that the indexedDB implementation should follow when determining which property should be used for some calculation. For example, when searching for a value with a given key.
It is a path, and not a simple key, because it considers that object property values can also be objects. In other words, there is a hierarchy. For example, {a:{b:1}}. The "path" to the value 1 is "a.b". The path is the sequence of properties to visit to get to the value.
The key part of the name signifies that the columns play an important role. For example, in identifying the primary key property, or a particular indexed property.
Properties that are not part of the keypath are ignored in the sense that the indexedDB implementation just treats the whole object as a bag of properties, and only pays attention to those, or gains awareness of those, that are a part of a keypath.
There's a number of posts here about this issue, and they all contain a lot of assertions that can be summarized like this:
Object properties are never guaranteed to be ordered in any way.
JSON.parse() never sorts properties in any way.
Obviously we tend to have no doubt about #1 above, so we may reasonably expect that, for any operation, properties are processed merely in the order they appear.
[edit, following the #Bergi's comment: or at least they should appear in a random order]
Then from that we might especially infer that #2 should be true.
But look at this snippet:
(BTW note: to show the results, snippets below don't use console.log() which may itself change order of the output. Instead objects are iterated by for (key in obj) and the output displayed in the document)
var inputs = [
'{"c": "C", "a": "A", "b": "B"}',
'{"3": "C", "1": "A", "2": "B"}',
'{"c": "C", "a": "A", "3": "C", "1": "A", "b": "B", "2": "B"}'
];
for (var i in inputs) {
var json = inputs[i],
parsed = JSON.parse(json),
output = [];
for (var j in parsed) {
output.push(j + ': ' + parsed[j]);
}
document.write(`JSON: ${json}<br />Parsed: ${output.join(', ')})<hr />`);
}
It shows that, given a JSON string having unordered keys:
When the input has keys with non-numeric values, the parsed object has its properties in the same order than in the input. This is consistent with the #2 assumption above.
Conversely when the input has keys with numeric values (though they're strings, so not firing parse error), the parsed object has its properties sorted. This now contradicts the #2 assumption.
More: when there are mixed numeric and non-numeric key values, first appear the numeric properties sorted, then the non-numeric properties in their original order.
From that I was first tempted to conclude that actually there would be a (non-documented?) feature, so JSON.parse() works following the "rules" exposed above.
But I had the idea to look further, so the snippet below now shows how ordered are the properties of a merely coded object:
var objects = [
[
'{"c": "C", "a": "A", "b": "B"}',
{"c": "C", "a": "A", "b": "B"}
],
[
'{"3": "C", "1": "A", "2": "B"}',
{"3": "C", "1": "A", "2": "B"}
],
[
'{"c": "C", "a": "A", "3": "C", "1": "A", "b": "B", "2": "B"}',
{"c": "C", "a": "A", "3": "C", "1": "A", "b": "B", "2": "B"}
]
];
for (var i in objects) {
var object = objects[i],
output = [];
for (var j in object[1]) {
output.push(j + ': ' + object[1][j]);
}
document.write(`Code: ${object[0]}<br />Object: ${output.join(', ')}<hr />`);
}
It results in analogue observations, i.e. whichever order they're coded, properties are stored following the 3rd rule above:
numerically named properties are all put first, sorted
other properties are set next, ordered as coded
So it means that JSON.parse() is not involved: in fact it seems to be a fundamental process of object building.
Again this appears not documented, at least as far I could find.
Any clue for a real, authoritative, rule?
[Edit, thanks to #Oriol's answer] It actually appears that, synthetically:
This behaviour conforms to an ECMA specification rule.
This rule should apply to all methods where a specific order is guaranteed but is optional for other cases.
However it seems that modern browsers all choose to apply the rule whatever method is involved, hence the apparent contradiction.
The properties of an object have no order, so JSON.parse can't sort them. However, when you list or enumerate the properties of an object, the order may be well-defined or not.
Not necessarily for for...in loops nor Object.keys
As fully explained in Does ES6 introduce a well-defined order of enumeration for object properties?, the spec says
The mechanics and order of enumerating the properties is not specified
But yes for OrdinaryOwnPropertyKeys
Objects have an internal [[OwnPropertyKeys]] method, which is used for example by Object.getOwnPropertyNames and Object.getOwnPropertySymbols.
In the case of ordinary objects, that method uses the OrdinaryGetOwnProperty abstract operation, which returns properties in a well-defined order:
When the abstract operation OrdinaryOwnPropertyKeys is called with
Object O, the following steps are taken:
Let keys be a new empty List.
For each own property key P of O that is an integer index, in
ascending numeric index order
Add P as the last element of keys.
For each own property key P of O that is a String but is not
an integer index, in ascending chronological order of property creation
Add P as the last element of keys.
For each own property key P of O that is a Symbol, in ascending chronological order of property creation
Add P as the last element of keys.
Return keys.
Therefore, since an order is required by OrdinaryOwnPropertyKeys, implementations may decide to internally store the properties in that order, and use it too when enumerating. That's what you observed, but you can't rely on it.
Also be aware non-ordinary objects (e.g. proxy objects) may have another [[OwnPropertyKeys]] internal method, so even when using Object.getOwnPropertyNames the order could still be different.
so we may reasonably expect that, for any operation, properties are processed merely in the order they appear
That's where the flaw in the reasoning lies. Given that object properties aren't guaranteed to be ordered, we have to assume that any operation processes properties in any order that it sees fit.
And in fact engines evolved in a way that treat integer properties specially - they're like array indices, and are stored in a more efficient format than a lookup table.
I have some data which I originally stored in a generic Javascript object, with the ID as a key:
{
"7": {"id":"7","name":"Hello"},
"3": {"id":"3","name":"World"},
...
}
However, I discovered that browsers do not guarantee a particular object order when looping through them, so in the above "3" would come before "7". I switched to using an array format like this:
[
{"id":"7","name":"Hello"},
{"id":"3","name":"World"},
...
]
Now, I can loop in the correct order but cannot do fast lookups, e.g. data["3"] without having to loop through the array.
Is there a good way to combine both approaches? I would rather avoid using a separate object for each format, because the object is pretty large (hundreds of elements).
I have run across this problem as well. A solution is to keep an ordered array of keys in addition to the original object.
var objects = {
"7": {"id":"7","name":"Hello"},
"3": {"id":"3","name":"World"},
...
}
var order = [ "3", "7", ... ];
Now if you want the second element you can do this lookup:
var second_object = objects[order[1]];
The ECMA standard does not say anything about the order of the elements in an object. And specifically Chrome reorders the keys when they look like numbers.
Example:
var example = {
"a": "a",
"b": "b",
"1": "1",
"2": "2"
};
if you print this in Chrome will get something like:
{
1: "1",
2: "2",
"a": "a",
"b": "b"
};
It's a little sour .. but life.
You could use the solution Andy linked as well, basically wrapping these two together in one object.
An alternative that I use a lot is a custom map function that allows you to specify the order in which the object is traversed. Typically you will do sorting when you're printing your data to the user so while you loop and create your table rows (for instance) your iterator will pass the rows in the order your sort function specifies. I thought it was a nice idea :)
The signature looks like:
function map(object, callback, sort_function);
Example usage:
map(object, function (row) {
table.add_row(row.header, row.value);
}, function (key1, key2) {
return object[key1] - object[key2];
});
Rather than coding your own, there are off-the-shelf libraries available to provide "as provided" JSON parsing or "consistently sorted" JSON printing for display.
You might well consider either of these:
The 'json-order' package offers parsing, formatting & pretty-printing with stable ordering. This is based on having ordered input.
The 'fast-json-stable-stringify' package offers deterministic formatting based on sorting.
I have some data which I originally stored in a generic Javascript object, with the ID as a key:
{
"7": {"id":"7","name":"Hello"},
"3": {"id":"3","name":"World"},
...
}
However, I discovered that browsers do not guarantee a particular object order when looping through them, so in the above "3" would come before "7". I switched to using an array format like this:
[
{"id":"7","name":"Hello"},
{"id":"3","name":"World"},
...
]
Now, I can loop in the correct order but cannot do fast lookups, e.g. data["3"] without having to loop through the array.
Is there a good way to combine both approaches? I would rather avoid using a separate object for each format, because the object is pretty large (hundreds of elements).
I have run across this problem as well. A solution is to keep an ordered array of keys in addition to the original object.
var objects = {
"7": {"id":"7","name":"Hello"},
"3": {"id":"3","name":"World"},
...
}
var order = [ "3", "7", ... ];
Now if you want the second element you can do this lookup:
var second_object = objects[order[1]];
The ECMA standard does not say anything about the order of the elements in an object. And specifically Chrome reorders the keys when they look like numbers.
Example:
var example = {
"a": "a",
"b": "b",
"1": "1",
"2": "2"
};
if you print this in Chrome will get something like:
{
1: "1",
2: "2",
"a": "a",
"b": "b"
};
It's a little sour .. but life.
You could use the solution Andy linked as well, basically wrapping these two together in one object.
An alternative that I use a lot is a custom map function that allows you to specify the order in which the object is traversed. Typically you will do sorting when you're printing your data to the user so while you loop and create your table rows (for instance) your iterator will pass the rows in the order your sort function specifies. I thought it was a nice idea :)
The signature looks like:
function map(object, callback, sort_function);
Example usage:
map(object, function (row) {
table.add_row(row.header, row.value);
}, function (key1, key2) {
return object[key1] - object[key2];
});
Rather than coding your own, there are off-the-shelf libraries available to provide "as provided" JSON parsing or "consistently sorted" JSON printing for display.
You might well consider either of these:
The 'json-order' package offers parsing, formatting & pretty-printing with stable ordering. This is based on having ordered input.
The 'fast-json-stable-stringify' package offers deterministic formatting based on sorting.