I need to store client side data temporarily. The data will be trashed on refresh or redirect. What is the best way to store the data?
using javascript by putting the data inside a variable
var data = {
a:"longstring",
b:"longstring",
c:"longstring",
}
or
putting the data inside html elements (as data-attribute inside div tags)
<ul>
<li data-duri="longstring"></li>
<li data-duri="longstring"></li>
<li data-duri="longstring"></li>
</ul>
The amount of data to temporarily store could get a lot because the data I need to store are image dataUri's and a user that does not refresh for the whole day could stack up maybe 500+ images with a size of 50kb-3mb. (I am unsure if that much data could crash the app because of too much memory consumption. . please correct me if I am wrong.)
What do you guys suggest is the most efficient way to keep the data?
I'd recommend storing in JavaScript and only updating the DOM when you actually want to display the image assuming all the image are not stored at the same time. Also note the browser will also store the image in its own memory when it is in the DOM.
Update: As comments have been added to the OP I believe you need to go back to customer requirements and design - caching 500 x 3MB images is unworkable - consider thumbnails etc? This answer only focuses on optimal client side caching if you really need to go that way...
Data URI efficiency
Data URIs use base64 which adds an overhead of around 33% representing binary data in ASCII.
Although base64 is required to update the DOM the overhead can be avoided by storing the data as binary strings and encoding and decoding using atob() and btoa() functions - as long as you drop references to the original data allowing it to be garbage collected.
var dataAsBase64 = "iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==";
var dataAsBinary = atob(dataAsBase64);
console.log(dataAsBinary.length + " vs " + dataAsBase64.length);
// use it later
$('.foo').attr("src", "data:image/png;base64," + btoa(dataAsBinary));
String memory efficiency
How much RAM does each character in ECMAScript/JavaScript string consume? suggests they take 2 bytes per character - although this is still could be browser dependent.
This could be avoided by using ArrayBuffer for 1-to-1 byte storage.
var arrayBuffer = new Uint8Array(dataAsBinary.length );
for (i = 0; i < dataAsBinary.length; i++) {
arrayBuffer[i] = dataAsBinary.charCodeAt(i);
}
// allow garbage collection
dataAsBase64 = undefined;
// use it later
dataAsBase64 = btoa(String.fromCharCode.apply(null, arrayBuffer));
$('.foo').attr("src", "data:image/png;base64," + btoa(dataAsBinary));
Disclaimer: Note all this add a lot of complexity and I'd only recommend such optimisation if you actually find a performance problem.
Alternative storage
Instead of using browser memory
local storage - limited, typically 10MB, certainly won't allow - 500 x 3MB without specific browser configuration.
Filesystem API - not yet widely supported, but ideal solution - can create temp files to offload to disk.
if you really want to loose the data on a refresh, just use a javascript hash/object var storage={} and you have a key->value store. If you would like to keep the data during the duration of the user visiting the page (until he closes the browser window), you could use sessionStorage or to persist the data undefinetly (or until the user deletes it), use localStorage or webSQL
putting data into the DOM (as a data-attribute or hidden fields etc) is not a good idea as the process for javascript to go into the DOM and pull that information out is very expensive (crossing borders between the javascript- and the DOM-world (the website structure) doesn't come cheap)
Using Javascript variable is the best way to store you temp data. You may consider to storing your data inside a DOM attribute only if the data is related to a specific DOM element.
About the performance, storing your data directly in a javascript variable will probably be faster since storing data in a DOM element would also involve javascript in addition to the DOM modifications. If the data isn't related to an existing DOM element, you'll also have to create a new element to store that value and make sure it isn't visible to the user.
The OP mentions a requirement for the data to be forcibly transient i.e. (if possible) unable to be saved locally on the client - at least that is how I read it.
If this type of data privacy is a firm requirement for an application, there are multiple considerations when dealing with a browser environment, I am unsure whether the images in question are to be displayed as images to the user, or where in relation to the client the source data of the images is coming from. If the data is coming into the browser over the network, you might do well (or better than the alternative, at least) to use a socket or other raw data connection rather than HTTP requests, and consider something like a "sentinel" value in the stream of bytes, to indicate boundaries of image data.
Once you have the bytes coming in, you could, I believe, (or soon will be able to) pass the data via a generator function into a typedArray via the iterator protocol, see: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array
// From an iterable
var iterable = function*(){ yield* [1,2,3]; }();
var uint8 = new Uint8Array(iterable);
// Uint8Array[1, 2, 3]
And then perhaps integrate those arrays as private members of some class you use to manage their lifecycle? see:
https://www.nczonline.net/blog/2014/01/21/private-instance-members-with-weakmaps-in-javascript/
var Person = (function() {
var privateData = {},
privateId = 0;
function Person(name) {
Object.defineProperty(this, "_id", { value: privateId++ });
privateData[this._id] = {
name: name
};
}
Person.prototype.getName = function() {
return privateData[this._id].name;
};
return Person;
}());
I think you should be able to manage the size / wait problem to some extent with the generator method of creating the byte arrays as well, perhaps check for sane lengths, time passed on this iterator, etc.
A general set of ideas more than an answer, and none of which are my own authorship, but this seems to be appropriate to the question.
Why not used #Html.Hidden ?
#Html.Hidden("hId", ViewData["name"], new { #id = "hId" })
There are various ways to do this, depending upon your requirement:
1) We can make use of constant variables, create a file Constants.js and can be used to store data as
"KEY_NAME" : "someval"
eg:
var data = {
a:"longstring",
b:"longstring",
c:"longstring",
}
CLIENT_DATA = data;
Careful: This data will be lost if you refresh the screen, as all the variables memory is just released out.
2) Make use of cookieStore, using:
document.cookie = some val;
For reference :http://www.w3schools.com/js/tryit.asp?filename=tryjs_cookie_username
Careful: Cookie store data has an expiry period also has a data storage capacity https://stackoverflow.com/a/2096803/1904479.
Use: Consistent long time storage. But wont be recommended to store huge data
3) Using Local Storage:
localStorage.setItem("key","value");
localStorage.getItem("key");
Caution: This can be used to store value as key value pairs, as strings, you will not be able to store json arrays without stringify() them.
Reference:http://www.w3schools.com/html/tryit.asp?filename=tryhtml5_webstorage_local
4) Option is to write the data into a file
Reference: Writing a json object to a text file in javascript
Related
I would like to come straight to the point and show you my sample data, which is around the average of 180.000 lines from a .csv file, so a lot of lines. I am reading in the .csv with papaparse. Then I am saving the data as array of objects, which looks like this:
I just used this picture as you can also see all the properties my objects have or should have. The data is from Media Transperency Data, which is open source and shows the payments between institiutions.
The array of objects is saved by using the localforage technology, which is basically an IndexedDB or WebSQL with localstorage like API. So I save the data never on a sever! Only in the client!
The Question:
So my question is now, the user can add the sourceHash and/or targetHash attributes in a client interface. So for example assume the user loaded the "Energie Steiermark Kunden GmbH" object and now adds the sourceHash -- "company" to it. So basically a tag. This is already reflected in the client and shown, however I need to get this also in the localforage and therefore rewrite the initial array of objects. So I would need to search for every object in my huge 180.000 lines array that has the name "Energie Steiermark Kunden GmbH", as there can be multiple and set the property sourceHash to "company". Then save it again in the localforage.
The first question would be how to do this most efficient? I can get the data out of localforage by using the following method and set it respectively.
Get:
localforage.getItem('data').then((value) => {
...
});
Set:
localforage.setItem('data', dataObject);
However, the question is how do I do this most efficiently? I mean if the sourceNode only starts with "E" for example we don't need to search all sourceNode's. The same goes of course for the targetNode.
Thank you in advance!
UPDATE:
Thanks for the answeres already! And how would you do it the most efficient way in Javascript? I mean is it possible to do it in few lines. If we assume I have for example the current sourceHash "company" and want to assign it to every node starting with "Energie Steiermark Kunden GmbH" that appear across all timeNode's. It could be 20151, 20152, 20153, 20154 and so on...
Localforage is only a localStorage/sessionStorage-like wrapper over the actual storage engine, and so it only offers you the key-value capabilities of localStorage. In short, there's no more efficient way to do this for arbitrary queries.
This sounds more like a case for IndexedDB, as you can define search indexes over the data, for instance for sourceNodes, and do more efficient queries that way.
In Node.js you can assign values to keys off of the global object. This gives you the ability to "remember" something between requests. Assuming the node.js process doesn't die/hang and restart by a process like iisnode.
Does Vert.x have an equivalent? Essentially, I'm looking for the simplest possible cache for a piece of data so I do not have to fetch it on every request. I assume the solution on Vert.x may be able to work across threads?
the code:
{{id:1,name:"Yahoo"},{id:2,name:"Google"}}
break cause it's not valid json,you can use
{companies : [{id:1,name:"Yahoo"},{id:2,name:"Google"}]} //notice than they are inside an array
now..the doc says
To prevent issues due to mutable data, vert.x only allows simple immutable types such as number, boolean and string or Buffer to be used in shared data
that means, maybe you will need use
var map = vertx.getMap('demo.mymap');
map.put('data', JSON.stringify({companies : [{id:1,name:"Yahoo"},{id:2,name:"Google"}]}))
and then in other verticle
var map = vertx.getMap('demo.mymap');
var yourJSON = JSON.parse(map.get('data');
now..maybe a good option which would be use redis like a cache system, although the vertex map seems solve your needs so far...
I have been experimenting with PostgreSQL and PL/V8, which embeds the V8 JavaScript engine into PostgreSQL. Using this, I can query into JSON data inside the database, which is rather awesome.
The basic approach is as follows:
CREATE or REPLACE FUNCTION
json_string(data json, key text) RETURNS TEXT AS $$
var data = JSON.parse(data);
return data[key];
$$ LANGUAGE plv8 IMMUTABLE STRICT;
SELECT id, data FROM things WHERE json_string(data,'name') LIKE 'Z%';
Using, V8 I can parse JSON data into JS, then return a field and I can use this as a regular pg query expression.
BUT
On large datasets, performance can be an issue, as for every row I need to parse the data.
The parser is fast, but it is definitely the slowest part of the process and it has to happen every time.
What I am trying to work out (to finally get to an actual question) is if there is a way to cache or pre-process the JSON ... even storing a binary representation of the JSON in the table that could be used by V8 automatically as a JS object might be a win. I've had a look at using an alternative format such as messagepack or protobuf, but I don't think they will necessarily be as fast as the native JSON parser in any case.
THOUGHT
PG has blobs and binary types, so the data could be stored in binary, then we just need a way to marshall this into V8.
Postgres supports indexes on arbitrary function calls. The following index should do the trick :
CREATE INDEX json_idx ON things (json_string(field,'name'));
The short version appears to be that with Pg's new json support, so far there's no way to store json directly in any form other than serialised json text. (This looks likely to change in 9.4)
You seem to want to store a pre-parsed form that's a serialised representation of how v8 represents the json in memory, and that's not currently supported. It's not even clear that v8 offers any kind of binary serialisation/deserialisation of json structures. If it doesn't do so natively, code would need to be added to Pg to produce such a representation and to turn it back into v8 json data structures.
It also wouldn't necessarily be faster:
If json was stored in a v8 specific binary form, queries that returned the normal json representation to clients would have to format it each time it was returned, incurring CPU cost.
A binary serialised version of json isn't the same thing as storing the v8 json data structures directly in memory. You can't write a data structure that involves any kind of graph of pointers out to disk directly, it has to be serialised. This serialisation and deserialisation has a cost, and it might not even be much faster than parsing the json text representation. It depends a lot on how v8 represents JavaScript objects in memory.
The binary serialised representation could easily be bigger, since most json is text and small numbers, where you don't gain any compactness from a binary representation. Since storage size directly affects the speed of table scans, value fetches from TOAST, decompression time required for TOASTed values, index sizes, etc, you could easily land up with slower queries and bigger tables.
I'd be interested to see whether an optimisation like what you describe is possible, and whether it'd turn out to be an optimisation at all.
To gain the benefits you want when doing table scans, I guess what you really need is a format that can be traversed without having to parse it and turn it into what's probably a malloc()'d graph of javascript objects. You want to be able to give a path expression for a field and grab it out directly from the serialised form where it's been read into a Pg read buffer or into shared_buffers. That'd be a really interesting design project, but I'd be surprised if anything like it existed in v8.
What you really need to do is research how the existing json-based object databases do fast searches for arbitrary json paths and what their on-disk representations are, then report back on pgsql-hackers. Maybe there's something to be learned from people who've already solved this - presuming, of course, that they have.
In the mean time, what I'd want to focus on is what the other answers here are doing: Working around the slow point and finding other ways to do what you need. You could also look into helping to optimise the json parser, but depending on whether the v8 one or some other one is in use that might already be far past the point of diminishing returns.
I guess this is one of the areas where there's a trade-off between speed and flexible data representation.
perhaps instead of making the retrieval phase responsible for parsing the data, creating a new data type which could pre-disseminate json data on input might be a better approach?
http://www.postgresql.org/docs/9.2/static/sql-createtype.html
I don't have any experience with this, but it got me curious so I did some reading.
JSON only
What about something like the following (untested, BTW)? It doesn't address your question about storing a binary representation of the JSON, it's an attempt to parse all of the JSON at once for all of the rows you're checking, in the hope that it will yield higher performance by reducing the processing overhead of doing it individually for each row. If it succeeds at that, I'm thinking it may result in higher memory consumption though.
The CREATE TYPE...set_of_records() stuff is adapted from the example on the wiki where it mentions that "You can also return records with an array of JSON." I guess it really means "an array of objects".
Is the id value from the DB record embedded in the JSON?
Version #1
CREATE TYPE rec AS (id integer, data text, name text);
CREATE FUNCTION set_of_records() RETURNS SETOF rec AS
$$
var records = plv8.execute( "SELECT id, data FROM things" );
var data = [];
// Use for loop instead if better performance
records.forEach( function ( rec, i, arr ) {
data.push( rec.data );
} );
data = "[" + data.join( "," ) + "]";
data = JSON.parse( data );
records.forEach( function ( rec, i, arr ) {
rec.name = data[ i ].name;
} );
return records;
$$
LANGUAGE plv8;
SELECT id, data FROM set_of_records() WHERE name LIKE 'Z%'
Version #2
This one gets Postgres to aggregate / concatenate some values to cut down on the processing done in JS.
CREATE TYPE rec AS (id integer, data text, name text);
CREATE FUNCTION set_of_records() RETURNS SETOF rec AS
$$
var cols = plv8.execute(
"SELECT" +
"array_agg( id ORDER BY id ) AS id," +
"string_agg( data, ',' ORDER BY id ) AS data" +
"FROM things"
)[0];
cols.data = JSON.parse( "[" + cols.data + "]" );
var records = cols.id;
// Use for loop if better performance
records.forEach( function ( id, i, arr ) {
arr[ i ] = {
id : id,
data : cols.data[ i ],
name : cols.data[ i ].name
};
} );
return records;
$$
LANGUAGE plv8;
SELECT id, data FROM set_of_records() WHERE name LIKE 'Z%'
hstore
How would the performance of this compare?: duplicate the JSON data into an hstore column at write time (or if the performance somehow managed to be good enough, convert the JSON to hstore at select time) and use the hstore in your WHERE, e.g.:
SELECT id, data FROM things WHERE hstore_data -> name LIKE 'Z%'
I heard about hstore from here: http://lwn.net/Articles/497069/
The article mentions some other interesting things:
PL/v8 lets you...create expression indexes on specific JSON elements and save them, giving you stored search indexes much like CouchDB's "views".
It doesn't elaborate on that and I don't really know what it's referring to.
There's a comment attributed as "jberkus" that says:
We discussed having a binary JSON type as well, but without a protocol to transmit binary values (BSON isn't at all a standard, and has some serious glitches), there didn't seem to be any point.
If you're interested in working on binary JSON support for PostgreSQL, we'd be interested in having you help out ...
I don't know if it would be useful here, but I came across this: pg-to-json-serializer. It mentions functionality for:
parsing JSON strings and filling postgreSQL records/arrays from it
I don't know if it would offer any performance benefit over what you've been doing so far though, and I don't really even understand their examples.
Just thought it was worth mentioning.
I have a list with 10.000 entrys.
for example
myList = {};
myList[hashjh5j4h5j4h5j4]
myList[hashs54s5d4s5d4sd]
myList[hash5as465d45ad4d]
....
I dont use an array (0,1,2,3) because i can check
very fast -> if this hash exist or not.
if(typeof myObject[hashjh5j4h5j4h5j4] == 'undefined')
{
alert('it is new');
}
else
{
alert('old stuff');
}
But i am not sure, is this a good solution?
Is it maybe a problem to handle an object with 10.000 entries?
EDIT:
I try to build an rss feed reader which shows only new feeds. So i calculate an hash from the link (every news has an uniqe link) and store it in the object (mongoDB). BTW: 10.000 entrys is not the normal case (but it is possible)
My advice:
Use as small of a hash as possible for the task at hand. If you are dealing with hundreds of hashable strings, compared to billions, then your hash length can be relatively small.
Store the hash as an integer, not a string, to avoid making it take less room than needed.
Don't store as objects, just store them in a simple binary tree log2(keySize) deep.
Further thoughts:
Can you come at this with a hybrid approach? Use hashes for recent feeds less than a month old, and don't bother showing items more than a month old. Store the hash and date together, and clean out old hashes each day?
You can use the in operator:
if ('hashjh5j4h5j4h5j4' in myList) { .. }
However, this will also return true for members that are in the objects prototype chain:
Object.prototype.foo = function () {};
if ("foo" in myList) { /* will be true */ };
To fix this, you could use hasOwnProperty instead:
if (myList.hasOwnProperty('hashjh5j4h5j4h5j4')) { .. }
Whilst you yourself may not have added methods to Object.prototype, you cannot guarantee that other 3rd party libraries you use haven't; incidentally, extending Object.prototype is frowned upon, so you shouldn't really do it. Why?; because you shouldn't modify things you don't own.
10.000 is quite a lot. You may consider storing the hashes in a database and query it using ajax. It maybe takes a bit longer to query one hash but your page loads much faster.
It is not a problem in modern browser on modern computers in any way.
10k entries that take up 50 bytes each would still take up less than 500KB ram.
As long as the js is served gzipped then bandwidth is no problem - but do try to serve the data as late as possible so they don't block perceived pageload performance.
All in all, unless you wish to cater to cellphones then your solution is fine.
I'm writing a jQuery plugin that works on a piece of JSON data object.
This data needs to be calculated by the plugin only once, so I want to calculate it on the first call to the plugin and store it to be used in every subsequent call.
My questtion is whether there's a standard and accepted methodology for storing data used by jQuery plugins.
Assuming my plugin is:
jQuery.fn.myPlugin = function(){...}
I was thinking of storing it's calculated data in:
jQuery.myPlugin.data = {...}
Is this the acceptable way of going about it?
I think storing it there is acceptable (or jQuery.fn.myPlugin.data for that matter)...or instead use your own ID in $.cache which is for storage but uses integer IDs for jQuery events and data, so you won't have any conflict, for example:
$.cache["myPlugin"] = myData;
//and to get:
var dataToUse = $.cache["myPlugin"];
The main reason I'd take this route is it eliminates the potential jQuery.something naming conflicts that could arise with future versions.