On my web-app, the user can request different data lines. Those data lines have an unique "statusID" each, say "18133". When the user requests to load the data, it is either loaded from the server or from indexedDB(that part im trying to figure out). In order to make it as fast as possible, I want the index to be the timestamp of the data, as I will request ranges which are smaller than the actual data in the indexedDB. However, I am trying to figure out how to create the stores and store the data properly. I tried to dynamically create stores everytime data with a new Id is requested, but creating stores is only possible in "onupradeneeded". I also thought about storing everything in the same store, but I fear that the performance will make that bad. I do not really now how to approach this thing.
What I do know: If you index a value, it means that the data is sorted, which is exactly what I want. I dont know if the following is possible but this would solve my issue too: store everthing in the same store, index by "statusID" and index by "timestamp". This way, it would be fast too i guess.
Note that I am talking about many many datapoints, possible in the millions.
You can index by multiple values, allowing you to get all by statusID and restricting to a range for your timestamp. So I'd go with the one datastore solution. Performance should not be an issue.
This earlier post may be helpful: Javascript: Searching indexeddb using multiple indexes
Related
I would like to persist a list of similarly-typed data on the client using localStorage. Each piece of data is a high-score and other information about a game session. Accessing by key isn't necessary here; I'm more likely going to iterate the list and display either the full list or sections of it.
It would look like there are two choices for me:
Stringify the whole array somehow and store them under the same key (perhaps "gamedata" or something similar).
Store each piece of data under a separate key (perhaps using a timestamp).
Here are the downsides for each choice:
Adding one item into the list (the most common operation) will require reading the whole list, adding the item, and writing the data back to localStorage. I'd expect about a hundred items in the list, but there might be outliers that play the game more often, and I don't want it to lag for them. The data might get corrupted if the user closes the page or shuts down their computer while writing to localStorage, thereby losing all their game data. It also doesn't really feel right to read the whole thing and write most of it back untouched.
I have to iterate the whole localStorage to find all my data. Doesn't feel like the right use of an associative array. Also, this method would probably be less efficient in terms of storage space due to the storage of keys.
Which method is the preferred method for storing my data?
My scenario is this: I have a very small array in my js file. When the page loads up, I have a function that loops through the array, and generates an li element for each item in the array, displaying it's name and price in the li. The array is constructed like this:
var gameList = [
{ name: "", value: 0.00},
]
Secondly, I have a simple form on the page that allows me to add new items to the array, and using localStorage, it's possible for me to keep a dynamically updated array. I push new items into the array (gameList), then at the end of the session I set it using localStorage.
localStorage.setItem("updatedGameList", JSON.stringify(gameList));
I have a couple of lines at the start of my code that sets my original array 'gameList' to be equal to the locally stored, updated game list.
var retrievedData = localStorage.getItem("updatedGameList");
gameList = JSON.parse(retrievedData);
So this is fine for now, but the growing array - which I want to keep and maintain - is only available in this browser, on this machine.
So, my question is, can I send this locally stored data somewhere? Maybe my personal domain? (Which is where I will host the app when it's finished) That way I could then reference it properly in my js file so that the data is always available? Maybe the array could have it's own js file?
I realise that this may not be the best way to be handling what is essentially a database. But I'm only part way through an online course and I'm using the tools that I have to make this work.
And lastly, in terms of maintenance of the array, is there any way to send it back to sublime in the form a .js file? I know this could be a crazy question. The updated array will become pretty big, maybe 200 items eventually, and it would be much easier to maintain from within sublime.
Thanks for your time, and apologies if part of this request is ridiculous!! :)
I have just been reading about AJAX, and thought maybe there's a way to send the updated array as a json file to somewhere(!) on my website, and then request that same file at the start of each new session, so I'm always working with, and saving, the latest updated array.
Thanks for reading, and hopefully you have some answers! :)
Although not quite what I was looking for - essentially some way of automatically getting the new array, sending it somewhere more secure than local storage, then referencing the new array to give me the most up to date starting point each time (and all with just javascript) - the 'dirty' way suggested below turned out to be sufficient for now until I start using databases.
From Kirupa, over at the forums:
Not a ridiculous question at all! You can send your own data anywhere you want, but it will require some level of server-related code. The easiest way to send data back and forth is through JSON, and you can convert your array into a JSON format easily using something like the following:
var jsonData = JSON.stringify(myArray);
From here, you can send this data to a database, to another web site, or to your e-mail server. If you want something really quick and dirty, you can literally just copy the contents of your JSON-ized array using the Chrome Dev Tools, save it on disk as a .js file, and reference it again in your app. That is a manual way of doing something that you don't really care about automating.
The best solution is to store this in a database. They've gotten easier to deal with as well. Firebase is my go-to for things like this, and this video might give you some ideas: https://www.youtube.com/watch?v=xAsvwy1-oxE
I am running MySQL 5.6. I have a number of various "name" columns in the database (in various tables). These get imported every year by each customer as a CSV data dump. There are a number of places that these names are displayed throughout this website. The issue is, the names have almost no formatting (and to this point, no sanitization existed upon importation):
Phil Eaton, PHIL EATON, Phil EATON, etc.
Thus, the website sometimes look like a mess when these names are involved. There are a number of ways that I can think to do this, but none that are particularly appealing.
First, I can have a filter in Javascript. However, as I said, these names exist in a number of places throughout this (large) site. I may end up missing a page. The names do not exist already within nice "name"-classed divs/spans, etc.
Second, I could filter in PHP (the backend). This seems about as effective as doing it in Javascript. I could do it on the API, but there was still not a central method for pulling names from the database. So I could still miss an API call anyway.
Finally, the obvious "best" way is to sanitize the existing data in place for each name column. Then at the same time, immediately start sanitizing all names that get imported each time we add a customer. The issue with the first part of this is that there are hundreds of millions of rows of names in the database. Updating these could take a long amount of time and be disruptive to the clients' daily routines.
So, the most appealing way to correct this in the short-term is to invoke a function every time a column is selected. In this way I could "decorate" every name column with a formatting function so the names will appear uniform on the frontend. So ultimately, my question is: is it possible to invoke a specific function in SQL to format each row every time a specific column is selected? In other words, maybe can I call a stored procedure every time a column is selected? (Point being, I'm trying to keep the formatting in SQL to avoid the propagation of usage.)
In MySQL you can't trigger something on SELECT, but I have an idea (it's only an idea, now I don't have time to try it, sorry).
You probably can create a VIEW on this table, with the same structure, but with the stored procedure applied to the names fields, and select from this view in your PHP.
But it has two backdraw:
You have to modify all your SELECT statements in your PHPs.
The server will always call that procedure. Maybe you can store the formatted values, then check for it (cache them).
On the other hand I agree with HLGEM, I also suggest to format the data on import, because it's a very bad practice to import something you don't check into a DB (SQL Injections?). The batch tasking is also a good idea to clean up the mess.
I presume names are called frequently so invoking a sanitization function every time they are called could severely slow down your system. Further, you can't just do a simple setting to get this, you would have to change every buit of SQL code that is run that includes names.
Personally how I would handle it is to fix the imports so they put in a sanitized version for new names. It is a bad idea to directly put any data into a database without some sort of staging and clean up.
Then I would tackle the old names and fix them in batches in a nightly run that is scheduled when the fewest people are using the system. You would have to do some testing on dev to determine how big a batch you could run without interfering with other things the database is doing. The alrger the batch the sooner you would get through all the names, but even though this will take time, it is the surest method of getting the data cleaned up and over time the data will appear better to the users. If the design of your datbase allows you to identify which are the more active names (such as an is_active flag for a customer or am order in the last year), I would prioritize the update by that. Alternatively, you could clean up one client at a time starting with whichever one has noticed the problem and is driving this change.
Other answers before give some possible solutions. But, the short answer for the specific option you are asking is : No. There is no such thing called a
"Select Statement Trigger", that too for a single column, although triggers come close for this kind of expectation, but only for Insert, Update and Delete operations.
The project requirements are odd for this one, but I'm looking to get some insight...
I have a CSV file with about 12,000 rows of data, approximately 12-15 columns. I'm converting that to a JSON array and loading it via JSONP (has to run client-side). It takes many seconds to do any kind of querying on the data set to returned a smaller, filtered data set. I'm currently using JLINQ to do the filtering, but I'm essentially just looping through the array and returning a smaller set based on conditions.
Would webdb or indexeddb allow me to do this filtering significantly faster? Any tutorials/articles out there that you know of that tackles this particular type of issue?
http://square.github.com/crossfilter/ (no longer maintained, see https://github.com/crossfilter/crossfilter for a newer fork.)
Crossfilter is a JavaScript library for exploring large multivariate
datasets in the browser. Crossfilter supports extremely fast (<30ms)
interaction with coordinated views, even with datasets containing a
million or more records...
This reminds me of an article John Resig wrote about dictionary lookups (a real dictionary, not a programming construct).
http://ejohn.org/blog/dictionary-lookups-in-javascript/
He starts with server side implementations, and then works on a client side solution. It should give you some ideas for ways to improve what you are doing right now:
Caching
Local Storage
Memory Considerations
If you require loading an entire data object into memory before you apply some transform on it, I would leave IndexedDB and WebSQL out of the mix as they typically both add to complexity and reduce the performance of apps.
For this type of filtering, a library like Crossfilter will go a long way.
Where IndexedDB and WebSQL can come into play in terms of filtering is when you don't need to load, or don't want to load, an entire dataset into memory. These databases are best utilized for their ability to index rows (WebSQL) and attributes (IndexedDB).
With in browser databases, you can stream data into a database one record at a time and then cursor through it, one record at a time. The benefit here for filtering is that this you means can leave your data on "disk" (a .leveldb in Chrome and .sqlite database for FF) and filter out unnecessary records either as a pre-filter step or filter in itself.
I'm designing a MongoDB database that works with a script that periodically polls a resource and gets back a response which is stored in the database. Right now my database has one collection with four fields , id, name, timestamp and data.
I need to be able to find out which names had changes in the data field between script runs, and which did not.
In pseudocode,
if(data[name][timestamp]==data[name][timestamp+1]) //data has not changed
store data in collection 1
else //data has changed between script runs for this name
store data in collection 2
Is there a query that can do this without iterating and running javascript over each item in the collection? There are millions of documents, so this would be pretty slow.
Should I create a new collection named timestamp for every time the script runs? Would that make it faster/more organized? Is there a better schema that could be used?
The script runs once a day so I won't run into a namespace limitation any time soon.
OK, this is a neat question b/c the short is basically: you will have to iterate and run javascript over each item.
The part where this gets "neat" is that this isn't really different from what an SQL solution would have to do. I mean, you're basically joining a table to itself where x.1=x.1 and y.1=y.2. Even if the relational DB can handle such a beast, it's definitely not going to be fast with millions of entries.
So the truth is, you're doing this right way. Here are the extra details I would use to make this cleaner.
Ensure that you have an index on Name/Timestamp.
Run a db.mycollection.find().foreach() across the data set.
Foreach entry you're going to a) Perform comparison. b) Save appropriately. c) Update a flag indicating that this record has been processed.
On future loads you should be able to add a query to your find. db.mycollection.find({flag:{$exists:false}}).foreach()
Use db.eval() to help with speed.
The reason for the "Name/Timestamp" index is that you're going to be looking up each "successor" by "Name/Timestamp", so you want to be quick here.
The reason for the "processed" flag is that you should never have to re-run the same item. If given timestamp 'n' you find 'n+1', then that's the only 'n+1' you're going to have.
Honestly, if you're only running this once / day, it's quite likely that the speed will be just fine, especially if you only running on new records. Just assume that it's going to take several minutes.