It seems django, or the sqlite database, is storing datetimes with microsecond precision. However when passing a time to javascript the Date object only supports milliseconds:
var stringFromDjango = "2015-08-01 01:24:58.520124+10:00";
var time = new Date(stringFromDjango);
$('#time_field').val(time.toISOString()); //"2015-07-31T15:24:58.520Z"
Note the 58.520124 is now 58.520.
This becomes an issue when, for example, I want to create a queryset for all objects with a datetime less than or equal to the time of an existing object (i.e. MyModel.objects.filter(time__lte=javascript_datetime)). By truncating the microseconds the object no longer appears in the list as the time is not equal.
How can I work around this? Is there a datetime javascript object that supports microsecond accuracy? Can I truncate times in the database to milliseconds (I'm pretty much using auto_now_add everywhere) or ask that the query be performed with reduced accuracy?
How can I work around this?
TL;DR: Store less precision, either by:
Coaxing your DB platform to store only miliseconds and discard any additional precision (difficult on SQLite, I think)
Only ever inserting values with the precision you want (difficult to ensure you've covered all cases)
Is there a datetime javascript object that supports microsecond accuracy?
If you encode your dates as Strings or Numbers you can add however much accuracy you'd like. There are other options (some discussed in this thread). Unless you actually want this accuracy though, it's probably not the best approach.
Can I truncate times in the database to milliseconds..
Yes, but because you're on SQLite it's a bit weird. SQLite doesn't really have dates; you're actually storing the values in either a text, real or integer field. These underlying storage classes dictate the precision and range of the values you can store. There's a decent write up of the differences here.
You could, for example, change your underlying storage class to integer. This would truncate dates stored in that field to a precision of 1 second. When performing your queries from JS, you could likewise truncate your dates using the Date.prototype.setMilliseconds() function. Eg..
MyModel.objects.filter(time__lte = javascript_datetime.setMilliseconds(0))
A more feature complete DB platform would handle it better. For example in PostgreSQL you can specify the precision stored more exactly. This will add a timestamp column with precision down to miliseconds (matching that of Javascript)..
alter table "my_table" add "my_timestamp" timestamp (3) with time zone
MySQL will let you do the same thing.
.. or ask that the query be performed with reduced accuracy?
Yeah but this is usually the wrong approach.
If the criteria you're filtering by is to precise then you're ok; you can truncate the value then filter (like in the ..setMilliseconds() example above). But if the values in the DB you're checking against are too precise you're going to have a Bad Time.
You could write a query such that the stored values are formatted or truncated to reduce their precision before being compared to your criteria but that operation is going to need to be performed for every value stored. This could be millions of values. What's more, because you're generating the values dynamically, you've just circumvented any indexes created against the stored values.
Related
In MongoDB, I only need to make date range queries. But the data set is huge (9 M) and coverting a string to DateTime object (I use Perl script) and then inserting them into MongoDB is very time consuming. If I just store the dates as strings "YYYY-MM-DD", would not the range query gt:"2013-06-01" and lt:"2013-08-31" still give me the same results as if they were of datetime type? Are they the same in this scenario? If so, what would be the advantage of storing as a DateTime object.
Thanks.
If you don't care about time-zone support in your application, then using strings for basic queries in MongoDB should work fine (but if it does matter, you'll want a real Date type).
However, if you later want to do date math or use the Aggregation Framework with your date field, it's necessary that the field is actually a Date type:
http://docs.mongodb.org/manual/reference/aggregation/#date-operators
For example, you could use the $dayOfWeek function on the Date typed field.
You could likely do some simple things like group on year by using $substr (doc) in MongoDB, but the resulting code will not be as clear (nor likely perform as well).
While it's not a huge difference, I'd recommend storing them as Date types if possible generally.
I see in the docs for the Perl driver that developers are warned against using the DateTime due to the fact that it is very slow, so maybe if you use Perl regularly, and the Aggregation Framework isn't a big issue, you'd be better off storing them as either numbers or as strings, and converting them as needed in Perl.
If space is an issue, remove unnecessary characters (such as the -):
20130613 ->
4 bytes for length of string
8 bytes encoded as UTF-8
NULL character
That would be 13 characters. A DateTime value in BSON/MongoDB requires 8 bytes on the other hand (as would the Perl $time function).
(I'd strongly recommend you do a bit of performance testing to find out if the performance impact of using a Date type in MongoDB with Perl will impact your typical workflows.)
The advantage of DateTime is a few bytes less on disk. bson stores DateTime as an integer, but "2013-08-31" is a string, at 20 bytes right there.
ISO-8601 (http://www.w3.org/QA/Tips/iso-date) is meant for being able to sort quickly.
In this case, I would always store as datetime.
edit: How time-consuming are you seeing this string-to-datetime conversion? Are you sure that is your bottleneck? I have a hard time believing the conversion is taking as long as you claim.
I'm currently modeling a database using MongoDB in which users can transfer funds between accounts and buy products, the values of which are debited from their current balances. I'm working with a precision of two decimal places, for products values and for user's balance.
The problem is that when I add or subtract a value with decimal places using the $inc operator, in my user document I get some precision errors, like this:
{
"balance": 31513.210000000003,
}
I'm using node and mongoose to manipulate my DB, and I know about the floating point inaccuracies of the language, but I'd like to know if there's any way to overcome this issue in my mongodb database and force it to always work with two decimal places, so when I query for an user with a positive balance, values like 0.00000000003 won't be detected, as it should be 0.
Is there any way to control this in mongodb?
Instead of using a double which uses floating point math you've got several other methods you can choose from that offer different pros and cons.
Integer
Represent the balance not in the whole units, but in the fractional units. So you store 123.45 as 12345. If the math is addition and subtraction then this is easy, and all you need to make sure of is that any time you represent a value you insert the . at the appropriate place, but you can write a single function for this.
Object
Represent it as two integers: one for the whole part and one for the fractional part (123 and 45). This makes the math harder, but potentially makes the output easier to manage, and harder to accidentally output the wrong value.
String
With a string you'll have to write all your own math, but the output can be simple as the stored value would be '123.45'.
So let's say I have a sensor that's giving me a number, let's say the local temperature, or whatever really, every 1/100th of a second.
So in a second I've filled up an array with a hundred numbers.
What I want to do is, over time, create a statistical model, most likely a bell curve, of this streaming data so I can get the population standard deviation of this data.
Now on a computer with a lot of storage, this won't be an issue, but on something small like a raspberry pi, or any microprocessor, storing all the numbers generated from multiple sensors over a period of months becomes very unrealistic.
When I looked at the math of getting the standard deviation, I thought of simply storing a few numbers:
The total running sum of all the numbers so far, the count of numbers, and lastly a running sum of (each number - the current mean)^2.
Using this, whenever I would get a new number, I would simply add one to the count, add the number to the running sum, get the new mean, add the (new number - new mean)^2 to the running sum, divide that by the count and root that, to get the new standard deviation.
There are a few problems with this approach, however:
It would take 476 years to overflow the sum of numbers streaming in assuming the data type is temperature and the average temperature is 60 degrees Fahrenheit and the numbers are streamed at a 100hz.
The same level of confidence cannot be held for the sum of the (number - mean)^2 since it is a sum of squared numbers.
Most importantly, this approach is highly inaccurate since for each number a new mean is used, which completely obliterates the entire mathematical value of a standard deviation, especially a population standard deviation.
If you believe a population standard deviation is impossible to achieve, then how should I go about a sample standard deviation? Taking every nth number will still result in the same problems.
I also don't want to limit my data set to a time interval, (ie. a model for only the last 24 hours of sensor data is made) since I want my statistical model to be representative of the sensor data over a long period of time, namely a year, and if I have to wait a year to do testing and debugging or even getting a useable model, I won't be having fun.
Is there any sort of mathematical work around to get a population, or at least a sample standard deviation of an ever increasing set of numbers, without actually storing that set since that would be impossible, and still being able to accurately detect when something is multiple standard deviations away?
The closest answer I've seen is: wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm, however, but I have no idea what this is saying and if this requires storage of the set of numbers.
Thank you!
Link shows code, and it is clear that you need to store only 3 variables: number of samples so far, current mean and sum of quadratic differences
we have an existing silverlight app which runs in browser + on hardware.
we want to rewrite this app using angular js and html5.
one of the key requirements with new system is support of internationalization and localization. and target countries are usa, brazil, italy for now.
Am new to this area and have lot of basic questions.
does existing database needs to be redesigned to support same ? i mean to identify columns (product_name/customer_name etc) that needs to have locale specific data and then store data for each locale and modify sprocs and webapi to accept language parameter and then get content based on that. ?
I believe we need to user nvarchar for such columns.
what will happen to currency and date time columns in db ? say there is quantity column then what should be data type of this column in db ? if current locale is Portuguese then will qty stored in Portuguese number.
what is the best practices for storing and retrieving currency column
based on locale.
what is the best practices for storing and retrieving date column
based on locale.
how to handle string checks, numeric checks in webapi methods ?
how to do comparison and checks in javascript for string,number,datetime
please share link to some good pointers which could help.
so in short right from javascript to .net webapi to database (sql) how should we take care of locale dependent logic and fields
thanks.
A lot of questions, let's see if I can answer those.
If your existing application is properly internationalized, I don't think there is any need to modify the database. Just make sure it is able to handle international characters (NCHAR, NVARCHAR, NTEXT in MS SQL, valid character encodings in others).
As for DB design, it is good to keep things locale-independent as long as you can. For instance it is better to store keys in the database and resolve them at runtime. However, if your data is dynamic (i.e. you have product names and their descriptions that changes often), the only way to go is to have translation table and look the data up using valid locale. It's quite complex in relational world (i.e. joins), but it could be done.
2,3. All the numeric columns should be kept locale-independent and formatted on the UI side. The more problematic would be prices and sales orders - you would need an additional column to store the currency code (i.e. 12.34 | USD). On the UI side you would need to pass the code to the Angular currency filter. The only gotcha here is, Angular does not support easy locale context switching, so you would need to use a hacky library like Angular Dynamic Locale to load the formats for you.
Similar. Keep it locale-independent. DB built-in types should automatically handle that for you and give you nice DateTime/DateTimeOffset (in a .Net world) back. The only gotcha would be the time zone - it may make sense to use DATETIMEOFFSET MS SQL type, as others does not store time zone.
There is an alternative way to store date and times in the database - you may decide to store it as a number of milliseconds since January 1, 1970 UTC - as BIGINT type. Especially if you are going to read this directly to JS, you will be able to easily re-create JS Date object (should you need this for calculations or something) in a valid time zone (it works the other way round as well). All you have to do to format date is to use this number (not date, that is AFAIR) and Angular's date filter with UTC as a parameter.
I don't think I understand what you're asking exactly. I guess the question is about validation of user input, rather than API. Well, beware of using Regular Expressions, because JavaScript doesn't handle Unicode well (at least in this area). You'd need to ask more precise question.
Assuming that you have Number and Date objects (i.e. typeof o == 'number') it is straightforward (as in obj1 === obj2).
As far as strings are concerned... Well, str1 === str2 will give you valid answer if you want to be exact. If you want to sort them, modern web browsers (Chrome 14+, Firefox 29+, IE11+) implement EcmaScript 402 Internationalization API so you can do something like str1.localeCompare(str2, locale), see this article.
The real problem occurs when you want to compare two strings case insensitive and accent insensitive for equality (as oppose for ordering like in case of sorting). Basically, there is no way (and this is true even in "big" programming languages like Java or C#).
How do you efficiently store very large and very small numbers from say 10^-100 to 10^100, so that you can use them to calculate values in a programming language like JavaScript.
JavaScript stores 10^100 as 1e+101, is there a way to do that in the database? The numbers would not often be that large, but I would like to do calculations with data such as 10^-34 * 2^16 or whatever, so the database should (I think) be storing these as numbers...
How does this work? How do you store numbers of this scale such that you can run computations with them?
By "the database", I'm thinking in general. I am messing around with MongoDB and Neo4j currently.
Databases themselves don't support numbers of arbitrary size in a native numeric format. Your general upper limit on numeric types is usually 8 bytes, which isn't anywhere near a googol.
You'll have to store the number either as a string (least efficient, easiest to work with, can be as precise as needed), as a byte array of arbitrary length (more efficient, harder to work with, still arbitrary precision), or in scientific notation (most efficient, harder to work with, and limited precision).
The first two, unfortunately, do eliminate the possibility of doing any server-side computation, since there wouldn't be a native numeric type that could support the range of valid values. All of the computation would have to be done client-side using a suitable numeric type.
If I were you, I'd separate the numerical value from the exponent. I personally don't have experience with MongoDB or Neo4j, but in MySQL (I'm sure they have similar terms) I'd create a table with an VARCHAR (text) column with whatever precision you'd like in your program (or how many unique numbers), and another VARCHAR column with length 3 (for max exponent 999). You can tinker with the values as you see fit, but that's all I can think of. If you want more flexible size values, I'd store the numbers on the server's file system using PHP rather than use databases.
You could use the double type.
The MySQL DOUBLE[(M,D)]
A normal-size (double-precision) floating-point number. Permissible
values are -1.7976931348623157E+308 to -2.2250738585072014E-308, 0,
and 2.2250738585072014E-308 to 1.7976931348623157E+308. These are the
theoretical limits, based on the IEEE standard. The actual range might
be slightly smaller depending on your hardware or operating system.