is there any workaround for broken v8 date parser? - javascript

V8 Date parser is broken:
> new Date('asd qw 101')
Sat Jan 01 101 00:00:00 GMT+0100 (CET)
I can use fragile regular expression like this:
\d{1,2} (jan|feb|mar|may|jun|jul|aug|sep|oct|nov|dec) \d{1,4}
but it is too fragile. I cannot rely on new Date (issue in V8) and also moment cant help me because moment is getting rid off date detection (github issue-thread).
is there any workaround for broken v8 date parser?
To be clear. We have Gecko and V8, both have Date. V8 has broken Date, Gecko has working one. I need the Date from in Gecko (Firefox).
Update: It’s definitely broken parser https://code.google.com/p/v8/issues/detail?id=2602
nope, Status: WorkingAsIntended

Date objects are based on a time value that is the number of milliseconds since 1 January, 1970 UTC and have the following constructors
new Date();
new Date(value);
new Date(dateString);
new Date(year, month[, day[, hour[, minutes[, seconds[, milliseconds]]]]]);
From the docs,
dateString in new Date(dateString) is a string value representing a date. The string should be in a
format recognized by the Date.parse() method (IETF-compliant RFC 2822
timestamps and also a version of ISO8601).
Now looking at the v8 sourcecode in date.js:
function DateConstructor(year, month, date, hours, minutes, seconds, ms) {
if (!%_IsConstructCall()) {
// ECMA 262 - 15.9.2
return (new $Date()).toString();
}
// ECMA 262 - 15.9.3
var argc = %_ArgumentsLength();
var value;
if (argc == 0) {
value = %DateCurrentTime();
SET_UTC_DATE_VALUE(this, value);
} else if (argc == 1) {
if (IS_NUMBER(year)) {
value = year;
} else if (IS_STRING(year)) {
// Probe the Date cache. If we already have a time value for the
// given time, we re-use that instead of parsing the string again.
var cache = Date_cache;
if (cache.string === year) {
value = cache.time;
} else {
value = DateParse(year); <- DOES NOT RETURN NaN
if (!NUMBER_IS_NAN(value)) {
cache.time = value;
cache.string = year;
}
}
}
...
it looks like DateParse() does not return a NaN for for a string like 'asd qw 101' and hence the error. You can cross-check the same with Date.parse('asd qw 101') in both Chrome(v8) [which returns -58979943000000] and Gecko (Firefox) [which returns a NaN]. Sat Jan 01 101 00:00:00 comes when you seed new Date() with a timestamp of -58979943000000(in both browsers)
is there any workaround for broken v8 date parser?
I wouldnt say V8 date parser is broken. It just tries to satisfy a string against RFC 2822 standard in the best possible way but so does gecko and both break gives different results in certain cases.
Try new Date('Sun Ma 10 2015') in both Chrome(V8) and Firefox(Gecko) for another such anomaly.
Here chrome cannot decide weather 'Ma' stands for 'March' or 'May' and gives an Invalid Date while Firefox doesnt.
Workaround:
You can create your own wrapper around Date() to filter those strings that V8's own parser cannot. However, subclassing built-ins in ECMA-5 is not feasible. In ECMA-6, it will be possible to subclass built-in constructors (Array, Date, and Error) - reference
However you can use a more robust regular expression to validate strings against RFC 2822/ISO 8601
^(?:(?:31(\/|-|\. |\s)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:29|30)(\/|-|\.|\s)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.|\s)(?:0?2|(?:Feb))\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.|\s)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$
Image generated from debuggex
So, seems like v8 aint broken, it just works differently.
Hope it helps!

You seem to be asking for a way to parse a string that might be in any particular format and determine what data is represented. There are many reasons why this is a bad idea in general.
You say moment.js is "getting rid of date detection", but actually it never had this feature in the first place. People just made the assumption that it could do that, and in some cases it worked, and in many cases it didn't.
Here's an example that illustrates the problem.
var s = "01.02.03";
Is that a date? Maybe. Maybe not. It could be a section heading in a document. Even if we said it was a date, what date is it? It could be interpreted as any of the following:
January 2nd, 2003
January 2nd, 0003
February 1st, 2003
February 1st, 0003
February 3rd, 2001
February 3rd, 0001
The only way to disambiguate would be with knowledge of the current culture date settings. Javascript's Date object does just that - which means you will get a different value depending on the settings of the machine where the code is running. However, moment.js is about stability across all environments. Cultural settings are explicit, via moment's own locale functionality. Relying on the browser's culture settings leads to errors in interpretation.
The best thing to do is to be explicit about the format you are working with. Don't allow random garbage input. Expect your input in a particular format, and use a regex to validate that format ahead of time, rather then just trying to construct a Date and seeing if it's valid after the fact.
If you can't do that, you'll have to find additional context to help decide. For example, if you are scraping some random bits of the web from a back-end process and you want to extract a date from the text, you'd have to have some knowledge about the language and locale of each particular web page. You could guess, but you'd likely be wrong a fair amount of the time.
See also: Garbage in, garbage out

ES5 15.9.4.2 Date.parse: /.../ If the String does not conform to
that format the function may fall back to any implementation-specific
heuristics or implementation-specific date formats. Unrecognizable
Strings or dates containing illegal element values in the format
String shall cause Date.parse to return NaN.
So that's all right and according to the citation above result of v8 date parser:
new Date('asd qw 101') : Sat Jan 01 101 00:00:00 GMT+0100
(CET)
new Date('asd qw') : Invalid Date

Related

Why is 2/8888/2016 a valid date in IE and Firefox?

If you take the following:
var s = "2/8888/2016";
var d = new Date(s);
alert(d);
In Chrome, you'll get:
Invalid Date
But in IE and Firefox, you'll get:
Fri Jun 01 2040 00:00:00 GMT-0500 (Central Daylight Time)
It appears to be just adding 8888 days to Feb 01. Instead, I would expect the date to be considered invalid. Is there a way I can make FireFox and IE think this date string is invalid?
Short answer:
It's a misbehaviuor of the browsers you're mentioning.
You have to check date is in correct format on your own. But it's quite trivial, I suggest this approach:
Split the date in year y, month m, day d and create the Date object:
var date = new Date( y, m - 1, d ); // note that month is 0 based
Then compare the original values with the logical values obtained using the Date methods:
var isValid = date.getDate() == d &&
date.getMonth() == m-1 &&
date.getFullYear() == y;
Before doing all of this you may want to check if the date string is valid for any browser:
Detecting an "invalid date" Date instance in JavaScript
Long answer:
Firefox (and IE) accepting "2/8888/2016" as a correct string sate format seem to be a bug / misbehaviour.
In fact according to ECMAScript 2015 Language Specification when Date() is invoked with a single string argument should behave just as Date.parse()
http://www.ecma-international.org/ecma-262/6.0/#sec-date-value
The latter
attempts to parse the format of the String according to the rules (including extended years) called out in Date Time String Format (20.3.1.16)
..that is specified here
http://www.ecma-international.org/ecma-262/6.0/#sec-date-time-string-format
where you can read
The format is as follows: YYYY-MM-DDTHH:mm:ss.sssZ
[...]
MM is the month of the year from 01 (January) to 12 (December).
DD is the day of the month from 01 to 31.
It seems that Firefox is interpreting the string value as when Date() is invoked with multiple arguments.
From
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date
Note: Where Date is called as a constructor with more than one argument, if values are greater than their logical range (e.g. 13 is provided as the month value or 70 for the minute value), the adjacent value will be adjusted. E.g. new Date(2013, 13, 1) is equivalent to new Date(2014, 1, 1), both create a date for 2014-02-01 (note that the month is 0-based). Similarly for other values: new Date(2013, 2, 1, 0, 70) is equivalent to new Date(2013, 2, 1, 1, 10) which both create a date for 2013-03-01T01:10:00.
This may explain how "2/8888/2016" turns into 2040-05-31T22:00:00.000Z
There's no way to make IE and FF think it's invalid, except:
you could change their javascript implementations
you use a library instead to deal with that.
We can also expect that Javascript, as a language, evolves and we can cross our fingers that browsers decide to follow a more strict specification. The problem of course is that every "fix" must be also backward compatible with previous versions of the language (does not happen always, Perl for example).
So the best thing by now is to use some library just like momentjs as suggested by Derek in the post comments.
You have stumbled across yet another reason why you should manually parse date strings.
When Date is provided a single string argument, it is treated as a date string and parsed according to the rules in Date.parse. The rules there first attempt to parse it as an ISO 8601 format string. If that doesn't work, it may fall back to any parsing algorithm it wants.
In the case of "2/8888/2016", browsers will first attempt to parse it as an ISO format and fail. It seems from experimentation that IE and Firefox determine that the string is in month/day/year format and effectively call the Date constructor with:
new Date(2016,1,8888);
However, other browsers may attempt to validate the values and decide that 8888 is not a valid date or month, so return an invalid date. Both responses are compliant with ECMA-262.
The best advice is to always manually parse date strings (a library can help, but generally isn't necessary as a bespoke parse function with validation is 3 lines of code) then you can be certain of consistent results in any browser or host environment.

JS Date() - Leading zero on days

A leading zero for the day within a string seems to break the Javascript Date object in Chrome. There are also some inconsistencies between browsers, since Firefox handles the leading zero correctly, but fails when the zero is not included. See this example: https://jsfiddle.net/3m6ovh1f/3/
Date('2015-11-01'); // works in Firefox, not in Chrome
Date('2015-11-1'); // works in Chrome, not in Firefox
Why? Is there a good way to work around/with the leading zero?
Please note, the strings are coming from MySQL via AJAX and all dates will contain the leading zero, and I can fix this by formating the dates server-side. What format would work the best?
EDIT
Just to specify what my problem was, it looks like Chrome is applying a time zone to the YYYY-MM-DD format, which reverts the Nov. 1st date back to the Oct. 31st date (because of my EDT local time).
According to ECMA-262 (5.1):
The function first attempts to parse the format of the String according to the rules called out in Date Time String Format (15.9.1.15). If the String does not conform to that format the function may fall back to any implementation-specific heuristics or implementation-specific date formats.
The date/time string format as described in 15.9.1.15 is YYYY-MM-DDTHH:mm:ss.sssZ. It can also be a shorter representation of this format, like YYYY-MM-DD.
2015-11-1 is not a valid date/time string for Javascript (note it's YYYY-MM-D and not YYYY-MM-DD). Thus, the implementation (browser) is able to do whatever it wants with that string. It can attempt to parse the string in a different format, or it can simply say that the string is an invalid date. Chrome chooses the former (see DateParser::Parse) and attempts to parse it as a "legacy" date. Firefox seems to choose the latter, and refuses to parse it.
Now, your claim that new Date('2015-11-01') doesn't work in Chrome is incorrect. As the string conforms to the date/time string format, Chrome must parse it to be specification compliant. In fact, I just tried it myself -- it works in Chrome.
So, what are your options here?
Use the correct date/time format (i.e. YYYY-MM-DD or some extension of it).
Use the new Date (year, month, date) constructor, i.e. new Date(2015, 10, 1) (months go from 0-11) in this case.
Whichever option is up to you, but there is a date/time string format that all specification compliant browsers should agree on.
As an alternative, why not use unix timestamps instead? In JavaScript, you would would multiply the timestamp value by 1000,
e.g
var _t = { time: 1446220558 };
var _d = new Date( _t.time*1000 );
Test in your browser console:
new Date( 14462205581000 );
// prints Fri Oct 30 2015 11:55:58 GMT-0400 (EDT)
There's a little benefit in it as well (if data comes via JS) - you'd save 2 bytes on every date element '2015-10-30' VS 1446220558 :)

momentJs parse Date: Unexpected behaviour

Using momentJs I've come across something I don't understand;
let's play around with one particular date, for example 2099-11-11T15:00
Convert Moment to Date:
> moment('2099-11-11T15:00').toDate()
// => Wed Nov 11 2099 15:00:00 GMT+0100 (CET)
Convert Date to Moment:
> var d = new Date('2099-11-11T15:00')
// => undefined
> moment(d)
// => { ... _d: Wed Nov 11 2099 16:00:00 GMT+0100 (CET) }
We have different dates, the first one is wednesday at 15:00 but the second one is wednesday at 16:00. Indeed if we compare them:
moment(d).isSame(moment('2099-11-11T15:00'))
// => false
At first look I thought it was something related with toDate() method but it's not; let's type the following:
new Date('2099-11-11T15:00').toISOString()
'2099-11-11T15:00:00.000Z'
moment('2099-11-11T15:00').toISOString()
'2099-11-11T14:00:00.000Z'
What's going on here?
A few things:
Do not pay attention to the _d field. Underscored fields are internal to moment's API, and may not always be what you expect. In many functions, it has to be evaluated in combination with other internal fields (such as _offset) in order to produce a valid output. Instead, use the various public functions, such as format, toDate, .valueOf, and others.
Recognize there are many difference between how the Date constructor parses strings and how moment's parsing functions work. Don't expect them to match.
When a string doesn't contain any time zone information, moment(...) will always treat it as local, while moment.utc(...) will treat it as UTC. (Your answer gives a good example of that.)
When the Date constructor is given a string, the format of the string can affect the interpretation significantly. The actual behavior can vary across implementations, but most current browsers will see the hyphens and the T as an indication that the string is in ISO8601 format. However without any trailing Z or offset, the ES5 spec says to interpret those as UTC. This has changed for ES6, which will treat those cases as local time - to better conform with the ISO8601 spec. Since it's not clear when various environments will start to implement this change, it's prudent to not rely on the Date constructor.
If you wanted the Date constructor to interpret your value as local time (with ES5), one approach is to use string replacements to remove the T and swap the hyphens (-) with slashes (/). This works in the majority of environments, though there is no specification around this. (I've been told it fails in some Safari browsers.) Really, I would just use Moment's parsing functions, and not rely on the Date constructor at all.
Finally I decided to use moment.utc(...) instead of moment() for every operation, doing so, both times are the same:
var d = new Date('2099-11-11T15:00')
var m = moment.utc('2099-11-11T15:00')
m.isSame(moment.utc(d))
// => true

Javascript: Difference between `new Date(dateString)` vs `new Date(year, month, day)`

Referencing to the accepted answer on this question How do I get the number of days between two dates in JavaScript?. I see, in the function parseDate:
function parseDate(str) {
var mdy = str.split('/')
return new Date(mdy[2], mdy[0]-1, mdy[1]);
}
He is doing this:
var mdy = str.split('/')
return new Date(mdy[2], mdy[0]-1, mdy[1]);
i.e. splitting the passed date into month, day and year and then passing it on to Date like new Date(year, month, day) while he could simply do new Date(str) and it would have returned the same result (Wouldn't it?). Can anyone please explain the difference between both the ways?
Update: Test results:
var str = '1/1/2000'
var mdy = str.split('/')
console.log( new Date(str) ) // Sat Jan 01 2000 00:00:00 GMT+0500 (Pakistan Standard Time)
console.log( new Date(mdy[2], mdy[0]-1, mdy[1]) ); // Sat Jan 01 2000 00:00:00 GMT+0500 (Pakistan Standard Time)
No, they're not the same (even assuming you'll subtract one month later: he's doing mdy[0] - 1) because new Date(str) is required (by standard, see §15.9.4.2) to accept only date in a specific format ISO 8601 (YYYY-MM-DDTHH:mm:ss.sssZ, see also this post, I won't repeat myself here):
If the String does not conform to that format [ISO 8601] the function may fall back to any implementation-specific heuristics or implementation-specific date formats.
Please note (as pointed out by Royi in comments) that also RFC 2822 should be supported (according to MDN) but it's not mentioned in JavaScript specifications and Internet Explorer doesn't officially support it (see MSDN, it can parse something similar but it's not the same).
In that code they're parsing using a specific locale rules (MM/DD/YYYY, I suppose en-US locale but it's not only one). To be honest I wouldn't even use that code for parsing (because yes, actually it'll be broken for a different locale: even separator used for splitting is not "locale safe"). Let me explain with an example:
You're using a proper configured date time picker (or <input type="date"/> when supported) you'll enter date according to your locale. For example in Italy (but in general in Europe) we write DD/MM/YYYY.
Now let's imagine that user picked 21 December 2014 (formatted as 21/12/2014 according to his locale).
With string splitting that code will fail (because it'll pick 21 as month number, obviously it's not valid). Even worse than that such errors may even go unnoticed (for example if user picks 1/2/2014 code will "think" it's 2nd Jan but user picked 1st Feb). Do you want to make it more complicate? Even new Date(str) may fail because it's browser dependent (and you can't really trust heuristic to be portable and safe).
If you're asking yourself "Then why they used such code?" I'd say that they used a quick workaround to support dates using en-US locale (probably because browser they used didn't support them with heuristic guess) but it's not something you should reuse.
Solution? Do not ever parse date by hand (unless you really and deep know what you're doing), use a good library (for example moment.js) for that because most assumption you may do about date formatting are...wrong.
I tried to enter your test code into jsperf.com, and the results on my machine are clear, and they say that you should not try to split the string.
I tried two tests for the test using a split string, and supprisingly, the split itself was not what was taking up the time.
Try for yourself at http://jsperf.com/date-from-string-or-dateparts

JavaScript: Difference between toString() and toLocaleString() methods of Date

I am unable to understand the difference between the toString() and toLocaleString() methods of a Date object in JavaScript. One thing I know is that toString() will automatically be called whenever the Date objects needs to be converted to string.
The following code returns identical results always:
​var d = new Date();
document.write( d + "<br />" );
document.write( d.toString() + "<br />" );
document.write( d.toLocaleString() );
​
And the output is:
Tue Aug 14 2012 08:08:54 GMT+0500 (PKT)
Tue Aug 14 2012 08:08:54 GMT+0500 (PKT)
Tue Aug 14 2012 08:08:54 GMT+0500 (PKT)
https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Date/toLocaleString
Basically, it formats the Date to how it would be formatted on the computer where the function is called, e.g. Month before Day in US, Day before Month in most of the rest of the world.
EDIT:
Because some others pointed out that the above reference isn't necessary reliable, how's this from the ECMAScript spec:
15.9.5.2 Date.prototype.toString ( )
This function returns a String value. The contents of the String are implementation->> dependent, but are intended to represent the Date in the current time zone in a convenient, human-readable form.
15.9.5.5 Date.prototype.toLocaleString ( )
This function returns a String value. The contents of the String are implementation->>dependent, but are intended to represent the Date in the current time zone in a convenient, human-readable form that corresponds to the conventions of the host environment‘s current locale.
Since you can hopefully assume that most implementations will reflect the specification, the difference is that toString() is just required to be readable, toLocaleString() should be readable in a format that the should match the users expectations based on their locale.
Converts a date to a string, using the operating system's locale's
conventions.
https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Date/toLocaleString
toLocaleString behaves similarly to toString when converting a year
that the operating system does not properly format.
I am just checked in console of the Chrome for date and found the difference in the presentation format. Hope this could help.
var d = new Date();
console.log(d.toLocaleString()); //"04.09.2016, 15:42:44"
console.log(d.toString()); //"Sun Sep 04 2016 15:42:44 GMT+0300 (FLE Daylight Time)"
Lots of references, but none are authoritative. Note that Mozilla's documentation is for JavaScript, which is their version of ECMAScript for browsers. Other browsers use other implementations and therefore, while the MDN documentation is useful, it is not authoritative (it is also a community wiki, so not even official Mozilla documentation) and does not necessarily apply to other browsers.
The definitive reference is the ECMAScript Language specification, where the behaviour of both Date.prototype.toString and Date.prototype.toLocaleString are explained in browser independent terms.
Notable is the for both methods, the string is implementation dependent, which means that different browsers will return different strings.
Just to add. Apart from Date, it also converts/formats the normal variable.
Both functions used to format/convert the passed parameter to string but how parameter is formatted is the point to look on.
toLocalestring() used to return the formatted string based on which geography the function is called.
For the sake of simplicity.
Take this example.
It shows how toString() won't format the variable but toLocaleSting() will format it based on locale setting of the geography.
let number = 1100;
console.log(number.toString()); // "1100"
console.log(number.toLocaleString()) // 1,100
let number = 1100;
console.log(number.toString());
console.log(number.toLocaleString());
It is a great help for programmer in order to avoid to write extra function to format the string or Date. toLocaleString() will take care of this.
Hope you would find it somewhat helpful & interesting.

Categories

Resources