Parse date(s) out of arbitrary string - javascript

So I need to write a module/function that would analyse a string and parse all possible date values given in any arbitrary form, eg:
bla-bla-bla today --> new Date()
january foo Mar 11th bar --> [new Date('2014/1/1'), new Date('2014/3/11')] // more than one date
lorem 11/08 ipsum --> new Date('2014/11/08')
monday --> new Date('2014/10/27')
How difficult would it be? Should I even bother trying, or it's a lot more difficult than I'm imagining? Maybe someone has already done something like that?
Only dates, time doesn't really matter, also I need to do it on the client, crazy eh?

One solution is to use http://momentjs.com/docs/#/plugins/parseformat/
Good luck!

I really depends on how custom you're looking to make it. Your best bet is really to exhaust the current options out there, http://momentjs.com/ or even the thing it "replaced", http://www.datejs.com/. Those libraries aren't meant to do that but you might be able to either contort them to do so or fork them and modify them for your purposes.
If you can't get something custom enough out of those I would recommend going the route of describing the date syntax you want as more of a grammer, then using a parser or parser generator to create something that can correctly parse what you want. Something in the realm of http://pegjs.majda.cz/ or http://marijnhaverbeke.nl/acorn/.
For some context I ran into a similar situation for time only and ended up using Scala's standard parser combinator library to achieve something of this sort. https://github.com/scala/scala-parser-combinators. That only worked out because I was able to express a fairly high level grammar and not get into the weeds of parsing particulars.

Related

Date/Time on Articles <time>

This is a fairly simple question. While I'll certainly accept and appreciate a detailed answer, guidance in the right direction is all I'm looking for as I have no qualms about learning. I still consider myself an amatuer so please forgive me if you find this trivial.
I'm sure you've all seen what I'm looking for here if you've read a blog or any type of news site. Articles usually have some type of heading with "1 Year Ago", "28 Minutes Ago", etc to reflect the difference in time from when an article was published to the current time you are looking at it. What I'm trying to figure out is how that is accomplished?
I learned today that a tag exists but so far I haven't been able to determine how the attributes you can assign to (e.g. datetime="2015-04-27 20:00") it turn into a readable "1 Year Ago". In my head, I'm imagining some ways I might be able to do this with JavaScript but I'm wondering if this is how it's typically done.
Thanks in advance.
What you might have read is that the "special attributes" actually are pseudo-attributes in some front end framework like Angular, React or Vue etc.
In angular they are known as custom directives. Where you can define custom attribute to pass some data into angular code and get some thing back in this case the humanized form of a date.
What you probably want is moment.js and some way to pass the the date into moment.js to parse it if your not using Angular or other frameworks. Since you are nt so descriptive about your code behind I leave it here on how to work that thing out.
A simple example to demonstrate Moment.js Time from now
moment([2007, 0, 29]).fromNow(); // 4 years ago

Grab dates from a website and put into a calender

This is a long-shot and I'm writing because I have not idea where to start.
I want to write some code that can automatically and on regular basis grab the 5 dates from this website and put them into my iCal calender.
Where should I start and end to do this?
I'm pretty good in RoR and Javascript, but have absolutely no idea what technology I should use to accomplish this.
Hope you can shed some light on my question.
Thanks
Assuming the HTML page is always going to keep the same basic structure, you could use something like nokogiri to locate the nodes containing the dates.
You can then use the Date.strptime or DateTime.strptime methods to convert the date from the particular format, into a Date or DateTime object, as required.
As for then adding the dates to your calendar, it's not something I have had to do, but you might want to check out How to interact with a CalDAV server from Ruby?
Use an XMLHttpRequest object in Javascript to download the page that you need and then use a regular expression to parse out the dates. It seems that the dates all have a fixed format:
<b>Mon Day Hr:Min UTC+4</b>
so it should be easy to write the regular expression for this. I don't know the exact Javascript Regex format but here's the .NET equivalent, it should be easy to tweak this to Javascript - hope this helps:
<b>(?<date>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) [0-9]{2} [0-9]{2}:[0-9]{2}) UTC[+-][0-9]+</b>
This finds all date fields in the page - once you have the date fields, each Regex match will have a sub-group named date that contains the actual date part.
If you go to this page: .NET Regex tester you can test the above expression to see how it returns the dates - just copy & paste your page's source with the dates. As I said, this is for .NET, not for Javascript but the differences are not terribly big.
Use a Ruby script. The Mechanize gem can scrape the dates from the web page. Then the ri_cal gem can add them to your calendar. A pure JavaScript approach like xxbbcc suggested may be possible but it will almost certainly be more involved. If you're already familiar with Ruby, I'd recommend taking advantage of the "magic" and let these gems do the dirty work for you.

Parsing a string into a custom object based on different criteria

As part of a small project I'm working on, I need to be able to parse a string into a custom object, which represents an action, date and a few other properties. The tricky part is that the input string can come in a variety of flavors that all need to be properly parsed.
Input strings may be in the following formats:
Go to work tomorrow at 9am
Wash my car on Monday, at 3 pm
Call the doctor next Tuesday at 10am
Fill out the rebate form in 3 days at 2:30pm
Wake me up every day at 7:00am
And the output object would look something like this:
{
"Action":"Wash my car",
"DateTime":"2011-12-26 3:00PM", // Format is irrelevant at this point
"Recurring":False,
"RecurranceType":""
}
At first I thought of constructing some sort of tree to represent different states (On, In, Every, etc.) with different outcomes and further states (candidate for a state machine, right?). However, the more I thought about this, the more it started looking like a grammar parsing problem. Due to a (limited) number of ways the sentence could be formed, it looks like some sort of grammar parsing algorithm would need to be implemented.
In addition, I'm doing this on the front end, so JavaScript is the language of choice here. Back end will be written in Python and could be used by calling AJAX methods, if necessary, but I'd prefer to keep it all in JavaScript. (To be honest, I don't think the language is a big issue here).
So, am I in way over my head? I have a strong JavaScript background, but nothing beyond school courses when it comes to language design, parsing, etc. Is there a better way to solve this problem? Any suggestions are greatly appreciated.
I don't know a lot about grammar parsing, but something here might help.
My first thought is that your sentence syntax seems to be pretty consistent
1st 3-4 words are generally VERB text NOUN, followed by some form of time. If the total options are limited to what form the sentence can take, you can hard-code some parsing rules.
I also ran across a couple of js grammar parsers that might get you somewhere:
http://jscc.jmksf.com/
http://pegjs.majda.cz/
http://www.corion.net/perl-dev/Javascript-Grammar.html
This is an interesting problem you have. Please update this with your solutions later.

date.js Parse method overrides Javascript Parse method

I'm including the date.js library in my site because I need its functionality.
I have just realized, though, that the standard Javascript parse() method is overwritten by it.
I'm trying to build a line chart in Highcharts, and the data series wants the first element to be be in milliseconds (their demos show them using the Date.UTC() method to achieve this, but my data is returned in a different format).
Short of doing a bunch of string manipulation to put my data into a format that Date.UTC will recognize, is there another way of getting the standard Javascript parse() functionality while date.js is loaded?
I know this isn't a direct solution to your problem, but it may help anyway.
If you want a fully featured date library that doesn't modify the native Date object, I wrote one called Moment.js.
It provides a lot of the things that DateJS provides (formatting, parsing, manipulation, timeago, i18n, etc), but it's smaller, faster, and doesn't ruin the native date prototype.
https://github.com/timrwood/moment
Nope, this is the intended design of date.js. It adds to the "prototype" of the Date object. Some people hate that, some people like it - but you've uncovered one of the drawbacks of this design.
You can tell Highcharts to not use UTC date:
Highcharts.setOptions({
global: {
useUTC: false
}
});
You should do this before you create the chart. Then you won't have to worry about converting your dates to UTC, it will be easier.
after I asked the question, I went ahead and did it this way:
d = Date.parse(data);
y = d.getFullYear();
m = d.getMonth();
d = d.getDate();
dUTC = Date.UTC(y, m, d);
but will now try your suggestions.

Localize dates on a browser?

Let's say I have a date that I can represent in a culture-invariant format (ISO 8601).
I'll pick July 6, 2009, 3:54 pm UTC time in Paris, a.k.a. 5:54 pm local time in Paris observing daylight savings.
2009-07-06T15:54:12.000+02:00
OK... is there any hidden gem of markup that will tell the browser to convert that string into a localized version of it?
The closest solution is using Javascript's Date.prototype.toLocaleString(). It certainly does a good job, but it can be slow to iterate over a lot of dates, and it relies on Javascript.
Is there any HTML, CSS, XSLT, or otherwise semantic markup that a browser will recognize and automatically render the correct localized string?
Edit:
The method I am currently using is replacing the text of an HTML element with a localized string:
Starting with:
<span class="date">2009/07/06 15:54:12 GMT</span>
Using Javascript (with jQuery):
var dates = $("span.date", context);
// use for loop instead of .each() for speed
for(var i=0,len=dates.length; i < len; i++) {
// parse the date
var d = new Date(dates.eq(i).text());
// set the text to the localized string
dates.eq(i).text(d.toLocaleString());
}
From a practical point of view, it makes the text "flash" to the new value when the Javascript runs, and I don't like it.
From a principles point of view, I don't get why we need to do this - the browser should be able to localize standard things like currency, dates, numbers, as long as we mark it up as such.
A follow up question: Why do browsers/the Web not have such a simple feature - take a standard data item, and format it according to the client's settings?
I use toLocaleString() on my site, and I've never had a problem with the speed of it. How are you getting the server date into the Date object? Parsing?
I add a comment node right before I display the date as the server sees it. Inside the comment node is the date/time of that post as the number of milliseconds since epoch. In Rails, for example:
<!--<%= post.created_at.to_i * 1000 %>-->
If they have JS enabled, I use jQuery to grab those nodes, get the value of the comment, then:
var date = new Date();
date.setTime(msFromEpoch);
// output date.toLocaleString()
If they don't have JS enabled, they can feel free to do the conversion in their head.
If you're trying to parse the ISO time, that may be the cause of your slowness. Also, how many dates are we talking?
Unfortunately, there is not.
HTML & CSS are strictly used for presentation, as such, there is no "smarts" built in to change the way things are displayed.
Your best bet would be to use a server side language (like .NET, Python, etc.) to emit the dates into the HTML in the format you want them shown to your user.
It is not possible to do this with HTML, it has no smart tags that can make any kind of decisions like this. It is strictly presentational. I do wonder, though, if HTML5 perhaps has a tag for something like this...
Anyways, the way I see it, you have 3 options:
Stick to the Javascript way. There's questions with more details on it on this website, such as How do I display a date/time in the user’s locale format and time offset? and How can I determine a web user’s time zone?
Try to use geolocation. That is, your server side script fires off a request to one of the many geolocator services out there on the user's first page visit to try and guess where the user is. The downside of this is that it will be wrong about 10% of the time, so it's not that much better than the market share Javascript is going to get you.... (all in all, then, not a very good method...)
Ask the user! You will see that most websites that want to display a tailored experience for you will ask you this sort of thing because it's just not possible to know. As a neat fallback, you could wrap the question around <noscript> tags so you only ask those with Javascript disabled while offering the Javascript experience to those that have it.
Dojo has some pretty good localizations for dates and currencies. Using this method also allows you to pick different formats (e.g.: short date vs long date) and force locales.
The language and the user's locale should be sent on the HTTP header. You can use those to create the correct date format server-side to be displayed to the user. However, this is often undesirable because many users completely ignore their locale settings in their OS and/or browser. So, you may be feeding USA style timestamps to New Zealanders.
I liked the trick posted in the comment above, but it sounds like a QA headache, since you could be dealing with a large number of clients that implement timestamps in very different ways.
The most effective solution I have seen, is to simple provide a panel to allow your users to choose what time format they like. Some users even ****gasp**** like ISO formats. Then you do the time format conversion server side. If your application language does not have good locale to timezone formatting mapping, check your database. Many databases provide locale-based customized timezone formatting as well.
Because this anwser still popups in google I share that this is now possible to do by using a readonly datetime-local input (see below) and you can then style the input the way you want:
<input type="datetime-local" value="2018-06-12T19:30" readonly />
For more information see: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/datetime-local

Categories

Resources