Strip brackets and the word "The" from a string - javascript

I have been trying to figure out regex myself and to no avail can I get the last bracket to disappear from a string.
For example:
[The Day the Earth Stood Still]
I can only get:
Day the Earth Stood Still]
with the following RegEx code:
/(\[|\](^The ))\2/
I'm aiming for just:
Day the Earth Stood Still
Any help would be greatly appreciated. I've spent 3 hours trying to figure it out on my own... This is me giving in. :3

You can try:
\[The\s(.*)]
If you need this to work to strip out brackets even when 'The' is not present you can try:
(?:\[The\s|\[)(.*)]
If you think you will run into a case where you may have 'the' or 'The' you can try:
(?:\[[Tt]he\s|\[)(.*)]
Here is some code to implement capturing text without the brackets and 'The':
var title = new Array();
title[0] = "[The Day the Earth Stood Still]";
title[1] = "[Independence Day]";
title[2] = "[the Day the Earth Stood Still]";
alert(title[0].match(/(?:\[[Tt]he\s|\[)(.*)]/)[1]);
alert(title[1].match(/(?:\[[Tt]he\s|\[)(.*)]/)[1]);
alert(title[2].match(/(?:\[[Tt]he\s|\[)(.*)]/)[1]);
Try it out here: http://jsfiddle.net/aSwYz/

Maybe you should take the approach of thinking of what you want to keep instead of what you want to get rid of
(?<=\[The\s*)[\w\s]*(?=\])
Otherwise, you are trying to do a match in separate parts of the string and should be done with 2 different matches

Related

JavasScript Regex to exclude specific characters which are stored in a string variable, spaces, and change remaining characters to underscores

This is for a hangman-type guessing game. I already figured out how to use a Regex to display the letters as underscores on the page with appropriate spacing. Now I want use a Regex to do the following, all in one expression:
Check a string containing the correct answer this.answers[arraysIndex], against the string containing all of the user's correct guesses rightString
In the correct answer string: change only the letters that don't match the correct guesses string into underscores. This means I want to keep the spaces unchanged too.
I've tried this:
var regex = new RegExp("/(?![^"+rightString+"])/[\A-Za-z/])/","g");
newDisplay = (this.answers[arraysIndex]).replace(regex, "_");
...and this:
newDisplay = (this.answers[arraysIndex]).replace("/(?![^+rightString+])/[\A-Za-z/])/g", "_")
...and countless slight variations of each. I'm not married to the idea of using a string variable, I could use an array variable too, or maybe there's something that hasn't even occurred to me. I've researched exhaustively on here and many other resources (that's how I solved my first problem) but this one's got me beat. Any help is greatly appreciated.

Regex to match date with string as month

I have the following string:
var cur = "t+20d";
And I want to match it. That part I already did with
if(cur.match(/^[t]\+[0-9]{2,4}[dmyw]/i))
Now I also need to be able to match this string, and prefably in the same regex
var cur = "10may15+20d";
I have tried
cur.match(/^([t]|([0-9]{1,2}(jan|feb|march|apr|may|jun|jul|aug|sept|okt|nov|dec)))\+[0-9]{2,4}[dmyw]/i)
But it doens't work as intended.
if I try to compile the subpart I get two pieces of array instead of one
cur.match(/[0-9]{1,2}(jan|feb|march|apr|may|jun|jul|aug|sept|okt|nov|dec)/i);
//yields ["10MaY", "MaY"]
And this worries me about false positives.
I'm really really rusty at regex, last time I tried to make a complicated regex was 15 years ago and that was in perl, so I could really use some help with this one. I know ors and grouped matches are possible, I just can't figure out how to do it anymore so some help is appriciated.
You need to match the number which exists after the month.
^(t|[0-9]{1,2}(?:jan|feb|march|apr|may|jun|jul|aug|sept|okt|nov|dec)\d+)\+[0-9]{2,4}[dmyw]
DEMO
With the help of #AvinashRaj who pointed me to the group operator ?: with his regex I managed to compose this regexfor my uses and i'm posting it here for future users who might need to match a date string like this. ddmmmyy
cur = "10-apr-1115+20d";
cur.match(/^(?:t|[0-9]{1,2}(?:jan|feb|mar|apr|may|jun|jul|aug|sep|okt|nov|dec|[-\/](?:[0-9]{1,2}|jan|feb|mar|apr|may|jun|jul|aug|sep|okt|nov|dec)[-\/])(?:[0-9]{2}|[0-9]{4}))\+[0-9]{1,4}[dmyw]/igm);

JavaScript + RegEx Complications- Searching Strings Not Containing SubString

I am trying to use a RegEx to search through a long string, and I am having trouble coming up with an expression. I am trying to search through some HTML for a set of tags beginning with a tag containing a certain value and ending with a different tag containing another value. The code I am currently using to attempt this is as follows:
matcher = new RegExp(".*(<[^>]+" + startText + "((?!" + endText + ").)*" + endText + ")", 'g');
data.replace(matcher, "$1");
The strangeness around the middle ( ((\\?\\!endText).)* ) is borrowed from another thread, found here, that seems to describe my problem. The issue I am facing is that the expression matches the beginning tag, but it does not find the ending tag and instead includes the remainder of the data. Also, the lookaround in the middle slowed the expression down a lot. Any suggestions as to how I can get this working?
EDIT: I understand that parsing HTML in RegEx isn't the best option (makes me feel dirty), but I'm in a time-crunch and any other alternative I can think of will take too long. It's hard to say what exactly the markup I will be parsing will look like, as I am creating it on the fly. The best I can do is to say that I am looking at a large table of data that is collected for a range of items on a range of dates. Both of these ranges can vary, and I am trying to select a certain range of dates from a single row. The approximate value of startText and endText are \\#\\#ASSET_ID\\#\\#_<YYYY_MM_DD>. The idea is to find the code that corresponds to this range of cells. (This edit could quite possibly have made this even more confusing, but I'm not sure how much more information I could really give without explaining the entire application).
EDIT: Well, this was a stupid question. Apparently, I just forgot to add .* after the last paren. Can't believe I spent so long on this! Thanks to those of you that tried to help!
First of all, why is there a .* Dot Asterisk in the beginning? If you have text like the following:
This is my Text
And you want "my Text" pulled out, you do my\sText. You don't have to do the .*.
That being said, since all you'll be matching now is what you need, you don't need the main Capture Group around "Everything". This: .*(xxx) is a huge no-no, and can almost always be replaced with this: xxx. In other words, your regex could be replaced with:
<[^>]+xxx((?!zzz).)*zzz
From there I examine what it's doing.
You are looking for an HTML opening Delimeter <. You consume it.
You consume at least one character that is NOT a Closing HTML Delimeter, but can consume many. This is important, because if your tag is <table border=2>, then you have, at minimum, so far consumed <t, if not more.
You are now looking for a StartText. If that StartText is table, you'll never find it, because you have consumed the t. So replace that + with a *.
The regex is still success if the following is NOT the closing text, but starts from the VERY END of the document, because the Asterisk is being Greedy. I suggest making it lazy by adding a ?.
When the backtracking fails, it will look for the closing text and gather it successfully.
The result of that logic:
<[^>]*xxx((?!zzz).)*?zzz
If you're going to use a dot anyway, which is okay for new Regex writers, but not suggested for seasoned, I'd go with this:
<[^>]*xxx.*?zzz
So for Javascript, your code would say:
matcher = new RegExp("<[^>]*" + startText + ".*?" + endText, 'gi');
I put the IgnoreCase "i" in there for good measure, but you may or may not want that.

Grabbing the third fragment between square brackets

Still completely stuck with regex's and square brackets. Hopefully someone can help me out.
Say I have a string like this:
room_request[1][1][2011-08-21]
How would I grab the third fragment out of it?
I tried the following, but I'm not exactly sure what I'm doing so it's fairly hard to figure out where I'm going wrong.
.match(/\[(.*?)\]/);
But this returns the [1] fragment. (The first one, I guess).
So then, I asked here on SO and people told me to add a global flag:
.match(/\[(.*?)\]/g)[2];
In other cases that I've used this regex, this worked fine. However, in this case, I want the stuff INSIDE the square brackets. It returns:
[2011-08-21]
But I really want 2011-08-21.
How can I do this? Thanks a lot.
If anyone could recommend any decent resources about regular expressions, that'd be great aswell. I'm starting to understand the very basics but most of this stuff is far too confusing atm. Thanks.
Two possible methods. To grab the third bracketed expression:
.match(/\[.*?\]\[.*?\]\[(.*?)\]/);
Or, if you know that the expression you want is always at the end of the string:
.match(/\[(.*?)\]$/);
var str = "room_request[1][1][2011-08-21]"
var val = str.match(/\[[^\]]*\]\[[^\]]*\]\[([^\]]*)\]/);
alert(val[1]);
This is a little less messy I think:
var r = "room_request[1][1][2011-08-21]";
var match = r.match(/(?:\[([^\]]+)\]){3}/);
console.log(match[1]);
Basically, it picks out the third match of the square brackets containing something. You get the match result back with two matches - the whole [1][1][2011-08-21] (for whatever reason) and the matched date: 2011-08-21
My regex is a little rusty, but this certainly works.

Check that the user is entering a time format? eg 13:00

Basically, I'd like to have an input that when blur'd, will check the input to make sure it's in the format...
[24hour] : [minutes]
So for example 13:00, or 15:30.
So I guess I have to split it up into three parts, check the first bit is between 0 and 24, then check it has the semi-colon, then check it has a number between 0 and 60.
Going more complicated than that, it'd be fantastic to have it so if the user enters 19 it'll complete it as 19:00 for example.
I am using jQuery fwiw, but regular code is fine, for example I'm using this little piece of code so far which works fine, to convert . inputs to :
tempval = $(this).val().replace(".", ":");
$(this).val(tempval);
Not sure where to start with this, if anyone could recommend me some reading that'd be fantastic, thank you!
([0-1 ]?[0-9]|2[0-3]):([0-5][0-9])
I think that's the regex you're looking for (not specifically for javascript though).
http://www.regular-expressions.info/
This site has an excellent amount of info for language-specific regular expressions! Cheers!
I suggest using masked input That way the wrong input will be prevented in the first place.
Disclaimer: I haven't used that plugin myself, just found it by keywords "masked input"
There are a bunch of widgets that already deal with time validation - try googling for "jQuery time widget" - the first result doesn't look bad.
var re = /^(\d+)(:\d+)?$/;
var match = re.match(yourstring);
Now if the match has succeeded match is an array with the matched pieces: match[0] is the whole of yourstring (you don't care about that), match[1] has the digits before the colon (if any colon, else just digits), match[2] if it exists has the colon followed by the digits after it. So now you just need to perform your numeric tests on match[1], and possibly match[2] minus the leading colon, to ensure the numbers are correct.

Categories

Resources