A simpler way to capture multiple variables in javascript regexps - javascript

Comming from the perl/python world I was wondering if there is a simpler way to filter out multiple captured variables from regexp in javascript:
#!/usr/bin/env node
var data=[
"DATE: Feb 26,2015",
"hello this should not match"
];
for(var i=0; i<data.length; i++) {
var re = new RegExp('^DATE:\\s(.*),(.*)$');
if(data[i].match(re)) {
//match correctly, but how to get hold of the $1 and $2 ?
}
if(re.exec(data[i])) {
//match correctly, how to get hold of the $1 and $2 ?
}
var ret = '';
if(data[i].match(re) && (ret = data[i].replace(re,'$1|$2'))) {
console.log("line matched:" + data[i]);
console.log("return string:" + ret);
ret = ret.split(/\|/g);
if (typeof ret !== 'undefined') {
console.log("date:" + ret[0], "\nyear:" + ret[1]);
}
else {
console.log("match but unable to parse capturing parentheses");
}
}
}
The last condition works, but you need a temp var and split it, and you need to have a test in front because the replace works on everything.
Output is:
$ ./reg1.js
line matched:DATE: Feb 26,2015
return string:Feb 26|2015
date:Feb 26
year:2015
If I look up: mosdev regexp it says on (x):
The matched substring can be recalled from the resulting array's
elements 1, ..., [n] or from the predefined RegExp object's
properties $1, ..., $9.
How do I get hold of the RegExp objects' $1 and $2?
Thanks

The MDN is a good resource for learning Javascript. In this particular case, .match(), .exec(), etc. all return objects containing match information. That is where you'll find captured groups.

Thanks for the answer found that they return an array:, so the simpler blocks can look like this:
if((ret = data[i].match(re))!=null) {
//match correctly, but how to get hold of the $1 and $2 ?
console.log("line matched:" + data[i]);
console.log("return string:" + ret[0] + "|" + ret[1]);
ret = null;
}
if((ret = re.exec(data[i]))!=null) {
//match correctly, how to get hold of the $1 and $2 ?
console.log("line matched:" + data[i]);
console.log("return string:" + ret[0] + "|" + ret[1]);
ret = null;
}

Using JavaScript .test() and .match() this can be very simple
An example:
var input = "DATE: Feb 26, 2015",
regex = /^DATE:\s*(.*),\s*(.*)$/;
if (regex.match(input)) {
console.log('Matches Format!');
//.match() needs splicing because .match() returns the actually selected stuff. It becomes weirder with //g
var results = input.match(regex).splice(0,1);
console.log(results);
//Logs: ["Feb 26", "2015"]
}
Regex101 can be useful

Related

Match or search string by comma separated values by jquery or js

I want to match/search string partially by js array to a string. my string and array example are below
var str = "host, gmail, yahoo";
var search = 'new#gmail.com';
i already tried as below:
if( str.split(',').indexOf(search) > -1 ) {
console.log('Found');
}
It should match with gmail for string new#gmail.com
i am using this with reference of: https://stackoverflow.com/a/13313857/2384642
There's a few issues here. Firstly, your input string has spaces after the comma, yet you're splitting by just the comma hence you'd get ' gmail' as a value, which would break the indexOf() result. Either remove the spaces, or use split(', ').
Secondly, you need to loop through the resulting array from the split() operation and check each value in the search string individually. You're also currently using indexOf() backwards, ie. you're looking for new#gmail.com within gmail. With these issues in mind, try this:
var str = "host,gmail,yahoo";
var search = 'new#gmail.com';
str.split(',').forEach(function(host) {
if (search.indexOf(host) != -1) {
console.log('Found');
}
});
Also note that you could define the array of hosts explicitly and avoid the need to split():
var hosts = ['host', 'gmail', 'yahoo'];
var search = 'new#gmail.com';
hosts.forEach(function(host) {
if (search.indexOf(host) != -1) {
console.log('Found');
}
});
As the split method returns an array, you'd have to iterate through that array and check for matchs.
Here's a demo:
// added gmail.com to the string so you can see more matched results(gmail and gmail.com).
var str = "host, gmail, yahoo, gmail.com",
search = 'new#gmail.com',
splitArr = str.replace(/(\,\s+)/g, ',').split(','),
/* the replace method above is used to remove whitespace(s) after the comma. The str variable stays the same as the 'replace' method doesn't change the original strings, it returns the replaced one. */
l = splitArr.length,
i = 0;
for(; i < l; i++) {
if(search.indexOf(splitArr[i]) > -1 ) {
console.log('Found a match: "' + splitArr[i] + '" at the ' + i + ' index.\n');
}
}
As you can see, none substring in str contains search value. So you need to invert the logic. something like this.
var str = "host, gmail, yahoo";
var search = 'new#gmail.com';
var res = str.split(', ').filter(function(el) {
return search.indexOf(el) > -1;
});
console.log(res);
Declare the Array with this code so you don't need to separate it with the ','
var str = new Array ("host","gmail","yahoo");
To find the element, use something this
for (i = 0; i < str.length; ++i)
{
val = str[i];
if (val.substring(0) === "gmail")
{
res = val;
break;
}
}
//Use res (result) here
Note: This is my first answer, so plese forgive me if there are some errors...
It should match with gmail for string new#gmail.com
In order to achieve your result you need to extract gmail from the search string.
You can achieve this with regex:
search.match( /\S+#(\S+)\.\S+/)[1]
var str = "host, gmail, yahoo, qwegmail";
var search = 'new#gmail.com';
if( str.split(', ').indexOf(search.match( /\S+#(\S+)\.\S+/)[1]) > -1 ) {
console.log(search + ': Found');
} else {
console.log(search + ': Not found');
}
search = 'asd#qwegmail.com';
if( str.split(', ').indexOf(search.match( /\S+#(\S+)\.\S+/)[1]) > -1 ) {
console.log(search + ': Found');
} else {
console.log(search + ': Not found');
}

Recombine capture groups in single regexp?

I am trying to handle input groups similar to:
'...A.B.' and want to output '.....AB'.
Another example:
'.C..Z..B.' ==> '......CZB'
I have been working with the following:
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$1")
returns:
"....."
and
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$2")
returns:
"AB"
but
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$1$2")
returns
"...A.B."
Is there a way to return
"....AB"
with a single regexp?
I have only been able to accomplish this with:
'...A.B.'.replace(/(\.*)([A-Z]*)/g, "$1") + '...A.B.'.replace(/(\.*)([A-Z]*)/g, "$2")
==> ".....AB"
If the goal is to move all of the . to the beginning and all of the A-Z to the end, then I believe the answer to
with a single regexp?
is "no."
Separately, I don't think there's a simpler, more efficient way than two replace calls — but not the two you've shown. Instead:
var str = "...A..B...C.";
var result = str.replace(/[A-Z]/g, "") + str.replace(/\./g, "");
console.log(result);
(I don't know what you want to do with non-., non-A-Z characters, so I've ignored them.)
If you really want to do it with a single call to replace (e.g., a single pass through the string matters), you can, but I'm fairly sure you'd have to use the function callback and state variables:
var str = "...A..B...C.";
var dots = "";
var nondots = "";
var result = str.replace(/\.|[A-Z]|$/g, function(m) {
if (!m) {
// Matched the end of input; return the
// strings we've been building up
return dots + nondots;
}
// Matched a dot or letter, add to relevant
// string and return nothing
if (m === ".") {
dots += m;
} else {
nondots += m;
}
return "";
});
console.log(result);
That is, of course, incredibly ugly. :-)

Substring and regex

I have a SQL string:
var sql= "AND DT_FIM IS NULL AND ( CD_JOB, DT_INI_JOB ) IN (SELECT x.CD_JOB, x.DT_INI FROM PRS_JOBS_MAQUINA x WHERE x.CD_JOB = ':CD_JOB' AND TO_CHAR(x.DT_INI, 'YYYY-MM-DD') = ':DT_INI_JOB' AND x.DT_FIM IS NULL)
and i need to extract the binded values(:CD_JOB , :DT_INI_JOB), the problem is that with
var bindFields=sql.substring(sql.lastIndexOf("':") + 1 , sql.lastIndexOf("'"));
it returns only the first match and i need both.
Is it possible with Javascript? Lodash to the rescue if anybody find it useful.
https://jsfiddle.net/oq0dmyjo/1/ .
Apreciate your help . Thanks
You can use a very simple regex with a capturing group:
/':([^']+)/g
Explanation:
': - a literal sequence ':
([^']+) - Group 1 capturing 1 or more symbols other than '
g - a global modifier matching all instances.
Here are some resources to learn:
Capturing groups
How do you access the matched groups in a JavaScript regular expression?
JS code:
var re = /':([^']+)/g;
var str = "AND DT_FIM IS NULL AND ( CD_JOB, DT_INI_JOB ) IN (SELECT x.CD_JOB, x.DT_INI FROM PRS_JOBS_MAQUINA x WHERE x.CD_JOB = ':CD_JOB' AND TO_CHAR(x.DT_INI, 'YYYY-MM-DD') = ':DT_INI_JOB' AND x.DT_FIM IS NULL)";
var res = [];
while ((m = re.exec(str)) !== null) {
res.push(m[1]);
}
document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";
Since it looks like you're binding the fields, you probably want to use replace with a replacement function. Use the regex from #WiktorStribiżew, /':([^']+)'/g, but then use it like this:
var sql = "AND DT_FIM IS NULL AND ( CD_JOB, DT_INI_JOB ) IN (SELECT x.CD_JOB, x.DT_INI FROM PRS_JOBS_MAQUINA x WHERE x.CD_JOB = ':CD_JOB' AND TO_CHAR(x.DT_INI, 'YYYY-MM-DD') = ':DT_INI_JOB' AND x.DT_FIM IS NULL)";
var fields = {
CD_JOB: 1,
DT_INI_JOB: 2
};
sql = sql.replace(/':([^']+)'/g, function($0, $1) {
return fields[$1];
});
console.log(sql);

search for keyword in extracted text javascript

So I managed to extract some text, and then I saved it as a variable for later use, how can I test for certain keywords within said text?
Here's an example, checkTitle is the text I extracted, and I want to search it for certain keywords, in this example, the words delimited by commas within compareTitle. I want to search for the strings '5' and 'monkeys'.
var checkTitle = "5 monkeys jumping on the bed";
var compareTitle = "5 , monkeys";
if (checkTitle === compareTitle) {
// ...
}
You can use regular expressions to search for strings, and the test() function to return true/false if the string contains the words.
/(?:^|\s)(?:(?:monkeys)|(?:5))\s+/gi.test("5 monkeys jumping on the bed");
will return true, because the string contains either (in this case, both) the words 5 and monkeys.
See my example here: http://regexr.com/39ti7 to use the site's tools to analyse each aspect of the regular expression.
If you need to change the words you are testing for each time, then you can use this code:
function checkForMatches(){
var words = checkTitle.split(" , ");
var patt;
for (var i = 0; i < words.length; i++){
patt = patt + "(" + words[i] + ")|";
}
patt[patt.length-1] = "";
patt = new RegExp(patt, "i");
return patt.test(checkTitle);
}
will return true, and allow for any words you want to be checked against. They must be separated like this:
compareText = "word1 , word2 , word3 , so on , ...";
// ^^^notice, space-comma-space.
And you can use it in an if statement like this:
var checkTitle = "5 monkeys jumping on the bed";
var compareTitle = "5 , monkeys";
if (checkForMatches()){
//a match was found!
}
Using the String.prototype.contains() method you can search for substrings within strings. For example,
"Hi there everyone!".contains("Hi"); //true
"Hi there everyone!".contains("there e"); //true
"Hi there everyone!".contains("goodbye"); //false
You can also use Array.prototype.indexOf() to a similar effect:
("Hi there everyone!".indexOf("Hi") > -1) //true
("Hi there everyone!".indexOf("Hmmn") > -1) //false
However, in this method, we don't even need to use any of the above methods, as the .match() method will do it all for us:
To return an array of the matched keywords, use .match(). It accepts a regular expression as an argument, and returns an array of substrings which match the regular expression.
var checkTitle = "5 monkeys jumping on the bed";
var compareTitle = "5 , monkeys";
/* ^^^
It is very important to separate each word with space-comma-space */
function getWords(str){ //returns an array containing key words
return str.split(" , ");
}
function matches(str, words){
var patt = "";
for (var i = 0; i < words.length; i++){
patt = (i > 0)? patt + "|" + "(" + words[i] + ")" : patt + "(" + words[i] + ")";
}
patt[patt.length-1] = "";
patt = new RegExp(patt, 'gi');
// ^ i makes it case-insensitive
return str.match(patt);
}
matches(compareTitle, getWords(checkTitle)); // should return ['5','monkeys']
you can check if a match exists by doing this:
if (matches(compareTitle, getWords(checkTitle))){
// matches found
}
If this answer is better than the first, accept this one instead. Ask for any more help, or if it doesn't work.

Remove all dots except the first one from a string

Given a string
'1.2.3.4.5'
I would like to get this output
'1.2345'
(In case there are no dots in the string, the string should be returned unchanged.)
I wrote this
function process( input ) {
var index = input.indexOf( '.' );
if ( index > -1 ) {
input = input.substr( 0, index + 1 ) +
input.slice( index ).replace( /\./g, '' );
}
return input;
}
Live demo: http://jsfiddle.net/EDTNK/1/
It works but I was hoping for a slightly more elegant solution...
There is a pretty short solution (assuming input is your string):
var output = input.split('.');
output = output.shift() + '.' + output.join('');
If input is "1.2.3.4", then output will be equal to "1.234".
See this jsfiddle for a proof. Of course you can enclose it in a function, if you find it necessary.
EDIT:
Taking into account your additional requirement (to not modify the output if there is no dot found), the solution could look like this:
var output = input.split('.');
output = output.shift() + (output.length ? '.' + output.join('') : '');
which will leave eg. "1234" (no dot found) unchanged. See this jsfiddle for updated code.
It would be a lot easier with reg exp if browsers supported look behinds.
One way with a regular expression:
function process( str ) {
return str.replace( /^([^.]*\.)(.*)$/, function ( a, b, c ) {
return b + c.replace( /\./g, '' );
});
}
You can try something like this:
str = str.replace(/\./,"#").replace(/\./g,"").replace(/#/,".");
But you have to be sure that the character # is not used in the string; or replace it accordingly.
Or this, without the above limitation:
str = str.replace(/^(.*?\.)(.*)$/, function($0, $1, $2) {
return $1 + $2.replace(/\./g,"");
});
You could also do something like this, i also don't know if this is "simpler", but it uses just indexOf, replace and substr.
var str = "7.8.9.2.3";
var strBak = str;
var firstDot = str.indexOf(".");
str = str.replace(/\./g,"");
str = str.substr(0,firstDot)+"."+str.substr(1,str.length-1);
document.write(str);
Shai.
Here is another approach:
function process(input) {
var n = 0;
return input.replace(/\./g, function() { return n++ > 0 ? '' : '.'; });
}
But one could say that this is based on side effects and therefore not really elegant.
This isn't necessarily more elegant, but it's another way to skin the cat:
var process = function (input) {
var output = input;
if (typeof input === 'string' && input !== '') {
input = input.split('.');
if (input.length > 1) {
output = [input.shift(), input.join('')].join('.');
}
}
return output;
};
Not sure what is supposed to happen if "." is the first character, I'd check for -1 in indexOf, also if you use substr once might as well use it twice.
if ( index != -1 ) {
input = input.substr( 0, index + 1 ) + input.substr(index + 1).replace( /\./g, '' );
}
var i = s.indexOf(".");
var result = s.substr(0, i+1) + s.substr(i+1).replace(/\./g, "");
Somewhat tricky. Works using the fact that indexOf returns -1 if the item is not found.
Trying to keep this as short and readable as possible, you can do the following:
JavaScript
var match = string.match(/^[^.]*\.|[^.]+/g);
string = match ? match.join('') : string;
Requires a second line of code, because if match() returns null, we'll get an exception trying to call join() on null. (Improvements welcome.)
Objective-J / Cappuccino (superset of JavaScript)
string = [string.match(/^[^.]*\.|[^.]+/g) componentsJoinedByString:''] || string;
Can do it in a single line, because its selectors (such as componentsJoinedByString:) simply return null when sent to a null value, rather than throwing an exception.
As for the regular expression, I'm matching all substrings consisting of either (a) the start of the string + any potential number of non-dot characters + a dot, or (b) any existing number of non-dot characters. When we join all matches back together, we have essentially removed any dot except the first.
var input = '14.1.2';
reversed = input.split("").reverse().join("");
reversed = reversed.replace(\.(?=.*\.), '' );
input = reversed.split("").reverse().join("");
Based on #Tadek's answer above. This function takes other locales into consideration.
For example, some locales will use a comma for the decimal separator and a period for the thousand separator (e.g. -451.161,432e-12).
First we convert anything other than 1) numbers; 2) negative sign; 3) exponent sign into a period ("-451.161.432e-12").
Next we split by period (["-451", "161", "432e-12"]) and pop out the right-most value ("432e-12"), then join with the rest ("-451161.432e-12")
(Note that I'm tossing out the thousand separators, but those could easily be added in the join step (.join(','))
var ensureDecimalSeparatorIsPeriod = function (value) {
var numericString = value.toString();
var splitByDecimal = numericString.replace(/[^\d.e-]/g, '.').split('.');
if (splitByDecimal.length < 2) {
return numericString;
}
var rightOfDecimalPlace = splitByDecimal.pop();
return splitByDecimal.join('') + '.' + rightOfDecimalPlace;
};
let str = "12.1223....1322311..";
let finStr = str.replace(/(\d*.)(.*)/, '$1') + str.replace(/(\d*.)(.*)/, '$2').replace(/\./g,'');
console.log(finStr)
const [integer, ...decimals] = '233.423.3.32.23.244.14...23'.split('.');
const result = [integer, decimals.join('')].join('.')
Same solution offered but using the spread operator.
It's a matter of opinion but I think it improves readability.

Categories

Resources