RegExp to filter characters after the last dot - javascript

For example, I have a string "esolri.gbn43sh.earbnf", and I want to remove every character after the last dot(i.e. "esolri.gbn43sh"). How can I do so with regular expression?
I could of course use non-RegExp way to do it, for example:
"esolri.gbn43sh.earbnf".slice("esolri.gbn43sh.earbnf".lastIndexOf(".")+1);
But I want a regular expression.
I tried /\..*?/, but that remove the first dot instead.
I am using Javascript. Any help is much appreciated.

I would use standard js rather than regex for this one, as it will be easier for others to understand your code
var str = 'esolri.gbn43sh.earbnf'
console.log(
str.slice(str.lastIndexOf('.') + 1)
)

Pattern Matching
Match a dot followed by non-dots until the end of string
let re = /\.[^.]*$/;
Use this with String.prototype.replace to achieve the desired output
'foo.bar.baz'.replace(re, ''); // 'foo.bar'
Other choices
You may find it is more efficient to do a simple substring search for the last . and then use a string slicing method on this index.
let str = 'foo.bar.baz',
i = str.lastIndexOf('.');
if (i !== -1) // i = -1 means no match
str = str.slice(0, i); // "foo.bar"

Related

remove last part of string following '&&&' with JavaScript Regex

I'm trying to use a regex in JS to remove the last part of a string. This substring starts with &&&, is followed by something not &&&, and ends with .pdf.
So, for example, the final regex should take a string like:
parent&&&child&&&grandchild.pdf
and match
parent&&&child
I'm not that great with regex's, so my best effort has been something like:
.*?(?:&&&.*\.pdf)
Which matches the whole string. Can anyone help me out?
You may use this greedy regex either in replace or in match:
var s = 'parent&&&child&&&grandchild.pdf';
// using replace
var r = s.replace(/(.*)&&&.*\.pdf$/, '$1');
console.log(r);
//=> parent&&&child
// using match
var m = s.match(/(.*)&&&.*\.pdf$/)
if (m) {
console.log(m[1]);
//=> parent&&&child
}
By using greedy pattern .* before &&& we make sure to match **last instance of &&& in input.
You want to remove the last portion, so replace it
var str = "parent&&&child&&&grandchild.pdf"
var result = str.replace(/&&&[^&]+\.pdf$/, '')
console.log(result)

Get last occurrence using RegEx

I have a huge string with this inside:
linha[0] = '12/2010 281R$ 272.139,05 ';
linha[0] = '13SL 1R$ 226.185,81 ';
Both lines are separate, and I need get the last occurrence from both. I'm using the following regex to match the first one:
/linha\[0]\s=\s'(.*)';/
I would like to get the second "linha..." too, but I don't know exactly how.
That's how i'm using this regex to get the first "linha...":
string.match(/linha\[0]\s=\s'(.*)';/);
output:
linha[0] = '12/2010 281R$ 272.139,05 ';
Also, i can't do extra work, i need get the second occurrence using only regex.
If you want to get the last occurrence of your regex in a string (and assuming it exists), you can do
var str = hugeString.match(/linha\[0]\s=\s'([^']*)';/g).pop();
(yes, I changed .* to [^']* for a better efficiency, ignore that if you have quotes in your inner string)
Now, if you want to extract just the submatch, you can do
var regex = /linha\[0]\s=\s'([^']*)';/g,
arr,
str;
while (arr = regex.exec(hugeString)) str = arr[1];

How to split a string by a character not directly preceded by a character of the same type?

Let's say I have a string: "We.need..to...split.asap". What I would like to do is to split the string by the delimiter ., but I only wish to split by the first . and include any recurring .s in the succeeding token.
Expected output:
["We", "need", ".to", "..split", "asap"]
In other languages, I know that this is possible with a look-behind /(?<!\.)\./ but Javascript unfortunately does not support such a feature.
I am curious to see your answers to this question. Perhaps there is a clever use of look-aheads that presently evades me?
I was considering reversing the string, then re-reversing the tokens, but that seems like too much work for what I am after... plus controversy: How do you reverse a string in place in JavaScript?
Thanks for the help!
Here's a variation of the answer by guest271314 that handles more than two consecutive delimiters:
var text = "We.need.to...split.asap";
var re = /(\.*[^.]+)\./;
var items = text.split(re).filter(function(val) { return val.length > 0; });
It uses the detail that if the split expression includes a capture group, the captured items are included in the returned array. These capture groups are actually the only thing we are interested in; the tokens are all empty strings, which we filter out.
EDIT: Unfortunately there's perhaps one slight bug with this. If the text to be split starts with a delimiter, that will be included in the first token. If that's an issue, it can be remedied with:
var re = /(?:^|(\.*[^.]+))\./;
var items = text.split(re).filter(function(val) { return !!val; });
(I think this regex is ugly and would welcome an improvement.)
You can do this without any lookaheads:
var subject = "We.need.to....split.asap";
var regex = /\.?(\.*[^.]+)/g;
var matches, output = [];
while(matches = regex.exec(subject)) {
output.push(matches[1]);
}
document.write(JSON.stringify(output));
It seemed like it'd work in one line, as it did on https://regex101.com/r/cO1dP3/1, but had to be expanded in the code above because the /g option by default prevents capturing groups from returning with .match (i.e. the correct data was in the capturing groups, but we couldn't immediately access them without doing the above).
See: JavaScript Regex Global Match Groups
An alternative solution with the original one liner (plus one line) is:
document.write(JSON.stringify(
"We.need.to....split.asap".match(/\.?(\.*[^.]+)/g)
.map(function(s) { return s.replace(/^\./, ''); })
));
Take your pick!
Note: This answer can't handle more than 2 consecutive delimiters, since it was written according to the example in the revision 1 of the question, which was not very clear about such cases.
var text = "We.need.to..split.asap";
// split "." if followed by "."
var res = text.split(/\.(?=\.)/).map(function(val, key) {
// if `val[0]` does not begin with "." split "."
// else split "." if not followed by "."
return val[0] !== "." ? val.split(/\./) : val.split(/\.(?!.*\.)/)
});
// concat arrays `res[0]` , `res[1]`
res = res[0].concat(res[1]);
document.write(JSON.stringify(res));

find and remove words matching a substring in a sentence

Is it possible to use regex to find all words within a sentence that contains a substring?
Example:
var sentence = "hello my number is 344undefined848 undefinedundefined undefinedcalling whistleundefined";
I need to find all words in this sentence which contains 'undefined' and remove those words.
Output should be "hello my number is ";
FYI - currently I tokenize (javascript) and iterate through all the tokens to find and remove, then merge the final string. I need to use regex. Please help.
Thanks!
You can use:
str = str.replace(/ *\b\S*?undefined\S*\b/g, '');
RegEx Demo
It certainly is possible.
Something like start of word, zero or more letters, "undefined", zero or more letters, end of word should do it.
A word boundary is \b outside a character class, so:
\b\w*?undefined\w*?\b
using non-greedy repetition to avoid the letter matching tryig to match "undefined" and leading to lots of backtracking.
Edit switch [a-zA-Z] to \w because the example includes numbers in the "words".
\S*undefined\S*
Try this simple regex.Replace by empty string.See demo.
https://www.regex101.com/r/fG5pZ8/5
you can use str.replace function like this
str = str.replace(/undefined/g, '');
Since there are enough solutions with regular expressions, here is another one - using arrays and simple function that finds occurrence of a string in a string :)
Even though the code looks more "dirty", it actually works faster than regular expression, so it might make sense to consider it when dealing with LARGE strings
var sentence = "hello my number is 344undefined848 undefinedundefined undefinedcalling whistleundefined";
var array = sentence.split(' ');
var sanitizedArray = [];
for (var i = 0; i <= array.length; i++) {
if (undefined !== array[i] && array[i].indexOf('undefined') == -1) {
sanitizedArray.push(array[i]);
}
}
var sanitizedSentence = sanitizedArray.join(' ');
alert(sanitizedSentence);
Fiddle: http://jsfiddle.net/448bbumh/

RegExp match a single quoted text without quotes - JavaScript

I'm sorry if it is a confusing question. I was trying to find a way to do this but couldn't find it so, if it is a repeated question, my apologies!
I have a text something like this: something:"askjnqwe234"
I want to be able to get askjnqwe234 using a RegExp. You can notice I want to omit the quotes. I was trying this using /[^"]+(?=(" ")|"$)/g but it returns an array. I want a RegExt to return a single string, not an array.
I don't know if it's possible but I do not want to specify the position of the array; something like this:
var x = string.match(/[^"]+(?=(" ")|"$)/g)[0];
Thanks!
Try:
/"([^"]*)"/g
in English: look for " the match and record anything that isn't " till you see another "".
match and exec always return an array or null, so, assuming you have a single double-quoted value and no newlines in the string, you could use
var x;
var str = 'something:"askjnqwe234"';
x = str.replace( /^[^"]*"|".*/g, '' );
// "askjnqwe234"
Or, if you may have other quoted values in the string
x = str.replace( /.*?something:"([^"]*)".*/, '$1' );
where $1 refers to the substring captured by the sub-pattern [^"]* between the ().
Further explanation on request.
Notwithstanding the above, I recommend that you tolerate the array indexing and just use match.
You can capture the information inside quotes like this, assuming it matches:
var x = string.match(/something:"([^"]*)"/)[1];
The memory capture at index 1 is the part inside the double quotes.
If you're not sure it will match:
var match = string.match(/something:"([^"]*)"/);
if (match) {
// use match[1] here
}

Categories

Resources