Replace multiple identical characters with a string - javascript

Using Javascript, I want to replace:
This is a test, please complete ____.
with:
This is a test, please complete %word%.
The number of underlines isn't consistent, so I cannot just use something like str.replace('_____', '%word%').
I've tried str.replace(/(_)*/g, '%word%') but it didn't work. Any suggestions?

Remove the capturing group, and make sure _ repeats with + (at least one occurrence, matches as many _s as possible):
const str = 'This is a test, please complete ____.';
console.log(
str.replace(/_+/g, '%word%')
);
The regular expression
/(_)*/
means, in plain language: match zero or more underscores, which of course isn't what you're looking for. That will match every position in the string (except positions in the string between underscores).

I'm going to suggest a slightly different approach to this. Instead of maintaining the sentence as you currently have it, instead maintain something like this:
This is the {$1} test, please complete {$2}.
When you want to render this sentence, use a regex replacement to replace the placeholders with underscores:
var sentence = "This is the {$1} test, please complete {$2}.";
var show = sentence.replace(/\{\$\d+\}/g, "____");
console.log(show);
When you want to replace a given placeholder, you may also use a targeted regex replacement. For example, to target the first placeholder you could use:
var sentence = "This is the {$1} test, please complete {$2}.";
var show = sentence.replace(/\{\$1\}/g, "first");
console.log(show);
This is a fairly robust and scalable solution, and is more accurate than just doing a single blanket replacement of all underscores.

Related

Regular expression to match a string which is NOT matched by a given regexp

I've been hoving around by some answers here, and I can't find a solution to my problem:
I have this regexp which matches everyting inside an HTML span tag, including contents:
<span\b[^>]*>(.*?)</span>
and I want to find a way to make a search in all the text, except for what is matched with that regexp.
For example, if my text is:
var text = "...for there is a class of <span class="highlight">guinea</span> pigs which..."
... then the regexp would match:
<span class="highlight">guinea</span>
and I want to be able to make a regexp such that if I search for "class", regexp will match "...for there is a class of..."
and will not match inside the tag, like in
"... class="highlight"..."
The word to be matched ("class") might be anywhere within the text. I've tried
(?!<span\b[^>]*>(.*?)</span>)class
but it keeps searching inside tags as well.
I want to find a solution using only regexp, not dealing with DOM nor JQuery. Thanks in advance :).
Although I wouldn't recommend this, I would do something like below
(class)(?:(?=.*<span\b[^>]*>))|(?:(?<=<\/span>).*)(class)
You can see this in action here
Rubular Link for this regex
You can capture your matches from the groups and work with them as needed. If you can, use a HTML parser and then find matches from the text element.
It's not pretty, but if I get you right, this should do what you wan't. It's done with a single RegEx but js can't (to my knowledge) extract the result without joining the results in a loop.
The RegEx: /(?:<span\b[^>]*>.*?<\/span>)|(.)/g
Example js code:
var str = '...for there is a class of <span class="highlight">guinea</span> pigs which...',
pattern = /(?:<span\b[^>]*>.*?<\/span>)|(.)/g,
match,
res = '';
match = pattern.exec(str)
while( match != null )
{
res += match[1];
match = pattern.exec(str)
}
document.writeln('Result:' + res);
In English: Do a non capturing test against your tag-expression or capture any character. Do this globally to get the entire string. The result is a capture group for each character in your string, except the tag. As pointed out, this is ugly - can result in a serious number of capture groups - but gets the job done.
If you need to send it in and retrieve the result in one call, I'd have to agree with previous contributors - It can't be done!

Matching invisible characters in JavaScript RegEx

I've got some string that contain invisible characters, but they are in somewhat predictable places. Typically the surround the piece of text I want to extract, and then after the 2nd occurrence I want to keep the rest of the text.
I can't seem to figure out how to both key off of the invisible characters, and exclude them from my result. To match invisibles I've been using this regex: /\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F/ which does seem to work.
Here's an example: [invisibles]Keep as match 1[invisibles]Keep as match 2
Here's what I've been using so far without success:
/([\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+)(.+)([\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+)/(.+)
I've got the capture groups in there, but it's bee a while since I've had to use regex's in this way, so I know I'm missing something important. I was hoping to just make the invisible matches non-capturing groups, but it seems that JavaScript does not support this.
Something like this seems like what you want. The second regex you have pretty much works, but the / is in totally the wrong place. Perhaps you weren't properly reading out the group data.
var s = "\x0EKeep as match 1\x0EKeep as match 2";
var r = /[\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+(.+)[\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+(.+)/;
var match = s.match(r);
var part1 = match[1];
var part2 = match[2];

javascript regex to extract the first character after the last specified character

I am trying to extract the first character after the last underscore in a string with an unknown number of '_' in the string but in my case there will always be one, because I added it in another step of the process.
What I tried is this. I also tried the regex by itself to extract from the name, but my result was empty.
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var string = match(/[^_]*$/)[1]
string.charAt(0)
So the final desired result is 'D'. If the RegEx can only get me what is behind the last '_' that is fine because I know I can use the charAt like currently shown. However, if the regex can do the whole thing, even better.
If you know there will always be at least one underscore you can do this:
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var firstCharAfterUnderscore = s.charAt(s.lastIndexOf("_") + 1);
// OR, with regex
var firstCharAfterUnderscore = s.match(/_([^_])[^_]*$/)[1]
With the regex, you can extract just the one letter by using parentheses to capture that part of the match. But I think the .lastIndexOf() version is easier to read.
Either way if there's a possibility of no underscores in the input you'd need to add some additional logic.

Javascript string validation using the regex object

I am complete novice at regex and Javascript. I have the following problem: need to check into a textfield the existence of one (1) or many (n) consecutive * (asterisk) character/characters eg. * or ** or *** or infinite (n) *. Strings allowed eg. *tomato or tomato* or **tomato or tomato** or as many(n)*tomato many(n)*. So, far I had tried the following:
var str = 'a string'
var value = encodeURIComponent(str);
var reg = /([^\s]\*)|(\*[^\s])/;
if (reg.test(value) == true ) {
alert ('Watch out your asterisks!!!')
}
By your question it's hard to decipher what you're after... But let me try:
Only allow asterisks at beginning or at end
If you only allow an arbitrary number (at least one) of asterisks either at the beginning or at the end (but not on both sides) like:
*****tomato
tomato******
but not **tomato*****
Then use this regular expression:
reg = /^(?:\*+[^*]+|[^*]+\*+)$/;
Match front and back number of asterisks
If you require that the number of asterisks at the biginning matches number of asterisks at the end like
*****tomato*****
*tomato*
but not **tomato*****
then use this regular expression:
reg = /^(\*+)[^*]+\1$/;
Results?
It's unclear from your question what the results should be when each of these regular expressions match? Are strings that test positive to above regular expressions fine or wrong is on you and your requirements. As long as you have correct regular expressions you're good to go and provide the functionality you require.
I've also written my regular expressions to just exclude asterisks within the string. If you also need to reject spaces or anything else simply adjust the [^...] parts of above expressions.
Note: both regular expressions are untested but should get you started to build the one you actually need and require in your code.
If I understand correctly you're looking for a pattern like this:
var pattern = /\**[^\s*]+\**/;
this won't match strings like ***** or ** ***, but will match ***d*** *d or all of your examples that you say are valid (***tomatos etc).If I misunderstood, let me know and I'll see what I can do to help. PS: we all started out as newbies at some point, nothing to be ashamed of, let alone apologize for :)
After the edit to your question I gather the use of an asterisk is required, either at the beginning or end of the input, but the string must also contain at least 1 other character, so I propose the following solution:
var pattern = /^\*+[^\s*]+|[^\s*]+\*+$/;
'****'.match(pattern);//false
' ***tomato**'.match(pattern);//true
If, however *tomato* is not allowed, you'll have to change the regex to:
var pattern = /^\*+[^\s*]+$|^[^\s*]+\*+$/;
Here's a handy site to help you find your way in the magical world of regular expressions.

RegExp: how to exclude matched groups from $N?

I've made a working regexp, but i think it's not the best use-case:
el = '<div style="color:red">123</div>';
el.replace(/(<div.*>)(\d+)(<\/div>)/g, '$1<b>$2</b>$3');
// expecting result: <div style="color:red"><b>123</b></div>
After googling i've found that (?: ... ) in regexps - means ignoring group match, thus:
el.replace(/(?:<div.*>)(\d+)(?:<\/div>)/g, '<b>$1</b>');
// returns <b>123</b>
but i need an expecting result from 1st example.
Is there a way to exclude 'em? just to write replace(/.../, '<b>$1</b>')?
This is just a little case for understanding how to exclude groups in regexp. And i know, what we can't parse HTML with regexp :)
So you want to get the same result while only using the replacement <b>$1</b>?
In your case just replace(/\d+/, '<b>$&</b>') would suffice.
But if you want to make sure there are div tags around the number, you could use lookarounds and \K like in the following expression. Except that JS does not support lookbehind nor \K, so you're out of luck, you have to use a capturing group for that in JS.
<div[^>]*>\K\d+(?=</div>)
There nothing wrong with a replacement value of '$1<b>$2</b>$3'. I would just change your regex to this:
el = '<div style="color:red">123</div>';
el.replace(/(<div[^>]*>)(\d+)(<\/div>)/g, '$1<b>$2</b>$3');
Changing how it matches the first div keeps the full match on the div tags, but makes sure it matches the minimum possible before the closing > of the first div tag rather than the maximum possible.
With your regex, you would not get what you wanted with this input string:
el = '<div style="color:red">123</div><div style="color:red">456</div>';
The problem with using something like:
el.replace(/\d+/, '<b>$&</b>')
is that doesn't work properly with things like this:
el = '<div style="margin-left: 10px">123</div>'
because it picks up the numbers inside the div tag.

Categories

Resources