RegExp function not working with alternation - javascript

string=string.replace(RegExp(filter[a]+" | "+filter[a],"g"),filter[a])
For some reason, this isn't affecting both the filter followed by the space and the filter with a space in front. Assuming the filter is ",", it would take the second side and only replace " ," rather than " ," and ", ". The filter is user-specified, so I can't use a normal regular expression (which DOES work) such as string=string.replace(/, | ,/g,filter[a])
Can someone explain to me why it doesn't work and how to make it work?

It works for me:
s = 'abc, def,ghi ,klm'
a = ','
s = s.replace(RegExp(a + " | " + a, "g"), a)
"abc,def,ghi,klm"
Remember that you regular expression won't replace " , " with ",". You could try using this instead:
" ?" + filter[a] + " ?"

Related

RegExp to replace exact OR similar word JS

I'm struggling with an Regex to replace the name of a variable inside a string...
I need to put certain variable name outside quotes, and, if that variable have a qualifier (like a property or method), this qualifier need to be outside quotes in the final string, too.
So, given this example:
cExp = new RegExp('oErro', 'g');
cMsg = "Error ocurred: oErro; please try again";
cMsg.replace(cExp, '\' + oErro + \'')
the output is exactly what I expect:
'Error ocurred: ' + oErro + '; please try again'
I search how to include any words after the variable name, and ended up with this piece of code:
cExp = new RegExp('oErro(\.[^\ |^\;|^\,|^\)|^\}]*)', 'g');
cMsg = "Error ocurred: oErro.message; please try again";
cMsg.replace(cExp, '\' + oErro$1 + \'')
and the result is exactly what I expected to see:
'Error ocurred: ' + oErro.message + '; please try again'
So far, so good.
But, if I mix variable name with variable.qualifier, things start to get messy:
cExp = new RegExp('oErro(\.[^\ |^\;|^\,|^\)|^\}]*)', 'g');
cMsg = "Error ocurred: oErro.message (complete message: oErro)";
cMsg.replace(cExp, '\' + oErro$1 + \'')
I GET this output
'Error ocurred: ' + oErro.message + ' (complete message: ' + oErro) + ''
while I EXPECTED this output (note the parenthesis INSIDE the quotes)
'Error ocurred: ' + oErro.message + ' (complete message: ' + oErro + ')'
In other words, every time "oErro" is used without a qualifier, the expression gets the next word and join with oErro, outside the enclosing quotes.
Certainly I'm doing something wrong, but I'm not very familiar with RegExp and maybe not searching with correct terms to get appropriate help.
What I need is an expression that works for both scenarios (removing the word "oErro" or the syntax "oErro.something" from quotes in the final string)...
Thanks in advance and sorry for the poor english, I try to put some examples but feel free to ask if you need more details on what I need to achieve.
You may use
cExp=/oErro(?:\.[^\s;,)}]+)?/g
// Or, if the chars after `.` can only only be letters/digits/underscore
cExp = /oErro(?:\.\w+)?/g
Then, you would need to use
cMsg.replace(cExp, '\' + $& + \'')
where $& is the backreference to the whole match value.
Pattern details
oErro - a literal string
(?:\.\w+)? - an optional (due to ? at the end) non-capturing group that matches 1 or 0 occurrences of
\. - a dot
-\w+ - 1+ letters/digits/underscores
[^\s;,)}]+ - 1 or more chars other than whitespace, ;, ,, ) and }.
I believe your requirement for capturing the property or method name is satisfied by using the \w character in your regex.
With oErro
cMsg = "Error ocurred: oErro; please try again";
cMsg.replace(/(oErro(\.\w+)?)/g, '\' + $1 + \'');
// Output: "Error ocurred: ' + oErro + '; please try again"
With oError.message
cMsg = "Error ocurred: oErro.message (complete message: oErro)";
cMsg.replace(/(oErro(\.\w+)?)/g, '\' + $1 + \'');
// Output: "Error ocurred: ' + oErro.message + ' (complete message: ' + oErro + ')"

How to replace found regex sub string with spaces with equal length in javascript?

In javascript if I have something like
string.replace(new RegExp(regex, "ig"), " ")
this replaces all found regexes with a single space. But how would I do it if I wanted to replace all found regexes with spaces that matched in length?
so if regex was \d+, and the string was
"123hello4567"
it changes to
" hello "
Thanks
The replacement argument (2nd) to .replace can be a function - this function is called in turn with every matching part as the first argument
knowing the length of the matching part, you can return the same number of spaces as the replacement value
In the code below I use . as a replacement value to easily illustrate the code
Note: this uses String#repeat, which is not available in IE11 (but then, neither are arrow functions) but you can always use a polyfill and a transpiler
let regex = "\\d+";
console.log("123hello4567".replace(new RegExp(regex, "ig"), m => '.'.repeat(m.length)));
Internet Exploder friendly version
var regex = "\\d+";
console.log("123hello4567".replace(new RegExp(regex, "ig"), function (m) {
return Array(m.length+1).join('.');
}));
thanks to #nnnnnn for the shorter IE friendly version
"123hello4567".replace(new RegExp(/[\d]/, "ig"), " ")
1 => " "
2 => " "
3 => " "
" hello "
"123hello4567".replace(new RegExp(/[\d]+/, "ig"), " ")
123 => " "
4567 => " "
" hello "
If you just want to replace every digit with a space, keep it simple:
var str = "123hello4567";
var res = str.replace(/\d/g,' ');
" hello "
This answers your example, but not exactly your question. What if the regex could match on different numbers of spaces depending on the string, or it isn't as simple as /d more than once? You could do something like this:
var str = "123hello456789goodbye12and456hello12345678again123";
var regex = /(\d+)/;
var match = regex.exec(str);
while (match != null) {
// Create string of spaces of same length
var replaceSpaces = match[0].replace(/./g,' ');
str = str.replace(regex, replaceSpaces);
match = regex.exec(str);
}
" hello goodbye and hello again "
Which will loop through executing the regex (instead of using /g for global).
Performance wise this could likely be sped up by creating a new string of spaces with the length the same length as match[0]. This would remove the regex replace within the loop. If performance isn't a high priority, this should work fine.

indexOf for multiple options

Let say, I get the following using var content = this.innerHTML:
w here </div>
Using indexOf (or other ways), I want to check for the first position that has either "Space", "<" or "&nbsp".
In this case, it will be 1 (after "w").
What I am confused about is how do I check for the very first position that has either one of these three choices? Do I use Do...while to check for individual "options"?
You're probably looking for a Regular Expression (Regex) and the String#search method. Regex is a bit much to learn all at once, but I'll explain this example code.
You can use square brackets to denote a set of characters, so for example [ <] says "match either a space or a less-than sign."
You can use the pipe | to separate possibilities if you want to match one pattern or another, and that's how to account for matching a non-breaking space HTML entity.
var string = 'w here </div>',
index = string.search(/[ <]| /)
console.log(index) //=> 1
You can use a regular expression with alternations (|), which means "match one of these things". That will also tell you what you found, if that's useful:
function check(str) {
var m = / |<| /.exec(str);
if (!m) {
console.log("Not found in '" + str + "'");
return;
}
console.log("'" + m[0] + "' found at index " + m.index + " in '" + str + "'");
}
check("w here </div>");
check("where </div>");
check("where</div>");

JavaScript equivalent to URLEncoder.encode("String", "UTF-8") of Java

I want to encode a string to UTF-8 in JavaScript. In java we use URLEncoder.encode("String", "UTF-8") to achieve this.
I know we can use encodeURI or encodeURIComponent but it is producing different output than URLEncoder.encode
Can anyone please suggest any available JS method that can be used to achieve same output as URLEncoder.encode.
NOTE:
Due to restrictions I cannot use jQuery.
I don't know if this javaURLEncode() function is a spot-on match for Java's URLEncoder.encode method, but it might be close to what you're looking for:
function javaURLEncode(str) {
return encodeURI(str)
.replace(/%20/g, "+")
.replace(/!/g, "%21")
.replace(/'/g, "%27")
.replace(/\(/g, "%28")
.replace(/\)/g, "%29")
.replace(/~/g, "%7E");
}
var testString = "It's ~ (crazy)!";
var jsEscape = escape(testString);
var jsEncodeURI = encodeURI(testString);
var jsEncodeURIComponent = encodeURIComponent(testString);
var javaURLEncoder = javaURLEncode(testString);
alert("Original: " + testString + "\n" +
"JS escape: " + jsEscape + "\n" +
"JS encodeURI: " + jsEncodeURI + "\n" +
"JS encodeURIComponent: " + jsEncodeURIComponent + "\n" +
"Java URLEncoder.encode: " + javaURLEncoder);
found one more character should be replaced
.replace(/\$/g, "%24")

match with Regular Expressions

I want to use match and a regular expression to split a string into an array.
Example:
var strdoc = '<p>noi dung</p>bài viết đúng.Đó thực sự là, cuối cùng';
var arrdocobj = strdoc.match(/(<.+?>)|(\s)|(\w+)(.+?)/g);
When I do console.log arrdocobj, it results in
["<p>", "noi ", "dung<", "p>", "bà", "i ", "viế", "t ", "ng.", " ", "thự", "c ", "sự", " ", "là", " ", "cuố", "i ", "cù", "ng"]
How can I split the string to an array like this
["<p>", "noi"," ", "dung", "<p>","bài"," ","viết"," ","đúng",".","Đó"," ","thực"," ","sự"," ","là", "," ," ","cuối"," ","cùng"]
You could maybe use something like that?
var strdoc = '<p>noi dung</p>tiêu đề bài viết đúng';
var arrdocobj = strdoc.match(/<[^>]+>|\S+?(?= |$|<)/g);
I was looking into using the \b with the unicode flag, but I guess it isn't available in JS, so I used (?= |$|<) to emulate the word boundary.
jsfiddle demo
EDIT: As per edit of question:
<[^>]+>|[^ .,!?:<]+(?=[ .,!?:<]|$)|.
might do the trick.
jsfiddle demo.
I just added a few more punctuations and the |. for the remaining stuff to match.
I thing the following regex does what you are asking in your edit:
var strdoc = '<p>noi dung</p>bài viết đúng.Đó thực sự là, cuối cùng';
var arrdocobj = strdoc.match(/<[^>]+>|[\s]+|[^\s<]+/g);
Unfortunatly JavaScript does not support Unicode categories like \p{L} for any Unicode Letter

Categories

Resources