ignore accent in regex [duplicate]

ignore accent in regex [duplicate] - javascript

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to ignore acute accent in a javascript regex match?
I have some javascript as :
var myString = 'préposition_preposition';
var regex = new RegExp("epo", "ig");
alert(myString.match(regex));
is it possible to match "épo" and "epo", if I set in regex only epo (or only épo)?

I had the same problem recently. Regex operates with ascii, therefor special characters like é or ß are not recognized. You need to explicitely include those into your regex.
Use this:
var regex = /[ée]po/gi;
Hint: Don't use new Regex() it's rather slow, but declare the regex directly instead. This also solves some quoting/escaping issues.

No you can not achieve this behavior. RegEx match exactly the string you provided. How should the computer know when épo or epo is what you are looking for!
But you can specify a class of chracters that can be matched new RegExp("[eé]po", "ig");

Try this:
var str = 'préposition_preposition';
str.match(/(e|é)po/gi);

Related

String starting with backslash hacks my regex [duplicate]

This question already has answers here:
How can I use backslashes (\) in a string?
(4 answers)
Closed 3 years ago.
My string should be in the IRC command format : "/add john".
So, i created this Regex :
var regex = /^\/add ([A-Za-z0-9]+)$/
var bool = regex.test('\/add user1');
alert(bool);
The problem is either I use /***/ or RegExp syntax, if I set a backslash at the beginning of my string (like in my example above), my alert pop up show "true" and I don't want that.
I code in Javascript

You can use String.raw to make sure that the backlash is not removed when testing your input:
var regex = /^\/add ([A-Za-z0-9]+)$/
var bool = regex.test(String.raw`\/add user1`);
alert(bool);
You can play with this code here: https://jsbin.com/ziqecux/25/edit?js

How do remove all Unicode from string, BUT keep lanauges such as: Japanese, Greek, Hindi etc [duplicate]

This question already has answers here:
How can I use Unicode-aware regular expressions in JavaScript?
(11 answers)
Closed 4 years ago.
How would I remove all Unicode from this string【Hello!】★ ああああ
I need to remove all the "weird" symbols (【, ★, 】) and keep "Hello!" and "ああああ". This needs to work for all languages not just Japanese.

You want to remove characters within the Unicode categories Other Symbol, Combining Symbol, and Enclosing Mark, but leave those from other categories.
Using regular expressions, those match the classes \p{So}, \p{Sk} and \p{Me}, respectively. You might for example use XRegExp.replace().

I have found a solution. Using XRegEXP, I was able to use PHP's \p{Common} in node.
const xreg = require('xregexp');
let str = '【Hello!】★ ああああ】';
let regex = new xreg('\\p{Common}', 'g');
let res = xreg.replace(str, regex, ' ');
console.log(res); // Hello ああああ

replace '\n' in javascript [duplicate]

This question already has answers here:
How do I replace all occurrences of a string in JavaScript?
(78 answers)
Fastest method to replace all instances of a character in a string [duplicate]
(14 answers)
Closed 1 year ago.
I'm trying to do replace in JavaScript using:
r = "I\nam\nhere";
s = r.replace("\n"," ");
But instead of giving me
I am here
as the value of s,
It returns the same.
Where's the problem??

As stated by the others the global flag is missing for your regular expression. The correct expression should be some thing like what the others gave you.
var r = "I\nam\nhere";
var s = r.replace(/\n/g,' ');
I would like to point out the difference from what was going on from the start.
you were using the following statements
var r = "I\nam\nhere";
var s = r.replace("\n"," ");
The statements are indeed correct and will replace one instance of the character \n. It uses a different algorithm. When giving a String to replace it will look for the first occurrence and simply replace it with the string given as second argument. When using regular expressions we are not just looking for the character to match we can write complicated matching syntax and if a match or several are found then it will be replaced. More on regular expressions for JavaScript can be found here w3schools.
For instance the method you made could be made more general to parse input from several different types of files. Due to differences in Operating system it is quite common to have files with \n or \r where a new line is required. To be able to handle both your code could be rewritten using some features of regular expressions.
var r = "I\ram\nhere";
var s = r.replace(/[\n\r]/g,' ');

use s = r.replace(/\\n/g," ");
Get a reference:
The "g" in the javascript replace code stands for "greedy" which means the replacement should happen more than once if possible

The problem is that you need to use the g flag to replace all matches, as, by default, replace() only acts on the first match it finds:
var r = "I\nam\nhere",
s = r.replace(/\n/g,' ');
To use the g flag, though, you'll have to use the regular expression approach.
Incidentally, when declaring variables please use var, otherwise the variables you create are all global, which can lead to problems later on.

.replace() needs the global match flag:
s = r.replace(/\n/g, " ");

It's working for me:
var s = r.split('\\n').join(' ');

replaceAll() is relative new, not supported in all browsers:
r = "I\nam\nhere";
s = r.replaceAll("\n"," ");

You can use:
var s = r.replace(/\n/g,' ').replace(/\r/g,' ');
because diferents SO use diferents ways to set a "new line", for example: Mac Unix Windows, after this, you can use other function to normalize white spaces.

Just use \\\n to replace it will work.
r.replace("\\\n"," ");

The solution from here worked perfect for me:
r.replace(/=(\r\n|\n|\r)/gm," ");

javascript - split without losing the separator [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
JavaScript Split without losing character
I have a string:
"<foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar>"
I want to separate all instances of "abcdefg" into an array like this:
["<foo>abcdefg</bar>", "<foo>abcdefg</bar>", "<foo>abcdefg</bar>", "<foo>abcdefg</bar>"];
I try:
var str="<foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar>";
var Array_Of_FooBars = str.split("</bar>");
alert(Array_Of_FooBars);
But it returns:
["<foo>abcdefg", "<foo>abcdefg", "<foo>abcdefg", "<foo>abcdefg",]
It is removing the separator ''. I don't want that.
How can I use split and not lose the separators from the string?
Thanks.
Ken

Try this. It's not a perfect solution, but it should work in most cases.
str.split(/(?=<foo>)/)
That is, split it in the position before each opening tag.
EDIT: You could also do it with match(), like so:
str.match(/<foo>.*?<\/bar>/g)

It seems that you would most likely want to use match:
var s = "<foo>abcd1efg</bar><foo>abc2defg</bar><foo>abc3defg</bar><foo>abc4defg</bar>"
s.match(/(<foo>.+?<\/bar>)/g)
// =>["<foo>abcd1efg</bar>", "<foo>abc2defg</bar>", "<foo>abc3defg</bar>", "<foo>abc4defg</bar>"]

You could just iterate over a simple regular expression and build the array that way:
var x = new RegExp('<foo>(.*?)</bar>', 'ig'),
s = "<foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar><foo>abcdefg</bar>",
matches = [];
while (i = x.exec(s)) {
matches.push(i[0]);
}
Just realized using String.match() would be better; this code would be more useful for matching the contents inside the tags.

Use positive lookahead so that the regular expression asserts that the special character exists, but does not actually match it:
string.split(/<br \/>(?=&#?[a-zA-Z0-9]+;)/g);

javascript - How to do replaceAll? [duplicate]

This question already has answers here:
How do I replace all occurrences of a string in JavaScript?
(78 answers)
Closed 6 years ago.
Hi I have a problem here. I am trying to replace all instances of + character in a string using javascript. What happens is that only the first instance is being changed.
Here is my code:
var keyword = "Hello+Word%+";
keyword = keyword.replace("+", encodeURIComponent("+"));
alert(keyword);
The output is Hello%2BWord%+ when it should be Hello%2BWord%%2B because there are 2 instances of +.
You can check this on : http://jsfiddle.net/Wy48Z/
Please help. Thanks in advance.

You need the global flag.
Fixed for you at http://jsfiddle.net/rtoal/Wy48Z/1/
var keyword = "Hello+Word%+";
keyword = keyword.replace(/\+/g, encodeURIComponent("+"));
alert(keyword);

The javascript regex, which is done by putting the expresison inbetween two forward slashes like: /<expression/
If you want to replace all, simply append a g after the last one like:
/<expression/g
In your case, it would be /\+/g

The cross-browser approach is to use a regexp with the g (global) flag, which means "process all matches of the pattern, not just the first":
keyword = keyword.replace(/\+/g, encodeURIComponent("+"));
Notice I prefix the plus sign with a backslash because it would otherwise have the special meaning of "match one or more of the preceding thing".

Develop Reference

JavaScript is the programming language of the Web.

ignore accent in regex [duplicate] - javascript

No you can not achieve this behavior. RegEx match exactly the string you provided. How should the computer know when épo or epo is what you are looking for! But you can specify a class of chracters that can be matched new RegExp("[eé]po", "ig");

Try this: var str = 'préposition_preposition'; str.match(/(e|é)po/gi);

Related

String starting with backslash hacks my regex [duplicate]

How do remove all Unicode from string, BUT keep lanauges such as: Japanese, Greek, Hindi etc [duplicate]

replace '\n' in javascript [duplicate]

javascript - split without losing the separator [duplicate]

javascript - How to do replaceAll? [duplicate]

Categories

Resources