Matching a Unicode "name" with a JavaScript Regular Expression - javascript

In JavaScript we can match individual Unicode codepoints or codepoint ranges by using the Unicode escape sequences, e.g.:
"A".match(/\u0041/) // => ["A"]
"B".match(/[\u0041-\u007A]/) // => ["B"]
But how could we create a regular expression to match a proper name which must include any Unicode "letter" using a JavaScript regular expression? Is there a range of letters? A special regex sequence or character class in JavaScript?
Say my website must validate names that could be in latin based languages as well as Hebrew, Cyrillic, Japanese (Katakana, Hiragana, etc.) is this feasible in JavaScript or is the only sane choice to delegate to a backend language with better Unicode support?

Here's a JS plugin that adds Unicode support to RegEx
http://xregexp.com/plugins/

I am using for defining unicode of a symbols this site http://www.fileformat.info.
Unicode Blocks (Basic Latin, .+, Cyrillic, .+, Arabic and other):
http://www.fileformat.info/info/unicode/block/index.htm
Unicode Character Categories (this does not work in JS):
http://www.fileformat.info/info/unicode/category/index.htm
Letters (A-я):
http://www.fileformat.info/info/unicode/char/a.htm
Fonts (which chars are supported in each font):
http://www.fileformat.info/info/unicode/font/index.htm
Index for all above
http://www.fileformat.info/info/unicode/index.htm

Related

JavaScript matching unicode letters [duplicate]

i have the following regex that allows only alphabets :
/[a-zA-Z]+/
a = "abcDF"
if (a.match(/[a-zA-Z]+/) == a){
//Match
}else{
//No Match
}
How can I do this using p{L} (universal - any language like german, english etc.. )
What I tried :
a.match(/[p{l}]+/)
a.match(/[\p{l}]+/)
a.match(/p{l}/)
a.match(/\p{l}/)
but all returned null for the letter a = "aB"
Starting with ECMAScript 2018, JavaScript finally supports Unicode property escapes natively.
For older versions, you either need to define all the relevant Unicode ranges yourself. Or you can use Steven Levithan's XRegExp package with Unicode add-ons and utilize its Unicode property shortcuts:
var regex = new XRegExp("^\\p{L}*$")
var a = "abcäöüéèê"
if (regex.test(a)) {
// Match
} else {
// No Match
}
If you are willing to use Babel to build your javascript then there's a babel-plugin I have released which will transform regular expressions like /^\p{L}+$/ or /\p{^White_Space}/ into a regular expression that browsers will understand.
This is the project page: https://github.com/danielberndt/babel-plugin-utf-8-regex
You may use \p{L} with the modern ECMAScript 2018+ compliant JavaScript environments, but you need to remember that the Unicode property classes are only supported when you pass u modifier/flag:
a.match(/\p{L}+/gu)
a.match(/\p{Alphabetic}+/gu)
will match all occurrences of 1 or more Unicode letters in the a string.
NOTE that \p{Alphabetic} (\p{Alpha}) includes all letters matched by \p{L}, plus letter numbers matched by \p{Nl} (e.g. Ⅻ – a character for the roman number 12), plus some other symbols matched with \p{Other_Alphabetic} (\p{OAlpha}).
There are some things to bear in mind though when using u modifier with a regex:
You can use Unicode code point escape sequences such as \u{1F42A} for specifying characters via code points. Normal Unicode escapes such as \u03B1 only have a range of four hexadecimal digits (which equals the basic multilingual plane) (source)
"Characters of 4 bytes are handled correctly: as a single character, not two 2-byte characters" (source)
Escaping requirements to patterns compiled with u flag are more strict: you can't escape any special characters, you can only escape those that can actually behave as special characters. See HTML input pattern not working.

JS regular expression is not working [duplicate]

i have the following regex that allows only alphabets :
/[a-zA-Z]+/
a = "abcDF"
if (a.match(/[a-zA-Z]+/) == a){
//Match
}else{
//No Match
}
How can I do this using p{L} (universal - any language like german, english etc.. )
What I tried :
a.match(/[p{l}]+/)
a.match(/[\p{l}]+/)
a.match(/p{l}/)
a.match(/\p{l}/)
but all returned null for the letter a = "aB"
Starting with ECMAScript 2018, JavaScript finally supports Unicode property escapes natively.
For older versions, you either need to define all the relevant Unicode ranges yourself. Or you can use Steven Levithan's XRegExp package with Unicode add-ons and utilize its Unicode property shortcuts:
var regex = new XRegExp("^\\p{L}*$")
var a = "abcäöüéèê"
if (regex.test(a)) {
// Match
} else {
// No Match
}
If you are willing to use Babel to build your javascript then there's a babel-plugin I have released which will transform regular expressions like /^\p{L}+$/ or /\p{^White_Space}/ into a regular expression that browsers will understand.
This is the project page: https://github.com/danielberndt/babel-plugin-utf-8-regex
You may use \p{L} with the modern ECMAScript 2018+ compliant JavaScript environments, but you need to remember that the Unicode property classes are only supported when you pass u modifier/flag:
a.match(/\p{L}+/gu)
a.match(/\p{Alphabetic}+/gu)
will match all occurrences of 1 or more Unicode letters in the a string.
NOTE that \p{Alphabetic} (\p{Alpha}) includes all letters matched by \p{L}, plus letter numbers matched by \p{Nl} (e.g. Ⅻ – a character for the roman number 12), plus some other symbols matched with \p{Other_Alphabetic} (\p{OAlpha}).
There are some things to bear in mind though when using u modifier with a regex:
You can use Unicode code point escape sequences such as \u{1F42A} for specifying characters via code points. Normal Unicode escapes such as \u03B1 only have a range of four hexadecimal digits (which equals the basic multilingual plane) (source)
"Characters of 4 bytes are handled correctly: as a single character, not two 2-byte characters" (source)
Escaping requirements to patterns compiled with u flag are more strict: you can't escape any special characters, you can only escape those that can actually behave as special characters. See HTML input pattern not working.

Add Unicode character support to a regexp expression [duplicate]

In JavaScript we can match individual Unicode codepoints or codepoint ranges by using the Unicode escape sequences, e.g.:
"A".match(/\u0041/) // => ["A"]
"B".match(/[\u0041-\u007A]/) // => ["B"]
But how could we create a regular expression to match a proper name which must include any Unicode "letter" using a JavaScript regular expression? Is there a range of letters? A special regex sequence or character class in JavaScript?
Say my website must validate names that could be in latin based languages as well as Hebrew, Cyrillic, Japanese (Katakana, Hiragana, etc.) is this feasible in JavaScript or is the only sane choice to delegate to a backend language with better Unicode support?
Here's a JS plugin that adds Unicode support to RegEx
http://xregexp.com/plugins/
I am using for defining unicode of a symbols this site http://www.fileformat.info.
Unicode Blocks (Basic Latin, .+, Cyrillic, .+, Arabic and other):
http://www.fileformat.info/info/unicode/block/index.htm
Unicode Character Categories (this does not work in JS):
http://www.fileformat.info/info/unicode/category/index.htm
Letters (A-я):
http://www.fileformat.info/info/unicode/char/a.htm
Fonts (which chars are supported in each font):
http://www.fileformat.info/info/unicode/font/index.htm
Index for all above
http://www.fileformat.info/info/unicode/index.htm

RegEx with extended latin alphabet (ä ö ü è ß)

I want to do some basic String testing in Node.js. Assume I have a form where users enter their name and I wanna check if it's just rubbish or a real name.
Happily (or sadly for my check) I get users from all around the world which means that their names contain non-english characters, like ä ö ü ß é. I was used to use /[A-Za-z -]{2,}/ but this doesn't match names like "Jan Buschtöns".
Do I have to manually add every possible non-english but latin character to my RegEx to work? I don't want a 100+ characters long RegEx like /[A-Za-z -äöüÄÖÜßéÉèÈêÊ...]{2,}/.
Check http://www.regular-expressions.info/unicode.html and http://xregexp.com/plugins/
You would need to use \p{L} to match any letter character if you want to include unicode.
Speaking unicode, alternative of \w is [\p{L}\p{N}_] then.
Update: As of ES2018, JavaScript supports Unicode property escapes such as \p{L}, which matches anything that Unicode considers to be a letter. All modern browsers support this feature, so that's probably the way to go as long as you don't care about ancient browsers.
Old answer for pre-ES2018 browsers:
The answer depends on exactly what you want to do.
As you have noticed, [A-Za-z] only matches Latin letters without diacritics.
If you only care about German diacritics and the ß ligature, then you can just replace that part with [A-Za-zÄÖÜäöüß], e.g.:
/[A-Za-zÄÖÜäöüß -]{2,}/
But that probably isn’t what you want to do. You probably want to match Latin letters with any diacritics, not just those used in German. Or perhaps you want to match any letters from any alphabet, not just Latin.
Other regular expression dialects have character classes to help you with problems like this, but unfortunately JavaScript’s regular expression dialect has very few character classes and none of them help you here.
(In case you don’t know, a “character class” is an expression that matches any character that is a member of a predefined group of characters. For example, \w is a character class that matches any ASCII letter, or digit, or an underscore, and . is a character class that matches any character.)
This means that you have to list out every range of UTF-16 code units that corresponds to a character that you want to match.
A quick and dirty solution might be to say [a-zA-Z\u0080-\uFFFF], or in full:
/[A-Za-z\\u0080-\\uFFFF -]{2,}/
This will match any letter in the ASCII range, but will also match any character at all that is outside the ASCII range. This includes all possible alphabetic characters with or without diacritics in any script. However, it also includes a lot of characters that are not letters. Non-letters in the ASCII range are excluded, but non-letters outside the ASCII range are included.
The above might be good enough for your purposes, but if it isn’t then you will have to figure out which character ranges you need and specify those explicitly.
If you want just Latin letters, including those with less common diacritics like åēį, but excluding e.g. Chinese, Devanagari, and Cyrillic characters, you can use \p{Script=Latin} with the u flag. This feature is called Unicode property escapes, and was introduced in ES2018.
For example, /\p{Script=Latin}+/u will match a word that only contains Latin characters.
This is a JavaScript/node.js question, but I barely see any actual JavaScript code which shows how to do it. Its a bit trickier, because it requires the Unicode "u" flag:
// Result: '_ {_} [_]'
'ulike {adj} [ubøyelig]'.replace(/\p{L}+/gu, '_')
I know this question is old but I'm working on some NPL software and I needed to match all words in most latin-like languages and I did this with the following piece of code.
let myString = "Whatever you want here, ex. Bân-lâm-gú or bokmål or Português or Română or Slovenčina or Slovenščina";
let wordchar = "-A-Za-zᴀⱯɐᵄⱭɑᵅꬰꭤⱰɒᶛʙᴃᴯꞖꞗꞴꞵᴄↃↄꞳꭓꭕꭔÐðᶞꟇꟈꝹꝺᴅᴆꝱẟᴇꬲꬳꬴƎᴲǝⱻƏəₔᵊƐɛᵋEɘꞫɜᶟɞʚᴈᵌɤꝻꝼꜰℲⅎꟻꬵꝽᵹꞬɡᶢꬶɢᵷ⅁ꝾꝿƔɣˠƢƣʜǶƕⱵⱶꟵꟶꜦꜧꭜıꞮɪᶦꟾꟷᴉᵎᵻᶧƖɩᶥᴊKᴋꞰʞʟᶫꝆꝇᴌꬸꬹꬷꭝꝲꞀꞁ⅃ᴍꬺꟽꟿꝳɴᶰᴎᴻꬻꝴŊŋᵑꬼᴏᴑꬽꬾƆɔᵓᴐꬿᴒᴖᵔᴗᵕꞶꞷɷȢȣᴕᴽᴘꟼɸᶲⱷĸꞯꞂꞃƦʀꝚꝛᴙꭆɹʴᴚʁʶꭉꭇꭈꭊꭋꭌꭅꝵꝶꝜꝝſꟉꟊꞄꞅƧƨꜱƩʃᶴꭍƪʅꞆꞇᴛꝷꞱʇᴜᶸᴝᵙᴞꭒꭟꭎꭏꞍɥᶣƜɯꟺᵚᴟƱʊᶷᴠỼỽɅʌᶺᴡꟂꟃʍꭩꭖꭗꭘꭙꭙ̆ʏꭚʎ⅄ƍᴢꝢꝣƷʒᶾᴣƸƹȜȝÞþǷƿꝨꝩꝪꝫꝬꝭꝮꝯꝰꝸꜪꜫꜬꜭꜮꜯƼƽƄƅɁɂʔꜢꜣꞋꞌꞏʕˤᴤᴥᵜꜤꜥʖǀǁǃǂʗʘʬʭꞚꞛꞜꞝꞞꞟẚÀàÁáÂâẦầẤấẪẫẨẩÃãÃ̀ã̀Ã́ã́Ã̂ã̂Ã̌ã̌Ã̍ã̍Ã̎ã̎ĀāĀ̀ā̀Ā́ā́Ā̂ā̂Ā̃ā̃Ā̃́ā̃́Ā̄ā̄Ā̆ā̆Ā̆́ā̆́Ā̈ā̈Ā̊ā̊Ā̌ā̌ĂăẰằẮắẴẵẲẳȦȧȦ́ȧ́ǠǡÄäÄ́ä́Ä̀ä̀Ä̂ä̂Ä̃ä̃ǞǟǞ̆ǟ̆Ä̆ä̆Ä̌ä̌ẢảÅåÅǺǻÅ̂å̂Å̃å̃Å̄å̄Å̄̆å̄̆Å̆å̆A̋a̋ǍǎA̍a̍A̎a̎ȀȁȂȃA̐a̐A̓a̓A̧a̧À̧à̧Á̧á̧Â̧â̧Ǎ̧ǎ̧A̭a̭A̰a̰À̰à̰Á̰á̰Ā̰ā̰Ä̰ä̰Ä̰́ä̰́ĄąĄ̀ą̀Ą́ą́Ą̂ą̂Ą̃ą̃Ą̄ą̄Ą̄̀ą̄̀Ą̄́ą̄́Ą̄̂ą̄̂Ą̄̌ą̄̌Ą̇ą̇Ą̈ą̈Ą̈̀ą̈̀Ą̈́ą̈́Ą̈̂ą̈̂Ą̈̌ą̈̌Ą̈̄ą̈̄Ą̊ą̊Ą̌ą̌Ą̋ą̋Ą̱ą̱Ą̱̀ą̱̀Ą̱́ą̱́A᷎a᷎A̱a̱À̱à̱Á̱á̱Â̱â̱Ã̱ã̱Ā̱ā̱Ā̱̀ā̱̀Ā̱́ā̱́Ā̱̂ā̱̂Ä̱ä̱Ä̱̀ä̱̀Ä̱́ä̱́Ä̱̂ä̱̂Ä̱̌ä̱̌Å̱å̱Ǎ̱ǎ̱A̱̥a̱̥ẠạẠ́ạ́Ạ̀ạ̀ẬậẠ̃ạ̃Ạ̄ạ̄ẶặẠ̈ạ̈Ạ̈̀ạ̈̀Ạ̈́ạ̈́Ạ̈̂ạ̈̂Ạ̈̌ạ̈̌Ạ̌ạ̌Ạ̍ạ̍A̤a̤À̤à̤Á̤á̤Â̤â̤Ä̤ä̤ḀḁḀ̂ḁ̂Ḁ̈ḁ̈A̯a̯A̩a̩À̩à̩Á̩á̩Â̩â̩Ã̩ã̩Ā̩ā̩Ǎ̩ǎ̩A̩̍a̩̍A̩̓a̩̓A͔a͔Ā͔ā͔ȺⱥȺ̀ⱥ̀Ⱥ́ⱥ́ᶏꞺꞻⱭ̀ɑ̀Ɑ́ɑ́Ɑ̂ɑ̂Ɑ̃ɑ̃Ɑ̄ɑ̄Ɑ̆ɑ̆Ɑ̇ɑ̇Ɑ̈ɑ̈Ɑ̊ɑ̊Ɑ̌ɑ̌ᶐB̀b̀B́b́B̂b̂B̃b̃B̄b̄ḂḃB̈b̈B̒b̒B̕b̕ḆḇḆ̂ḇ̂ḄḅB̤b̤B̥b̥B̬b̬ɃƀᵬᶀƁɓƂƃʙ̇ʙ̣C̀c̀ĆćĈĉC̃c̃C̄c̄C̄́c̄́C̆c̆ĊċC̈c̈ČčČ́č́Č͑č͑Č̓č̓Č̕č̕Č̔č̔C̋c̋C̓c̓C̕c̕C̔c̔C͑c͑ÇçḈḉÇ̆ç̆Ç̇ç̇Ç̌ç̌ꞔꟄC̦c̦C̭c̭C̱c̱C̮c̮C̣c̣Ć̣ć̣Č̣č̣C̥c̥C̬c̬C̯c̯C̨c̨ȻȼȻ̓ȼ̓ꞒꞓƇƈɕᶝꜾꜿD́d́D̂d̂D̃d̃D̄d̄ḊḋD̊d̊ĎďD̑d̑D̓d̓D̕d̕ḐḑD̦d̦ḒḓḎḏD̮d̮ḌḍḌ́ḍ́Ḍ̄ḍ̄D̤d̤D̥d̥D̬d̬D̪d̪ĐđĐ̣đ̣Đ̱đ̱ᵭᶁƉɖƊɗᶑƋƌȡꝹ́ꝺ́Ꝺ̇ꝺ̇ᴅ̇ᴅ̣Ð́ð́Ð̣ð̣ÈèÉéÊêỀềẾếỄễÊ̄ê̄Ê̆ê̆Ê̌ê̌ỂểẼẽẼ̀ẽ̀Ẽ́ẽ́Ẽ̂ẽ̂Ẽ̌ẽ̌Ẽ̍ẽ̍Ẽ̎ẽ̎ĒēḔḕḖḗĒ̂ē̂Ē̃ē̃Ē̃́ē̃́Ē̄ē̄Ē̆ē̆Ē̆́ē̆́Ē̌ē̌Ē̑ē̑ĔĕĔ̀ĕ̀Ĕ́ĕ́Ĕ̄ĕ̄ĖėĖ́ė́Ė̃ė̃Ė̄ė̄ËëË̀ë̀Ë́ë́Ë̂ë̂Ë̃ë̃Ë̄ë̄Ë̌ë̌ẺẻE̊e̊E̊̄e̊̄E̋e̋ĚěĚ́ě́Ě̃ě̃Ě̋ě̋Ě̑ě̑E̍e̍E̎e̎ȄȅȆȇE̓e̓E᷎e᷎ȨȩȨ̀ȩ̀Ȩ́ȩ́Ȩ̂ȩ̂ḜḝȨ̌ȩ̌Ẽ̦ẽ̦ĘęĘ̀ę̀Ę́ę́Ę̂ę̂Ę̃ę̃Ę̃́ę̃́Ę̄ę̄Ę̄̀ę̄̀Ę̄́ę̄́Ę̄̂ę̄̂Ę̄̃ę̄̃Ę̄̌ę̄̌Ę̆ę̆Ę̇ę̇Ę̇́ę̇́Ę̈ę̈Ę̈̀ę̈̀Ę̈́ę̈́Ę̈̂ę̈̂Ę̈̌ę̈̌Ę̈̄ę̈̄Ę̋ę̋Ę̌ę̌Ę̑ę̑Ę̱ę̱Ę̱̀ę̱̀Ę̱́ę̱́Ę̣ę̣Ę᷎ę᷎ḘḙḚḛE̱e̱È̱è̱É̱é̱Ê̱ê̱Ẽ̱ẽ̱Ē̱ē̱Ḕ̱ḕ̱Ḗ̱ḗ̱Ē̱̂ē̱̂Ë̱ë̱Ë̱̀ë̱̀Ë̱́ë̱́Ë̱̂ë̱̂Ë̱̌ë̱̌Ě̱ě̱E̮e̮Ē̮ē̮ẸẹẸ̀ẹ̀Ẹ́ẹ́ỆệẸ̃ẹ̃Ẹ̄ẹ̄Ẹ̄̀ẹ̄̀Ẹ̄́ẹ̄́Ẹ̄̃ẹ̄̃Ẹ̆ẹ̆Ẹ̆̀ẹ̆̀Ẹ̆́ẹ̆́Ẹ̈ẹ̈Ẹ̈̀ẹ̈̀Ẹ̈́ẹ̈́Ẹ̈̂ẹ̈̂Ẹ̈̌ẹ̈̌Ẹ̍ẹ̍Ẹ̌ẹ̌Ẹ̑ẹ̑E̤e̤È̤è̤É̤é̤Ê̤ê̤Ë̤ë̤E̥e̥E̯e̯E̩e̩È̩è̩É̩é̩Ê̩ê̩Ẽ̩ẽ̩Ē̩ē̩Ě̩ě̩E̩̍e̩̍E̩̓e̩̓È͕è͕Ê͕ê͕Ẽ͕ẽ͕Ē͕ē͕Ḕ͕ḕ͕E̜e̜E̹e̹È̹è̹É̹é̹Ê̹ê̹Ẽ̹ẽ̹Ē̹ē̹Ḕ̹ḕ̹ɆɇᶒⱸᶕᶓɚᶔɝƐ̀ɛ̀Ɛ́ɛ́Ɛ̂ɛ̂Ɛ̃ɛ̃Ɛ̃̀ɛ̃̀Ɛ̃́ɛ̃́Ɛ̃̂ɛ̃̂Ɛ̃̌ɛ̃̌Ɛ̃̍ɛ̃̍Ɛ̃̎ɛ̃̎Ɛ̄ɛ̄Ɛ̆ɛ̆Ɛ̇ɛ̇Ɛ̈ɛ̈Ɛ̈̀ɛ̈̀Ɛ̈́ɛ̈́Ɛ̈̂ɛ̈̂Ɛ̈̌ɛ̈̌Ɛ̌ɛ̌Ɛ̍ɛ̍Ɛ̎ɛ̎Ɛ̣ɛ̣Ɛ̣̀ɛ̣̀Ɛ̣́ɛ̣́Ɛ̣̂ɛ̣̂Ɛ̣̃ɛ̣̃Ɛ̣̈ɛ̣̈Ɛ̣̈̀ɛ̣̈̀Ɛ̣̈́ɛ̣̈́Ɛ̣̈̂ɛ̣̈̂Ɛ̣̈̌ɛ̣̈̌Ɛ̣̌ɛ̣̌Ɛ̤ɛ̤Ɛ̤̀ɛ̤̀Ɛ̤́ɛ̤́Ɛ̤̂ɛ̤̂Ɛ̤̈ɛ̤̈Ɛ̧ɛ̧Ɛ̧̀ɛ̧̀Ɛ̧́ɛ̧́Ɛ̧̂ɛ̧̂Ɛ̧̌ɛ̧̌Ɛ̨ɛ̨Ɛ̨̀ɛ̨̀Ɛ̨́ɛ̨́Ɛ̨̂ɛ̨̂Ɛ̨̄ɛ̨̄Ɛ̨̆ɛ̨̆Ɛ̨̈ɛ̨̈Ɛ̨̌ɛ̨̌Ɛ̰ɛ̰Ɛ̰̀ɛ̰̀Ɛ̰́ɛ̰́Ɛ̰̄ɛ̰̄Ɛ̱ɛ̱Ɛ̱̀ɛ̱̀Ɛ̱́ɛ̱́Ɛ̱̂ɛ̱̂Ɛ̱̃ɛ̱̃Ɛ̱̈ɛ̱̈Ɛ̱̈̀ɛ̱̈̀Ɛ̱̈́ɛ̱̈́Ɛ̱̌ɛ̱̌Ə̀ə̀Ə́ə́Ə̂ə̂Ə̄ə̄Ə̌ə̌Ə̏ə̏F̀f̀F́f́F̃f̃F̄f̄ḞḟF̓f̓F̧f̧ᵮᶂƑƒꞘꞙF̱f̱F̣f̣ꜰ̇Ꝼ́ꝼ́Ꝼ̇ꝼ̇Ꝼ̣ꝼ̣G̀g̀ǴǵǴ̄ǵ̄ĜĝG̃g̃G̃́g̃́ḠḡḠ́ḡ́ĞğĠġG̈g̈G̈̇g̈̇G̊g̊G̋g̋ǦǧǦ̈ǧ̈G̑g̑G̒g̒G̓g̓G̕g̕G̔g̔ĢģG̦g̦G̱g̱G̱̓g̱̓G̮g̮G̣g̣G̤g̤G̥g̥G̫g̫ꞠꞡǤǥᶃƓɠɢ̇ɢ̣ʛƔ̓ɣ̓H̀h̀H́h́ĤĥH̄h̄ḢḣḦḧȞȟH̐h̐H̓h̓H̕h̕ḨḩH̨h̨H̭h̭H̱ẖḪḫḤḥḤ̣ḥ̣H̤h̤H̥h̥H̬h̬H̯h̯ĦħꟸĦ̥ħ̥ꞪɦʱⱧⱨꞕh̢ʜ̇ɧÌìÍíÎîÎ́î́ĨĩĨ́ĩ́Ĩ̀ĩ̀Ĩ̂ĩ̂Ĩ̌ĩ̌Ĩ̍ĩ̍Ĩ̎ĩ̎ĪīĪ́ī́Ī̀ī̀Ī̂ī̂Ī̌ī̌Ī̃ī̃Ī̄ī̄Ī̆ī̆Ī̆́ī̆́ĬĭĬ̀ĭ̀Ĭ́ĭ́İiIıİ́i̇́ÏïÏ̀ï̀ḮḯÏ̂ï̂Ï̃ï̃Ï̄ï̄Ï̌ï̌Ï̑ï̑I̊i̊I̋i̋ǏǐỈỉI̍i̍I̎i̎ȈȉI̐i̐ȊȋI᷎i᷎ĮįĮ̀į̀Į́į́į̇́Į̂į̂Į̃į̃į̇̃Į̄į̄Į̄̀į̄̀Į̄́į̄́Į̄̂į̄̂Į̄̆į̄̆Į̄̌į̄̌Į̈į̈Į̈̀į̈̀Į̈́į̈́Į̈̂į̈̂Į̈̌į̈̌Į̈̄į̈̄Į̋į̋Į̌į̌Į̱į̱Į̱́į̱́Į̱̀į̱̀I̓i̓I̧i̧Í̧í̧Ì̧ì̧Î̧î̧I̭i̭Ī̭ī̭ḬḭḬ̀ḭ̀Ḭ́ḭ́Ḭ̄ḭ̄Ḭ̈ḭ̈Ḭ̈́ḭ̈́I̱i̱Ì̱ì̱Í̱í̱Î̱î̱Ǐ̱ǐ̱Ĩ̱ĩ̱Ï̱ï̱Ḯ̱ḯ̱Ï̱̀ï̱̀Ï̱̂ï̱̂Ï̱̌ï̱̌Ī̱ī̱Ī̱́ī̱́Ī̱̀ī̱̀Ī̱̂ī̱̂I̮i̮ỊịỊ̀ị̀Ị́ị́Ị̂ị̂Ị̃ị̃Ị̄ị̄Ị̈ị̈Ị̈̀ị̈̀Ị̈́ị̈́Ị̈̂ị̈̂Ị̈̌ị̈̌Ị̌ị̌Ị̍ị̍I̤i̤Ì̤ì̤Í̤í̤Î̤î̤Ï̤ï̤I̥i̥Í̥í̥Ï̥ï̥I̯i̯Í̯í̯Ĩ̯ĩ̯I̩i̩I͔i͔Ī͔ī͔ƗɨᶤƗ̀ɨ̀Ɨ́ɨ́Ɨ̂ɨ̂Ɨ̌ɨ̌Ɨ̃ɨ̃Ɨ̄ɨ̄Ɨ̈ɨ̈Ɨ̧ɨ̧Ɨ̧̀ɨ̧̀Ɨ̧̂ɨ̧̂Ɨ̧̌ɨ̧̌Ɨ̱ɨ̱Ɨ̱̀ɨ̱̀Ɨ̱́ɨ̱́Ɨ̱̂ɨ̱̂Ɨ̱̈ɨ̱̈Ɨ̱̌ɨ̱̌Ɨ̯ɨ̯ᶖꞼꞽı̣ı̥Ɩ̀ɩ̀Ɩ́ɩ́Ɩ̂ɩ̂Ɩ̃ɩ̃Ɩ̈ɩ̈Ɩ̌ɩ̌ᵼJ́j́ĴĵJ̃j̃j̇̃J̄j̄J̇J̈j̈J̈̇j̈̇J̊j̊J̋j̋J̌ǰJ̌́ǰ́J̑j̑J̓j̓J᷎j᷎J̱j̱J̣j̣J̣̌ǰ̣J̥j̥ɈɉɈ̱ɉ̱ꞲʝᶨȷɟᶡʄK̀k̀ḰḱK̂k̂K̃k̃K̄k̄K̆k̆K̇k̇K̈k̈ǨǩK̑k̑K̓k̓K̕k̕K̔k̔K͑k͑ĶķK̦k̦K̨k̨ḴḵḴ̓ḵ̓ḲḳK̮k̮K̥k̥K̬k̬K̫k̫ᶄƘƙⱩⱪꝀꝁꝂꝃꝄꝅꞢꞣᴋ̇ĿŀL̀l̀ĹĺL̂l̂L̃l̃L̄l̄L̇l̇L̈l̈L̋l̋ĽľL̐l̐L̑l̑L̓l̓L̕l̕ĻļĻ̂ļ̂Ļ̃ļ̃L̦l̦ḼḽḺḻḺ̓ḻ̓L̮l̮ḶḷḶ̀ḷ̀Ḷ́ḷ́ḸḹḸ́ḹ́Ḹ̆ḹ̆Ḷ̓ḷ̓Ḷ̕ḷ̕Ḷ̣ḷ̣L̤l̤L̤̄l̤̄L̥l̥L̥̀l̥̀Ĺ̥ĺ̥L̥̄l̥̄L̥̄́l̥̄́L̥̄̆l̥̄̆L̥̕l̥̕L̩l̩L̩̀l̩̀L̩̓l̩̓L̯l̯ŁłŁ̇ł̇Ł̓ł̓Ł̣ł̣Ł̱ł̱ꝈꝉȽƚⱠⱡⱢɫꭞꞭɬᶅᶪɭᶩꞎȴʟ̇ʟ̣ƛƛ̓λ̴λ̴̓M̀m̀ḾḿM̂m̂M̃m̃M̄m̄M̆m̆ṀṁṀ̇ṁ̇M̈m̈M̋m̋M̍m̍M̌m̌M̐m̐M̑m̑M̓m̓M̕̕m̕M͑m͑ᵯM̧m̧M̨m̨M̦m̦M̱m̱Ḿ̱ḿ̱M̮m̮ṂṃṂ́ṃ́Ṃ̄ṃ̄Ṃ̓ṃ̓M̥m̥Ḿ̥ḿ̥M̥̄m̥̄M̥̄́m̥̄́M̥̄̆m̥̄̆M̬m̬M̩m̩M̩̀m̩̀M̩̓m̩̓M̯m̯ᶆm̢Ɱɱᶬᴍ̇ᴍ̣ǸǹŃńN̂n̂ÑñÑ̈ñ̈N̄n̄N̆n̆ṄṅṄ̇ṅ̇N̈n̈N̋n̋ŇňN̐n̐N̑n̑N̍n̍N̓n̓N̕n̕ꞤꞥᵰŅņŅ̂ņ̂Ņ̃ņ̃N̦n̦N̨n̨ṊṋN̰n̰ṈṉṈ́ṉ́N̮n̮ṆṇṆ́ṇ́Ṇ̄ṇ̄Ṇ̄́ṇ̄́Ṇ̓ṇ̓N̤n̤N̥n̥Ǹ̥ǹ̥Ń̥ń̥Ñ̥ñ̥Ñ̥́ñ̥́N̥̄n̥̄N̥̄́n̥̄́N̥̄̆n̥̄̆N̥̄̑n̥̄̑Ṅ̥ṅ̥N̥̑n̥̑N̥̑́n̥̑́N̥̑̄n̥̑̄N̯n̯N̩n̩Ǹ̩ǹ̩N̩̓n̩̓N̲n̲ƝɲᶮȠƞꞐꞑŊ̀ŋ̀Ŋ́ŋ́Ŋ̂ŋ̂Ŋ̄ŋ̄Ŋ̈ŋ̈Ŋ̈̇ŋ̈̇Ŋ̊ŋ̊Ŋ̑ŋ̑Ŋ̨ŋ̨Ŋ̣ŋ̣Ŋ̥ŋ̥Ŋ̥́ŋ̥́Ŋ̥̄ŋ̥̄Ŋ̥̄́ŋ̥̄́ᶇɳᶯȵɴ̇ɴ̣ÒòÓóÔôỐốỒồỖỗÔ̆ô̆ỔổÕõÕ̍õ̍Õ̎õ̎Õ̀õ̀ṌṍÕ̂õ̂Õ̌õ̌ṎṏȬȭŌōṒṓṐṑŌ̂ō̂Ō̃ō̃Ō̃́ō̃́Ō̄ō̄Ō̆ō̆Ō̆́ō̆́Ō̈ō̈Ō̌ō̌ŎŏŎ̀ŏ̀Ŏ́ŏ́Ŏ̈ŏ̈ȮȯȮ́ȯ́ȰȱO͘o͘Ó͘ó͘Ò͘ò͘Ō͘ō͘O̍͘o̍͘ÖöÖ́ö́Ö̀ö̀Ö̂ö̂Ö̌ö̌Ö̃ö̃ȪȫȪ̆ȫ̆Ö̆ö̆ỎỏO̊o̊ŐőǑǒO̍o̍O̎o̎ȌȍO̐o̐ȎȏO̓o̓ØøØ̀ø̀ǾǿØ̂ø̂Ø̃ø̃Ø̄ø̄Ø̄́ø̄́Ø̄̆ø̄̆Ø̆ø̆Ø̇ø̇Ø̇́ø̇́Ø̈ø̈Ø̋ø̋Ø̌ø̌Ø᷎ø᷎Ø̨ø̨Ǿ̨ǿ̨Ø̨̄ø̨̄Ø̣ø̣Ø̥ø̥Ø̰ø̰Ǿ̰ǿ̰ظø¸Ǿ¸ǿ¸ƟɵᶱƠơỚớỜờỠỡƠ̆ơ̆ỞởO᷎o᷎Ó᷎ó᷎O̧o̧Ó̧ó̧Ò̧ò̧Ô̧ô̧Ǒ̧ǒ̧ǪǫǪ̀ǫ̀Ǫ́ǫ́Ǫ̂ǫ̂Ǫ̃ǫ̃ǬǭǬ̀ǭ̀Ǭ́ǭ́Ǭ̂ǭ̂Ǭ̃ǭ̃Ǭ̆ǭ̆Ǭ̌ǭ̌Ǫ̆ǫ̆Ǫ̆́ǫ̆́Ǫ̇ǫ̇Ǫ̇́ǫ̇́Ǫ̈ǫ̈Ǫ̈̀ǫ̈̀Ǫ̈́ǫ̈́Ǫ̈̂ǫ̈̂Ǫ̈̃ǫ̈̃Ǫ̈̄ǫ̈̄Ǫ̈̌ǫ̈̌Ǫ̋ǫ̋Ǫ̌ǫ̌Ǫ̑ǫ̑Ǫ̣ǫ̣Ǫ̱ǫ̱Ǫ̱́ǫ̱́Ǫ̱̀ǫ̱̀Ǫ᷎ǫ᷎O̭o̭O̰o̰Ó̰ó̰O̱o̱Ò̱ò̱Ó̱ó̱Ô̱ô̱Ǒ̱ǒ̱Õ̱õ̱Ō̱ō̱Ṓ̱ṓ̱Ṑ̱ṑ̱Ō̱̂ō̱̂Ö̱ö̱Ö̱́ö̱́Ö̱̀ö̱̀Ö̱̂ö̱̂Ö̱̌ö̱̌O̮o̮ỌọỌ̀ọ̀Ọ́ọ́ỘộỌ̃ọ̃Ọ̄ọ̄Ọ̄̀ọ̄̀Ọ̄́ọ̄́Ọ̄̃ọ̄̃Ọ̄̆ọ̄̆Ọ̆ọ̆Ọ̈ọ̈Ọ̈̀ọ̈̀Ọ̈́ọ̈́Ọ̈̂ọ̈̂Ọ̈̄ọ̈̄Ọ̈̌ọ̈̌Ọ̌ọ̌Ọ̑ọ̑ỢợỌọO̤o̤Ò̤ò̤Ó̤ó̤Ô̤ô̤Ö̤ö̤O̥o̥Ō̥ō̥O̬o̬O̯o̯O̩o̩Õ͔õ͔Ō͔ō͔O̜o̜O̹o̹Ó̹ó̹O̲o̲ᴓᶗꝌꝍⱺꝊꝋƆ́ɔ́Ɔ̀ɔ̀Ɔ̂ɔ̂Ɔ̌ɔ̌Ɔ̃ɔ̃Ɔ̃́ɔ̃́Ɔ̃̀ɔ̃̀Ɔ̃̂ɔ̃̂Ɔ̃̌ɔ̃̌Ɔ̃̍ɔ̃̍Ɔ̃̎ɔ̃̎Ɔ̄ɔ̄Ɔ̆ɔ̆Ɔ̇ɔ̇Ɔ̈ɔ̈Ɔ̈̀ɔ̈̀Ɔ̈́ɔ̈́Ɔ̈̂ɔ̈̂Ɔ̈̌ɔ̈̌Ɔ̌ɔ̌Ɔ̍ɔ̍Ɔ̎ɔ̎Ɔ̣ɔ̣Ɔ̣̀ɔ̣̀Ɔ̣́ɔ̣́Ɔ̣̂ɔ̣̂Ɔ̣̃ɔ̣̃Ɔ̣̈ɔ̣̈Ɔ̣̈̀ɔ̣̈̀Ɔ̣̈́ɔ̣̈́Ɔ̣̈̂ɔ̣̈̂Ɔ̣̈̌ɔ̣̈̌Ɔ̣̌ɔ̣̌Ɔ̤ɔ̤Ɔ̤̀ɔ̤̀Ɔ̤́ɔ̤́Ɔ̤̂ɔ̤̂Ɔ̤̈ɔ̤̈Ɔ̱ɔ̱Ɔ̱̀ɔ̱̀Ɔ̱́ɔ̱́Ɔ̱̂ɔ̱̂Ɔ̱̌ɔ̱̌Ɔ̱̃ɔ̱̃Ɔ̱̈ɔ̱̈Ɔ̱̈̀ɔ̱̈̀Ɔ̱̈́ɔ̱̈́Ɔ̧ɔ̧Ɔ̧̀ɔ̧̀Ɔ̧́ɔ̧́Ɔ̧̂ɔ̧̂Ɔ̧̌ɔ̧̌Ɔ̨ɔ̨Ɔ̨́ɔ̨́Ɔ̨̀ɔ̨̀Ɔ̨̂ɔ̨̂Ɔ̨̌ɔ̨̌Ɔ̨̄ɔ̨̄Ɔ̨̆ɔ̨̆Ɔ̨̈ɔ̨̈Ɔ̨̱ɔ̨̱Ɔ̰ɔ̰Ɔ̰̀ɔ̰̀Ɔ̰́ɔ̰́Ɔ̰̄ɔ̰̄P̀p̀ṔṕP̃p̃P̄p̄P̆p̆ṖṗP̈p̈P̋p̋P̑p̑P̓p̓P̕p̕P̔p̔P͑p͑P̱p̱P̣p̣P̤p̤P̬p̬ⱣᵽꝐꝑᵱᶈƤƥꝒꝓꝔꝕᴘ̇Q́q́Q̃q̃Q̄q̄Q̇q̇Q̈q̈Q̋q̋Q̓q̓Q̕q̕Q̧q̧Q̣q̣Q̣̇q̣̇Q̣̈q̣̈Q̱q̱ꝖꝗꝖ̃ꝗ̃ꝘꝙʠɊɋR̀r̀ŔŕR̂r̂R̃r̃R̄r̄R̆r̆ṘṙR̋r̋ŘřR̍r̍ȐȑȒȓR̓r̓R̕r̕ŖŗR̦r̦R̨r̨R̨̄r̨̄ꞦꞧR̭r̭ṞṟṚṛṚ̀ṛ̀Ṛ́ṛ́ṜṝṜ́ṝ́Ṝ̃ṝ̃Ṝ̆ṝ̆R̤r̤R̥r̥R̥̀r̥̀Ŕ̥ŕ̥R̥̂r̥̂R̥̃r̥̃R̥̄r̥̄R̥̄́r̥̄́R̥̄̆r̥̄̆Ř̥ř̥R̬r̬R̩r̩R̯r̯ɌɍᵲꭨɺᶉɻʵⱹɼⱤɽɾᵳɿʀ̇ʀ̣Ꝛ́ꝛ́Ꝛ̣ꝛ̣S̀s̀ŚśŚ̀ś̀ŚśṤṥŜŝS̃s̃S̄s̄S̄̒s̄̒S̆s̆ṠṡṠ̃ṡ̃S̈s̈S̋s̋ŠšŠ̀š̀Š́š́ṦṧŠ̓š̓S̑s̑S̒s̒S̓s̓S̕s̕ŞşȘșS̨s̨Š̨š̨ꞨꞩS̱s̱Ś̱ś̱S̮s̮ṢṣṢ́ṣ́Ṣ̄ṣ̄ṨṩṢ̌ṣ̌Ṣ̕ṣ̕Ṣ̱ṣ̱S̤s̤Š̤š̤S̥s̥Ś̥S̬s̬S̩s̩S̪s̪ꜱ̇ꜱ̣ſ́ẛſ̣ᵴᶊʂᶳꟅⱾȿẜẝᶋᶘʆT̀t̀T́t́T̃t̃T̄t̄T̆t̆T̆̀t̆̀ṪṫT̈ẗŤťT̑t̑T̓t̓T̕t̕T̔t̔T͑t͑ŢţȚțT̨t̨T̗t̗ṰṱT̰t̰ṮṯT̮t̮ṬṭṬ́ṭ́T̤t̤T̥t̥T̬t̬T̯t̯T̪t̪ƾŦŧȾⱦᵵƫᶵƬƭƮʈȶᴛ̇ᴛ̣ÙùÚúÛûŨũŨ̀ũ̀ṸṹŨ̂ũ̂Ũ̊ũ̊Ũ̌ũ̌Ũ̍ũ̍Ũ̎ũ̎ŪūŪ̀ū̀Ū́ū́Ū̂ū̂Ū̌ū̌Ū̃ū̃Ū̄ū̄Ū̆ū̆Ū̆́ū̆́ṺṻŪ̊ū̊ŬŭŬ̀ŭ̀Ŭ́ŭ́U̇u̇U̇́u̇́U̇̄u̇̄ÜüǛǜǗǘÜ̂ü̂Ü̃ü̃ǕǖǕ̆ǖ̆Ü̆ü̆ǙǚỦủŮůŮ́ů́Ů̃ů̃ŰűǓǔU̍u̍U̎u̎ȔȕȖȗU̓u̓U᷎u᷎ỦủƯưỨứỪừỮữƯ̆ư̆ỬửỰựU̧u̧Ú̧ú̧Ù̧ù̧Û̧û̧Ǔ̧ǔ̧ŲųŲ̀ų̀Ų́ų́Ų̂ų̂Ų̌ų̌Ų̄ų̄Ų̄́ų̄́Ų̄̀ų̄̀Ų̄̂ų̄̂Ų̄̌ų̄̌Ų̄̌ų̄̌Ų̈ų̈Ų̈́ų̈́Ų̈̀ų̈̀Ų̈̂ų̈̂Ų̈̌ų̈̌Ų̈̄ų̈̄Ų̋ų̋Ų̱ų̱Ų̱́ų̱́Ų̱̀ų̱̀ṶṷṴṵṴ̀ṵ̀Ṵ́ṵ́Ṵ̄ṵ̄Ṵ̈ṵ̈U̱u̱Ù̱ù̱Ú̱ú̱Û̱û̱Ũ̱ũ̱Ū̱ū̱Ū̱́ū̱́Ū̱̀ū̱̀Ū̱̂ū̱̂Ü̱ü̱Ǘ̱ǘ̱Ǜ̱ǜ̱Ü̱̂ü̱̂Ǚ̱ǚ̱Ǔ̱ǔ̱ỤụỤ̀ụ̀Ụ́ụ́Ụ̂ụ̂Ụ̃ụ̃Ụ̄ụ̄Ụ̈ụ̈Ụ̈̀ụ̈̀Ụ̈́ụ̈́Ụ̈̂ụ̈̂Ụ̈̌ụ̈̌Ụ̌ụ̌Ụ̍ụ̍ṲṳṲ̀ṳ̀Ṳ́ṳ́Ṳ̂ṳ̂Ṳ̈ṳ̈U̥u̥Ü̥ü̥U̯u̯Ũ̯ũ̯Ü̯ü̯U̩u̩U͔u͔Ũ͔ũ͔Ū͔ū͔ɄʉᶶɄ̀ʉ̀Ʉ́ʉ́Ʉ̂ʉ̂Ʉ̃ʉ̃Ʉ̄ʉ̄Ʉ̈ʉ̈Ʉ̌ʉ̌Ʉ̧ʉ̧Ʉ̰ʉ̰Ʉ̰́ʉ̰́Ʉ̱ʉ̱Ʉ̱́ʉ̱́Ʉ̱̀ʉ̱̀Ʉ̱̂ʉ̱̂Ʉ̱̈ʉ̱̈Ʉ̱̌ʉ̱̌Ʉ̥ʉ̥ꞸꞹᵾᶙꞾꞿʮʯɰᶭƱ̀ʊ̀Ʊ́ʊ́Ʊ̃ʊ̃ᵿV̀v̀V́v́V̂v̂ṼṽṼ̀ṽ̀Ṽ́ṽ́Ṽ̂ṽ̂Ṽ̌ṽ̌V̄v̄V̄̀v̄̀V̄́v̄́V̄̂v̄̂V̄̃v̄̃V̄̄v̄̄V̄̆v̄̆V̄̌v̄̌V̆v̆V̆́v̆́V̇v̇V̈v̈V̈̀v̈̀V̈́v̈́V̈̂v̈̂V̈̄v̈̄V̈̌v̈̌V̊v̊V̋v̋V̌v̌V̍v̍V̏v̏V̐v̐V̓v̓V̧v̧V̨v̨V̨̀v̨̀V̨́v̨́V̨̂v̨̂V̨̌v̨̌V̨̄v̨̄V̨̄́v̨̄́V̨̄̀v̨̄̀V̨̄̂v̨̄̂V̨̄̌v̨̄̌V̨̈v̨̈V̨̈́v̨̈́V̨̈̀v̨̈̀V̨̈̂v̨̈̂V̨̈̌v̨̈̌V̨̈̄v̨̈̄V̨̋v̨̋V̨̱v̨̱V̨̱́v̨̱́V̨̱̀v̨̱̀V̨̱̂v̨̱̂V̨̱̌v̨̱̌V̱v̱V̱̀v̱̀V̱́v̱́V̱̂v̱̂V̱̌v̱̌Ṽ̱ṽ̱V̱̈v̱̈V̱̈́v̱̈́V̱̈̀v̱̈̀V̱̈̂v̱̈̂V̱̈̌v̱̈̌ṾṿV̥v̥ꝞꝟᶌƲʋᶹƲ̀ʋ̀Ʋ́ʋ́Ʋ̂ʋ̂Ʋ̃ʋ̃Ʋ̈ʋ̈Ʋ̌ʋ̌ⱱⱴꝨ́ꝩ́Ꝩ̇ꝩ̇Ꝩ̣ꝩ̣ẀẁẂẃŴŵW̃w̃W̄w̄W̆w̆ẆẇẄẅW̊ẘW̋w̋W̌w̌W̍w̍W̓w̓W̱w̱ẈẉW̥w̥W̬w̬ⱲⱳX̀x̀X́x́X̂x̂X̃x̃X̄x̄X̆x̆X̆́x̆́ẊẋẌẍX̊x̊X̌x̌X̓x̓X̕x̕X̱x̱X̱̓x̱̓X̣x̣X̣̓x̣̓X̥x̥ᶍỲỳÝýŶŷỸỹȲȳȲ̀ȳ̀Ȳ́ȳ́Ȳ̃ȳ̃Ȳ̆ȳ̆Y̆y̆Y̆̀y̆̀Y̆́y̆́ẎẏẎ́ẏ́ŸÿŸ́ÿ́Y̊ẙY̋y̋Y̌y̌Y̍y̍Y̎y̎Y̐y̐Y̓y̓ỶỷY᷎y᷎Y̱y̱ỴỵỴ̣ỵ̣Y̥y̥Y̯y̯ɎɏƳƴỾỿZ̀z̀ŹźẐẑZ̃z̃Z̄z̄ŻżZ̈z̈Z̋z̋ŽžŽ́ž́Ž̏ž̏Z̑z̑Z̓z̓Z̕z̕Z̨z̨Z̗z̗ẔẕZ̮z̮ẒẓẒ́ẓ́Ẓ̌ẓ̌Ẓ̣ẓ̣Z̤z̤Z̥z̥ƵƶᵶᶎꟆȤȥʐᶼʑᶽⱿɀⱫⱬƷ́ʒ́Ʒ̇ʒ̇ǮǯǮ́ǯ́Ʒ̥ʒ̥ᶚƺʓÞ́þ́Þ̣þ̣ꝤꝥꝦꝧƻꜮꜯʡʢꜲꜳꜲ́ꜳ́Ꜳ̋ꜳ̋Ꜳ̇ꜳ̇Ꜳ̈ꜳ̈Ꜳ̣ꜳ̣ÆæᴭÆ̀æ̀ǼǽÆ̂æ̂Æ̌æ̌Æ̃æ̃Æ̃́æ̃́Æ̃̀æ̃̀Æ̃̂æ̃̂Æ̃̌æ̃̌ǢǣǢ́ǣ́Ǣ̂ǣ̂Ǣ̃ǣ̃Ǣ̆ǣ̆Æ̆æ̆Æ̇æ̇Æ̈æ̈Æ̈̀æ̈̀Æ̈́æ̈́Æ̈̂æ̈̂Æ̈̌æ̈̌Æ̊æ̊Æ̋æ̋Æ᷎æ᷎Æ̨æ̨Æ̨̀æ̨̀Ǽ̨ǽ̨Æ̨̂æ̨̂Æ̨̈æ̨̈Ǣ̨ǣ̨Æ̨̌æ̨̌Æ̨̱æ̨̱Æ̱æ̱Æ̱̃æ̱̃Æ̱̈æ̱̈Æ̣æ̣Æ͔̃æ͔̃ᴁᴂᵆꬱꜴꜵꜴ́ꜵ́Ꜵ̋ꜵ̋Ꜵ̣ꜵ̣ꜶꜷꜶ́ꜷ́Ꜷ̣ꜷ̣ꜸꜹꜺꜻꜸ́ꜹ́Ꜹ̋ꜹ̋Ꜹ̨ꜹ̨Ꜹ̣ꜹ̣Ꜻ́ꜻ́ꜼꜽꜼ̇ꜽ̇Ꜽ̣ꜽ̣ȸDZDzdzʣDŽDždžꭦʥʤffffifflfiflʩIJijꭡLJLjljỺỻʪʫɮNJNjnjŒœꟹŒ̀œ̀Œ́œ́Œ̂œ̂Œ̃œ̃Œ̄œ̄Œ̄́œ̄́Œ̄̃œ̄̃Œ̄̆œ̄̆Œ̋œ̋Œ̌œ̌Œ̨œ̨Œ̨̃œ̨̃Œ̣œ̣Œ̯œ̯ɶᴔꭂꭁꭢꝎꝏꝎ́ꝏ́Ꝏ̈ꝏ̈Ꝏ̋ꝏ̋Ꝏ̣ꝏ̣ꭃꭄȹẞßstſtʨᵺʦꭧʧꜨꜩꭀᵫꭐꭑꭣꝠꝡꝠ̈ꝡ̈Ꝡ̋ꝡ̋ꭠ";
let re = new RegExp(`(?<=[^${wordchar}]*)[${wordchar}]+(?=[${wordchar}]*)`, "g");
console.log(myString.match(re)); // ["Whatever", "you", "want", "here", "ex", "Bân-lâm-gú", "or", "bokmål", "or", "Português", "or", "Română", "or", "Slovenčina", "or", "Slovenščina"]
For russian and latin alphabet I've used
[\\wа-яА-Я]

Regular expression to allow all alphabet characters plus unicode characters

I need a regular expression to allow all alphabet characters plus Greek/German alphabet in a string but replace those symbols ?,&,^,". with *
I skipped the list with characters to escape to made the question simple.
I really want to see how to construct this and afterwards include alphabet sets using ASCII codes.
if you have a finite and short set of elements to replace you could just use a class e.g.
string.replace(/[?\^&]/g, '*');
and add as many symbols as you want to reject. you could also add ranges of unicode symbols you want to replace (e.g. \u017F-\036F\u0400-\uFFFF )
otherwise use a a class to specify what symbols don't need to be replaced, like a-z, accented/diacritic letters and greek symbols
string.replace(/[^a-z\00C0-\017E\u0370-\03FF]/gi, '*');
You have to use the XRegexp plugin, along with the Unicode add-on.
Once you have that, you can use modern regexes like /[\p{L}\p{Nl}]/, which necessarily also includes those \p{Greek} code points which are letters or letter-numbers. But you could also match /[\p{Latin}\p{Greek}]/ if you wanted.
Javascript’s own regexes are terrible. Use XRegexp.
So something like: /^[^?&\^"]*$/ (that means the string is composed only of characters outside the five you listed)...
But if you want to have the greek characters and the unicode characters (what are unicode characters? àèéìòù? Japanese?) perhaps you'll have to use http://xregexp.com/ It is a regex library for javascript that includes character classes for the various unicode character classes (I know I'm repeating myself) plus other "commands" for unicode handling.

Categories

Resources