Capture only letter followed by letter, excluding some words - Regex - javascript

I need to capture a letter in a string followed by a letter, excluding some specific words. I have the following string in Latex:
22+2p+p^{pp^{2p+pp}}+\delta+\pi+sqrt(2p)+\\frac{2}+{2p}+ppp+2P+\sqrt+xx+\to+p2+\pi+px+ab+\alpha
I want to add * between the letters, but I don't want the following words to apply:
\frac
\delta
\pi
\sqrt
\alpha
The output should be as follows:
22+2p+p^{p*p^{2p+p*p}}+\delta+\pi+\sqrt(2p)+\\frac{2}+{2p}+p*p*p+2P+\sqrt(9)+x*x+\to+p2+\pi+p*x+a*b+\alpha
The letters are dynamic entries, which can be any of the alphabet. I thought about using "positive lookbehind" but its support is limited.

You can achieve the result you want with a string replace with callback, using a regex:
(delta|frac|pi|sqrt|alpha|to)|([a-z](?=[a-z]))
that matches one of the excluded words in group 1 or a letter that is followed by another letter in group 2. In the callback, if group 1 is present, that is returned otherwise group 2 is returned followed by a *:
let str = '22+2p+p^{pp^{2p+pp}}+\\delta+\\pi+\\sqrt(2p)+\\\\frac{2}+{2p}+ppp+2P+\\sqrt(9)+xx+\\to+p2+\\pi+px+ab+\\alpha';
const replacer = (m, p1, p2) => {
return p1 ? p1 : (p2 + '*');
}
console.log(str.replace(/(delta|frac|pi|sqrt|alpha|to)|([a-z](?=[a-z]))/gi, replacer));

You can do something like this:
const str = "22+2p+p^{pp^{2p+pp}}+\\delta+\\pi+\\sqrt(2p)+\\\\frac{2}+{2p}+ppp+2P+\\sqrt+xx+\\to+p2+\\pi+px+ab+\\alpha";
const result = str.replace(/\\?[a-zA-Z]{2,}/g, (v) => {
if (v.startsWith('\\')) {
return v;
}
return v.split("").join("*");
});
console.log(result);
What this does is to match all 2 or more consecutive letters that are preceded by a \ or not and in the replace function, if the matched group is not starting with \, the replacement is set to the letters group split and joined by *.

You could use negative lookbehind to solve this.
const regex = /(?<!\\{1,})(\b[a-zA-Z]{2,}\b)/g;
const str = `22+2p+p^{pp^{2p+pp}}+\\delta+\\pi+\\sqrt(2p)+\\\\frac{2}+{2p}+ppp+2P+\\sqrt+xx+\\to+p2+\\pi+px+ab+\\alpha`;
let m;
let result = str.replace(regex, function(match) {
return match.split("").join("*");
});
console.log("Match: ",str.match(regex).toString());
console.log(result);

Related

Calculating mixed numbers and chars and concatinating it back again in JS/jQuery

I need to manipulate drawing of a SVG, so I have attribute "d" values like this:
d = "M561.5402,268.917 C635.622,268.917 304.476,565.985 379.298,565.985"
What I want is to "purify" all the values (to strip the chars from them), to calculate them (for the sake of simplicity, let's say to add 100 to each value), to deconstruct the string, calculate the values inside and then concatenate it all back together so the final result is something like this:
d = "M661.5402,368.917 C735.622,368.917 404.476,665.985 479.298,665.985"
Have in mind that:
some values can start with a character
values are delimited by comma
some values within comma delimiter can be delimited by space
values are decimal
This is my try:
let arr1 = d.split(',');
arr1 = arr1.map(element => {
let arr2 = element.split(' ');
if (arr2.length > 1) {
arr2 = arr2.map(el => {
let startsWithChar = el.match(/\D+/);
if (startsWithChar) {
el = el.replace(/\D/g,'');
}
el = parseFloat(el) + 100;
if (startsWithChar) {
el = startsWithChar[0] + el;
}
})
}
else {
let startsWithChar = element.match(/\D+/);
if (startsWithChar) {
element = element.replace(/\D/g,'');
}
element = parseFloat(element) + 100;
if (startsWithChar) {
element = startsWithChar[0] + element;
}
}
});
d = arr1.join(',');
I tried with regex replace(/\D/g,'') but then it strips the decimal dot from the value also, so I think my solution is full of holes.
Maybe another solution would be to somehow modify directly each of path values/commands, I'm opened to that solution also, but I don't know how.
const s = 'M561.5402,268.917 C635.622,268.917 304.476,565.985 379.298,565.985'
console.log(s.replaceAll(/[\d.]+/g, m=>+m+100))
You might use a pattern to match the format in the string with 2 capture groups.
([ ,]?\b[A-Z]?)(\d+\.\d+)\b
The pattern matches:
( Capture group 1
[ ,]?\b[A-Z]? Match an optional space or comma, a word boundary and an optional uppercase char A-Z
) Close group 1
( Capture group 2
\d+\.\d+ Match 1+ digits, a dot and 1+ digits
) Close group 1
\b A word boundary to prevent a partial word match
Regex demo
First capture the optional delimiter followed by an optional uppercase char in group 1, and the decimal number in group 2.
Then add 100 to the decimal value and join back the 2 group values.
const d = "M561.5402,268.917 C635.622,268.917 304.476,565.985 379.298,565.985";
const regex = /([ ,]?\b[A-Z]?)(\d+\.\d+)\b/g;
const res = Array.from(
d.matchAll(regex), m => m[1] + (+m[2] + 100)
).join('');
console.log(res);

Regex match apostrophe inside, but not around words, inside a character set

I'm counting how many times different words appear in a text using Regular Expressions in JavaScript. My problem is when I have quoted words: 'word' should be counted simply as word (without the quotes, otherwise they'll behave as two different words), while it's should be counted as a whole word.
(?<=\w)(')(?=\w)
This regex can identify apostrophes inside, but not around words. Problem is, I can't use it inside a character set such as [\w]+.
(?<=\w)(')(?=\w)|[\w]+
Will count it's a 'miracle' of nature as 7 words, instead of 5 (it, ', s becoming 3 different words). Also, the third word should be selected simply as miracle, and not as 'miracle'.
To make things even more complicated, I need to capture diacritics too, so I'm using [A-Za-zÀ-ÖØ-öø-ÿ] instead of \w.
How can I accomplish that?
1) You can simply use /[^\s]+/g regex
const str = `it's a 'miracle' of nature`;
const result = str.match(/[^\s]+/g);
console.log(result.length);
console.log(result);
2) If you are calculating total number of words in a string then you can also use split as:
const str = `it's a 'miracle' of nature`;
const result = str.split(/\s+/);
console.log(result.length);
console.log(result);
3) If you want a word without quote at the starting and at the end then you can do as:
const str = `it's a 'miracle' of nature`;
const result = str.match(/[^\s]+/g).map((s) => {
s = s[0] === "'" ? s.slice(1) : s;
s = s[s.length - 1] === "'" ? s.slice(0, -1) : s;
return s;
});
console.log(result.length);
console.log(result);
You might use an alternation with 2 capture groups, and then check for the values of those groups.
(?<!\S)'(\S+)'(?!\S)|(\S+)
(?<!\S)' Negative lookbehind, assert a whitespace boundary to the left and match '
(\S+) Capture group 1, match 1+ non whitespace chars
'(?!\S) Match ' and assert a whitespace boundary to the right
| Or
(\S+) Capture group 2, match 1+ non whitespace chars
See a regex demo.
const regex = /(?<!\S)'(\S+)'(?!\S)|(\S+)/g;
const s = "it's a 'miracle' of nature";
Array.from(s.matchAll(regex), m => {
if (m[1]) console.log(m[1])
if (m[2]) console.log(m[2])
});

JS : Remove all strings which are starting with specific character

I have an array contains names. Some of them starting with a dot (.), and some of them have dot in the middle or elsewhere. I need to remove all names only starting with dot. I seek help for a better way in JavaScript.
var myarr = 'ad, ghost, hg, .hi, jk, find.jpg, dam.ark, haji, jive.pdf, .find, home, .war, .milk, raj, .ker';
var refinedArr = ??
You can use the filter function and you can access the first letter of every word using item[0]. You do need to split the string first.
var myarr = 'ad, ghost, hg, .hi, jk, find.jpg, dam.ark, haji, jive.pdf, .find, home, .war, .milk, raj, .ker'.split(", ");
var refinedArr = myarr.filter(function(item) {
return item[0] != "."
});
console.log(refinedArr)
Use filter and startsWith:
let myarr = ['ad', 'ghost', 'hg', '.hi', 'jk'];
let res = myarr.filter(e => ! e.startsWith('.'));
console.log(res);
You can use the RegEx \B\.\w+,? ? and replace with an empty String.
\B matches a non word char
\. matches a dot
\w+ matches one or more word char
,? matches 0 or 1 ,
[space]? matches 0 or 1 [space]
Demo:
const regex = /\B\.\w+,? ?/g;
const str = `ad, ghost, hg, .hi, jk, find.jpg, dam.ark, haji, jive.pdf, .find, home, .war, .milk, raj, .ker`;
const subst = ``;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);

JavaScript: Amend the Sentence

I am having trouble below javaScript problem.
Question:
You have been given a string s, which is supposed to be a sentence. However, someone forgot to put spaces between the different words, and for some reason they capitalized the first letter of every word. Return the sentence after making the following amendments:
Put a single space between the words.
Convert the uppercase letters to lowercase.
Example
"CodefightsIsAwesome", the output should be "codefights is awesome";
"Hello", the output should be "hello".
My current code is:
Right now, my second for-loop just manually slices the parts from the string.
How can I make this dynamic and insert "space" in front of the Capital String?
You can use String.prototype.match() with RegExp /[A-Z][^A-Z]*/g to match A-Z followed by one or more characters which are not A-Z, or character at end of string; chain Array.prototype.map() to call .toLowerCase() on matched words, .join() with parameter " " to include space character between matches at resulting string.
var str = "CodefightsIsAwesome";
var res = str.match(/[A-Z][^A-Z]*/g).map(word => word.toLowerCase()).join(" ");
console.log(res);
Alternatively, as suggested by #FissureKing, you can use String.prototype.repalce() with .trim() and .toLowerCase() chained
var str = "CodefightsIsAwesome";
var res = str.replace(/[A-Z][^A-Z]*/g, word => word + ' ').trim().toLowerCase();
console.log(res);
Rather than coding a loop, I'd do it in one line with a (reasonably) simple string replacement:
function amendTheSentence(s) {
return s.replace(/[A-Z]/g, function(m) { return " " + m.toLowerCase() })
.replace(/^ /, "");
}
console.log(amendTheSentence("CodefightsIsAwesome"));
console.log(amendTheSentence("noCapitalOnFirstWord"));
console.log(amendTheSentence("ThereIsNobodyCrazierThanI"));
That is, match any uppercase letter with the regular expression /[A-Z]/, replace the matched letter with a space plus that letter in lowercase, then remove any space that was added at the start of the string.
Further reading:
String .replace() method
Regular expressions
We can loop through once.
The below assumes the very first character should always be capitalized in our return array. If that is not true, simply remove the first if block from below.
For each character after that, we check to see if it is capitalized. If so, we add it to our return array, prefaced with a space. If not, we add it as-is into our array.
Finally, we join the array back into a string and return it.
const sentence = "CodefightsIsAwesome";
const amend = function(s) {
ret = [];
for (let i = 0; i < s.length; i++) {
const char = s[i];
if (i === 0) {
ret.push(char.toUpperCase());
} else if (char.toUpperCase() === char) {
ret.push(` ${char.toLowerCase()}`);
} else {
ret.push(char);
}
}
return ret.join('');
};
console.log(amend(sentence));

Javascript validation regex for names

I am looking to accept names in my app with letters and hyphens or dashes, i based my code on an answer i found here
and coded that:
function validName(n){
var nameRegex = /^[a-zA-Z\-]+$/;
if(n.match(nameRegex) == null){
return "Wrong";
}
else{
return "Right";
}
}
the only problem is that it accepts hyphen as the first letter (even multiple ones) which i don't want.
thanks
Use negative lookahead assertion to avoid matching the string starting with a hyphen. Although there is no need to escape - in the character class when provided at the end of character class. Use - removed character class for avoiding - at ending or use lookahead assertion.
var nameRegex = /^(?!-)[a-zA-Z-]*[a-zA-Z]$/;
// or
var nameRegex = /^(?!-)(?!.*-$)[a-zA-Z-]+$/;
var nameRegex = /^(?!-)[a-zA-Z-]*[a-zA-Z]$/;
// or
var nameRegex1 = /^(?!-)(?!.*-$)[a-zA-Z-]+$/;
function validName(n) {
if (n.match(nameRegex) == null) {
return "Wrong";
} else {
return "Right";
}
}
function validName1(n) {
if (n.match(nameRegex1) == null) {
return "Wrong";
} else {
return "Right";
}
}
console.log(validName('abc'));
console.log(validName('abc-'));
console.log(validName('-abc'));
console.log(validName('-abc-'));
console.log(validName('a-b-c'));
console.log(validName1('abc'));
console.log(validName1('abc-'));
console.log(validName1('-abc'));
console.log(validName1('-abc-'));
console.log(validName1('a-b-c'));
FYI : You can use RegExp#test method for searching regex match and which returns boolean based on regex match.
if(nameRegex.test(n)){
return "Right";
}
else{
return "Wrong";
}
UPDATE : If you want only single optional - in between words, then use a 0 or more repetitive group which starts with -as in #WiktorStribiżew answer .
var nameRegex = /^[a-zA-Z]+(?:-[a-zA-Z]+)*$/;
You need to decompose your single character class into 2 , moving the hyphen outside of it and use a grouping construct to match sequences of the hyphen + the alphanumerics:
var nameRegex = /^[a-zA-Z]+(?:-[a-zA-Z]+)*$/;
See the regex demo
This will match alphanumeric chars (1 or more) at the start of the string and then will match 0 or more occurrences of - + one or more alphanumeric chars up to the end of the string.
If there can be only 1 hyphen in the string, replace * at the end with ? (see the regex demo).
If you also want to allow whitespace between the alphanumeric chars, replace the - with [\s-] (demo).
You can either use a negative lookahead like Pranav C Balan propsed or just use this simple expression:
^[a-zA-Z]+[a-zA-Z-]*$
Live example: https://regex101.com/r/Dj0eTH/1
The below regex is useful for surnames if one wants to forbid leading or trailing non-alphabetic characters, while permitting a small set of common word-joining characters in between two names.
^[a-zA-Z]+[- ']{0,1}[a-zA-Z]+$
Explanation
^[a-zA-Z]+ must begin with at least one letter
[- ']{0,1} allow zero or at most one of any of -, or '
[a-zA-Z]+$ must end with at least one letter
Test cases
(The double-quotes have been added purely to illustrate the presence of whitespace.)
"Blair" => match
" Blair" => no match
"Blair " => no match
"-Blair" => no match
"- Blair" => no match
"Blair-" => no match
"Blair -" => no match
"Blair-Nangle" => match
"Blair--Nangle" => no match
"Blair Nangle" => match
"Blair -Nangle" => no match
"O'Nangle" => match
"BN" => match
"BN " => no match
" O'Nangle" => no match
"B" => no match
"3Blair" => no match
"!Blair" => no match
"van Nangle" => match
"Blair'" => no match
"'Blair" => no match
Limitations include:
No single-character surnames
No surnames composed of more than two words
Check it out on regex101.

Categories

Resources