RegEx loop behaviour in TypeScript - Angular - javascript

I have strange problem with regex, im trying to check the user input in contentEditable div with regex, after each keydown, and if it match for example "hello" or "status", it should return modified text with <span style="color: purple">hello<span>. And it works properly with unique words, phrases or on paste, but when i declare both "hello" and "hello world" as key words, and type it in contentEditable, regex match only "hello", even if "hello world" is first in my array of strings.
Here is the code of my function:
searchByRegEx(wordsArr: string[], sentence: string): string {
let matchingWords = []; // matching words array
wordsArr.forEach((label) => {
const regEx = new RegExp(label, 'gi');
regEx.lastIndex = 0
let match = regEx.exec(sentence);
while (match) {
// console.log(match) - results of this console.log below
matchingWords.push(match[0]);
match = regEx.exec(sentence);
}
});
matchingWords = matchingWords.sort(function (a, b) {
return b.length - a.length;
});
matchingWords.forEach((word) => {
sentence = sentence.replaceAll(
word,
`<span style='color:${InputColorsHighlightValue.PURPLE}'>${word}</span>`
);
});
return sentence
}
}
And here is how i use it:
if (this.labels) {
textToShow = this.searchByRegEx(['hello', 'hello world'], textToShow)
}
This is how it looks in devTools, regex match properly, but ONLY on paste :
And here when i try to type it manually, it checks on every keydown, but cant match both hello and hello world. And as you can see, input in regex is the same as above:
I am struggling with this functionality and would appreciate any helpful advice.
Live version in stack blitz

There are a few things wrong with the actual regex in your example. If /hello?/gi is your input for a regex, consider the following things:
Your regex function already has those gi flags (new RegExp(label, 'gi');)
There are two ways to use a regex in javascript, new RegExp(regex) or /regex/. When you put a regex between slashes /hello/, you use the second one. Don't combine these two, it won't work!
You have a question mark on the single matchable part of your regex. This basically means that your regex does not have to match anything (and I don't know if your browser even knows how to process that correctly, I think the regex might be invalid). Since it should match either 'hello' or nothing at all, you should remove the ? parameter.
If you just want to match a plain string, not a pattern, don't use regexes. You can, but it's unnecessary obfuscation and probably computationally more demanding (i.e. lower performance).

Related

Finding punctuation marks in text with string-methods

how can I find out when a punctuation(?!;.) or "<" character comes in the string. I don’t want to use an array or compare any letter, but try to solve it with string methods. Something like that:
var text = corpus.substr(0, corpus.indexOf(".");
Ok, if I explicitly specify a character like a punct, it works fine. The problem with my parsing is that with a long text in a loop, I no longer know how a sentence ends, whether with question marks or exclamation points. I tried following, but it doesn’t work:
var text = corpus.substr(0, corpus.indexOf(corpus.search("."));
I want to loop through a long string and use every punctuation found to use it as the end-of-sentence character.
Do you know how can I solve my problem?
You can start with RegExp and weight it against going character by character and compare ascii codes essentially. Split is another way ( just posted above ).
RegExp solution
function getTextUpToPunc( text ) {
const regExp = /^.+(\!|\?|\.)/mg;
for (let match; (match = regExp.exec( text )) !== null;) {
console.log(match);
}
}
getTextUpToPunc(
"what a chunky funky monkey! this is really someting else"
)
The key advantage here is that you do not need to loop through the entire string and hold control over the iteration by doing regExp.exec( text ).
The split solution posted earlier would work but split will loop over the entire string. Typically that would not be an issue but if your strings are thousands upon thousands of characters and you do this operation a lot that it would make sense to think about performance.
And if this function will be ran many many times, a small performance improvement would be to memoize the RegExp creation:
const regExp = /^.+(\!|\?|\.)/mg;
Into something like this
function getTextUpToPunc( text ) {
if( !this._regExp ) this._regExp = /^.+(\!|\?|\.)/mg;;
const regExp = this._regExp;
for (let match; (match = regExp.exec( text )) !== null;) {
console.log(match);
}
}
Use a regular expression:
var text = corpus.split(/[(?!;.)<]/g);

Test fails when I include an array reference in regex (array with index in regex) JavaScript

I am doing a challenge on freeCodeCamp. I am passed an array with 2 strings, the instructions are to test to see if the letters in the second string are in the first string.
Here's what I have:
return /[arr\[1\]]/gi.test(arr[0]);
This passes all the tests except where it tries to match with a capital letter.
mutation(["hello", "Hello"]) should return true.
It's the only test that fails, I have tested my regex on regexr.com with:
/[Hello]/gi and it matches with 'hello'.
Yes, there are other ways to do it, but why does it fail when I pass the string into the regex from the array?
EDIT: https://learn.freecodecamp.org/javascript-algorithms-and-data-structures/basic-algorithm-scripting/mutations
keep in mind that with this: return /[arr\[1\]]/gi.test(arr[0]) you are evaluating exactly this string "arr[1]". test() is a method of RegExp, then to add variables in a regex, or build the regex as string, you should use the RegExp constructor. Like the example below.
See this for browser compatibility of flags.
function mutation(str){
var r = new RegExp(str[0].toLowerCase(), "gi")
return r.test(str[1].toLowerCase());
}
console.log(mutation(["hello", "Hello"]))
The fact that your code passes the test for ["Mary", "Army"] shows that the problem is not one of case sensitivity. The only reason your code passes any of the tests is that /[arr\[1\]]/ looks for matches against the set of characters ar1[] which coincidentally happens to correctly match 8 of the 9 tests. Anyway the other - perhaps biggest - issue is that you are not testing all of the characters in arr[1] against arr[0]; if you run #Emeeus's answer it returns false positives for many of the tests. So, to test all of the characters in arr[1] against arr[0] you need something like this:
function mutation(arr) {
return arr[1].split('').reduce((t, c) => t && new RegExp(c, 'i').test(arr[0]), true);
}
let tests = [
['hello', 'hey'],
["hello", "Hello"],
["zyxwvutsrqponmlkjihgfedcba", "qrstu"],
["Mary", "Army"],
["Mary", "Aarmy"],
["Alien", "line"],
["floor", "for"],
["hello", "neo"],
["voodoo", "no"]
];
tests.map(arr => console.log(arr[0] + ", " + arr[1] + " => " + (mutation(arr) ? 'match' : 'no match')));
JavaScript has a special syntax for Regular Expressions. Those two lines are essentially the same:
return /[arr\[1\]]/gi.test(arr[0]);
return new RegExp('[arr\\[1\\]]', 'gi').test(arr[0]);
but what you probably want is this:
new RegExp('['+arr[1]+']', 'gi').test(arr[0]);
However, you should be careful since this approach does not work if it contains special characters such as '[' or ']'.
Whenever you have a javascript variable in a regular expression, you should construct a new RegExp object. Taken from your question, it should look like this
return new RegExp(arr[1], "gi").test(arr[0]);
As one hint on freeCodeCamp.org says, you can solve the problem easier if you transform the strings into arrays, using the spread operator. No need for regular expressions.
Instead of:
return /[arr\[1\]]/gi.test(arr[0]);
you can do:
return new RegEx(arr[1], gi);
Your code uses a character match ([ ]), not a string match, so it will match anything, that has those characters directly (That's why uppercase and lowercase differs, although you have specified 'i').
The new expression directly uses the string to match, not just the characters.

regex to remove certain characters at the beginning and end of a string

Let's say I have a string like this:
...hello world.bye
But I want to remove the first three dots and replace .bye with !
So the output should be
hello world!
it should only match if both conditions apply (... at the beginning and .bye at the end)
And I'm trying to use js replace method. Could you please help? Thanks
First match the dots, capture and lazy-repeat any character until you get to .bye, and match the .bye. Then, you can replace with the first captured group, plus an exclamation mark:
const str = '...hello world.bye';
console.log(str.replace(/\.\.\.(.*)\.bye/, '$1!'));
The lazy-repeat is there to ensure you don't match too much, for example:
const str = `...hello world.bye
...Hello again! Goodbye.`;
console.log(str.replace(/\.\.\.(.*)\.bye/g, '$1!'));
You don't actually need a regex to do this. Although it's a bit inelegant, the following should work fine (obviously the function can be called whatever makes sense in the context of your application):
function manipulate(string) {
if (string.slice(0, 3) == "..." && string.slice(-4) == ".bye") {
return string.slice(4, -4) + "!";
}
return string;
}
(Apologies if I made any stupid errors with indexing there, but the basic idea should be obvious.)
This, to me at least, has the advantage of being easier to reason about than a regex. Of course if you need to deal with more complicated cases you may reach the point where a regex is best - but I personally wouldn't bother for a simple use-case like the one mentioned in the OP.
Your regex would be
const rx = /\.\.\.([\s\S]*?)\.bye/g
const out = '\n\nfoobar...hello world.bye\nfoobar...ok.bye\n...line\nbreak.bye\n'.replace(rx, `$1!`)
console.log(out)
In English, find three dots, anything eager in group, and ending with .bye.
The replacement uses the first match $1 and concats ! using a string template.
An arguably simpler solution:
const str = '...hello world.bye'
const newStr = /...(.+)\.bye/.exec(str)
const formatted = newStr ? newStr[1] + '!' : str
console.log(formatted)
If the string doesn't match the regex it will just return the string.

AngularJS filter to remove a certain regular expression

I am attempting to make an angularJS filter which will remove timestamps that look like this: (##:##:##) or ##:##:##.
This is a filter to remove all letters:
.filter('noLetter', function() {
//this filter removes all letters
return function removeLetters(string){
return string.replace(/[^0-9]+/g, " ");
}
})
This is my attempt to make a filter that removes the time stamps, however it is not working, help is much appreciated.
.filter('noStamps', function () {
return function removeStamps(item) {
return item.replace(/^\([0-9][0-9]:[0-9][0-9]:[0-9][0-9]\)$/i, "");
}
})
My goal is for it to delete the timestamps it finds and leave nothing in their place.
edit based on question in comments:
The time stamps are in the text so it would say "this is an example 21:20:19 of what I am 21:20:20 trying to do 21:20:22"
I would want this to be converted into "this is an example of what I am trying to do" by the filter.
You may use
/\s*\(?\b\d{2}:\d{2}:\d{2}\b\)?/g
See regex demo
Thre main points:
The ^(start of string) and $(end of string) anchors should be removed so that the expression becomes unanchored, and can match input text partially.
Global flag to match all occurrences
Limiting quantifier {2} to shorten the regex (and the use of a shorthand class \d helps shorten it, too)
\)? and \(? are used with ?quantifier to match 1 or 0 occurrences of the round brackets.
\s* in the beginning "trims" the result (as the leading whitespace is matched).
JS snippet:
var str = 'this is an example (21:20:19) of what I am 21:20:20 trying to do 21:20:22';
var result = str.replace(/\s*\(?\b\d{2}:\d{2}:\d{2}\b\)?/g, '');
document.getElementById("r").innerHTML = result;
<div id="r"/>

Javascript/Jquery - how to replace a word but only when not part of another word?

I am currently doing a regex comparison to remove words (rude words) from a text field when written by the user. At the moment it performs the check when the user hits space and removes the word if matches. However it will remove the word even if it is part of another word. So if you type apple followed by space it will be removed, that's ok. But if you type applepie followed by space it will remove 'apple' and leave pie, that's not ok. I am trying to make it so that in this instance if apple is part of another word it will not be removed.
Is there any way I can perform the comparison on the whole word only or ignore the comparison if it is combined with other characters?
I know that this allows people to write many rude things with no space. But that is the desired effect by the people that give me orders :(
Thanks for any help.
function rude(string) {
var regex = /apple|pear|orange|banana/ig;
//exaple words because I'm sure you don't need to read profanity
var updatedString = string.replace( regex, function(s) {
var blank = "";
return blank;
});
return updatedString;
}
$(input).keyup(function(event) {
var text;
if (event.keyCode == 32) {
var text = rude($(this).val());
$(this).val(text);
$("someText").html(text);
}
}
You can use word boundaries (\b), which match 0 characters, but only at the beginning or end of a word. I'm also using grouping (the parentheses), so it's easier to read an write such expressions.
var regex = /\b(apple|pear|orange|banana)\b/ig;
BTW, in your example you don't need to use a function. This is sufficient:
function rude(string) {
var regex = /\b(apple|pear|orange|banana)\b/ig;
return string.replace(regex, '');
}

Categories

Resources