How to search strings with brackets using Regular expressions

How to search strings with brackets using Regular expressions - javascript

I have a case wherein I want to search for all Hello (World) in an array. Hello (World) is coming from a variable and can change. I want to achieve this using RegExp and not indexOf or includes methods.
testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)']
My match should return index 1 & 2 as answers.

Use the RegExp constructor after escaping the string (algorithm from this answer), and use some array methods:
const testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)'];
const string = "Hello (World)".replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
const regex = new RegExp(string, "i");
const indexes = testArray.map((e, i) => e.match(regex) == null ? null : i).filter(e => e != null);
console.log(indexes);

This expression might help you to do so:
(\w+)\s\((\w+)
You may not need to bound it from left and right, since your input strings are well structured. You might just focus on your desired capturing groups, which I have assumed, each one is a single word, which you can simply modify that.
With a simple string replace you can match and capture both of them.
RegEx Descriptive Graph
This graph shows how the expression would work and you can visualize other expressions in this link:
Performance Test
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "Hello (World";
var regex = /(\w+)\s\((\w+)/g;
var match = string.replace(regex, "$1 & $2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)'];
let indexes = [];
testArray.map((word,i)=>{
if(word.match(/\(.*\)/)){
indexes.push(i);
}
});
console.log(indexes);

Related

String.replaceAll using regular expression in both parameters in JS

I need to replace all instances of a substring with a modified version of the substring, can I do something like:
const regex = /[0-9]{4}[A-Z]{3}/g; // format: 0000ABC
myString = myString.replaceAll(regex, regex + " I'm modified");
Abstract example
If myString is
5000ABC, 250XYZ, GEN3000
and I want to modify certain 4 digit - 3 letter patterns, my expected output is
5000ABC I'm modified, 250XYZ, 1000DEF I'm modified, GEN3000

Not sure I understand the output format you want, but guessing from your example it looks like you want to append some string after each match. Here we go:
const input = 'Some text with 1234ABC and 5678XYZ';
const regex = /([0-9]{4}[A-Z]{3})/g; // format: 0000ABC
var result = input.replace(regex, "$1 I'm modified");
console.log('input: ' + input);
console.log('result: ' + result);
Output:
input: Some text with 1234ABC and 5678XYZ
result: Some text with 1234ABC I'm modified and 5678XYZ I'm modified
Explanation:
capture your pattern with parenthesis for later use
use a string .replace() where you can reference the captured pattern with $1, and prefix/append any text you want

The workaround I have at the moment is
const regex = /[0-9]{4}[A-Z]{3}/g;
var matches = myString.match(regex);
if (matches && matches.length > 0) {
for (var match of matches) {
var modified = match + "I'm modified";
myString = myString.replaceAll(match, modified)
}
}

Split and replace text by two rules (regex)

I trying to split text by two rules:
Split by whitespace
Split words greater than 5 symbols into two separate words like (aaaaawww into aaaaa- and www)
I create regex that can detect this rules (https://regex101.com/r/fyskB3/2) but can't understand how to make both rules work in (text.split(/REGEX/)
Currently regex - (([\s]+)|(\w{5})(?=\w))
For example initial text is hello i am markopollo and result should look like ['hello', 'i', 'am', 'marko-', 'pollo']

It would probably be easier to use .match: match up to 5 characters that aren't whitespace:
const str = 'wqerweirj ioqwejr qiwejrio jqoiwejr qwer qwer';
console.log(
str.match(/[^ ]{1,5}/g)
)

My approach would be to process the string before splitting (I'm a big fan of RegEx):
1- Search and replace all the 5 consecutive non-last characters with \1-.
The pattern (\w{5}\B) will do the trick, \w{5} will match 5 exact characters and \B will match only if the last character is not the ending character of the word.
2- Split the string by spaces.
var text = "hello123467891234 i am markopollo";
var regex = /(\w{5}\B)/g;
var processedText = text.replace(regex, "$1- ");
var result = processedText.split(" ");
console.log(result)
Hope it helps!

Something like this should work:
const str = "hello i am markopollo";
const words = str.split(/\s+/);
const CHUNK_SIZE=5;
const out = [];
for(const word of words) {
if(word.length > CHUNK_SIZE) {
let chunks = chunkSubstr(word,CHUNK_SIZE);
let last = chunks.pop();
out.push(...chunks.map(c => c + '-'),last);
} else {
out.push(word);
}
}
console.log(out);
// credit: https://stackoverflow.com/a/29202760/65387
function chunkSubstr(str, size) {
const numChunks = Math.ceil(str.length / size)
const chunks = new Array(numChunks)
for (let i = 0, o = 0; i < numChunks; ++i, o += size) {
chunks[i] = str.substr(o, size)
}
return chunks
}
i.e., first split the string into words on spaces, and then find words longer than 5 chars and 'chunk' them. I popped off the last chunk to avoid adding a - to it, but there might be a more efficient way if you patch chunkSubstr instead.
regex.split doesn't work so well because it will basically remove those items from the output. In your case, it appears you want to strip the whitespace but keep the words, so splitting on both won't work.

Uses the regex expression of #CertainPerformance = [^\s]{1,5}, then apply regex.exec, finally loop all matches to reach the goal.
Like below demo:
const str = 'wqerweirj ioqwejr qiwejrio jqoiwejr qwer qwer'
let regex1 = RegExp('[^ ]{1,5}', 'g')
function customSplit(targetString, regexExpress) {
let result = []
let matchItem = null
while ((matchItem = regexExpress.exec(targetString)) !== null) {
result.push(
matchItem[0] + (
matchItem[0].length === 5 && targetString[regexExpress.lastIndex] && targetString[regexExpress.lastIndex] !== ' '
? '-' : '')
)
}
return result
}
console.log(customSplit(str, regex1))
console.log(customSplit('hello i am markopollo', regex1))

Find number of replacements when using a global Regular Expression

I have a sentence (string) containing words. I want to replace all occurrences of a word with another. I use newString = oldString.replace(/w1/gi, w2);, but now I need to report to the user how many words I actually replaced.
Is there a quick way to do it without resorting to:
Replacing one word at a time and counting.
Comparing oldString to newString word-by-word and tallying the differences?
(The easy case is if oldString === newString => 0 replacements, but beyond that, I'll have to run over both and compare).
Is there any RegEx "trickery" I can use here, or should I just avoid using the g flag?

Option 1: Using the replace callback
By using the callback, you can increment a counter, and then return the new word in the callback, this allows you to traverse the string only 1 time, and achieve a count.
var string = 'Hello, hello, hello, this is so awesome';
var count = 0;
string = string.replace(/hello/gi, function() {
count++;
return 'hi';
});
console.log('New string:', string);
console.log('Words replaced', count);
Option 2: Using split and join
Also using the split method, instead of using regex, just join with the new word to create the new string. This solution allows you to avoid using regex at all to achieve counts.
var string = 'Hello, hello, hello, this is so awesome';
string = string.split(/hello/i);
var count = string.length - 1;
string = string.join('Hi');
console.log('New string:', string);
console.log('Words replaced', count);

You could split the string with the regex you're using and get the length.
oldString.split(/w1/gi).length - 1
Working example:
var string = "The is the of the and the";
var newString = string.replace(/the/gi, "hello");
var wordsReplaced = string.split(/the/gi).length - 1;
console.log("Words replaced: ", wordsReplaced);

Bold part of String

What is the best way to bold a part of string in Javascript?
I have an array of objects. Each object has a name. There is also an input parameter.
If, for example, you write "sa" in input, it automatically searches in array looking for objects with names that contain "sa" string.
When I print all the names, I want to bold the part of the name that coincide with the input text.
For example, if I search for "Ma":
Maria
Amaria
etc...
I need a solution that doesn't use jQuery. Help is appreciated.
PD: The final strings are in the tag. I create a list using angular ng-repeat.
This is the code:
$scope.users = data;
for (var i = data.length - 1; i >= 0; i--) {
data[i].name=data[i].name.replace($scope.modelCiudad,"<b>"+$scope.modelCiudad+"</b>");
};
ModelCiudad is the input text content var. And data is the array of objects.
In this code if for example ModelCiudad is "ma" the result of each is:
<b>Ma</b>ria
not Maria

You can use Javascript's str.replace() method, where str is equal to all of the text you want to search through.
var str = "Hello";
var substr = "el";
str.replace(substr, '<b>' + substr + '</b>');
The above will only replace the first instance of substr. If you want to handle replacing multiple substrings within a string, you have to use a regular expression with the g modifier.
function boldString(str, substr) {
var strRegExp = new RegExp(substr, 'g');
return str.replace(strRegExp, '<b>'+substr+'</b>');
}
In practice calling boldString would looks something like:
boldString("Hello, can you help me?", "el");
// Returns: H<b>el</b>lo can you h<b>el</b>p me?
Which when rendered by the browser will look something like: Hello can you help me?
Here is a JSFiddle with an example: https://jsfiddle.net/1rennp8r/3/
A concise ES6 solution could look something like this:
const boldString = (str, substr) => str.replace(RegExp(substr, 'g'), `<b>${substr}</b>`);
Where str is the string you want to modify, and substr is the substring to bold.
ES12 introduces a new string method str.replaceAll() which obviates the need for regex if replacing all occurrences at once. It's usage in this case would look something like this:
const boldString = (str, substr) => str.replaceAll(substr, `<b>${substr}</b>`);
I should mention that in order for these latter approaches to work, your environment must support ES6/ES12 (or use a tool like Babel to transpile).
Another important note is that all of these approaches are case sensitive.

Here's a pure JS solution that preserves the original case (ignoring the case of the query thus):
const boldQuery = (str, query) => {
const n = str.toUpperCase();
const q = query.toUpperCase();
const x = n.indexOf(q);
if (!q || x === -1) {
return str; // bail early
}
const l = q.length;
return str.substr(0, x) + '<b>' + str.substr(x, l) + '</b>' + str.substr(x + l);
}
Test:
boldQuery('Maria', 'mar'); // "<b>Mar</b>ia"
boldQuery('Almaria', 'Mar'); // "Al<b>mar</b>ia"

I ran into a similar problem today - except I wanted to match whole words and not substrings. so if const text = 'The quick brown foxes jumped' and const word = 'foxes' than I want the result to be 'The quick brown <strong>foxes</strong> jumped'; however if const word = 'fox', than I expect no change.
I ended up doing something similar to the following:
const pattern = `(\\s|\\b)(${word})(\\s|\\b)`;
const regexp = new RegExp(pattern, 'ig'); // ignore case (optional) and match all
const replaceMask = `$1<strong>$2</strong>$3`;
return text.replace(regexp, replaceMask);
First I get the exact word which is either before/after some whitespace or a word boundary, and then I replace it with the same whitespace (if any) and word, except the word is wrapped in a <strong> tag.

Here is a version I came up with if you want to style words or individual characters at their index in react/javascript.
replaceAt( yourArrayOfIndexes, yourString/orArrayOfStrings )
Working example: https://codesandbox.io/s/ov7zxp9mjq
function replaceAt(indexArray, [...string]) {
const replaceValue = i => string[i] = <b>{string[i]}</b>;
indexArray.forEach(replaceValue);
return string;
}
And here is another alternate method
function replaceAt(indexArray, [...string]) {
const startTag = '<b>';
const endTag = '</b>';
const tagLetter = i => string.splice(i, 1, startTag + string[i] + endTag);
indexArray.forEach(tagLetter);
return string.join('');
}
And another...
function replaceAt(indexArray, [...string]) {
for (let i = 0; i < indexArray.length; i++) {
string = Object.assign(string, {
[indexArray[i]]: <b>{string[indexArray[i]]}</b>
});
}
return string;
}

Above solutions are great, but are limited! Imagine a test scenerio where you want to match case insensitive query in a string and they could be multiple matches.
For example
Query: ma
String: The Amazing Spiderman
Expected Result: The Amazing Spiderman
For above scenerio, use this:
const boldMatchText = (text,searchInput) => {
let str = text.toLowerCase();
const query = searchInput.toLowerCase();
let result = "";
let queryLoc = str.indexOf(query);
if (queryLoc === -1) {
result += text;
} else
do {
result += ` ${text.substr(0, queryLoc)}
<b>${text.substr(queryLoc, query.length)}</b>`;
str = str.substr(queryLoc + query.length, str.length);
text = text.substr(queryLoc + query.length, str.length);
queryLoc = str.indexOf(query);
} while (text.length > 0 && queryLoc !== -1);
return result + text;
};

Javascript: highlight substring keeping original case but searching in case insensitive mode

I'm trying to write a "suggestion search box" and I cannot find a solution that allows to highlight a substring with javascript keeping the original case.
For example if I search for "ca" I search server side in a case insensitive mode and I have the following results:
Calculator
calendar
ESCAPE
I would like to view the search string in all the previous words, so the result should be:
Calculator
calendar
ESCAPE
I tried with the following code:
var reg = new RegExp(querystr, 'gi');
var final_str = 'foo ' + result.replace(reg, '<b>'+querystr+'</b>');
$('#'+id).html(final_str);
But obviously in this way I loose the original case!
Is there a way to solve this problem?

Use a function for the second argument for .replace() that returns the actual matched string with the concatenated tags.
Try it out: http://jsfiddle.net/4sGLL/
reg = new RegExp(querystr, 'gi');
// The str parameter references the matched string
// --------------------------------------v
final_str = 'foo ' + result.replace(reg, function(str) {return '<b>'+str+'</b>'});
$('#' + id).html(final_str);
JSFiddle Example with Input: https://jsfiddle.net/pawmbude/

ES6 version
const highlight = (needle, haystack) =>
haystack.replace(
new RegExp(needle, 'gi'),
(str) => `<strong>${str}</strong>`
);

nice results with
function str_highlight_text(string, str_to_highlight){
var reg = new RegExp(str_to_highlight, 'gi');
return string.replace(reg, function(str) {return '<span style="background-color:#ffbf00;color:#fff;"><b>'+str+'</b></span>'});
}
and easier to remember...
thx to user113716: https://stackoverflow.com/a/3294644/2065594

While the other answers so far seem simple, they can't be really used in many real world cases as they don't handle proper text HTML escaping and RegExp escaping. If you want to highlight every possible snippet, while escaping the text properly, a function like that would return all elements you should add to your suggestions box:
function highlightLabel(label, term) {
if (!term) return [ document.createTextNode(label) ]
const regex = new RegExp(term.replace(/[\\^$*+?.()|[\]{}]/g, '\\$&'), 'gi')
const result = []
let left, match, right = label
while (match = right.match(regex)) {
const m = match[0], hl = document.createElement('b'), i = match.index
hl.innerText = m
left = right.slice(0, i)
right = right.slice(i + m.length)
result.push(document.createTextNode(left), hl)
if (!right.length) return result
}
result.push(document.createTextNode(right))
return result
}

string.replace fails in the general case. If you use .innerHTML, replace can replace matches in tags (like a tags). If you use .innerText or .textContent, it will remove any tags there were previously in the html. More than that, in both cases it damages your html if you want to remove the highlighting.
The true answer is mark.js (https://markjs.io/). I just found this - it is what I have been searching for for such a long time. It does just what you want it to.

I do the exact same thing.
You need to make a copy.
I store in the db a copy of the real string, in all lower case.
Then I search using a lower case version of the query string or do a case insensitive regexp.
Then use the resulting found start index in the main string, plus the length of the query string, to highlight the query string within the result.
You can not use the query string in the result since its case is not determinate. You need to highlight a portion of the original string.

.match() performs case insensitive matching and returns an array of the matches with case intact.
var matches = str.match(queryString),
startHere = 0,
nextMatch,
resultStr ='',
qLength = queryString.length;
for (var match in matches) {
nextMatch = str.substr(startHere).indexOf(match);
resultStr = resultStr + str.substr(startHere, nextMatch) + '<b>' + match + '</b>';
startHere = nextMatch + qLength;
}

I have found a easiest way to achieve it. JavaScript regular expression remembers the string it matched. This feature can be used here.
I have modified the code a bit.
reg = new RegExp("("+querystr.trim()+")", 'gi');
final_str = 'foo ' + result.replace(reg, "<b>&1</b>");
$('#'+id).html(final_str);

Highlight search term and anchoring to first occurence - Start
function highlightSearchText(searchText) {
var innerHTML = document.documentElement.innerHTML;
var replaceString = '<mark>'+searchText+'</mark>';
var newInnerHtml = this.replaceAll(innerHTML, searchText, replaceString);
document.documentElement.innerHTML = newInnerHtml;
var elmnt = document.documentElement.getElementsByTagName('mark')[0]
elmnt.scrollIntoView();
}
function replaceAll(str, querystr, replace) {
var reg = new RegExp(querystr, 'gi');
var final_str = str.replace(reg, function(str) {return '<mark>'+str+'</mark>'});
return final_str
}
Highlight search term and anchoring to first occurence - End

Develop Reference

JavaScript is the programming language of the Web.

How to search strings with brackets using Regular expressions - javascript

testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)']; let indexes = []; testArray.map((word,i)=>{ if(word.match(/\(.*\)/)){ indexes.push(i); } }); console.log(indexes);

Related

String.replaceAll using regular expression in both parameters in JS

Split and replace text by two rules (regex)

Find number of replacements when using a global Regular Expression

Bold part of String

Javascript: highlight substring keeping original case but searching in case insensitive mode

Categories

Resources