Regex word search with apostrophe - javascript

highlightStr: function (body, searchString){
console.log(searchString);
var regex = new RegExp('(' + searchString + ')', 'gi');
console.log(regex)
return body.replace(regex, "<span class='text-highlight'>$1</span>");
}
Above is the code I'm using. I want to find and replace the searchString, which could be anything. It works fine for most words, but fails when finding words with apostrophes.
How can I modify the regex to include special characters like the appostrophe.
var body = "<br>I like that Apple’s.<br>";
var searchString = "Apple's";
Thank you

You should escape the search string to make sure the regex works OK even if the search string contains special regex metacharacters.
Besides, there is no need wrapping the whole pattern with a capturing group, you may always reference the whole match with $& placeholder from the replacement pattern.
Here is an example code:
var s = "I like that Apple's color";
var searchString = "Apple's";
var regex = new RegExp(searchString.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), "gi");
document.body.innerHTML = s.replace(regex, '<b>$&</b>');

Related

How to put a variable in my JS regular expression? [duplicate]

I want to add a (variable) tag to values with regex, the pattern works fine with PHP but I have troubles implementing it into JavaScript.
The pattern is (value is the variable):
/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is
I escaped the backslashes:
var str = $("#div").html();
var regex = "/(?!(?:[^<]+>|[^>]+<\\/a>))\\b(" + value + ")\\b/is";
$("#div").html(str.replace(regex, "" + value + ""));
But this seem not to be right, I logged the pattern and its exactly what it should be.
Any ideas?
To create the regex from a string, you have to use JavaScript's RegExp object.
If you also want to match/replace more than one time, then you must add the g (global match) flag. Here's an example:
var stringToGoIntoTheRegex = "abc";
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
In the general case, escape the string before using as regex:
Not every string is a valid regex, though: there are some speciall characters, like ( or [. To work around this issue, simply escape the string before turning it into a regex. A utility function for that goes in the sample below:
function escapeRegExp(stringToGoIntoTheRegex) {
return stringToGoIntoTheRegex.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}
var stringToGoIntoTheRegex = escapeRegExp("abc"); // this is the only change from above
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
Note: the regex in the question uses the s modifier, which didn't exist at the time of the question, but does exist -- a s (dotall) flag/modifier in JavaScript -- today.
If you are trying to use a variable value in the expression, you must use the RegExp "constructor".
var regex = "(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b";
new RegExp(regex, "is")
I found I had to double slash the \b to get it working. For example to remove "1x" words from a string using a variable, I needed to use:
str = "1x";
var regex = new RegExp("\\b"+str+"\\b","g"); // same as inv.replace(/\b1x\b/g, "")
inv=inv.replace(regex, "");
You don't need the " to define a regular expression so just:
var regex = /(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is; // this is valid syntax
If value is a variable and you want a dynamic regular expression then you can't use this notation; use the alternative notation.
String.replace also accepts strings as input, so you can do "fox".replace("fox", "bear");
Alternative:
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(.*?)\b/", "is");
Keep in mind that if value contains regular expressions characters like (, [ and ? you will need to escape them.
I found this thread useful - so I thought I would add the answer to my own problem.
I wanted to edit a database configuration file (datastax cassandra) from a node application in javascript and for one of the settings in the file I needed to match on a string and then replace the line following it.
This was my solution.
dse_cassandra_yaml='/etc/dse/cassandra/cassandra.yaml'
// a) find the searchString and grab all text on the following line to it
// b) replace all next line text with a newString supplied to function
// note - leaves searchString text untouched
function replaceStringNextLine(file, searchString, newString) {
fs.readFile(file, 'utf-8', function(err, data){
if (err) throw err;
// need to use double escape '\\' when putting regex in strings !
var re = "\\s+(\\-\\s(.*)?)(?:\\s|$)";
var myRegExp = new RegExp(searchString + re, "g");
var match = myRegExp.exec(data);
var replaceThis = match[1];
var writeString = data.replace(replaceThis, newString);
fs.writeFile(file, writeString, 'utf-8', function (err) {
if (err) throw err;
console.log(file + ' updated');
});
});
}
searchString = "data_file_directories:"
newString = "- /mnt/cassandra/data"
replaceStringNextLine(dse_cassandra_yaml, searchString, newString );
After running, it will change the existing data directory setting to the new one:
config file before:
data_file_directories:
- /var/lib/cassandra/data
config file after:
data_file_directories:
- /mnt/cassandra/data
Much easier way: use template literals.
var variable = 'foo'
var expression = `.*${variable}.*`
var re = new RegExp(expression, 'g')
re.test('fdjklsffoodjkslfd') // true
re.test('fdjklsfdjkslfd') // false
Using string variable(s) content as part of a more complex composed regex expression (es6|ts)
This example will replace all urls using my-domain.com to my-other-domain (both are variables).
You can do dynamic regexs by combining string values and other regex expressions within a raw string template. Using String.raw will prevent javascript from escaping any character within your string values.
// Strings with some data
const domainStr = 'my-domain.com'
const newDomain = 'my-other-domain.com'
// Make sure your string is regex friendly
// This will replace dots for '\'.
const regexUrl = /\./gm;
const substr = `\\\.`;
const domain = domainStr.replace(regexUrl, substr);
// domain is a regex friendly string: 'my-domain\.com'
console.log('Regex expresion for domain', domain)
// HERE!!! You can 'assemble a complex regex using string pieces.
const re = new RegExp( String.raw `([\'|\"]https:\/\/)(${domain})(\S+[\'|\"])`, 'gm');
// now I'll use the regex expression groups to replace the domain
const domainSubst = `$1${newDomain}$3`;
// const page contains all the html text
const result = page.replace(re, domainSubst);
note: Don't forget to use regex101.com to create, test and export REGEX code.
var string = "Hi welcome to stack overflow"
var toSearch = "stack"
//case insensitive search
var result = string.search(new RegExp(toSearch, "i")) > 0 ? 'Matched' : 'notMatched'
https://jsfiddle.net/9f0mb6Lz/
Hope this helps

Remove string between two variables

I have a string which has some data with a few special characters, Need to remove the data between the desired special char in JavaScript.
The special char would be obtained in a variable.
var desiredChar = "~0~";
And Imagine this to be the Input string:
~0~1|0|20170807|45|111.00|~0~~1~1|0|20170807|50|666.00|~1~~2~1|0|20170807|55|111.00|~2~
So I'm supposed to remove the text in bold.
The desired output is supposed to be-
~1~1|0|20170807|50|666.00|~1~~2~1|0|20170807|55|111.00|~2~
I've tried using "Replace" and "Regex", but as the desired character is being passed in a variable and keeps changing I'm facing difficulties.
You can create your own regex based on whatever the bounding character(s) are that contain the text you want removed, and then replace any text that matches that regex with a blank string "".
The JS below should work for your use case (and it should work for multiple occurrences as well):
var originalText = "~0~1|0|20170807|45|111.00|~0~~1~1|0|20170807|50|666.00|~1~~2~1|0|20170807|55|111.00|~2~";
var desiredChar = "~0~";
var customRegex = new RegExp(desiredChar + ".*?" + desiredChar, "gi");
var processedText = originalText.replace(customRegex, "");
console.log(processedText);
You can build your regex from the constructor with a string input.
var desiredChar = "~0~";
// use the g flag in your regex if you want to remove all substrings between desiredChar
var myRegex = new Regex(desiredChar + ".*" + desiredChar, 'ig');
var testString = "~0~1|0|20170807|45|111.00|~0~~1~1|0|20170807|50|666.00|~1~~2~1|0|20170807|55|111.00|~2~";
testString = testString.replace(myRegex, "");
Given input string you can use .indexOf(), .lastIndexOf() and .slice().
Note, OR character | passed to RegExp constructor should be escaped to avoid RegExp created by passing string interpreting | character as OR | within resulting RegExp passed to .replace().
var desiredChar = "~0~";
var str = "~0~1|0|20170807|45|111.00|~0~~1~1|0|20170807|50|666.00|~1~~2~1|0|20170807|55|111.00|~2~";
var not = str.slice(str.indexOf(desiredChar), str.lastIndexOf(desiredChar) + desiredChar.length);
// escape OR `|`
var res = str.replace(new RegExp(not.replace(/[|]/g, "\\|")), "");
console.log(res)
You can use the RegExp object:
var regexstring = "whatever";
var regexp = new RegExp(regexstring, "gi");
var str = "whateverTest";
var str2 = str.replace(regexp, "other");
document.write(str2);
Then you can construct regexstring in any way you want.
You can read more about it at http://www.regular-expressions.info/javascript.html

javascript regex Numbers and letters only

This should automatically remove characters NOT on my regex, but if I put in the string asdf sd %$##$, it doesnt remove anything, and if I put in this #sdf%#, it only removes the first character. I'm trying to make it remove any and all instances of those symbols/special characters (anything not on my regex), but its not working all the time. Thanks for any help:
function ohno(){
var pattern = new RegExp("[^a-zA-Z0-9]+");
var str = "#sdf%#"; //"asdf sd %$##$" // Try both
str = str.replace(pattern,' ');
document.getElementById('msg').innerHTML = str;
}
You need the g flag to remove more than one match:
var pattern = new RegExp("[^a-zA-Z0-9]+", "g");
Note that it would be more efficient and readable to use a regex literal instead of the RegExp constructor:
var pattern = /[^a-zA-Z0-9]+/g;
reference
You need to set global using "g", The flag indicates that the regular expression should be tested against all possible matches in a string.
new RegExp("[^a-zA-Z0-9]+", "g")
Reference
var pattern = new RegExp("[^a-zA-Z0-9]+", "g");
var str = "#sdf%#"; //"asdf sd %$##$" // Try both
str = str.replace(pattern,' ');
alert(str)

How to ignore newline in regexp?

How to ignore newline in regexp in Javascript ?
for example:
data = "\
<test>11\n
1</test>\n\
#EXTM3U\n\
"
var reg = new RegExp( "\<" + "test" + "\>(.*?)\<\/" + "test" + "\>" )
var match = data.match(reg)
console.log(match[1])
result: undefined
In JavaScript, there is no flag to tell to RegExp() that . should match newlines. So, you need to use a workaround e.g. [\s\S].
Your RegExp would then look like this:
var reg = new RegExp( "\<" + "test" + "\>([\s\S]*?)\<\/" + "test" + "\>" );
You are missing a JS newline character \ at the end of line 2.
Also, change regexp to:
var data = "\
<test>11\n\
1</test>\n\
#EXTM3U\n\
";
var reg = new RegExp(/<test>(.|\s)*<\/test>/);
var match = data.match(reg);
console.log(match[0]);
http://jsfiddle.net/samliew/DPc2E/
By reading this one: How to use JavaScript regex over multiple lines?
I came with that, which works:
var data = "<test>11\n1</test>\n#EXTM3U\n";
reg = /<test>([\s\S]*?)<\/test>/;
var match = data.match(reg);
console.log(match[1]);
Here is a fiddle: http://jsfiddle.net/Rpkj2/
Better you can use [\s\S] instead of . for multiline matching.
It is the most common JavaScript idiom for matching everything including newlines. It's easier on the eyes and much more efficient than an alternation-based approach like (.|\n).
EDIT: Got it:
I tried to use this regex in notepad++ But the problem is that it finds the whole text from beginning to end
MyRegex:
<hostname-validation>(.|\s)*<\/pathname-validation> (finds everything)
/<hostname-validation>(.|\s)*<\/pathname-validation>/ (finds nothing)
/\<hostname-validation\>([\s\S]*?)\<\/pathname-validation\>/ (finds nothing)
**<hostname-validation>([\s\S]*?)<\/pathname-validation> (my desired result)**
The text where I use in:
<hostname-validation>www.your-tag-name.com</hostname-validation>
<pathname-validation>pathname</pathname-validation> <response-validation nil="true"/>
<validate-absence type="boolean">false</validate-absence> (...) <hostname-validation>www.your-tag-name.com</hostname-validation>
<pathname-validation>pathname</pathname-validation> <response-validation nil="false"/>
<validate-absence type="boolean">false</validate-absence> (...) <hostname-validation>www.your-tag-name.com</hostname-validation>
<pathname-validation>pathname</pathname-validation> <response-validation nil="true"/>
<validate-absence type="boolean">false</validate-absence> (...)

Javascript: highlight substring keeping original case but searching in case insensitive mode

I'm trying to write a "suggestion search box" and I cannot find a solution that allows to highlight a substring with javascript keeping the original case.
For example if I search for "ca" I search server side in a case insensitive mode and I have the following results:
Calculator
calendar
ESCAPE
I would like to view the search string in all the previous words, so the result should be:
Calculator
calendar
ESCAPE
I tried with the following code:
var reg = new RegExp(querystr, 'gi');
var final_str = 'foo ' + result.replace(reg, '<b>'+querystr+'</b>');
$('#'+id).html(final_str);
But obviously in this way I loose the original case!
Is there a way to solve this problem?
Use a function for the second argument for .replace() that returns the actual matched string with the concatenated tags.
Try it out: http://jsfiddle.net/4sGLL/
reg = new RegExp(querystr, 'gi');
// The str parameter references the matched string
// --------------------------------------v
final_str = 'foo ' + result.replace(reg, function(str) {return '<b>'+str+'</b>'});
$('#' + id).html(final_str);​
JSFiddle Example with Input: https://jsfiddle.net/pawmbude/
ES6 version
const highlight = (needle, haystack) =>
haystack.replace(
new RegExp(needle, 'gi'),
(str) => `<strong>${str}</strong>`
);
nice results with
function str_highlight_text(string, str_to_highlight){
var reg = new RegExp(str_to_highlight, 'gi');
return string.replace(reg, function(str) {return '<span style="background-color:#ffbf00;color:#fff;"><b>'+str+'</b></span>'});
}
and easier to remember...
thx to user113716: https://stackoverflow.com/a/3294644/2065594
While the other answers so far seem simple, they can't be really used in many real world cases as they don't handle proper text HTML escaping and RegExp escaping. If you want to highlight every possible snippet, while escaping the text properly, a function like that would return all elements you should add to your suggestions box:
function highlightLabel(label, term) {
if (!term) return [ document.createTextNode(label) ]
const regex = new RegExp(term.replace(/[\\^$*+?.()|[\]{}]/g, '\\$&'), 'gi')
const result = []
let left, match, right = label
while (match = right.match(regex)) {
const m = match[0], hl = document.createElement('b'), i = match.index
hl.innerText = m
left = right.slice(0, i)
right = right.slice(i + m.length)
result.push(document.createTextNode(left), hl)
if (!right.length) return result
}
result.push(document.createTextNode(right))
return result
}
string.replace fails in the general case. If you use .innerHTML, replace can replace matches in tags (like a tags). If you use .innerText or .textContent, it will remove any tags there were previously in the html. More than that, in both cases it damages your html if you want to remove the highlighting.
The true answer is mark.js (https://markjs.io/). I just found this - it is what I have been searching for for such a long time. It does just what you want it to.
I do the exact same thing.
You need to make a copy.
I store in the db a copy of the real string, in all lower case.
Then I search using a lower case version of the query string or do a case insensitive regexp.
Then use the resulting found start index in the main string, plus the length of the query string, to highlight the query string within the result.
You can not use the query string in the result since its case is not determinate. You need to highlight a portion of the original string.
.match() performs case insensitive matching and returns an array of the matches with case intact.
var matches = str.match(queryString),
startHere = 0,
nextMatch,
resultStr ='',
qLength = queryString.length;
for (var match in matches) {
nextMatch = str.substr(startHere).indexOf(match);
resultStr = resultStr + str.substr(startHere, nextMatch) + '<b>' + match + '</b>';
startHere = nextMatch + qLength;
}
I have found a easiest way to achieve it. JavaScript regular expression remembers the string it matched. This feature can be used here.
I have modified the code a bit.
reg = new RegExp("("+querystr.trim()+")", 'gi');
final_str = 'foo ' + result.replace(reg, "<b>&1</b>");
$('#'+id).html(final_str);
Highlight search term and anchoring to first occurence - Start
function highlightSearchText(searchText) {
var innerHTML = document.documentElement.innerHTML;
var replaceString = '<mark>'+searchText+'</mark>';
var newInnerHtml = this.replaceAll(innerHTML, searchText, replaceString);
document.documentElement.innerHTML = newInnerHtml;
var elmnt = document.documentElement.getElementsByTagName('mark')[0]
elmnt.scrollIntoView();
}
function replaceAll(str, querystr, replace) {
var reg = new RegExp(querystr, 'gi');
var final_str = str.replace(reg, function(str) {return '<mark>'+str+'</mark>'});
return final_str
}
Highlight search term and anchoring to first occurence - End

Categories

Resources