Search mechanism to include words as a whole [duplicate] - javascript

This question already has answers here:
Regex match entire words only
(7 answers)
Closed 4 years ago.
I have created a search mechanism that searches through an array of strings for an exact string match, however I want it to be a bit more intuitive.
I can also get it to search for a string within the string (for example chicken in grilled chicken - however the issue is this allows users to type ken or ill and it returns grilled chicken.
I would like it to return if I typed in chicken or grilled.
Does anyone have any suggestions on how to have a more intuitive search mechanism?
EDIT:
The correct answer below worked when typing 1 word and it would search all individual words in a string. However, I realised it fails when you search with 2 words (as it only searches each string word individually).
I solved this by adding || search == string to the if to include not just individually word matches but whole string matches.
However I am still having an issue with it either searching for:
Whole string matches
OR
Matches with individual words.
This means it fails when search = green cup and string = big green cup. Is there a way to solve this by cutting for collections to search within? Perhaps something similar to:
string.split(' ') but to also include big green, green cup to the array also?

Try This Simplest Code without Regex
var data = ["first string1 is here", "second string2 is here", "third string3 is here"];
var wordToSearch = "string2 is thanks";
var broken = wordToSearch.split(' ');
var result = 'not found';
if(broken.length == 1){
data.forEach(function(d){
d1 = d.split(' ');
if(d1.includes(wordToSearch))
result = d;
});
}
else if(broken.length == 2)
{
data.forEach(function(d){
var d1 = d.split(' ');
if(d1.includes(broken[0]) && d1.includes(broken[1]))
{
result = d;
}
});
}
alert(result);

I'd use RegExp with word boundary anchor - \b.
function search(query, arr) {
var res = [];
var re = new RegExp('\\b' + query + '\\b');
arr.forEach(function (item) {
if (re.test(item)) res.push(item);
});
return res;
}

It sounds like you only want to search by whole words, if that's the case, you could split the string by the space character and then search through the resultant array for matches.

Related

Checking the presence of multiple words in a variable using JavaScript

The code the presence of a single word in a sentence and it's working fine.
var str ="My best food is beans and plantain. Yam is also good but I prefer yam porrage"
if(str.match(/(^|\W)food($|\W)/)) {
alert('Word Match');
//alert(' The matched word is' +matched_word);
}else {
alert('Word not found');
}
Here is my issue: I need to check presence of multiple words in a sentence (eg: food,beans,plantains etc) and then also alert the matched word.
something like //alert(' The matched word is' +matched_word);
I guess I have to passed the searched words in an array as per below:
var words_checked = ["food", "beans", "plantain"];
You can construct a regular expression by joining the array of words by |, then surround it with word boundaries \b:
var words_checked = ['foo', 'bar', 'baz']
const pattern = new RegExp(String.raw`\b(?:${words_checked.join('|')})\b`);
var str = 'fooNotAStandaloneWord baz something';
console.log('Match:', str.match(pattern)[0]);
Here's a way to solve this. Simply loop through the list of words to check, build the regex as you go and check to see if there is a match. You can read up on how to build Regexp objects here
var str ="My best food is beans and plantain. Yam is also good but I prefer
yam porrage"
var words = [
"food",
"beans",
"plantain",
"potato"
]
for (let word of words) {
let regex = new RegExp(`(^|\\W)${word}($|\\W)`)
if (str.match(regex)) {
console.log(`The matched word is ${word}`);
} else {
console.log('Word not found');
}
}
var text = "I am happy, We all are happy";
var count = countOccurences(text, "happy");
// count will return 2
//I am passing the whole line and also the word for which i want to find the number of occurences
// The code splits the string and find the length of the word
function countOccurences(string, word){
string.split(word).length - 1;
}

Finding multiple groups in one string

Figure the following string, it's a list of html a separated by commas. How to get a list of {href,title} that are between 'start' and 'end'?
not thisstartfoo, barendnot this
The following regex give only the last iteration of a.
/start((?:<a href="(?<href>.*?)" title="(?<title>.*?)">.*?<\/a>(?:, )?)+)end/g
How to have all the list?
This should give you what you need.
https://regex101.com/r/isYIeR/1
/(?:start)*(?:<a href=(?<href>.*?)\s+title=(?<title>.*?)>.*?<\/a>)+(?:,|end)
UPDATE
This does not meet the requirement.
The Returned Value for a Given Group is the Last One Captured
I do not think this can be done in one regex match. Here is a javascript solution with 2 regex matches to get a list of {href, title}
var sample='startfoo, bar,barendstart<img> something end\n' +
'beginfoo, bar,barend\n'+
'startfoo again, bar again,bar2 againend';
var reg = /start((?:\s*<a href=.*?\s+title=.*?>.*?<\/a>,?)+)end/gi;
var regex2 = /href=(?<href>.*?)\s+title=(?<title>.*?)>/gi;
var step1, step2 ;
var hrefList = [];
while( (step1 = reg.exec(sample)) !== null) {
while((step2 = regex2.exec(step1[1])) !== null) {
hrefList.push({href:step2.groups["href"], title:step2.groups["title"]});
}
}
console.log(hrefList);
If the format is constant - ie only href and title for each tag, you can use this regex to find a string which is not "", and has " and a space or < after it using lookahead (regex101):
const str = 'startfoo, barend';
const result = str.match(/[^"]+(?="[\s>])/gi);
console.log(result);
This regex:
<.*?>
removes all html tags
so for example
<h1>1. This is a title </h1><ul><a href='www.google.com'>2. Click here </a></ul>
After using regex you will get:
1. This is a title 2. Click here
Not sure if this answers your question though.

How to check if a string contains a WORD in javascript? [duplicate]

This question already has answers here:
How to check if a string contain specific words?
(11 answers)
Closed 3 years ago.
So, you can easily check if a string contains a particular substring using the .includes() method.
I'm interested in finding if a string contains a word.
For example, if I apply a search for "on" for the string, "phones are good", it should return false. And, it should return true for "keep it on the table".
You first need to convert it into array using split() and then use includes()
string.split(" ").includes("on")
Just need to pass whitespace " " to split() to get all words
This is called a regex - regular expression
You can use of 101regex website when you need to work around them (it helps). Words with custom separators aswell.
function checkWord(word, str) {
const allowedSeparator = '\\\s,;"\'|';
const regex = new RegExp(
`(^.*[${allowedSeparator}]${word}$)|(^${word}[${allowedSeparator}].*)|(^${word}$)|(^.*[${allowedSeparator}]${word}[${allowedSeparator}].*$)`,
// Case insensitive
'i',
);
return regex.test(str);
}
[
'phones are good',
'keep it on the table',
'on',
'keep iton the table',
'keep it on',
'on the table',
'the,table,is,on,the,desk',
'the,table,is,on|the,desk',
'the,table,is|the,desk',
].forEach((x) => {
console.log(`Check: ${x} : ${checkWord('on', x)}`);
});
Explaination :
I am creating here multiple capturing groups for each possibily :
(^.*\son$) on is the last word
(^on\s.*) on is the first word
(^on$) on is the only word
(^.*\son\s.*$) on is an in-between word
\s means a space or a new line
const regex = /(^.*\son$)|(^on\s.*)|(^on$)|(^.*\son\s.*$)/i;
console.log(regex.test('phones are good'));
console.log(regex.test('keep it on the table'));
console.log(regex.test('on'));
console.log(regex.test('keep iton the table'));
console.log(regex.test('keep it on'));
console.log(regex.test('on the table'));
You can .split() your string by spaces (\s+) into an array, and then use .includes() to check if the array of strings has your word within it:
const hasWord = (str, word) =>
str.split(/\s+/).includes(word);
console.log(hasWord("phones are good", "on"));
console.log(hasWord("keep it on the table", "on"));
If you are worried about punctuation, you can remove it first using .replace() (as shown in this answer) and then split():
const hasWord = (str, word) =>
str.replace(/[.,\/#!$%\^&\*;:{}=\-_`~()]/g,"").split(/\s+/).includes(word);
console.log(hasWord("phones are good son!", "on"));
console.log(hasWord("keep it on, the table", "on"));
You can split and then try to find:
const str = 'keep it on the table';
const res = str.split(/[\s,\?\,\.!]+/).some(f=> f === 'on');
console.log(res);
In addition, some method is very efficient as it will return true if any predicate is true.
You can use .includes() and check for the word. To make sure it is a word and not part of another word, verify that the place you found it in is followed by a space, comma, period, etc and also has one of those before it.
A simple version could just be splitting on the whitespace and looking through the resulting array for the word:
"phones are good".split(" ").find(word => word === "on") // undefined
"keep it on the table".split(" ").find(word => word === "on") // "on"
This just splits by whitespace though, when you need parse text (depending on your input) you'll encounter more word delimiters than whitespace. In that case you could use a regex to account for these characters.
Something like:
"Phones are good, aren't they? They are. Yes!".split(/[\s,\?\,\.!]+/)
I would go with the following assumptions:
Words the start of a sentence always have a trailing space.
Words at the end of a sentence always have a preceding space.
Words in the middle of a sentence always have a trailing and preceding space.
Therefore, I would write my code as follows:
function containsWord(word, sentence) {
return (
sentence.startsWith(word.trim() + " ") ||
sentence.endsWith(" " + word.trim()) ||
sentence.includes(" " + word.trim() + " "));
}
console.log(containsWord("test", "This is a test of the containsWord function."));
Try the following -
var mainString = 'codehandbook'
var substr = /hand/
var found = substr.test(mainString)
if(found){
console.log('Substring found !!')
} else {
console.log('Substring not found !!')
}

Searching string from every position with regex so it finds overlapping matches

My regex doesn't find overlapping matches, is there a way to do find them?
Example regex that illustrates the problem: /(\d\*x\^(\d))\+(\d\*x\^(\d))/g
String to perform this regex on: 3*x^2+2*x^3+6*x^2
Matches: 3*x^2+2*x^3
Desired matches: 3*x^2+2*x^3, 2*x^3+6*x^2
I want to use this in a replace function, and check if the first digit is greater than the second, if that is the case, it should switch them around. Because it doesn't match overlapping matches, it doesn't work.
I checked another answer about this, but don't know how to implement it with the replace function in JavaScript.
I've got this so far:
var string1 = "3*x^2+2*x^3+6*x^2";
var newString1 = string1.replace(/(\d\*x\^(\d))\+(\d\*x\^(\d))/g, function (piece, $1, $2, $3, $4) {
if ($2 > $4) {
return $3 + "+" + $1;
} else {
return piece;
}
});
Or, if you know a better tool to solve this problem instead of regexp, like mentioned in the comments, please provide a solution using that tool.
The snippet below just uses split, reverse, join.
It orders the pieces between the '+' based on the reversed pieces.
Thus sorts them by the exponent.
For example 3*x^2+2*x^3+6*x^2 becomes 3*x^2+6*x^2+2*x^3
var str = "3*x^2+2*x^3+6*x^2";
var result = str
.split('').reverse().join('') // reverse the string
.split('+') // split into an array by the +
.sort().reverse() // sort the array descending
.join('+') // join the array by + to a string
.split('').reverse().join(''); // reverse the string
console.log(result);
As for how to use regex to get each piece and the piece after this?
That can be done via a positive lookahead that has a capture group in it.
var str = "3*x^2+2*x^3+6*x^2";
var re = /(\d\*x\^(\d))(?=\+?(\d\*x\^(\d))?)/g;
var m;
while (m = re.exec(str)) {
console.log(m[1], "exponent:"+m[2], m[3], "exponent:"+m[4]);
}

JS: matching entries with capture groups, accounting for new lines

Given this text:
1/12/2011
I did something.
10/5/2013
I did something else.
Here is another line.
And another.
5/17/2014
Lalala.
More text on another line.
I would like to use regex (or maybe some other means?) to get this:
["1/12/2011", "I did something.", "10/5/2013", "I did something else.\n\nHere is another line.\n\nAnd another.", "5/17/2014", "Lalala.\nMore text on another line."]
The date part and content part are each separate entries, alternating.
I've tried using [^] instead of the dot since JS's .* does not match new lines (as Matching multiline Patterns says), but then the match is greedy and takes up too much, so the resulting array only has 1 entry:
var split_pattern = /\b(\d\d?\/\d\d?\/\d\d\d\d)\n([^]+)/gm;
var array_of_mems = contents.match(split_pattern);
// => ["1/12/2011↵I did something else..."]
If I add a question mark to get [^]+?, which according to How to make Regular expression into non-greedy? makes the match non-greedy, then I only get the first character of the content part.
What's the best method? Thanks in advance.
(\d{1,2}\/\d{1,2}\/\d{4})\n|((?:(?!\n*\d{1,2}\/\d{1,2}\/\d{4})[\s\S])+)
You can try this.grab the captures.See demo.
https://regex101.com/r/sJ9gM7/126
var re = /(\d{1,2}\/\d{1,2}\/\d{4})\n|((?:(?!\n*\d{1,2}\/\d{1,2}\/\d{4})[\s\S])+)/gim;
var str = '1/12/2011\nI did something.\n\n10/5/2013\nI did something else.\n\nHere is another line.\n\nAnd another.\n\n5/17/2014\nLalala.\nMore text on another line.';
var m;
if ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
You can use the exec() method in a loop to get your desired results.
var re = /^([\d/]+)\s*((?:(?!\s*^[\d/]+)[\S\s])+)/gm,
matches = [];
while (m = re.exec(str)) {
matches.push(m[1]);
matches.push(m[2]);
}
Output
[ '1/12/2011',
'I did something.',
'10/5/2013',
'I did something else.\n\nHere is another line.\n\nAnd another.',
'5/17/2014',
'Lalala.\nMore text on another line.' ]
eval.in

Categories

Resources