How to find a word that has surrounded with indicator? javascript - javascript

I have a string below which has some identifier to get an specific word on it.
string example: "I will c#hec*k on it"
the "#" indicates starting, and the "*" indicates for last.
I want to get two strings.
check - the whole word that has "#" and "*" on it.
hec - string that was surrounded.
I have started to use the below code, but it seems does not work.
sentence.split('#').pop().split('*')[0];
Somebody knows how to do it. would appreciate it thanks

var s = "I will c#hec*k on it"
console.log(s.match(/(?<=#)[^*]*(?=\*)/)) // this will print ["hec"]
console.log(s.match(/\w*#[^*]*\*\w*/).map(s => s.replace(/#(.*)\*/, "$1"))) // this will print ["check"]
where:
(?<=#) means "preceded by a #"
[^*]* matches zero or more characters that are not a *
(?=\*) means "followed by a *"
\w* matches zero or more word characters
(.*) is a capturing group (referenced by $1) matching any number of any kind of character (except for newlines)

I would try something like this with Javascript,
there might be a better approach with regex though.
let sentence = "I will c#hec*k on it";
sentence.split(" ").forEach(word => {
if(word.includes("#") && word.includes("*")){
let betweenChars = word.substring(
word.lastIndexOf("#") + 1,
word.lastIndexOf("*")
)
console.log('Between chars: ', betweenChars);
let withoutChars = word.replace(/[#*]/g,"");
console.log('Without chars: ', withoutChars);
}
});

Related

Regex not finding two letter words that include Swedish letters

So I am very new with Regex and I have managed to create a way to check if a specific word exists inside of a string without just being part of another word.
Example:
I am looking for the word "banana".
banana == true, bananarama == false
This is all fine, however a problem occurs when I am looking for words containing Swedish letters (Å,Ä,Ö) with words containing only two letters.
Example:
I am looking for the word "på" in a string looking like this: "på påsk"
and it comes back as negative.
However if I look for the word "påsk" then it comes back positive.
This is the regex I am using:
const doesWordExist = (s, word) => new RegExp('\\b' + word + '\\b', 'i').test(s);
stringOfWords = "Färg på plagg";
console.log(doesWordExist(stringOfWords, "på"))
//Expected result: true
//Actual result: false
However if I were to change the word "på" to a three letter word then it comes back true:
const doesWordExist = (s, word) => new RegExp('\\b' + word + '\\b', 'i').test(s);
stringOfWords = "Färg pås plagg";
console.log(doesWordExist(stringOfWords, "pås"))
//Expected result: true
//Actual result: true
I have been looking around for answers and I have found a few that have similar issues with Swedish letters, none of them really look for only the word in its entirity.
Could anyone explain what I am doing wrong?
The word boundary \b strictly depends on the characters matched by \w, which is a short-hand character class for [A-Za-z0-9_].
For obtaining a similar behaviour you must re-implement its functionality, for example like this:
const swedishCharClass = '[a-zäöå]';
const doesWordExist = (s, word) => new RegExp(
'(?<!' + swedishCharClass + ')' + word + '(?!' + swedishCharClass + ')', 'i'
).test(s);
console.log(doesWordExist("Färg på plagg", "på")); // true
console.log(doesWordExist("Färg pås plagg", "pås")); // true
console.log(doesWordExist("Färg pås plagg", "på")); // false
For more complex alphabets, I'd suggest you to take a look at Concrete Javascript Regex for Accented Characters (Diacritics).

How to check if a string contains a WORD in javascript? [duplicate]

This question already has answers here:
How to check if a string contain specific words?
(11 answers)
Closed 3 years ago.
So, you can easily check if a string contains a particular substring using the .includes() method.
I'm interested in finding if a string contains a word.
For example, if I apply a search for "on" for the string, "phones are good", it should return false. And, it should return true for "keep it on the table".
You first need to convert it into array using split() and then use includes()
string.split(" ").includes("on")
Just need to pass whitespace " " to split() to get all words
This is called a regex - regular expression
You can use of 101regex website when you need to work around them (it helps). Words with custom separators aswell.
function checkWord(word, str) {
const allowedSeparator = '\\\s,;"\'|';
const regex = new RegExp(
`(^.*[${allowedSeparator}]${word}$)|(^${word}[${allowedSeparator}].*)|(^${word}$)|(^.*[${allowedSeparator}]${word}[${allowedSeparator}].*$)`,
// Case insensitive
'i',
);
return regex.test(str);
}
[
'phones are good',
'keep it on the table',
'on',
'keep iton the table',
'keep it on',
'on the table',
'the,table,is,on,the,desk',
'the,table,is,on|the,desk',
'the,table,is|the,desk',
].forEach((x) => {
console.log(`Check: ${x} : ${checkWord('on', x)}`);
});
Explaination :
I am creating here multiple capturing groups for each possibily :
(^.*\son$) on is the last word
(^on\s.*) on is the first word
(^on$) on is the only word
(^.*\son\s.*$) on is an in-between word
\s means a space or a new line
const regex = /(^.*\son$)|(^on\s.*)|(^on$)|(^.*\son\s.*$)/i;
console.log(regex.test('phones are good'));
console.log(regex.test('keep it on the table'));
console.log(regex.test('on'));
console.log(regex.test('keep iton the table'));
console.log(regex.test('keep it on'));
console.log(regex.test('on the table'));
You can .split() your string by spaces (\s+) into an array, and then use .includes() to check if the array of strings has your word within it:
const hasWord = (str, word) =>
str.split(/\s+/).includes(word);
console.log(hasWord("phones are good", "on"));
console.log(hasWord("keep it on the table", "on"));
If you are worried about punctuation, you can remove it first using .replace() (as shown in this answer) and then split():
const hasWord = (str, word) =>
str.replace(/[.,\/#!$%\^&\*;:{}=\-_`~()]/g,"").split(/\s+/).includes(word);
console.log(hasWord("phones are good son!", "on"));
console.log(hasWord("keep it on, the table", "on"));
You can split and then try to find:
const str = 'keep it on the table';
const res = str.split(/[\s,\?\,\.!]+/).some(f=> f === 'on');
console.log(res);
In addition, some method is very efficient as it will return true if any predicate is true.
You can use .includes() and check for the word. To make sure it is a word and not part of another word, verify that the place you found it in is followed by a space, comma, period, etc and also has one of those before it.
A simple version could just be splitting on the whitespace and looking through the resulting array for the word:
"phones are good".split(" ").find(word => word === "on") // undefined
"keep it on the table".split(" ").find(word => word === "on") // "on"
This just splits by whitespace though, when you need parse text (depending on your input) you'll encounter more word delimiters than whitespace. In that case you could use a regex to account for these characters.
Something like:
"Phones are good, aren't they? They are. Yes!".split(/[\s,\?\,\.!]+/)
I would go with the following assumptions:
Words the start of a sentence always have a trailing space.
Words at the end of a sentence always have a preceding space.
Words in the middle of a sentence always have a trailing and preceding space.
Therefore, I would write my code as follows:
function containsWord(word, sentence) {
return (
sentence.startsWith(word.trim() + " ") ||
sentence.endsWith(" " + word.trim()) ||
sentence.includes(" " + word.trim() + " "));
}
console.log(containsWord("test", "This is a test of the containsWord function."));
Try the following -
var mainString = 'codehandbook'
var substr = /hand/
var found = substr.test(mainString)
if(found){
console.log('Substring found !!')
} else {
console.log('Substring not found !!')
}

A regex for removing certain characters only if there is a space character on either side

I have a Javascript string:
var myString= "word = another : more new: one = two";
I am trying to figure out a regex that would produce this:
var myString= "word another more new: one two";
So when the pattern of a space followed by a = sign then followed by another space would result in the = sign being removed.
Likewise for the : character as well.
If the = character or the : character are removed that is fine or if those characters are replaced by a space character that is fine as well.
In summary to replace multiple occurrences of an = or a : if and only if they
surrounded by a space character.
Whichever regex is easier to write.
Not with javascript... but you get the idea:
echo "word = another : more new: one = two" | sed 's/ [:=] / /g'
returns the desired string:
word another more new: one two
Explanation: the expression / [:=] / finds all "space followed by either colon or equals sign followed by space" and replaces with "space".
//save the appropriate RegEx in the variable re
//It looks for a space followed by either a colon or equals sign
// followed by another space
let re = /(\s(=|:)\s)/g;
//load test string into variable string
let string = "word = another : more new: one = two";
//parse the string and replace any matches with a space
let parsed_string = string.replace(re, " ");
//show result in the DOM
document.body.textContent = string + " => " + parsed_string;

Javascript Regex: count unescaped quotes in string

I'm trying to find a regex in Javascript for a seemingly simple problem, but I have been beating my head against the wall all morning about it. I'm trying to count the quotation symbols that occur in a string with string.match. The catch is that escaped quotation symbols should not be counted, but quotations which are preceded by an escaped backslash should be again.
As side information, I'm just trying to see if all strings present in the line are properly closed, and I'm reasoning there should be an equal number of quotes present in the line if this is the case.
A few examples:
'"I am string 1" "I am string 2"'
should obviously count 4 quotes
'"I am \"string 1\"" "I am string 2"'
should still count 4 quotes as the ones escaped inside string 1 should be skipped.
'"I am string 1\\" "I am string 2"'
should count 4 quotes, since the \ in front of the 2nd " is escaped by the \ before it.
I have found a regexp which does the job in ruby (and is formatted in pcre), but it uses constructs which do not exist in Javascript, such as negative lookbehinds (?>! and resetting the starting point of the match \K
(?<!\\)(?:\\{2})*\K"
I've tried to translate it to a Javascript regex, but with no avail.
I reckoned something like
(?:\\(?="))|(")
(match either a slash followed by a " or a slash on its own)
should do the trick, but it doesn't work and doesn't even account for the \" problem. Can anyone give me a lead? Many thanks!
You need a small parser to deal with this task as there is no \G operator that could anchor the subsequent matches to the end of the previous successful match.
var s = "\"some text\" with 5 unescaped double quotes... \\\"extras\" \\some \\\"string \\\" right\" here \"";
var res = 0;
var in_entity = false;
for (var i=0; i<s.length; i++) {
if ((s[i] === '\\' && !in_entity) || in_entity) { // reverse the flag
in_entity = !in_entity;
} else if (s[i] === '"' && !in_entity) { // an unescaped "
res += 1;
}
}
console.log(s,": ", res);
You can use this regex to grab the matches and count the length of resulting array:
var arr=['"I am string 1" "I am string 2"',
'"I am \\"string 1\\"" "I am string 2"',
'"I am string 1\\\\" "I am string 2"'
];
for (i=0; i<arr.length; i++) {
console.log(arr[i].match(/"[^"\\]*(?:\\.[^"\\]*)*"/g).length * 2)
}
/"[^"\\]*(?:\\.[^"\\]*)*"/ will match quoted string consuming all the escaped characters inside.
RegEx Demo
Output:
4
4
4
A simplistic solution is to first strip off any escaped quotes then re-escape the entire string.
val = '"I am \"string 1\"" "I am string 2"';
val = val.replace(/\"/gm, '"');
val = val.replace(/(["])/gm,'\$1');
Result will be:
'\"I am \"string 1\"\" \"I am string 2\"'

regex for string to get customerID

I am looking for the word customerID number in a string. Customer id would be in this format customerID{id}
so look at some different strings I would have
myVar = "id: 1928763783.Customer Email: test#test.com.Customer Name:John Smith.CustomerID #123456.";
myVar = "id: 192783.Customer Email: test1#test.com.Customer Name:Rose Vil.CustomerID #193474.";
myVar = "id: 84374398.Customer Email: test2#test.com.Customer Name:James Yuem.";
Ideally I wanna be able to check if a CustomerID exists or not. If it does exists then I want to see what it is. I know we can use regext but not sure howd that look
thanks
var match = myVar.match(/CustomerID #(\d+)/);
if (match) id = match[1];
I'm not 100% farmiliar with the syntax but I'd say: "(CustomerID #([0-9]+).)"
I think this is a valid regular expression for what you're looking for, it would check if a string had 'CustomerID' followed by a space, a numer sign and then a sequence of numbers. By surrouding the numbers with brackets, they can be captured by refrencing bracket 2 if it found something
I'm not sure if the brackets or period need a \ before them in this syntax or not. Sorry I can't be of more help but I hope this helps in some way.
Play around to get this to work for your needs:
// case-insensitive regular expression (i indicates case-insensitive match)
// that looks for one of more spaces after customerid (if you want zero or more spaces, change + to *)
// optional # character (remove ? if you don't want optional)
// one or more digits thereafter, (you can specify how long of an id to expect with by replacing + with {length} or {min, max})
var regex = /CustomerID\s+#?(\d+)/i;
var myVar1 = "id: 1928763783.Customer Email: test#test.com.Customer Name:John Smith.CustomerID #123456.";
var match = myVar1.match(regex);
if(match) { // if no match, this will be null
console.log(match[1]); // match[0] is the full string, you want the second item in the array for your first group
}

Categories

Resources