Regexp adds invisible dot character replacing \b - javascript

I want this to be my regex: /^word\b/ (word is dynamic)
When I set it up to be dynamic I have to use this:
var word='spoon';
'spoon .table .chair'.match(new RegExp('^'+word+'\b'));
However, this finds null, while this:
var word='spoon';
'spoon .table .chair'.match(/^spoon\b/);
finds ["spoon"].
The interesting part is when I examine the difference between the regex I worte and the regex RegExp wrote:
console.log(/^spoon\b/,new RegExp('^'+word+'\b'))
It shows this:
/^spoon\b/ /^spoon/
If I then copy the second part of the log output (/^spoon/) into my code editor I see this character:
What is that? How do I do RegExp word-ending-with as I am not always guaranteed to have a space at the end when the string might be a one-word string (spoon or another word)
I'd rather just do this without the invisible thing

You've got to escape the \ in the b in the regex string by adding an extra slash:
var regex = new RegExp('^' + word + '\\b')
This is because the RegExp is expecting to see the two characters \ and b, but the string '\b' is one character, ascii 8, the backspace character (in the same way that '\n' is a single newline character).

In Javascript, \b doesn't mean a \ followed by a b. It means the backspace character (ASCII code 8). To get a \ followed by a b, you need to escape the slash so that Javascript doesn't parse it as a backspace:
'^' + word + '\\b'
The same thing applies if you want to use \d or \s or anything else: You need to escape the \ with another one so that Javascript doesn't think it's a Javascript escape code and the RegExp can parse it as what you expect.

Related

Regex to match string from the back

Let's say we have a string "text\t1\nText that has to be extracted" what regex can be used so that we check the string from the back that is from the last " to n because the start of the string can change. In this case, I need to get only Text that has to be extracted. What generic regex can we use here?
I used this (?<=\\n1\\n)(.*)(?=“) but this will not work if the pattern before n changes to n2 or ntext.
Any help is appreciated.
You may use this regex:
/(?<=\\n)[^"\\]+(?="$)/
RegEx Demo
RegEx Details:
(?<=\\n): Lookbehind to make sure we have a \n before the current position
[^"\\]+: Match 1+ of any character that is not " and not \
(?="$): Make sure we have a " before line end ahead
Can't you just split and take the last element?
var item = "text\n1\nText that has to be extracted";
var last = item.split(/\n/g).reverse()[0];
console.log(last) // "Text that has to be extracted"
/^(\d+)\n([^\n"]+)"$/ may have some edge cases, but will find the number (one or more digits), followed by a newline, followed by any character that is neither newline nor a double quote, followed by a literal double quote.
This would require that the double quote occurs immediately before the end-of-line (EOL), but if that's not required (for example, if you have a semi-colon after the closing quote), remove $ from the end.
Edit
Just noticed that it's the literal text \n and not a newline character.
/(?<=\\n)(\d+)\\n((?:[^\\"]+|\\.)*)"/
Regex101 example
Breakdown:
(?<=\\n) looks for a \ followed by the letter n.
(\d+) captures the 1-or-more digits.
\\n matches a literal \ followed by the letter n.
(...*) matches some text that repeats 0 or more times.
(?:...|...) matches any character that are neither a literal \ character nor a double quote character... OR a literal \ character that is followed by "anything" so you can have \n or \" etc. The entire group is matched repeatedly.
" at the end ensures that you're inside (well, we hope) a double-quoted string on the same line.

Replace '\' with '-' in a string

I have seen all the questions asked previously but it didn't help me out . I have a string that contains backslash and i want to replace the backslashes with '-'
var s="adbc\sjhf\fkjfh\af";
s = s.replace(/\\/g,'-');
alert(s);
I thought this is the proper way to do it and of course i am wrong because in alert it shows adbcsjhffkjfhaf but i need it to be like adbc-sjhf-fkjfh-af.
What mistake i do here and what is the reason for it and how to achieve this...??
Working JS Fiddle
Your s is initially adbcsjhffkjfhaf. You meant
var s="adbc\\sjhf\\fkjfh\\af";
You need to double-up the backslashes in your input string:
var s="adbc\\sjhf\\fkjfh\\af";
Prefixing a character with '\' in a string literal gives special meaning to that character (eg '\t' means a tab character). If you want to actually include a '\' in your string you must escape it with a second backslash: '\\'
Javascript is ignoring the \ in \s \f \a in your string. Do a console.log(s) after assigning, you will understand.
You need to escape \ with \\. Like: "adbc\\sjhf\\fkjfh\\af"
The string doesn't contain a backslash, it contains the \a, \s and \f (escape sequence for Form Feed).
if you change your string to adbc\\sjhf\\fkjfh\\af
var s="adbc\\sjhf\\fkjfh\\af";
s = s.replace(/\\/g,'-');
alert(s);
you will be able to replace it with -

Strange javascript regular expressions

I have found the following regular expression
new RegExp("(^|\\s)hello(\\s|$)");
I refer http://www.javascriptkit.com/jsref/escapesequence.shtml for regular expressions..
But i cannot see \s escape sequence there..I know \s indicate whitespace character...
But what does the preceding \ do ..Which character is escaped?
I found similar regular expression in the Treewalker code in the following document http://ejohn.org/blog/getelementsbyclassname-speed-comparison/
The double \\ is to escape the backslash inside the string. In other word, \\ will be interpreted as \ for the regular expression.
The extra \ in this case is to escape the \ in the \s. Because we are inside a string declaration, you have to double up the \ to escape it. Once the string is processed and saved, it is reduced down to (^|\s)hello(\s|$)
The character immediately following the first \ is escaped. Normally \s escapes the s to mean "whitespace". In your example, the character which is escaped is \.
What you have is an expression which builds a regex (presumably to pass elsewhere) of (^|\s)hello(\s|$) — the word "hello" preceded either by whitespace or the start of the string, and followed by whitespace or the end of the string.
Essentially what the reg ex is doing, is looking for the opening and closing items of text surrounding the word hello and literally interpreting the '\s' as string content at the same time.
In laymans terms it's looking for a string that exactly matches:
|\shello\s|
As others have said the double \ is to escape the single \ so that instead of the reg ex engine looking for white-space it actually looks for '\s' as a string.
The ^ means start of line, the $ means end of line and the 2 | are interpreted as actual chars to look for
Lastly your start and end markers are bracketed () which means they will be extracted and placed in matches, which for you using C# means you can get at them by using:
myRegex.Matches.Group[1].Value
myRegex.Matches.Group[2].Value
1 being the beginning grouping, and 2 being the end.

unterminated string literal in the variable

var MM = '\' + obj[0]['MM '] + '/';
I get two errors while using this code...
missing; before statement and
unterminated string literal
The character \ is "special" because it's used to allow the use of all printable characters in strings. In your case '\' is not a string composed by the only character \, but the beginning of a string starting with the single quote character '.
For exampe if you want the string Hello Andrea "6502" Griffini you can use single quotes
string1 = 'Hello Andrea "6502" Griffini';
and if you want single quotes in the string you can do the opposite
string2 = "Hello Andrea '6502' Griffini";
But what if you want both kind of quotes in the same string? This is where the escape \ character comes handy:
string3 = "'llo Andrea \"6502\" Griffini";
Basically \ before a quote or double quote in a string tells javascript that the following character is just a regular character, with no special meaning attached to it.
Note that the very same character is also used in regular expressions... for example if you want to look for an open bracket [ you must prefix it with a backslash because [ in a regular expression has a special meaning.
The escape is also used to do the opposite... in a string if you put a backslash in front of a normal character you are telling javascript that that character is indeed special... for example
alert("This is\na test");
In the above line the \n sequence means a newline code, so the message displayed will be on two lines ("This is" and "a test").
You may now wonder... what if I need a backslash character in my string? Just double it in that case. In your code for example just use '\\'.
Here is a table for the possible meanings of backslash in strings
\" just a regular double-quote character, it doesn't end the string
\' just a regular single-quote character, it doesn't end the string
\b a backspace character (ASCII code 0x08)
\t a tab character (ASCII code 0x09)
\n a newline character (ASCII code 0x0A)
\v a vertical tab character (ASCII code 0x0B)
\f a form feed character (ASCII code 0x0C)
\r a carriage return character (ASCII code 0x0D)
\033 the character with ASCII code 033 octal = 27 ("ESC" in this case)
\x41 the character with ASCII code 0x41 = 65 ("A" in this case)
\u05D0 the unicode character 0x05D0 (Aleph from the Hebrew charset)
\\ just regular backslash character, not an escape prefix
\ is an escape character. You'll have to double it to literally mean a backslash character, otherwise it'll augment the following character (In this case the next single quote)
You need to properly escape the backslash:
var lastMenstrualPeriod = '\\' + obj[0]['LastMenstrualPeriod'] + '/';
Being escape character, the JS "compiler" is expecting another character to follow, for example \n is newline constant, \t is tab etc.. so \\ is one single backslash in a string.
It is also mentioned in Douglas Crockford book.
You are forgetting to escape '\'
Do this:
var lastMenstrualPeriod = '\\' + obj[0]['LastMenstrualPeriod'] + '/';

Finding Plus Sign in Regular Expression

var string = 'abcd+1';
var pattern = 'd+1'
var reg = new RegExp(pattern,'');
alert(string.search(reg));
I found out last night that if you try and find a plus sign in a string of text with a Javascript regular expression, it fails. It will not find that pattern, even though it exists in that string. This has to be because of a special character. What's the best way to find a plus sign in a piece of text? Also, what other characters will this fail on?
Plus is a special character in regular expressions, so to express the character as data you must escape it by prefixing it with \.
var reg = /d\+1/;
\-\.\/\[\]\\ **always** need escaping
\*\+\?\)\{\}\| need escaping when **not** in a character class- [a-z*+{}()?]
But if you are unsure, it does no harm to include the escape before a non-word character you are trying to match.
A digit or letter is a word character, escaping a digit refers to a previous match, escaping a letter can match an unprintable character, like a newline (\n), tab (\t) or word boundary (\b), or a a set of characters, like any word-character (\w), any non-word character (\W).
Don't escape a letter or digit unless you mean it.
Just a note,
\ should be \\ in RegExp pattern string, RegExp("d\+1") will not work and Regexp(/d\+1/) will get error.
var string = 'abcd+1';
var pattern = 'd\\+1'
var reg = new RegExp(pattern,'');
alert(string.search(reg));
//3
You should use the escape character \ in front of the + in your pattern. eg. \+
You probably need to escape the plus sign:
var pattern = /d\+1/
The plus sign is used in regular expressions to indicate 1 or more characters in a row.
It should be var pattern = '/d\\+1/'.
The string will escape '\\' as '\' ('\\+' --> '\+') so the regex object init with /d\+1/
if you want to use + (plus sign) or $ (sigil /dollar sign), then use \ (backslash) as a prefix. Like that:
\$ or \+

Categories

Resources