Regex get all text from # to quotation - javascript

Okay so I currently have:
/(#([\"]))/g;
I want to be able to check for a string like:
#23ad23"
Whats wrong with my regex?

Your regex (/(#([\"]))/g) breaks down like this:
without start/end delimiters/flags and capturing braces..
#[\"]
which just means #, followed by ", but the square brackets for the class are unnecessary, as there is only one item, so equivalent to...
#"
I think you want to match all characters between # and " inclusive (and captured exclusively).
Start with regex like this:
#.+?"
Which means # followed by anything (.) one or more times (+) un-greedily (?) followed by "
so with the capturing brackets, and delimeters...
/(#(.+?)")/g

Is this how you mean?
/(#([^\"]+))/g;
This will include everything until it reaches the " char.

For minimum match count (bigger-length matches): #(.+)\"
For maximum match count (smaller-length matches): #(.+?)\"

Related

Regular Expression to match text between # and only if # is not preceded by '

Hello I'm trying to find a regular expression that can help me find all matches inside a string when they're inside # and only if # are not preceded by an apostrophe "'".
Basically I need to bold the text just as here when we use double * to bold text like this, but the apostrophe should work as an escape character.
For example
#Hello my name is Noé# should look like Hello my name is Noé
#Hello this has an escape apostrophe '# so I'll match until here# should look like Hello this has an escape apostrophe '# so I'll match until here
Inside a long text there might or might not be several matches:
"Hello I'm a text #I'm bold#, and I need to know how to match my text that's inside two '#, and #I will not match either 'cause I got no end"
So i can print it like
"Hello I'm a text I'm bold, and I need to know how to match my text that's inside two '#, and #I will not match either 'cause I got no end"
If thats not possible with a RegExp I could program a finite state machine, but I was hoping I was possible, thank you in advance God bless you!
Note: I will handle the escape characters later by now I just need to know how to mach this
/(?<!')#.*(?<!')#/gim
This was the only thing I could come up with, but honestly, I have no idea how negative look behind works :(, with this regexp it would match wrong. For example, if I type:
"I'm a text #and I should be a match# and this should not #But this should as well# and I'm just some random extra text"
matches from the first # occurrence until the last one, like so:
"I'm a text #and I should be a match# and this should not #But this should as well# and I'm just some random extra text"
I think this should work:
(?<!')#(.*?)(?<!')#
Here you can see the regexp working with your examples: https://regex101.com/r/wnguiA/1
(?<!') is Negative Lookbehind, it tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (?<!a)b matches a b that is not preceded by an a.
More easy is the (.*?) that matches any character (except for line terminators); adding ? tells the capturing group to be not-greedy and stop at the first occourence of the succesive token.
To prevent triggering the negatilve lookbehind at all the positions not asserting a ' to the left, you can also first match # and do the assertion after it.
#(?<!'#)(.*?)#(?<!'#)
Regex demo
Another option instead of using the non greedy .*? is to use a negated character class matching any char except #
Then when you encounter # only match it if there is ' before it using a positive lookbehind.
#(?<!'#)([^#\n]*(?:#(?<='#)[^#\n]*)*)#(?<!'#)
#(?<!'#) Match # not directly preceded by '
( Capture group 1
[^#\n]* Optionally match any char except # or a newline
(?: Non capture group
#(?<='#) Match # not directly preceded by '
[^#\n]* Match optional repetitions of any char except # or a newline
)* Close non capture group and optionally repeat it to match all occurrences
) Close group 1
#(?<!'#) Match # not directly preceded by '
Regex demo

Regex delimit the start of a string and the end

I'm been having trouble with regex, which I doesn't understand at all.
I have a string '#anything#that#i#say' and want that the regex detect one word per #, so it will be [#anything, #that, #i, #say].
Need to work with spaces too :(
The closest that I came is [#\w]+, but this only get 1 word and I want separated.
You're close; [#\w] will match anything that is either a # or a word character. But what you want is to match a single # followed by any number of word characters, like this: #\w+ without the brackets
var str = "#anything#that#i#say";
var regexp = /#\w+/gi;
console.log(str.match(regexp));
It's possible to have this deal with spaces as well, but I'd need to see an example of what you mean to tell you how; there are lots of ways that "need to work with spaces" can be interpreted, and I'd rather not guess.
use expression >> /#\s*(\w+)/g
\s* : to check if zero or more spaces you have between # and word
This will match 4 word in your string '#anything#that#i#say'
even your string is containing space between '#anything# that#i# say'
sample to check: http://www.regextester.com/?fam=97638

Javascript regex for tag in comments

I've been working on a web app in which users can comment and reply to comments, this uses a tagging system. The users are being tagged by their name which can contain more words so I've decided to mark the takes like this:
&&John Doe&&
So a comment might look like this:
&&John Doe&&, are you sure that &&Alice Johnson&& is gone?
I'm trying to write a regex to match use in a string.replace() javascript function, so the regex must match every single tag in the string.
So far I have:
^&&.+{2, 64}&&$
This isn't working so I'm sure something is wrong, in case you didn't understand what I meant, the regex from above is supposed to match strings like this:
&&anythingbetween2and64charslong&&.
Thanks in advance!
(.*?)&& means "everything until &&" :
var before = document.getElementById("before");
var after = document.getElementById("after");
var re = /&&(.*?)&&/g, b = "<b>$1</b>";
after.innerHTML = before.textContent.replace(re, b);
<p id="before">&&John Doe&&, are you sure that &&Alice Johnson&& is gone?</p>
<p id="after"></p>
try &{2}.{2,64}&{2}
if you want to get the match in between add parentheses for the match group
&{2}(.{2,64})&{2}
right now your are only checking strings where the entire line matches
the ^ character means beginning of line
the $ character means end of line
\A means beginning of entire string
\Z means end of entire string
Here's what you need:
str.match(/&&.{2,64}?&&/g)
you need to remove ^ and $ from the start and the end since they match the start and the end of the string.
add a /g flag at the end so all the matches will be matched
? after the {} makes the match non-greedy, so it will match the shortest possible string between "&&" instead of the longest (will give you "&&John Doe&&" instead of "&&John Doe&&, are you sure that &&Alice Johnson&&")
Read up on greediness: Repetition with Star and Plus
This regex will match any Unicode letter between && signs:
str.match(/\&\&[\p{L}\p{N}]+(?:\s+[\p{L}\p{N}]+)*\&\&/g);
Here,
\p{L} --> Any unicode letter, the names can be any language and letter
\p{N} --> Any unicode digit
[\p{L}\p{N}]+ --> A word constructed with unicode letters or digits
\s+ --> Gaps between words, max 3 length
[\p{L}\p{N}]+(?:\s+[\p{L}\p{N}]+)* --> All word groups

Match word ending with either one of two special characters in the string

I'm trying to check if word. or word: is part of the string.
if (/word\b/.test(str) )
This is the best solution I came up with, but I'd like to match only for word. or word:.
I was trying something in the following lines, but can't get it to work:
if (/word\/(.|:)/i.test(str)
How to go about this?
You may leverage a character class [.:] to match either a dot or a colon, and then add a non-word boundary \B to make sure there is a non-word char after the dot/colon, or the end of string:
if (/word[.:]\B/i.test(str)
As an alternative, you may require a whitespace or the end of string after . or ::
if (/word[.:](?=\s|$)/i.test(str)

Match character but not when preceded by

I want to replace all line breaks but only if they're not preceded by these two characters {] (both, not one of them) using JavaScript. The following expression seems to do the job but it breaks other regex results so something must be wrong:
/[^\{\]]\n/g
What am I doing wrong?
Do you need to be able to strip out \n, \r\n, or both?
This should do the job:
/(^|^.|[^{].|.[^\]])\r?\n/gm
And would require that you place $1 at the beginning of your replacement string.
To answer your question about why /[^\{\]]\n/ is wrong, this regex equates to: "match any character that is neither { nor ]", followed by \n, so this incorrectly fail to match the following:
here's a square]\n
see the following{\n
You're also missing the g flag at the end, but you may have noticed that.
When you're using [^\{\]] you're using a character range: this stands for "any character which is not \{ or \]. Meaning the match will fail on {\n or }\n.
If you want to negate a pattern longer than one character you need a negative look-ahead:
/^(?!.*{]\n)([^\n]*)\n/mg
^(?! # from the beginning of the line (thanks to the m flag)
.*{]\n # negative lookahead condition: the line doesn't end with {]\n
)
([^\n]*) # select, in capturing group 1, everything up to the line break
\n
And replace it with
$1 + replacement_for_\n
What we do is check line by line that our line doesn't hold the unwanted pattern.
If it doesn't, we select everything up to the ending \n in capturing group 1, and we replace the whole line with that everything, followed by what you want to replace \n with.
Demo: http://regex101.com/r/nM2xE1
Look behind is not supported, you could emulate it this way
stringWhereToReplaceNewlines.replace(/(.{0,2})\n/g, function(_, behind) {
return (behind || "") + ((behind === '{]') ? "\n" : "NEWLINE_REPLACE")
})
The callback is called for every "\n" with the 2 preceding characters as the second parameter. The callback must return the string replacing the "\n" and the 2 characters before. If the 2 preceding characters are "{]" then the new line should not be replaced so we return exactly the same string matched, otherwise we return the 2 preceding characters (possibly empty) and the thing that should replace the newline
My solution would be this:
([^{].|.[^\]])\n
Your replacement string should be $1<replacement>
Since JavaScript doesn't support lookbehind, we have to make do with lookahead. This is how the regex works:
Anything but { and then anything - [^{].
Or anything and then anything but ] - .[^\]]
Put simply, [^{].|.[^\]] matches everything that .. matches except for {]
Finally a \n
The two chars before the \n are captured, so you can reinsert them into the replacement string using $1.

Categories

Resources