Regex to find single word surrounded by square brackets - javascript

I'm stumped on the following regex problem.
I'm trying to find single words surrounded by square brackets without spaces. Like this:
[singleWord]
I don't want to find phrases, like this
[a series of words]
At the moment I'm using this regex:
/\[(.*?)\]/g
It's finding words and phrases. Can anyone suggest how to modify it so that it only finds words in square brackets without spaces?
Thanks!!

Try replacing .*? with \S* in your regex. . will match any character including spaces, whereas \S will match only non-space characters
/\[(\S*)\]/g

I'm trying to find single words surrounded by square brackets without spaces.
You can use this regex:
/\[([^\s\]]*)\]/g

You current regex allows anything. You only want letters.
/\[([a-z])\]/gi

Related

Match exact word and remove leading space in regular expression

I'm looking for a regular expression.
Requirement:
I need to select a complete word from a string (word might contain special character or anything). And m pretty close to the solution.
Example:
character-set
Regular expression: (?:^|\s)(cent-er)(?=\s|$)
Result: " character-set" with a leading space.
But i want to remove leading space from the selected word. The word should match exactly i.e if i say character or character- or -set or set it should not get any result.
Any help is much appreciated. Thanks in advance.
It is not exactly what you seem to describe (as far as I could understand, that is), but maybe what you are looking for are word boundaries: \b. Try the regex (parentheses optional):
(\b)(cent-er)(\b)
Other than that, if you have to have a space before the word, then you will have to match it (and then use capturing groups to extract the word without the space), because JavaScript's regex has no lookbehinds.

Split string on spaces except for in quotes, but include incomplete quotes

I am trying to split a string in JS on spaces except when the space is in a quote. However, an incomplete quote should be maintained. I'm not skilled in regex wizardry, and have been using the below regex:
var list = text.match(/[^\s"]+|"([^"]*)"/g)
However, if I provide input like sdfj "sdfjjk this will become ["sdfj","sdfjjk"] rather than ["sdfj",""sdfjjk"].
You can use
var re = /"([^"]*)"|\S+/g;
By using \S (=[^\s]) we just drop the " from the negated character class.
By placing the "([^"]*)" pattern before \S+, we make sure substrings in quotes are not torn if they come before. This should work if the string contains well-paired quoted substrings and the last is unpaired.
Demo:
var re = /"([^"]*)"|\S+/g;
var str = 'sdfj "sdfjjk';
document.body.innerHTML = JSON.stringify(str.match(re));
Note that to get the captured texts in-between quotes, you will need to use RegExp#exec in a loop (as String#match "drops" submatches).
UPDATE
No idea what downvoter thought when downvoting, but let me guess. The quotes are usually used around word characters. If there is a "wild" quote, it is still a quote right before/after a word.
So, we can utilize word boundaries like this:
"\b[^"]*\b"|\S+
See regex demo.
Here, "\b[^"]*\b" matches a " that is followed by a word character, then matches zero or more characters other than " and then is followed with a " that is preceded with a word character.
Moving further in this direction, we can make it as far as:
\B"\b[^"\n]*\b"\B|\S+
With \B" we require that " should be preceded with a non-word character, and "\B should be followed with a non-word character.
See another regex demo
A lot depends on what specific issue you have with your specific input!
Try the following:
text.match(/".*?"|[^\s]+/g).map(s => s.replace(/^"(.*)"$/, "$1"))
This repeatedly finds either properly quoted substrings (first), OR other sequences of non-whitespace. The map part is to remove the quotes around the quoted substrings.
> text = 'abc "def ghi" lmn "opq'
< ["abc", "def ghi", "lmn", ""opq"]

Why do I have to add double backslash on javascript regex?

When I use a tool like regexpal.com it let's me use regex as I am used to. So for example I want to check a text if there is a match for a word that is at least 3 letters long and ends with a white space so it will match 'now ', 'noww ' and so on.
On regexpal.com this regex works \w{3,}\s this matches both the words above.
But on javascript I have to add double backslashes before w and s. Like this:
var regexp = new RegExp('\\w{3,}\\s','i');
or else it does not work. I looked around for answers and searched for double backslash javascript regex but all I got was completely different topics about how to escape backslash and so on. Does someone have an explanation for this?
You could write the regex without double backslash but you need to put the regex inside forward slashshes as delimiter.
/^\w{3,}\s$/.test('foo ')
Anchors ^ (matches the start of the line boundary), $ (matches the end of a line) helps to do an exact string match. You don't need an i modifier since \w matches both upper and lower case letters.
Why? Because in a string, "\" quotes the following character so "\w" is seen as "w". It essentially says "treat the next character literally and don't interpret it".
To avoid that, the "\" must be quoted too, so "\\w" is seen by the regular expression parser as "\w".

Can it be done with regex?

Having the following regex: ([a-zA-Z0-9//._-]{3,12}[^//._-]) used like pattern="([a-zA-Z0-9/._-]{3,12}[^/._-])" to validate an HTML text input for username, I wonder if is there anyway of telling it to check that the string has only one of the following: ., -, _
By that I mean, that I'm in need of regex that would accomplish the following (if possible)
alex-how => Valid
alex-how. => Not valid, because finishing in .
alex.how => Valid
alex.how-ha => Not valid, contains already a .
alex-how_da => Not valid, contains already a -
The problem with my current regex, is that for some reason, accepts any character at the end of the string that is not ._-, and can't figure it out why.
The other problem, is that it doesn't check to see that it contains only of the allowed special characters.
Any ideas?
Try this one out:
^(?!(.*[.|_|-].*){2})(?!.*[.|_|-]$)[a-zA-Z0-9//._-]{3,12}$
Regexpal link. The regex above allow at max one of ., _ or -.
What you want is one or more strings containing all upper, lower and digit characters
followed by either one or none of the characters in "-", ".", or "_", followed by at least one character:
^[a-zA-Z0-9]+[-|_|\.]{0,1}[a-zA-Z0-9]+$
Hope this will work for you:-
It says starts with characters followed by (-,.,_) and followed and end with characters
^[\w\d]*[-_\.\w\d]*[\w\d]$
Seems to me you want:
^[A-Za-z0-9]+(?:[\._-][A-Za-z0-9]+)?$
Breaking it down:
^: beginning of line
[A-Za-z0-9]+: one or more alphanumeric characters
(?:[\._-][A-Za-z0-9]+)?: (optional, non-captured) one of your allowed special characters followed by one or more alphanumeric characters
$: end of line
It's unclear from your question if you wanted one of your special characters (., -, and _) to be optional or required (e.g., zero-or-one versus exactly-one). If you actually wanted to require one such special character, you would just get rid of the ? at the very end.
Here's a demonstration of this regular expression on your example inputs:
http://rubular.com/r/SQ4aKTIEF6
As for the length requirement (between 3 and 12 characters): This might be a cop-out, but personally I would argue that it would make more sense to validate this by just checking the length property directly in JavaScript, rather than over-complicating the regular expression.
^(?=[a-zA-Z0-9/._-]{3,12}$)[a-zA-Z0-9]+(?:[/._-][a-zA-Z0-9]+)?$
or, as a JavaScript regex literal:
/^(?=[a-zA-Z0-9\/._-]{3,12})[a-zA-Z0-9]+(?:[\/._-][a-zA-Z0-9]+)?$/
The lookahead, (?=[a-zA-Z0-9/._-]{3,12}$), does the overall-length validation.
Then [a-zA-Z0-9]+ ensures that the name starts with at least one non-separator character.
If there is a separator, (?:[/._-][a-zA-Z0-9]+)? ensures that there's at least one non-separator following it.
Note that / has no special meaning in a regex. You only have to escape it if you're using a regex literal (because / is the regex delimiter), and you escape it by prefixing with a backslash, not another forward-slash. And inside a character class, you don't need to escape the dot (.) to make it match a literal dot.
The dot in regex has a special meaning: "any character here".
If you mean a literal dot, you should escape it to tell the regex parser so.
Escape dot in a regex range

simple regex to matching multiple word with spaces/multiple space or no spaces

I am trying to match all words with single or multiple spaces. my expression
(\w+\s*)* is not working
edit 1:
Let say i have a sentence in this form
[[do "hi i am bob"]]
[[do "hi i am Bob"]]
now I have to replace this with
cool("hi i am bob") or
cool("hi i am Bob")
I do not care about replacing multiple spaces with single .
I can achieve this for a single word like
\[\[do\"(\w+)\"\]\] and replacing regex cool\(\"$1\") but this does not look like an effective solution and does not match multiple words ....
I apologies for incomplete question
any help will be aprecciated
Find this Regular Expression:
/\[\[do\s+("[\w\s]+")\s*\]\]/
And do the following replacement:
'cool($1)'
The only special thing that's being done here is using character classes to our advantage with
[\w\s]+
Matches one or more word or space characters (a-z, A-Z, 0-9, _, and whitespace). That';; eat up your internal stuff no problem.
'[[do "hi i am Bob"]]'.replace(/\[\[do\s+("[\w\s]+")\s*\]\]/, 'cool($1)')
Spits out
cool("hi i am Bob")
Though - if you want to add punctuation (which you probably will), you should do it like this:
/\[\[do\s+("[^"]+")\s*\]\]/
Which will match any character that's not a double quote, preserving your substring. There are more complicated ones to allow you to deal with escaped quotation marks, but I think that's outside the scope of this question.
To match "all words with single or multiple spaces", you cannot use \s*, as it will match even no spaces.
On the other hand, it looks like you want to match even "hi", which is one word with no spaces.
You probably want to match one or more words separated by spaces. If so, use regex pattern
(\w+(?:$|\s+))+
or
\w+(\s+\w+)*
I'm not sure, but maybe this is what you're trying to get:
"Hi I am bob".match(/\b\w+\b/g); // ["Hi", "I", "am", "bob"]
Use regex pattern \w+(\s+\w+)* as follows:
m = s.match(/\w+(\s+\w+)*/g);
Simple. Match all groups of characters that are not white spaces
var str = "Hi I am Bob";
var matches = str.match(/[^ ]+/g); // => ["Hi", "I", "am", "Bob"]
What your regex is doing is:
/([a-zA-Z0-9_]{1,}[ \r\v\n\t\f]{0,}){0,}/
That is, find the first match of one or more of A through Z bother lower and upper along with digits and underscore, then followed by zero or more space characters which are:
A space character
A carriage return character
A vertical tab character
A new line character
A tab character
A form feed character
Then followed by zero or more of A through Z bother lower and upper along with digits and underscore.
\s matches more than just simple spaces, you can put in a literal space, and it will work.
I believe you want:
/(\w+ +\w+)/g
Which all matches of one or more of A through Z bother lower and upper along with digits and underscore, followed by one or more spaces, then followed by one or more of A through Z bother lower and upper along with digits and underscore.
This will match all word-characters separated by spaces.
If you just want to find all clusters of word characters, without punctuation or spaces, then, you would use:
/(\w+)/g
Which will find all word-characters that are grouped together.
var regex=/\w+\s+/g;
Live demo: http://jsfiddle.net/GngWn/
[Update] I was just answering the question, but based on the comments this is more likely what you're looking for:
var regex=/\b\w+\b/g;
\b are word boundaries.
Demo: http://jsfiddle.net/GngWn/2/
[Update2] Your edit makes it a completely different question:
string.replace(/\[\[do "([\s\S]+)"\]\]/,'cool("$1")');
Demo: http://jsfiddle.net/GngWn/3/

Categories

Resources