Match parts of code - javascript

I'm trying to match parts of code with regex. How can I match var, a, =, 2 and ; from
"var a = 2;"
?

I believe you want this regexp: /\S+/g
To break it down: \S selects all non-whitespace characters, + makes sure you it selects multiple non whitespace characters together (i.e. 'var'),
and the 'g' flag makes sure it selects all of the occurrences in the string, and instead of stopping at the first one which is the default behavior.
This is a helpful link for playing around until you find the right regexp: https://regex101.com/#javascript

var str = "var a = 2;";
// clean the duplicate whitespaces
var no_duplicate_whitespace = str.replace(new RegExp("\\s+", "g"), " ");
// and split by space
var tokens = no_duplicate_whitespace.split(" ");
Or as #kuujinbo pointed out:
str.split(/\s+/);

Related

Replace multiple characters by one character with regex

I have this string :
var str = '#this #is____#a test###__'
I want to replace all the character (#,_) by (#) , so the excepted output is :
'#this #is#a test#'
Note :
I did not knew How much sequence of (#) or (_) in the string
what I try :
I try to write :
var str = '#this #is__ __#a test###__'
str = str.replace(/[#_]/g,'#')
alert(str)
But the output was :
#this #is## ###a test#####
my try online
I try to use the (*) for sequence But did not work :
var str = '#this #is__ __#a test###__'
str = str.replace(/[#_]*/g,'#')
alert(str)
so How I can get my excepted output ?
A well written RegEx can handle your problem rather easily.
Quoting Mohit's answer to have a startpoint:
var str = '#this #is__ __#a test###__';
var formattedStr = str.replace(/[#_,]+/g, '#');
console.log( formattedStr );
Line 2:
Put in formattedStr the result of the replace method on str.
How does replace work? The first parameter is a string or a RegEx.
Note: RegExps in Javascripts are Objects of type RegExp, not strings. So writing
/yourRegex/
or
New RegExp('/yourRegex/')
is equivalent syntax.
Now let's discuss this particular RegEx itself.
The leading and trailing slashes are used to surround the pattern, and the g at the end means "globally" - do not stop after the first match.
The square parentheses describe a set of characters who can be supplied to match the pattern, while the + sign means "1 or more of this group".
Basically, ### will match, but also # or #####_# will, because _ and # belong to the same set.
A preferred behavior would be given by using (#|_)+
This means "# or _, then, once found one, keep looking forward for more or the chosen pattern".
So ___ would match, as well as #### would, but __## would be 2 distinct match groups (the former being __, the latter ##).
Another problem is not knowing wheter to replace the pattern found with a _ or a #.
Luckily, using parentheses allows us to use a thing called capturing groups. You basically can store any pattern you found in temporary variabiles, that can be used in the replace pattern.
Invoking them is easy, propend $ to the position of the matched (pattern).
/(foo)textnotgetting(bar)captured(baz)/ for example would fill the capturing groups "variables" this way:
$1 = foo
$2 = bar
$3 = baz
In our case, we want to replace 1+ characters with the first occurrence only, and the + sign is not included in the parentheses!
So we can simply
str.replace("/(#|_)+/g", "$1");
In order to make it work.
Have a nice day!
Your regex replaces single instance of any matched character with character that you specified i.e. #. You need to add modifier + to tell it that any number of consecutive matching characters (_,#) should be replaced instead of each character individually. + modifier means that 1 or more occurrences of specified pattern is matched in one go. You can read more about modifiers from this page:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
var str = '#this #is__ __#a test###__';
var formattedStr = str.replace(/[#_,]+/g, '#');
console.log( formattedStr );
You should use the + to match one-or-more occurrences of the previous group.
var str = '#this #is__ __#a test###__'
str = str.replace(/[#_]+/g,'#')
alert(str)

Regexp replace all exact matching words/characters in string

I need to replace all matching words or special characters in a string, but cant figure out a way to do so.
For example i have a string: "This - is a great victory"
I need to replace all - with + signs. Or great with unpleasant - user selects a word to be replaced and gives replacement for it.
"\\b"+originalTex+"\\b"
was working fie until i realised that \b does work only with word characters.
So the question is: what is replacement for \b would let me replace any matching word that is enclosed by whitespaces?
EDIT: I can not remove word boundaries as it would result inexact match. For example: you are creator of your world, while change you, your also would be changed. as it contains "you"
You need to use the following code:
var s = "you are creator of your world";
var search = "you";
var torepl = "we";
var rx = new RegExp("(^|\\s)" + search.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') + "(?!\\S)", "gi");
var res = s.replace(rx, "$1" + torepl);
console.log(res);
The (^|\\s) will match and capture into Group 1 start of string or a whitespace. The search.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') will escape special chars (if any) inside the search word. The (?!\\S) lookahead will require a whitespace or end of string right after the search word.
The $1 backreference inserts the contents of Group 1 back into the string during replacement (no need to use any lookbehinds here).
How about two replaces
var txt = "This - is a great, great - and great victory"
var originalTex1 = "great",originalTex2 = "-",
re1 = new RegExp("\\b"+originalTex1+"\\b","g"),
re2 = new RegExp("\\s"+originalTex2+"\\s","g")
console.log(txt.replace(re1,"super").replace(re2," + "))

Reduce multiple occurences of any non-alphanumeric characters in string down to one, **NOT REMOVE**

I've been searching for a solution but almost every one that I've come across with is about replacing a matching pattern with a previously known character.
For example:
var str = 'HMQ 2.. Elizabeth';
How do we catch multiple occurences of that dots in the string and replace them with only one? And it's also not specific to dots but any non-alphanumeric characters that we don't know which. Thank you.
Use a backreference. \1 in the regex refers to the first match group in the expression.
var str = 'HMQ 2.. Elizabetttth .';
var regex = /([^A-Za-z0-9])\1+/g;
var trimmed = str.replace(regex, "$1");
console.log( trimmed );

Regex to not match when not in quotes

I'm looking to create a JS Regex that matches double spaces
([-!$%^&*()_+|~=`{}\[\]:";'<>?,.\w\/]\s\s[^\s])
The RegEx should match double spaces (not including the start or end of a line, when wrapped within quotes).
Any help on this would be greatly appreciated.
For example:
var x = 1,
Y = 2;
Would be fine where as
var x = 1;
would not (more than one space after the = sign.
Also if it was
console.log("I am some console output");
would be fine as it is within double quotes
This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."
We can solve it with a beautifully-simple regex:
(["']) \1|([ ]{2})
The left side of the alternation | matches complete ' ' and " ". We will ignore these matches. The right side matches and captures double spaces to Group 2, and we know they are the right ones because they were not matched by the expression on the left.
This program shows how to use the regex in JavaScript, where we will retrieve the Group 2 captures:
var the_captures = [];
var string = 'your_test_string'
var myregex = /(["']) \1|([ ]{2})/g;
var thematch = myregex.exec(string);
while (thematch != null) {
// add it to array of captures
the_captures.push(thematch[2]);
document.write(thematch[2],"<br />");
// match the next one
thematch = myregex.exec(string);
}
A Neat Variation for Perl and PCRE
In the original answer, I hadn't noticed that this was a JavaScript question (the tag was added later), so I had given this solution:
(["']) \1(*SKIP)(*FAIL)|[ ]{2}
Here, thanks to (*SKIP)(*FAIL) magic, we can directly match the spaces, without capture groups.
See demo.
Reference
How to match (or replace) a pattern except in situations s1, s2, s3...
Article about matching a pattern unless...
Simple solution:
/\s{2,}/
This matches all occurrences of one or more whitespace characters. If you need to match the entire line, but only if it contains two or more consecutive whitespace characters:
/^.*\s{2,}.*$/
If the whitespaces don't need to be consecutive:
/^(.*\s.*){2,}$/

regex and javascript, some matches disappear !

Here is the code :
> var reg = new RegExp(" hel.lo ", 'g');
>
> var str = " helalo helblo helclo heldlo ";
>
> var mat = str.match(reg);
>
> alert(mat);
It alerts "helalo, helclo", but i expect it to be "helalo, helblo, helclo, heldlo" .
Only the half of them matches, I guess that's because of the space wich count only once. So I tried to double every space before processing, but in some case it's not enough.
I'm looking for an explanation, and a solution.
Thx
"␣helalo␣helblo␣helclo␣heldlo␣"
// 11111111------22222222-------
When ␣helalo␣ was matched, the string left is helblo␣... without the leading space. But the regex requires a leading space, so it skips to ␣helclo␣.
To avoid the expression eating up the space, use a lookahead.
var reg = / hel.lo(?= )/g
(Or use \b as a word boundary.)
It matches the regex, advances to the next character after the matched string and goes on.
You can use \b to match word boundaries. You can add the whitespaces later, if you want to.
\bhel.lo\b

Categories

Resources