Problem with newline in JavaScript regexp

Problem with newline in JavaScript regexp - javascript

i tried to do not show "SPAM" in string below using that regex:
alert("{SPAM\nSPAM} _1_ {SPAM} _2_".replace(/{[\s\S]+}/gm, ""));
What i was supposed to see was "~1~ ~2~"
(or something like that) but i got just ~2~. Why?

} and { are also elements of the character class [\s\S]. You should avoid matching this by:
/{[^}]+}/g
so that the regex stops once the } is found.

Related

Or Condition in a regular expression

I have a string in which I need to get the value between either "[ValueToBeFetched]" or "{ValueToBeFetched}".
var test = "I am \"{now}\" doing \"[well]\"";
test.match(/"\[.*?]\"/g)
the above regex serves the purpose and gets the value between square brackets and I can use the same for curly brackets also.
test.match(/"\{.*?}\"/g)
Is there a way to keep only one regex and do this, something like an or {|[ operator in regex.
I tried some scenarios but they don't seem to work.
Thanks in advance.

You could try following regex:
(?:{|\[).*?(?:}|\])
Details:
(?:{|\[): Non-capturing group, gets character { or [
.*?: gets as few as possible
(?:}|\]): Non-capturing group, gets character } or ]
Demo
Code in JavaScript:
var test = "I am \"{now}\" doing \"[well]\"";
var result = test.match(/"(?:{|\[).*?(?:}|\])"/g);
console.log(result);
Result:
["{now}", "[well]"]

As you said, there is an or operator which is |:
[Edited as suggested] Let's catch all sentences that begins with an "a" or a "b" :
/^((a|b).*)/gm
In this example, if the line parsed begins with a or b, the entire sentence will be catched in the first result group.
You may test your regex with an online regex tester
For your special case, try something like that, and use the online regex tester i mentionned before to understand how it works:
((\[|\{)\w*(\]|\}))

Regex for finding a specific pattern in a string

I want to write a regex to test if a string contains ]][[. For ex:[[t]][[t]].
I am able to find a string which ends with ]] using the pattern:
RegExp(/]]/)
But if i try to use the pattern:
RegExp(/]\]\(?=\[\[)/)
for testing [[t]][[t]], the console displays the following error:
Uncaught Syntax Error: Invalid regular expression: /]\]\(?=\[\[)/: Unmatched ')'

Why you have a syntax error:
You are escaping ( with \(, so you get the Unmatched ')' error in \(?=\[\[).
How to fix it:
The best way to do this depends on exactly what you want.
If you just want to check that the string contains ]][[, don't use a regex:
if (yourString.indexOf(']][[') !== -1) {
// do something
}
If you really want to use a regex, you need to escape [s but not ]s:
if (/]]\[\[/.test(yourString)) {
// do something
}
If you really want to use a regex and not capture the [[:
if (/]](?=\[\[)/.test(yourString)) {
// do something
}
If you want to check for matching [[ and ]] (like [[t]]):
if (/\[\[[^[\]]*]]/.test(yourString)) {
// do something
}
If you want to check for two [[..]] strings back-to-back:
if (/(\[\[[^[\]]*]]){2}/.test(yourString)) {
// do something
}
If you want to check for one [[..]] string, followed by exactly the same string ([[t]][[t]] but not [[foo]][[bar]]):
if (/(\[\[[^[\]]*]])\1/.test(yourString)) {
// do something
}
Here's a demo of all of the above, along with a bunch of unit tests to demonstrate how each of these works.

Use this:
if (/]]\[\[/.test(yourString)) {
// It matches!
} else {
// Nah, no match...
}
Note that we need to escape the two opening braces with \[, but the closing braces ] do not need escaping as there is no potential confusion with a character class.

You need to escape the opening square brackets because it has a special meaning in regex which represents the start of a character class.
]](?=\[\[)

Try this one, this regex must match everything you need:
(\w*[\[\]]+\w*)+
But if for every open bracket [ there must be a closed bracket ], that's different.
What exactly you want?

Replace all besides the Regex group?

I was given a task to do which requires a long time to do.
The image say it all :
This is what I have : (x100 times):
And I need to extract this value only
How can I capture the value ?
I have made it with this regex :
DbCommand.*?\("(.*?)"\);
As you can see it does work :
And after the replace function (replace to $1) I do get the pure value :
but the problem is that I need only the pure values and not the rest of the unmatched group :
Question : In other words :
How can I get the purified result like :
Eservices_Claims_Get_Pending_Claims_List
Eservices_Claims_Get_Pending_Claims_Step1
Here is my code at Online regexer
Is there any way of replacing "all besides the matched group" ?
p.s. I know there are other ways of doing it but I prefer a regex solution ( which will also help me to understand regex better)

Unfortunately, JavaScript doesn't understand lookbehind. If it did, you could change your regular expression to match .*? preceded (lookbehind) by DbCommand.*?\(" and followed (lookahead) by "\);.
With that solution denied, i believe the cleanest solution is to perform two matches:
// you probably want to build the regexps dynamically
var regexG = /DbCommand.*?\("(.*?)"\);/g;
var regex = /DbCommand.*?\("(.*?)"\);/;
var matches = str.match(regexG).map(function(item){ return item.match(regex)[1] });
console.log(matches);
// ["Eservices_Claims_Get_Pending_Claims_List", "Eservices_Claims_Get_Pending_Claims_Step1"]
DEMO: http://jsbin.com/aqaBobOP/2/edit

You should be able to do a global replace of:
public static DataTable.*?{.*?DbCommand.*?\("(.*?)"\);.*?}
All I've done is changed it to match the whole block including the function definition using a bunch of .*?s.
Note: Make sure your regex settings are such that the dot (.) matches all characters, including newlines.
In fact if you want to close up all whitespace, you can slap a \s* on the front and replace with $1\n:
\s*public static DataTable.*?{.*?DbCommand.*?\("(.*?)"\);.*?}
Using your test case: http://regexr.com?37ibi

You can use this (without the ignore case and multiline option, with a global search):
pattern: (?:[^D]+|\BD|D(?!bCommand ))+|DbCommand [^"]+"([^"]+)
replace: $1\n

Try simply replacing the whole document replacing using this expression:
^(?: |\t)*(?:(?!DbCommand).)*$
You will then only be left with the lines that begin with the string DbCommand
You can then remove the spaces in between by replacing:
\r?\n\s* with \n globally.
Here is an example of this working: http://regexr.com?37ic4

Remove line breaks from start and end of string

I noticed that trim() does not remove new line characters from the start and end of a string, so I am trying to accomplish this with the following regex:
return str.replace(/^\s\n+|\s\n+$/g,'');
This does not remove the new lines, and I fear I am out of my depth here.
EDIT
The string is being generated with ejs like so
go = ejs.render(data, {
locals: {
format() {
//
}
}
});
And this is what go is, but with a few empty lines before. When I use go.trim() I still get the new lines before.
<?xml version="1.0"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="Out" page-width="8.5in" page-height="11in" margin-top="1in" margin-bottom="0.5in" margin-left="0.75in" margin-right="0.75in">
<fo:region-body margin-top="1in" margin-bottom="0.25in"/>
<fo:region-before extent="1in"/>
<fo:region-after extent="0.25in"/>
<fo:region-start extent="0in"/>
<fo:region-end extent="0in"/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="Out" initial-page-number="1" force-page-count="no-force">
<fo:static-content flow-name="xsl-region-before">
<fo:block font-size="14pt" text-align="center">ONLINE APPLICATION FOR SUMMARY ADVICE</fo:block>
<fo:block font-size="13pt" font-weight="bold" text-align="center">Re:
SDF, SDF
</fo:block>
</fo:static-content>
<fo:flow flow-name="xsl-region-body" font="10pt Helvetica">
.. removed this content
</fo:flow>
</fo:page-sequence>
</fo:root>

Try this:
str = str.replace(/^\s+|\s+$/g, '');
jsFiddle here.

String.trim() does in fact remove newlines (and all other whitespace). Maybe it didn't used to? It definitely does at the time of writing. From the linked documentation (emphasis added):
The trim() method removes whitespace from both ends of a string. Whitespace in this context is all the whitespace characters (space, tab, no-break space, etc.) and all the line terminator characters (LF, CR, etc.).
If you want to trim all newlines plus other potential whitespace, you can use the following:
return str.trim();
If you want to only trim newlines, you can use a solution that targets newlines specifically.

/^\s+|\s+$/g should catch anything. Your current regex may have the problem that if your linebreaks contain \r characters they wouldn't be matched.

Try this:
str.split('\n').join('');

Skipping over tags and spaces in regex html

I'm using this regex to find a String that starts with !?, ends with ?!, and has another variable inbetween (in this example "a891d050"). This is what I use:
var pattern = new RegExp(/!\\?.*\s*(a891d050){1}.*\s*\\?!/);
It matches correctly agains this one:
!?v8qbQ5LZDnFLsny7VmVe09HJFL1/WfGD2A:::a891d050?!
But fails when the string is broken up with html tags.
<span class="userContent"><span>!?v8qbQ5LZDnFLsny7VmVe09HJFL1/</span><wbr /><span class="word_break"></span>WfGD2A:::a891d050?!</span></div></div></div></div>
I tried adding \s and {space}*, but it still fails.
The question is, what (special?)characters do I need to account for if I want to ignore whitespace and html tags in my match.
edit: this is how I use the regex:
var pattern = /!\?[\s\S]*a891d050[\s\S]*\?!/;
document.body.innerHTML = document.body.innerHTML.replace(pattern,"new content");
It appears to me that when it encounters the 'plain' string it replaces is correctly. But when faced with String with classes around it and inside, it makes a mess of the classes or doesn't replace at all depending on the context. So I decided to try jquery-replacetext-plugin(as it promises to leave tags as they were) like this:
$("body *").replaceText( pattern, "new content" );
But with no success, the results are the same as before.

Maybe this:
var pattern = /!\?[\s\S]*a891d050[\s\S]*\?!/;
[\s\S] should match any character. I have also removed {1}.

The problem was apparently solved by using this regex:
var pattern = /(!\?)(?:<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])*?>)?(.)*?(a891d050)(?:<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])*?>)?(.)*?(\?!)/;

Develop Reference

JavaScript is the programming language of the Web.

Problem with newline in JavaScript regexp - javascript

i tried to do not show "SPAM" in string below using that regex: alert("{SPAM\nSPAM} _1_ {SPAM} _2_".replace(/{[\s\S]+}/gm, "")); What i was supposed to see was "~1~ ~2~" (or something like that) but i got just ~2~. Why?

} and { are also elements of the character class [\s\S]. You should avoid matching this by: /{[^}]+}/g so that the regex stops once the } is found.

Related

Or Condition in a regular expression

Regex for finding a specific pattern in a string

Replace all besides the Regex group?

Remove line breaks from start and end of string

Skipping over tags and spaces in regex html

Categories

Resources