Remove line breaks from start and end of string - javascript

I noticed that trim() does not remove new line characters from the start and end of a string, so I am trying to accomplish this with the following regex:
return str.replace(/^\s\n+|\s\n+$/g,'');
This does not remove the new lines, and I fear I am out of my depth here.
EDIT
The string is being generated with ejs like so
go = ejs.render(data, {
locals: {
format() {
//
}
}
});
And this is what go is, but with a few empty lines before. When I use go.trim() I still get the new lines before.
<?xml version="1.0"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="Out" page-width="8.5in" page-height="11in" margin-top="1in" margin-bottom="0.5in" margin-left="0.75in" margin-right="0.75in">
<fo:region-body margin-top="1in" margin-bottom="0.25in"/>
<fo:region-before extent="1in"/>
<fo:region-after extent="0.25in"/>
<fo:region-start extent="0in"/>
<fo:region-end extent="0in"/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="Out" initial-page-number="1" force-page-count="no-force">
<fo:static-content flow-name="xsl-region-before">
<fo:block font-size="14pt" text-align="center">ONLINE APPLICATION FOR SUMMARY ADVICE</fo:block>
<fo:block font-size="13pt" font-weight="bold" text-align="center">Re:
SDF, SDF
</fo:block>
</fo:static-content>
<fo:flow flow-name="xsl-region-body" font="10pt Helvetica">
.. removed this content
</fo:flow>
</fo:page-sequence>
</fo:root>

Try this:
str = str.replace(/^\s+|\s+$/g, '');
jsFiddle here.

String.trim() does in fact remove newlines (and all other whitespace). Maybe it didn't used to? It definitely does at the time of writing. From the linked documentation (emphasis added):
The trim() method removes whitespace from both ends of a string. Whitespace in this context is all the whitespace characters (space, tab, no-break space, etc.) and all the line terminator characters (LF, CR, etc.).
If you want to trim all newlines plus other potential whitespace, you can use the following:
return str.trim();
If you want to only trim newlines, you can use a solution that targets newlines specifically.

/^\s+|\s+$/g should catch anything. Your current regex may have the problem that if your linebreaks contain \r characters they wouldn't be matched.

Try this:
str.split('\n').join('');

Related

Regex find string and replace that line and following lines

I am trying to find a regex to achieve the following criteria which I need to use in javascript.
Input file
some string is here and above this line
:62M:C111111EUR1211498,00
:20:0000/11111000000
:25:1111111111
:28C:00001/00002
:60M:C170926EUR1211498,06
:61:1710050926C167,XXNCHKXXXXX 11111//111111/111111
Output has to be
some string is here and above this line
:61:1710050926C167,XXNCHKXXXXX 11111//111111/111111
Briefly, find :62M: and then replace (and delete) the lines starting with :62M: followed by lines starting with :20:, :25:, :28c: and :60M:.
Or, find :62M: and replace (and delete) until the line starting with :61:.
Each line has fixed length of 80 characters followed by newline (CR LF).
Is this really possible with regex?
I know how to find a string and replace the same line where the string is. But here multiple lines to be removed which is quite hard for me.
Please could someone help me out if it is possible with regex.
Here it is. First I'm finding text to delete using regex (note that I'm using [^]* to match all the lines insted of .*, as it also matches newlines). Then I'm replacing it with a newline.
var regex = /:62M:.*([^]*):61:.*/;
var text = `some string is here and above this line
:62M:C111111EUR1211498,00
:20:0000/11111000000
:25:1111111111
:28C:00001/00002
:60M:C170926EUR1211498,06
:61:1710050926C167,XXNCHKXXXXX 11111//111111/111111`;
var textToDelete = regex.exec(text)[1];
var result = text.replace(textToDelete, '\n');
console.log(result);

Match only the line which end with specific char

How to match the line which does not contain the final dot (full stop/period), in order to add it afterwards.
Someword someword someword.
Someword someword someword
Someword someword someword.
These are my unsuccessful attempts:
.+(?=\.)
.+[^.]
--- update
This works for me:
.+\w+(?:\n)
https://regex101.com/r/sR0aD7/1
The following should match a string that ends with anything but dot: [^.]$ - "anything but dot" and end-of-text marker.
How to match the line which does not contain the final dot (full stop/period),
You can use negative lookahead like this:
/(?!\.$)/
OR else you can also inverse test:
if (!/\.$/.test(input)) { console.log("line is not ending with dot"); }
Regular expression is one way i think you can use this method also --->
function lastCharacter(sentence){
var length = sentence.length;
return sentence.charAt(length-1);
}
Example :-
Input ---> Hey JavaScript is damm good.
Use ---> lastCharacter('Hey JavaScript is damm good.');
Output ---> '.'
In other cases you can check with if condition for dot('.').
Just use something like this: [^\.]$
$ - Indicates end of line.
[^...] - Indicates selecting lines not containing "..."
\. - This is the escaped "." Character. It needs to be escaped because . is anything.
Pulling this together, you get a regular expression .+[^\.]$ which will match your line. You will need the m flag (Multiline) for this to work (I believe)
This works for me:
.+\w+(?:\n)
https://regex101.com/r/sR0aD7/1

JavaScript: how to use a regular expression to remove blank lines from a string?

I need to use JavaScript to remove blank lines in a HTML text box. The blank lines can be at anywhere in the textarea element. A blank line can be just a return or white spaces plus return.
I am expecting a regular expression solution to this. Here are some I tried, but they are not working and cannot figure out why:
/^\s*\r?\n/g
/^\s*\r?\n$/g
Edit 1
It appears that the solution (I modified it a little) suggested by aaronman and m.buettner works:
string.replace(/^\s*\n/gm, "")
Can someone tell why my first regular expression is not working?
Edit 2
After reading all useful answers, I came up with this:
/^[\s\t]*(\r\n|\n|\r)/gm
Is this going to be one that cover all situations?
Edit 3
This is the most concise one covering all spaces (white spaces, tabs) and platforms (Linux, Windows, Mac).
/^\s*[\r\n]/gm
Many thanks to m.buettner!
Your pattern seems alright, you just need to include the multiline modifier m, so that ^ and $ match line beginnings and endings as well:
/^\s*\n/gm
Without the m, the anchors only match string-beginnings and endings.
Note that you miss out on UNIX-style line endings (only \r). This would help in that case:
/^\s*[\r\n]/gm
Also note that (in both cases) you don't need to match the optional \r in front of the \n explicitly, because that is taken care of by \s*.
As Dex pointed out in a comment, this will fail to clear the last line if it consists only of spaces (and there is no newline after it). A way to fix that would be to make the actual newline optional but include an end-of-line anchor before it. In this case you do have to match the line ending properly though:
/^\s*$(?:\r\n?|\n)/gm
I believe this will work
searchText.replace(/(^[ \t]*\n)/gm, "")
This should do the trick i think:
var el = document.getElementsByName("nameOfTextBox")[0];
el.value.replace(/(\r\n|\n|\r)/gm, "");
EDIT: Removes three types of line breaks.
Here's a reusable function that will trim each line's whitespace and remove any blank or space-only lines:
function trim_and_remove_blank_lines(string)
{
return string.replace(/^(?=\n)$|^\s*|\s*$|\n\n+/gm, "")
}
Usage example:
trim_and_remove_blank_lines("Line 1 \nLine2\r\n\r\nLine4\n")
//Returns 'Line 1\nLine2\nLine4'
function removeEmptyLine(text) {
return text.replace(/(\r?\n)\s*\1+/g, '$1');
}
test:
console.assert(removeEmptyLine('a\r\nb') === 'a\r\nb');
console.assert(removeEmptyLine('a\r\n\r\nb') === 'a\r\nb');
console.assert(removeEmptyLine('a\r\n \r\nb') === 'a\r\nb');
console.assert(removeEmptyLine('a\r\n \r\n \r\nb') === 'a\r\nb');
console.assert(removeEmptyLine('a\r\n \r\n 2\r\n \r\nb') === 'a\r\n 2\r\nb');
console.assert(removeEmptyLine('a\nb') === 'a\nb');
console.assert(removeEmptyLine('a\n\nb') === 'a\nb');
console.assert(removeEmptyLine('a\n \nb') === 'a\nb');
console.assert(removeEmptyLine('a\n \n \nb') === 'a\nb');
console.assert(removeEmptyLine('a\n \n2 \n \nb') === 'a\n2 \nb');

Javascript Regular expression to remove unwanted <br>,

I have a JS stirng like this
<div id="grouplogo_nav"><br> <ul><br> <li><a class="group_hlfppt" target="_blank" href="http://www.hlfppt.org/">&nbsp;</a></li><br> </ul><br> </div>
I need to remove all <br> and $nbsp; that are only between > and <. I tried to write a regular expression, but didn't got it right. Does anybody have a solution.
EDIT :
Please note i want to remove only the tags b/w > and <
Avoid using regex on html!
Try creating a temporary div from the string, and using the DOM to remove any br tags from it. This is much more robust than parsing html with regex, which can be harmful to your health:
var tempDiv = document.createElement('div');
tempDiv.innerHTML = mystringwithBRin;
var nodes = tempDiv.childNodes;
for(var nodeId=nodes.length-1; nodeId >= 0; --nodeId) {
if(nodes[nodeId].tagName === 'br') {
tempDiv.removeChild(nodes[nodeId]);
}
}
var newStr = tempDiv.innerHTML;
Note that we iterate in reverse over the child nodes so that the node IDs remain valid after removing a given child node.
http://jsfiddle.net/fxfrt/
myString = myString.replace(/^( |<br>)+/, '');
... where /.../ denotes a regular expression, ^ denotes start of string, ($nbsp;|<br>) denotes " or <br>", and + denotes "one or more occurrence of the previous expression". And then simply replace that full match with an empty string.
s.replace(/(>)(?: |<br>)+(\s?<)/g,'$1$2');
Don't use this in production. See the answer from Phil H.
Edit: I try to explain it a bit and hope my english is good enough.
Basically we have two different kinds of parentheses here. The first pair and third pair () are normal parentheses. They are used to remember the characters that are matched by the enclosed pattern and group the characters together. For the second pair, we don't need to remember the characters for later use, so we disable the "remember" functionality by using the form (?:) and only group the characters to make the + work as expected. The + quantifier means "one or more occurrences", so or <br> must be there one or more times. The last part (\s?<) matches a whitespace character (\s), which can be missing or occur one time (?), followed by the characters <. $1 and $2 are kind of variables that are replaces by the remembered characters of the first and third parentheses.
MDN provides a nice table, which explains all the special characters.
You need to replace globally. Also don't forget that you can have the being closed . Try this:
myString = myString.replace(/( |<br>|<br \/>)/g, '');
This worked for me, please note for the multi lines
myString = myString.replace(/( |<br>|<br \/>)/gm, '');
myString = myString.replace(/^( |<br>)+/, '');
hope this helps

Problem with newline in JavaScript regexp

i tried to do not show "SPAM" in string below using that regex:
alert("{SPAM\nSPAM} _1_ {SPAM} _2_".replace(/{[\s\S]+}/gm, ""));
What i was supposed to see was "~1~ ~2~"
(or something like that) but i got just ~2~. Why?
} and { are also elements of the character class [\s\S]. You should avoid matching this by:
/{[^}]+}/g
so that the regex stops once the } is found.

Categories

Resources