regex to separate paragraphs by double new line

regex to separate paragraphs by double new line - javascript

Making a tiny helper for authors to add/remove empty lines in textarea. The one that removes extra lines seems to work:
if(jQuery("#remove_line").prop("checked")){
conver = conver.replace(/^\s*[\r\n]/gm,'');
}
But there has to be a reverse one that makes any new line into double new line (no matter if there's just 1 or even 3 new lines in a row). E.g. this:
text
text
text
text
Should be processed into this:
text
text
text
text
Anyone can help with that? Thanks in advance!

You can match one or more newlines [\r\n]+ and replace with 2 newlines \n\n
const regex = /[\r\n]+/gm;
const str = `text
text
text
text`;
console.log(str.replace(/[\r\n]+/g, `\n\n`));

didn't test but i think this could work : replace(/[\n]+?/g,'\n\n')
\n+ will match 1 or more new lines, the question mark will force to take the shortest amount possible (this will stop the matching block at the first non new line character)

The reverse?
addEventListener('load', ()=>{
const ta = document.querySelector('textarea');
function doubleSpaceAfter(fullText, text, insensitive = true){
const s = insensitive ? 'i' : '';
return fullText.replace(new RegExp('('+text+')\\s+', 'g'+s), '\n\n$1\n\n');
}
ta.value = doubleSpaceAfter(ta.value, 'text here');
}); // end load
<textarea>other stuff text here is this what you want? text here text here</textarea>

Related

Remove a specific character from a text file

I have a program that removes a character from a string and it works fine what I was trying to add was to make the program read a text file with multiple lines of text and remove the specific character if the character exists in the first position of every line. How can I implement this. Any help is appreciated. Thanks in advance.
Example:
22Orange
22Banana
would become:
Orange
Bannana
//something like this but this only works if I have a single line string
var aStr = fs.readFileSync('filePath', 'utf-8')
var bStr = aStr.replace(/^22+/i, '');
console.log(bStr)

Use the global (g) and multiline (m) flags.
let str = `22Orange
22Banana`;
console.log(str.replace(/^22/gm, ''));

Split a huge text using regex delimiters

I'm working with giant text files that have more than one
document inside. These documents have a very similar interface, with fixed fields
and dynamic values. I need to separate these documents in arrays.
Example:
[
   [] <- Doc1
   [] <- Doc2
   [] <- Doc3
   [] <- Doc4
   ...
   ...
   ...
]
For this, I need to create a regular expression that defines the delimiter, where the doc starts and where ends.
Example:
DOC_START
TEXT
TEXT
TEXT
TEXT
DOC_FINAL
DOC_START
TEXT
TEXT
TEXT
TEXT
DOC_FINAL
REGEX: ((?:DOC_START)(?:[\S\S]+)(?:DOC_FINAL)?)
The question is: Some documents may have peculiarities, starting or ending with a something a bit different, so I need to be able to pass start and end options.
My question: how can I do this? And how can I also improve the regex?
Just to be clear, sometimes, the document may have the beginning or the ending a bit different. Example:
DOC_START
TEXT
TEXT
TEXT
TEXT
DOC_FINAL
DOC_START
TEXT
TEXT
TEXT
TEXT
DOC_FINAL
OTHER_START
TEXT
TEXT
TEXT
TEXT
DOC_FINAL
DOC_START
TEXT
TEXT
TEXT
TEXT
OTHER_FINAL
OTHER_START
TEXT
TEXT
TEXT
TEXT
OTHER_FINAL

It would be better not to use regex, especially with large documents. Use indexOf():
var hugeDoc = 'DOC_STARTxxDOC_ENDOTHER_STARTyyOTHER_END';
var result = [];
var start =0;
var possibleDelimiters = [
{'start': 'OTHER_START', 'end':'OTHER_END'},
{'start': 'DOC_START', 'end':'DOC_END'}
];
function parseDoc(delimiter) {
var end = hugeDoc.indexOf(delimiter.end, start);
if(!end) return false;
result.push(hugeDoc.slice(start+delimiter.start.length, end));
//add +1 here, if you have a new line after DOC_END
start = end+delimiter.end.length;
return true;
}
do {
var found = false;
for(ix in possibleDelimiters) {
var delimiter = possibleDelimiters[ix];
if(hugeDoc.indexOf(delimiter.start, start) === start) {
found = parseDoc(delimiter) || found;
}
}
} while(found);
var node = document.getElementById('result');
node.innerHTML = JSON.stringify(result);
<html>
<body>
<div id="result"></div>
</body>
</html>

First I believe you have a typo in your regex it should be [\s\S] instead of [\S\S] notice the lower-case s. This correctly matches accross lines.
This regex could accomplish what you need for matching such a document, someone could probably make a more optimized version:
/(?:DOC_START|OTHER_START)([\s\S]*?)(?:DOC_FINAL|OTHER_FINAL)/g
On the other hand I would rather suggest you do this with a different approach if possible. For example if you're doing this within NodeJS I'd strongly suggest you do a check per line for the DOC_START or DOC_END delimiters. Then fill the array with lines until the ending delimiter.
Assuming that you want an array of lines in each document, loose pseudo code following:
create resulting object ({ doc1: null })
read line
if start delimiter
if current object property is null
create array (doc#: [])
else if end delimiter
create new doc property (doc2: null)
else
add line to array
Another note if you're doing this with HTML I'd strongly suggest not to use regex at all as HTML is not a regular language :) you'll find many links on SO pointing to evil.

Cannot split textarea on newline created from html entities

I have a textarea that needs a default value including new lines so I used the technique employing HTML entities found from this answer
<textarea rows=4 id="mytextarea">John,2
Jane,3
John,4
Jane,5</textarea>
Then I have a button with launches a button for parsing the textarea and the first thing I want to accomplish is creating a rows array from the textarea value but this is where I am having problems. There is another answer that goes into the details of which browsers represent line breaks as \r\n and which as \n only and though this may not be strictly necessary in my case I still decided to try it:
var textAreaValue = document.getElementById("mytextarea").value;
textAreaValue = textAreaValue.replace(/\s/g, ""); //remove white space
var rows = textAreaValue.replace(/\r\n/g, "\n").split("\n");
rows however is coming back as ["John,2Jane,3John,4Jane,5"]. So the split is having no effect; what am I doing wrong?

The \s in your regex is removing the line breaks. Try commenting out that replacement and check your rows value again, it should then be as you expect!
function parseText() {
var textAreaValue = document.getElementById("mytextarea").value;
//textAreaValue = textAreaValue.replace(/\s/g, ""); //remove white space
var rows = textAreaValue.replace(/\r\n/g, "\n").split("\n");
alert(rows);
}
See JSFiddle here: http://jsfiddle.net/xs2619cn/1/
I changed your commas to full stops just to make the alert output clearer.

You don't use \n and \r, so you can splite 
 or
.
var rows = textAreaValue.replace(/(&#[13|10];)+/g, "\n").split("\n");
Demo: http://jsfiddle.net/rr1zvxko/

Create new line/carriage return on paragraph after more than 1 spaces

I would like to know how to create carriage return/start in new line after more than 3 spaces in a single line of paragraph in Javascript. Thank You

It would help if you posted your code, but if you want to parse a block of text and reformat it, how about this:
var string = //paragraph text;
var newStr = string.replace(/ {3,}/g, '<br>');
Then replace string with newsStr.

Replace text in textarea using Javascript

I need to replace all the matches of a regular expression till the caret position in a textarea using Javascript.
For example, if the text in the textarea is: "6 students carry 2 books to 5 classes" and the cursor
is placed on books and the regular expression is /\d/, the numbers 6 and 2 should be replaced by, say, 4.
I know the replace function and I know how to get the caret position, but how do I solve this problem?
Thanks for any help in advance!

textareaClicked = function(e){
var pos = e.target.selectionStart;
var beforeSelection = e.target.innerHTML.slice(0,pos);
var afterSelection = e.target.innerHTML.slice(pos);
var newHTML = beforeSelection.replace(/\d/g,4) + afterSelection;
e.target.innerHTML = newHTML;
e.target.setSelectionRange(pos,pos);
};
document.getElementById('foo').onclick=textareaClicked;
see it in action in this jsfiddle.

There is probably a more elegant way, but I would just copy the text from the text area, split the string into two substrings at the caret position (which you said you know how to find), do the replace on the first substring and then concatenate it with the second substring. Copy it back into the text area making sure to update the caret position appropriately.

Develop Reference

JavaScript is the programming language of the Web.

regex to separate paragraphs by double new line - javascript

You can match one or more newlines [\r\n]+ and replace with 2 newlines \n\n const regex = /[\r\n]+/gm; const str = `text text text text`; console.log(str.replace(/[\r\n]+/g, `\n\n`));

didn't test but i think this could work : replace(/[\n]+?/g,'\n\n') \n+ will match 1 or more new lines, the question mark will force to take the shortest amount possible (this will stop the matching block at the first non new line character)

Related

Remove a specific character from a text file

Split a huge text using regex delimiters

Cannot split textarea on newline created from html entities

Create new line/carriage return on paragraph after more than 1 spaces

Replace text in textarea using Javascript

Categories

Resources