How to add '>' to every new line in a string in javascript? - javascript

I have a text area on a UI and I need the user to type in Markdown. I need to make sure that each line they type will start with > as I want to view everything the typed as a blockquote when they preview it.
So for example if they type in:
> some text user <b>typed</b>
another line
When the markdown is rendered, only the fist line is a blockquote. The rest is plain text outside the blockquote.
Is there a way I can check each line and add the > if it is missing.
Things I have tried:
I tried removing all > characters and replacing each \n with a \n>. This however messed up the markdown as the user can also type in <b>bold text</b>.
I have a loop that checks for the > character after every new line. I just don't know how to insert the > if its missing.
Loop code:
var match = /\r|\n/.exec(theString);
if (match) {
if (theString.charAt(match.index)!='>'){
// don't know how to ad the character
}
}
I also though that maybe I can enforce the > in the textarea, but that research got me nowhere. As in, I don't think that is possible.
I also thought, what if the user types multiple >>>>. At that stage I was thinking about it too much and said I'd leave out cases like that as maybe that is the user's intention.
If anyone has any suggestions and/or alternative solutions it would be very much appreciated. Thank you :)

You can use a regular expression to insert > to the beginning of each line, if it doesn't exist:
const input = `> some text user <b>typed</b>
another line
another line 2
> another line 3`;
const output = input.replace(/^(?!>)/gm, '> ');
console.log(output);
The pattern ^(?!>) means: match the beginning of a line, which is not followed by >.
If you only want to insert >s where lines have text already, then also lookahead for non-whitespace in the line:
const input = `> some text user <b>typed</b>
another line
another line 2
> another line 3`;
const output = input.replace(/^(?!>)(?=[^\n]*\S)/gm, '> ');
console.log(output);

I'd go with replace (first thing you tried). In order to insert literal > in HTML, you have to escape it.
Just replace \n with \n> and you're all set.

Related

How to detect sentences without comments and markdown using Javascript regex?

Problem
I have a piece of text. It can contain every character from ASCII 32 (space) to ASCII 126 (tilde) and including ASCII 9 (horizontal tab).
The text may contain sentences. Every sentence ends with dot, question mark or exclamation mark, directly followed by space.
The text may contain a basic markdown styling, that is: bold text (**, also __), italic text (*, also _) and strikethrough (~~). Markdown may occur inside sentences (e.g. **this** is a sentence.) or outside them (e.g. **this is a sentence!**). Markdown may not occur across sentences, that is, there may not be a situation like this: **sentence. sente** nce.. Markdown may include more than one sentence, that is, there may be a situation like this: **sentence. sentence.**.
It can also contain two sequences of characters: <!-- and -->. Everything between these sequences is treated as a comment (like in HTML). Comments can occur at every position in the text, but cannot contains newlines characters (I hope that on Linux it is just ASCII 10).
I want to detect in Javascript all sentences, and for each of them put its length after this sentence in a comment, like this: sentence.<!-- 9 -->. Mainly, I do not care if their length includes the length of the markdown tags or not, but it would be nice if it does not.
What have I done so far?
So far, with help of this answer, I have prepared the following regex for detecting sentences. It mostly fits my needs – except that it includes comments.
const basicSentence = /(?:^|\n| )(?:[^.!?]|[.!?][^ *_~\n])+[.!?]/gi;
I have also prepared the following regex for detecting comments. It also works as expected, at least in my own tests.
const comment = /<!--.*?-->/gi;
Example
To better see what I want to achieve, let us have an example. Say, I have the following piece of text:
foo0
b<!-- comment -->ar.
foo1 bar?
<!-- comment -->
foo2bar!
(There is also a newline at the end of it, but I do not know how to add an empty line in Stackoverflow markdown.)
And the expected result is:
foo0
b<!-- comment -->ar.<!-- 10 -->
foo1 bar?<!-- 9 -->
<!-- comment -->
foo2bar!<!-- 12 -->
(This time, there is no also newline at the end.)
UPDATE: Sorry, I have corrected the expected result in the example.
Pass a callback to .replace that replaces all comments with the empty string, and then returns the length of the resulting trimmed match:
const input = `foo0
b<!-- comment -->ar.
foo1 bar?
<!-- comment -->
foo2bar!
`;
const output = input.replace(
/(?:^|\n| )(?:[^.!?]|[.!?][^ *_~\n])+[.!?]/g,
(match) => {
const matchWithoutComments = match.replace(/<!--.*?-->/g, '');
return `${match}<!-- ${matchWithoutComments.length} -->`;
}
);
console.log(output);
Of course, you can use a similar pattern to replace markdown notation with the inner text content as well, if you wish:
.replace(/([*_]{1,2}|~~)((.|\n)*?)\1/g, '$2')
(due to nested and possibly unbalanced tags, which regex is not very good at working with, you may have to repeat that line until no further replacements can be found)
Also, per comment, your current regular expression is expecting every sentence to end in ., !, or ?. The comment's ! in <!-- is treated as the end of a (short) sentence. One option would be to lookahead for whitespace (a space, or a newline) or the end of the input at the very end of the regex:
const input = `foo0
b<!-- comment -->ar.
foo1 bar?
<!-- comment -->
foo2bar!
<!-- comment -->`;
const output = input.replace(
/(?:^|\n| )(?:[^.!?]|[.!?][^ *_~\n])+[.!?](?=\s|$|[*_~])/g,
(match) => {
const matchWithoutComments = match.replace(/<!--.*?-->/g, '');
return `${match}<!-- ${matchWithoutComments.length} -->`;
}
);
console.log(output);
https://regex101.com/r/RaTIOi/1

How to make new line from a xml response string

I get my data from an API, which return XML, I already convert it to json because I use angularjs, the field that I need, store Songs Lyrics and it used this symbol ↵ when ever it should go to new line.
for example :
You shout it loud↵But I can’t hear a word you say↵I’m talking loud, not saying much↵↵I’m criticized but all your bullets ricochet↵You shoot me down, but I get up
example above, is something that I get when I use console.log() but when I show this field to my HTML page, its just string with no ↵ in it. I don't know why it not show in HTML, and if its something to make new line, it's not happening.
I was thinking to replace ↵ with <br /> is it possible? I will be appreciate it if you guys can help me with that.
UPDATE :
I use angularjs and fill the model with lyric and show it with {{lyric}} in my html
but as you can see in picture, when I use console.log($scope.lyric) string is formated well, but when I show the same model in HTML, its like this
Simple regexr string replace should take care of it:
var str = 'You shout it loud↵But I can’t hear a word you say↵I’m talking loud, not saying much↵↵I’m criticized but all your bullets ricochet↵You shoot me down, but I get up';
var formatted = str.replace(/↵/ig, "<br/>\n");
console.log(formatted);
document.write(formatted);
The regexr finds everything that matches the character between the / signs and replaces them with a standard newline \n and a HTML breakline tag <br/>.
The i and g flags mean Case Insensitive and Search Global respectively.
Case Insensitive catches the characters even if they are in a different case. Search Global means that if you input a multi line string, then it will replace on all lines and not just on the first.
I just figure it out, I let you know how it works in case of anyone else face with same problem :
when I show lyric like this :
<p>{{lyric}}</p>
it ignored my new lines. but when I use this :
<pre>{{lyrics}}</pre>
it works!

JavaScript RegExp - How to match a word based on conditions

I'm building a search results page (in Angular) but using regular expressions to highlight the searched 'keywords' based on a condition. I'm having problems with RegExp with getting the correct condition, so apologies if my current syntax is messy, I've been playing about for hours.
Basically for this test i'm highlighting the word 'midlands' and I want to highlight every 'midlands' word except the word within the 'a' tag <a /> of the href="" attribute. So anything that's apart of the URL I do not want to highlight as I'll be wrapping the keywords within a span and this will break the url structure. Can anyone help? - I think I'm almost there.
Here's the current RegExp I'm using:
/(\b|^|)(\s|\()midlands(\b|$)(|\))/gi
Here's a link to test what I'm after.
https://regex101.com/r/wV4gC3/2
Further info, after the view has rendered I grab the the html content of the repeating results and then do a search based on the rendered html with the condition above. - If this helps anyone.
You're going about this all wrong. Don't parse HTML with regular expressions - use the DOM's built in HTML parser and explicitly run the regex on text nodes.
First we get all the text nodes. With jQuery that's:
var texts = $(elem).content().get().filter(function(el){
return el.nodeType === 3; // 3 is text
});
Otherwise - see the answer here for code for getting all text nodes in VanillaJS.
Then, iterate them and replace the relevant text only in the text nodes:
foreach(var text of texts) { // if old browser - angular.forEach(texts, fn(text)
text.textContent = text.textContent.replace(/midlands/g, function(m){
return "<b>" + m + "</b>"; // surround with bs.
});
}

Remove multiple line breaks (\n) in JavaScript

We have an onboarding form for new employees with multiple newlines (4-5 between lines) that need stripped. I want to get rid of the extra newlines but still space out the blocks with one \n.
example:
New employee<br/>
John Doe
Employee Number<br/>
1234
I'm currently using text = text.replace(/(\r\n|\r|\n)+/g, '$1'); but that gets rid of all newlines without spacing.
text = text.replace(/(\r\n|\r|\n){2,}/g, '$1\n');
use this, it will remove newlines where there are at least 2 or more
update
on specific requirement of the OP I will edit the answer a bit.
text = text.replace(/(\r\n|\r|\n){2}/g, '$1').replace(/(\r\n|\r|\n){3,}/g, '$1\n');
We can tidy up the regex as follows:
text = text.replace(/[\r\n]{2,}/g, "\n");

JavaScript + RegEx Complications- Searching Strings Not Containing SubString

I am trying to use a RegEx to search through a long string, and I am having trouble coming up with an expression. I am trying to search through some HTML for a set of tags beginning with a tag containing a certain value and ending with a different tag containing another value. The code I am currently using to attempt this is as follows:
matcher = new RegExp(".*(<[^>]+" + startText + "((?!" + endText + ").)*" + endText + ")", 'g');
data.replace(matcher, "$1");
The strangeness around the middle ( ((\\?\\!endText).)* ) is borrowed from another thread, found here, that seems to describe my problem. The issue I am facing is that the expression matches the beginning tag, but it does not find the ending tag and instead includes the remainder of the data. Also, the lookaround in the middle slowed the expression down a lot. Any suggestions as to how I can get this working?
EDIT: I understand that parsing HTML in RegEx isn't the best option (makes me feel dirty), but I'm in a time-crunch and any other alternative I can think of will take too long. It's hard to say what exactly the markup I will be parsing will look like, as I am creating it on the fly. The best I can do is to say that I am looking at a large table of data that is collected for a range of items on a range of dates. Both of these ranges can vary, and I am trying to select a certain range of dates from a single row. The approximate value of startText and endText are \\#\\#ASSET_ID\\#\\#_<YYYY_MM_DD>. The idea is to find the code that corresponds to this range of cells. (This edit could quite possibly have made this even more confusing, but I'm not sure how much more information I could really give without explaining the entire application).
EDIT: Well, this was a stupid question. Apparently, I just forgot to add .* after the last paren. Can't believe I spent so long on this! Thanks to those of you that tried to help!
First of all, why is there a .* Dot Asterisk in the beginning? If you have text like the following:
This is my Text
And you want "my Text" pulled out, you do my\sText. You don't have to do the .*.
That being said, since all you'll be matching now is what you need, you don't need the main Capture Group around "Everything". This: .*(xxx) is a huge no-no, and can almost always be replaced with this: xxx. In other words, your regex could be replaced with:
<[^>]+xxx((?!zzz).)*zzz
From there I examine what it's doing.
You are looking for an HTML opening Delimeter <. You consume it.
You consume at least one character that is NOT a Closing HTML Delimeter, but can consume many. This is important, because if your tag is <table border=2>, then you have, at minimum, so far consumed <t, if not more.
You are now looking for a StartText. If that StartText is table, you'll never find it, because you have consumed the t. So replace that + with a *.
The regex is still success if the following is NOT the closing text, but starts from the VERY END of the document, because the Asterisk is being Greedy. I suggest making it lazy by adding a ?.
When the backtracking fails, it will look for the closing text and gather it successfully.
The result of that logic:
<[^>]*xxx((?!zzz).)*?zzz
If you're going to use a dot anyway, which is okay for new Regex writers, but not suggested for seasoned, I'd go with this:
<[^>]*xxx.*?zzz
So for Javascript, your code would say:
matcher = new RegExp("<[^>]*" + startText + ".*?" + endText, 'gi');
I put the IgnoreCase "i" in there for good measure, but you may or may not want that.

Categories

Resources