regex to match only quotes that aren't in links - javascript

can you tell me how I can in javascript using regex to select quoted text, but not the one that is in the link
so I don't want to select these quotes some text
I want to select only normal quoted text
I used
result = content.replace(/"(.*?)"/g, "<i>$1</i>");
to replace all quoted text with italic, but it replaces also href quotes
Thanks :)

If you need an adhoc regex solution, you may match and capture tags, and only replace " symbols in other contexts. Defining a tag as <+non-<s up to the first >, we may use
var s = '"replace this" but <div id="not-here"> "and here"</div>';
var re = /(<[^<]*?>)|"(.*?)"/g;
var result = s.replace(re, function (m,g1,g2) {
return g1? g1 : '<i>' + g2 + '</i>';
});
console.log(result);
The (<[^<]*?>)|"(.*?)" matches:
(<[^<]*?>) - Group 1 (g1 later in the callback) that captures <, 0+ symbols other than < as few as possible up to the first >
| - or
"(.*?)" - ", 0+ chars other than a newline as few as possible captured into Group 2 (g2 later) and a ".
In the callback method, Group 1 is checked for a match, and if yes, we just put the tag back into the result, else, replace with the tags.

The simplest answer would be to use:
/[^=]"(.*)"/
instead of
/"(.*?)"/
But that will also include quotes that have = sign before them.

Why not only work on the actual text of the element... Like:
var anchors = [],
idx;
anchors = Array.prototype.slice.call(document.getElementsByTagName("a"));
for(idx=0; idx<anchors.length; idx++) {
anchors[idx].innerHTML = anchors[idx].innerHTML.replace(/"([^"]*)"/g, '<i>$1</i>');
}
some text that contains a "quoted" part.
<br/>
more "text" that contains a "quoted" part.
Here we get all anchor elements as an array and replace the innerHTML text with a italicized version of itself.

This pattern could be what you're looking for: <.+>.*(\".+\").*</.+>
Used in JavaScript, the following matches "text":
new RegExp('<.+>.*(\".+\").*</.+>', 'g').exec('some "text"')[1]

Related

Replce repeating set of character from end of string using regex

I want to remove all <br> from the end of this string. Currently I am doing this (in javascript) -
const value = "this is an event. <br><br><br><br>"
let description = String(value);
while (description.endsWith('<br>')) {
description = description.replace(/<br>$/, '');
}
But I want to do it without using while loop, by only using some regex with replace. Is there a way?
To identify the end of the string in RegEx, you can use the special $ symbol to denote that.
To identify repeated characters or blocks of text containing certain characters, you can use + symbol.
In your case, the final regex is: (<br>)*$
This will remove 0 or more occurrence of <br> from the end of the line.
Example:
const value = "this is an event. <br><br><br><br>"
let description = String(value);
description.replace(/(<br>)*$/g, '');
You may try:
var value = "this is an event. <br><br><br><br>";
var output = value.replace(/(<.*?>)\1*$/, "");
console.log(output);
Here is the regex logic being used:
(<.*?>) match AND capture any HTML tag
\1* then match that same tag zero or more additional times
$ all tags occurring at the end of the string

JS conditional RegEx that removes different parts of a string between two delimiters

I have a string of text with HTML line breaks. Some of the <br> immediately follow a number between two delimiters «...» and some do not.
Here's the string:
var str = ("«1»<br>«2»some text<br>«3»<br>«4»more text<br>«5»<br>«6»even more text<br>");
I’m looking for a conditional regex that’ll remove the number and delimiters (ex. «1») as well as the line break itself without removing all of the line breaks in the string.
So for instance, at the beginning of my example string, when the script encounters »<br> it’ll remove everything between and including the first « to the left, to »<br> (ex. «1»<br>). However it would not remove «2»some text<br>.
I’ve had some help removing the entire number/delimiters (ex. «1») using the following:
var regex = new RegExp(UsedKeys.join('|'), 'g');
var nextStr = str.replace(/«[^»]*»/g, " ");
I sure hope that makes sense.
Just to be super clear, when the string is rendered in a browser, I’d like to go from this…
«1»
«2»some text
«3»
«4»more text
«5»
«6»even more text
To this…
«2»some text
«4»more text
«6»even more text
Many thanks!
Maybe I'm missing a subtlety here, if so I apologize. But it seems that you can just replace with the regex: /«\d+»<br>/g. This will replace all occurrences of a number between « & » followed by <br>
var str = "«1»<br>«2»some text<br>«3»<br>«4»more text<br>«5»<br>«6»even more text<br>"
var newStr = str.replace(/«\d+»<br>/g, '')
console.log(newStr)
To match letters and digits you can use \w instead of \d
var str = "«a»<br>«b»some text<br>«hel»<br>«4»more text<br>«5»<br>«6»even more text<br>"
var newStr = str.replace(/«\w+?»<br>/g, '')
console.log(newStr)
This snippet assumes that the input within the brackets will always be a number but I think it solves the problem you're trying to solve.
const str = "«1»<br>«2»some text<br>«3»<br>«4»more text<br>«5»<br>«6»even more text<br>";
console.log(str.replace(/(«(\d+)»<br>)/g, ""));
/(«(\d+)»<br>)/g
«(\d+)» Will match any brackets containing 1 or more digits in a row
If you would prefer to match alphanumeric you could use «(\w+)» or for any characters including symbols you could use «([^»]+)»
<br> Will match a line break
//g Matches globally so that it can find every instance of the substring
Basically we are only removing the bracketed numbers if they are immediately followed by a line break.

javascript replace text at second occurence of "/"

I have this string
"/mp3/mysong.mp3"
I need to do make this string look like this with javascript.
"/mp3/myusername/mysong.mp3"
My guess would be to find second occurrence of "/", then append "myusername/" there or prepend "/myusername" but I'm not sure how to do this in javascript.
Just capture the characters upto the second / symbol and store it into a group. Then replace the matched characters with the characters inside group 1 plus the string /myusername
Regex:
^(\/[^\/]*)
Replacement string:
$1/myusername
DEMO
> var r = "/mp3/mysong.mp3"
undefined
> r.replace(/^(\/[^\/]*)/, "$1/myusername")
'/mp3/myusername/mysong.mp3'
OR
Use a lookahead.
> r.replace(/(?=\/[^/]*$)/, "/myusername")
'/mp3/myusername/mysong.mp3'
This (?=\/[^/]*$) matches a boundary which was just before to the last / symbol. Replacing the matched boundary with /myusername will give you the desired result.
This works -
> "/mp3/mysong.mp3".replace(/(.*?\/)(\w+\.\w+)/, "$1myusername\/$2")
"/mp3/myusername/mysong.mp3"
Demo and explanation of the regex here
use this :
var str = "/mp3/mysong.mp3";
var res = str.replace(/(.*?\/){2}/g, "$1myusername/");
console.log(res);
this will insert the text myusername after the 2nd / .

Regex replace text outside html tag

I'm working on an autocomplete component that highlights all ocurrences of searched text. What I do is explode the input text by words, and wrap every ocurrence of those words into a
My code looks like this
inputText = 'marriott st';
text = "Marriott east side";
textSearch = inputText.split(' ');
for (var i in textSearch) {
var regexSearch = new RegExp('(?!<\/?strong>)' + textSearch[i]), "i");
var textReplaced = regexSearch.exec(text);
text = text.replace(regexSearch, '< strong>' + textReplaced + '< /strong>');
}
For example, given the result: "marriott east side"
And the input text: "marriott st"
I should get
<strong>marriot< /strong > ea < strong >st < /strong > side
And i'm getting
<<strong>st</strong>rong>marriot</<strong>st </strong>rong>ea<<strong>st</strong> rong>s</strong> side
Any ideas how can I improve my regex, in order to avoid ocurrences inside the html tags? Thanks
/(?!<\/?strong>)st/
I would process the string in one pass. You can create one regular expression out of the search string:
var search_pattern = '(' + inputText.replace(/\s+/g, '|') + ')';
// `search_pattern` is now `(marriot|st)`
text = text.replace(RegExp(search_pattern, 'gi'), '<strong>$1</strong>');
DEMO
You could even split the search string first, sort the words by length and combine them, to give a higher precedence to longer matches.
You definitely should escape special regex characters inside the string: How to escape regular expression special characters using javascript?.
Before each search, I suggest getting (or saving) the original search string to work on each time. For example, in your current case that means you could replace all '<strong>' and '</strong>' tags with ''. This will help keep your regEx simple, especially if you decide to add other html tags and formatting in the future.

Javascript Regular expression to remove unwanted <br>,

I have a JS stirng like this
<div id="grouplogo_nav"><br> <ul><br> <li><a class="group_hlfppt" target="_blank" href="http://www.hlfppt.org/">&nbsp;</a></li><br> </ul><br> </div>
I need to remove all <br> and $nbsp; that are only between > and <. I tried to write a regular expression, but didn't got it right. Does anybody have a solution.
EDIT :
Please note i want to remove only the tags b/w > and <
Avoid using regex on html!
Try creating a temporary div from the string, and using the DOM to remove any br tags from it. This is much more robust than parsing html with regex, which can be harmful to your health:
var tempDiv = document.createElement('div');
tempDiv.innerHTML = mystringwithBRin;
var nodes = tempDiv.childNodes;
for(var nodeId=nodes.length-1; nodeId >= 0; --nodeId) {
if(nodes[nodeId].tagName === 'br') {
tempDiv.removeChild(nodes[nodeId]);
}
}
var newStr = tempDiv.innerHTML;
Note that we iterate in reverse over the child nodes so that the node IDs remain valid after removing a given child node.
http://jsfiddle.net/fxfrt/
myString = myString.replace(/^( |<br>)+/, '');
... where /.../ denotes a regular expression, ^ denotes start of string, ($nbsp;|<br>) denotes " or <br>", and + denotes "one or more occurrence of the previous expression". And then simply replace that full match with an empty string.
s.replace(/(>)(?: |<br>)+(\s?<)/g,'$1$2');
Don't use this in production. See the answer from Phil H.
Edit: I try to explain it a bit and hope my english is good enough.
Basically we have two different kinds of parentheses here. The first pair and third pair () are normal parentheses. They are used to remember the characters that are matched by the enclosed pattern and group the characters together. For the second pair, we don't need to remember the characters for later use, so we disable the "remember" functionality by using the form (?:) and only group the characters to make the + work as expected. The + quantifier means "one or more occurrences", so or <br> must be there one or more times. The last part (\s?<) matches a whitespace character (\s), which can be missing or occur one time (?), followed by the characters <. $1 and $2 are kind of variables that are replaces by the remembered characters of the first and third parentheses.
MDN provides a nice table, which explains all the special characters.
You need to replace globally. Also don't forget that you can have the being closed . Try this:
myString = myString.replace(/( |<br>|<br \/>)/g, '');
This worked for me, please note for the multi lines
myString = myString.replace(/( |<br>|<br \/>)/gm, '');
myString = myString.replace(/^( |<br>)+/, '');
hope this helps

Categories

Resources