I'm trying to remove all the characters between the characters <p and </p> (basically all the attributes in the p tags).
With the following block of code, it removes everything, including the text inside the <p>
MyString.replace(/<p.*>/, '<p>');
Example: <p style="test" class="test">my content</p> gives <p></p>
Thank you in advance for your help!
Try this RegEx: /<p [^>]*>/, basically just remove the closing bracket from the accepted characters. . matches all characters, that's why this doesn't work. With the new one it stops at the first >.
Edit: You can add a global and multi-line flag: /<p [^>]*>/gm. Also as one of the comments pointed out, removing the tag makes it applicant for every tag, however this will make replacing a bit harder. This RegEx is: /<[^>]*>/gm
MyString.replace(/\<p.*<\/p>/, '<p></p>');
Related
I have tag like <span style="font-size:10.5pt;\nfont-family:\nKaiTi"> and I want to replace \n within tag with empty character.
Note: Tag could be anything(not fixed)
I want regex expression to replace the same in the javascript.
You should be able to strip out the \n character before applying this HTML to the page.
Having said that, try this (\\n)
You can see it here: regex101
Edit: A bit of refinement and I have this (\W\\n). It works with the example you provided. It breaks down if you have spaces in the body of the tags (<span> \n </span>).
I've tried everything I know to do. Perhaps someone with more regex experience can assist?
I'm looking for a solution similar to
Regex to replace multiple spaces with a single space
but instead of space the question is about <span>. It doesn't contain additional attributes in it such as class. It's just exactly 6 symbols <span> (no spaces, no nothing).
As result, the string
"<span>The <span><span><span><span>dog <span><span>has</span> a long</span> tail, and it </span></span></span>is RED</span></span>!"
should be replaced to
"<span>The <span>dog <span>has</span> a long</span> tail, and it </span></span></span>is RED</span><span>!"
(please don't pay attention closing spans will be more, additional modifications are expected thereafter).
P.S. Yes, you're right, you may want to ask if 2+ consequent spans may have spaces in between, tabs or even new lines. Honestly - yes, but even without spaces, tabs, new lines the answer will be useful. Thank you.
Try out the following two replace methods (can you use them chained):
if or is repeated directly after another (twice or more often), replace that whole thing with just one expression:
.replace(/(\<span\>){2,}/g, "<span>")
.replace(/(\</span\>){2,}/g, "</span>")
By the way, regexr.com is a great place if you want to try out regex!
I'm trying to build a regular expression to replace brackets with other content, in this case, a div tag..
So, the text:
This is a [sample] text
Would become:
This is a <div>sample</div> text
The problem is, it should only replace when both brackets are found, and keep the content inside. If only one bracket is found, it should ignore it..
So
This is a [sample text
or
This is a ] sample text
Both would remain as is.
Also it should match more than one occurence, so:
This [is] a [sample] [text]
Would become
This <div>is</div> a <div>sample</div> <div>text</div>
And one last thing, it should remove (or at least ignore) nested brackets, so
This [[is a ] sample [[[ tag]]]
Would become
This <div>is a</div> sample <div> tag </div>
This is what I got until now:
function highlightWords(string){
return string.replace(/(.*)\[+(.+)\]+(.*)/,"$1<div>$2</div>$3");
}
It works in simple cases, but won't get multiple occurences and won't remove other tags. Any regex masters around?
No need to describe content before and after brackets. You must forbid brackets in the content description, so use [^\][]+ instead of .+. Don't forget to add g for a global replacement:
function highlightWords(string){
return string.replace(/\[+([^\][]+)]+/g,"<div>$1</div>");
}
Note: you don't need to escape the closing square bracket outside a character class, it isn't a special character.
I'm trying to get the first letter in a paragraph and wrap it with a <span> tag. Notice I said letter and not character, as I'm dealing with messy markup that often has blank spaces.
Existing markup (which I can't edit):
<p> Actual text starts after a few blank spaces.</p>
Desired result:
<p> <span class="big-cap">A</span>ctual text starts after a few blank spaces.</p>
How do I ignore anything but /[a-zA-Z]/ ? Any help would be greatly appreciated.
$('p').html(function (i, html)
{
return html.replace(/^[^a-zA-Z]*([a-zA-Z])/g, '<span class="big-cap">$1</span>');
});
Demo: http://jsfiddle.net/mattball/t3DNY/
I would vote against using JS for this task. It'll make your page slower and also it's a bad practice to use JS for presentation purposes.
Instead I can suggest using :first-letter pseudo-class to assign additional styles to the first letter in paragraph. Here is the demo: http://jsfiddle.net/e4XY2/. It should work in all modern browsers except IE7.
Matt Ball's solution is good but if you paragraph has and image or markup or quotes the regex will not just fail but break the html
for instance
<p><strong>Important</strong></p>
or
<p>"Important"</p>
You can avoid breaking the html in these cases by adding "'< to the exuded initial characters. Though in this case there will be no span wrapped on the first character.
return html.replace(/^[^a-zA-Z'"<]*([a-zA-Z])/g, '<span class="big-cap">$1</span>');
I think Optimally you may wish to wrap the first character after a ' or "
I would however consider it best to not wrap the character if it was already in markup, but that probably requires a second replace trial.
I do not seem to have permission to reply to an answer so forgive me for doing it like this. The answer given by Matt Ball will not work if the P contains another element as first child. Go to the fiddle and add a IMG (very common) as first child of the P and the I from Img will turn into a drop cap.
If you use the x parameter (not sure if it's supported in jQuery), you can have the script ignore whitespace in the pattern. Then use something like this:
/^([a-zA-Z]).*$/
You know what format your first character should be, and it should grab only that character into a group. If you could have other characters other than whitespace before your first letter, maybe something like this:
/.*?([a-zA-Z]).*/
Conditionally catch other characters first, and then capture the first letter into a group, which you could then wrap around a span tag.
I know that regex usually should not be used for parsing html content. In my special case i need them
(reason is, am using a rte editor and when pasting into the editor some replacement for attributes of paragraphs needs to be done).
I have something like
<p attribute1="val1" attribute2="val2" attribut="val3" ...>text blah blah</p>
and i need all attributes stripped out so that i get
<p>text blah blah</p>
How can this be done using a regex?
A solution to strip out attributes from all possible html tags is appreciated too.
Something like this should work on all tags:
replace(/<\s*(\w+).*?>/, '<$1>')
For paragraphs only, just replace the \w:
replace(/<\s*p.*?>/, '<p>')
The \s* in the beginning allows for whitespace before the tag name, so if you for some reason have < p class="foo">, it works on that too.
Because an html tag cannot have spaces before the tag name and can continue over multiple lines I would recommend this instead:
replace(/<(\w+)(.|[\r\n])*?>/, '<$1>');
And for paragraphs only:
replace(/<p\s+?(.|[\r\n])*?>/, '<p>');
perl -lpe 's/(<\w+)\s+[^>]*/$1/'