Regular Expression find match in text not after word - javascript

I'm trying to find all appearance of smiley text ":D" which is not following after "mailto"
The problem is that I'm trying to detect smileys in htmlText and sometimes it may be not correct
example "mailto:Denis#email.com" smile pattern ":D" should not be matched when appear in email.
This is the wrong example that I'm trying to implement: http://regexr.com/3a3at
Somebody please help.

Javascript (what a shame!) doesn't support lookbehinds, therefore the most realistic option would be something like
input
.replace(/(mailto|https?):/g, "$1\x00")
.replace(/:D/g, "smile")
.replace(/\x00/g, ":")
Basically, replace "unwanted" colons with \x00 (or some other symbol that is unlikely to appear in the input), process smiles and then replace \x00s back to colons.

Related

replace several spans <span>with one <span>

I'm looking for a solution similar to
Regex to replace multiple spaces with a single space
but instead of space the question is about <span>. It doesn't contain additional attributes in it such as class. It's just exactly 6 symbols <span> (no spaces, no nothing).
As result, the string
"<span>The <span><span><span><span>dog <span><span>has</span> a long</span> tail, and it </span></span></span>is RED</span></span>!"
should be replaced to
"<span>The <span>dog <span>has</span> a long</span> tail, and it </span></span></span>is RED</span><span>!"
(please don't pay attention closing spans will be more, additional modifications are expected thereafter).
P.S. Yes, you're right, you may want to ask if 2+ consequent spans may have spaces in between, tabs or even new lines. Honestly - yes, but even without spaces, tabs, new lines the answer will be useful. Thank you.
Try out the following two replace methods (can you use them chained):
if or is repeated directly after another (twice or more often), replace that whole thing with just one expression:
.replace(/(\<span\>){2,}/g, "<span>")
.replace(/(\</span\>){2,}/g, "</span>")
By the way, regexr.com is a great place if you want to try out regex!

Use Javascript to get the Sentence of a Clicked Word

This is a problem I'm running into and I'm not quite sure how to approach it.
Say I have a paragraph:
"This is a test paragraph. I love cats. Please apply here"
And I want a user to be able to click any one of the words in a sentence, and then return the entire sentence that contains it.
You first would have to split your paragraph into elements, as you can't (easily) detect clicks on text without elements :
$('p').each(function() {
$(this).html($(this).text().split(/([\.\?!])(?= )/).map(
function(v){return '<span class=sentence>'+v+'</span>'}
));
});
Note that it splits correctly paragraphs like this one :
<p>I love cats! Dogs are fine too... Here's a number : 3.4. Please apply here</p>​
Then you would bind the click :
$('.sentence').click(function(){
alert($(this).text());
});
Demonstration
I don't know if in English : is a separator between sentences. If so, it can be added to the regex of course.
First of all, be prepared to accept a certain level of inaccuracy. This may seem simple on the surface, but trying to parse natural languages is an exercise in madness. Let us assume, then, that all sentences are punctuated by ., ?, or !. We can forget about interrobangs and so forth for the moment. Let's also ignore quoted punctuation like "!", which doesn't end the sentence.
Also, let's try to grab quotation marks after the punctuation, so that "Foo?" ends up as "Foo?" and not "Foo?.
Finally, for simplicity, let's assume that there are no nested tags inside the paragraph. This is not really a safe assumption, but it will simplify the code, and dealing with nested tags is a separate issue.
$('p').each(function() {
var sentences = $(this)
.text()
.replace(/([^.!?]*[^.!?\s][.!?]['"]?)(\s|$)/g,
'<span class="sentence">$1</span>$2');
$(this).html(sentences);
});
$('.sentence').on('click', function() {
console.log($(this).text());
});​
It's not perfect (for example, quoted punctuation will break it), but it will work 99% of the time.
Live demo: http://jsfiddle.net/SmhV3/
Slightly amped-up version that can handle quoted punctuation: http://jsfiddle.net/pk5XM/1/
Match the sentences. You can use a regex along the lines of /[^!.?]+[!.?]/g for this.
Replace each sentence with a wrapping span that has a click event to alert the entire span.
I suggest you take a look at Selection and ranges in JavaScript.
There is not method parse, which can get you the current selected setence, so you have to code that on your own...
A Javascript library for getting the Selection Rang cross browser based is Rangy.
Not sure how to get the complete sentense. but you can try this to get word by word if you split each word by spaces.
<div id="myDiv" onmouseover="splitToSpans(this)" onclick="alert(event.target.innerHTML)">This is a test paragraph. I love cats. Please apply here</div>
function splitToSpans(element){
if($(element).children().length)
return;
var arr = new Array();
$($(element).text().split(' ')).each(function(){
arr.push($('<span>'+this+' </span>'));
});
$(element).text('');
$(arr).each(function(){$(element).append(this);});
}

Find and Highlight Arabic with diacritics Text in UIWebView

I'm viewing an Arabic with diacritics (tashkel) text in a UIWebView.
I also have a Search View.
I want to give the UIWebView a keyword and UIWebView finds and highlights it,
but the search should ignore the diacritics.
The main text should remain with the diacritics.
example :
if the text is " الَلَهُم صَلِ عَلى مُحَمّد و آل مُحمد "
and I tell the UIWebView to search for " محمد "
it should highlight "مُحَمّد " and " مُحمد " regardless of all the diacritics.
I think of two approaches :
1- I do the find and highlight by Javascript after the UIWebView load.
2- I edit the text by Objective-C before loading the UIWebView.
You have to first strip all the diacritics from the string, then you can compare without any diacritics. Use regular expression to remove the characters you don't want. Check out this fiddle i did, inside the strip() function you need to add all the diacritics you need to take out.
hope this helps.
The version below does not remove sukoon.
I have converted the eight diacritics (fatha, kasrah, damma, shaddah, sukoon and tanween) into their Unicode equivalent. It is much easier to manipulate Arabic converted into Latin characters in any text editor :
<html>
<head><meta charset="UTF-8"></head>
<body>
<script>document.write("حَرْفٌ".replace(/(\u0652)|(\u0650)|(\u064C)|(\u064E)|(\u064B)|(\u064F)|(\u064D)|(\u0651)/g,""));</script>
</body>
</html>
Resources used:
Arabic Keyboard with diacritics
Unicode Code Converter

jQuery match first letter in a string and wrap with span tag

I'm trying to get the first letter in a paragraph and wrap it with a <span> tag. Notice I said letter and not character, as I'm dealing with messy markup that often has blank spaces.
Existing markup (which I can't edit):
<p> Actual text starts after a few blank spaces.</p>
Desired result:
<p> <span class="big-cap">A</span>ctual text starts after a few blank spaces.</p>
How do I ignore anything but /[a-zA-Z]/ ? Any help would be greatly appreciated.
$('p').html(function (i, html)
{
return html.replace(/^[^a-zA-Z]*([a-zA-Z])/g, '<span class="big-cap">$1</span>');
});
Demo: http://jsfiddle.net/mattball/t3DNY/
I would vote against using JS for this task. It'll make your page slower and also it's a bad practice to use JS for presentation purposes.
Instead I can suggest using :first-letter pseudo-class to assign additional styles to the first letter in paragraph. Here is the demo: http://jsfiddle.net/e4XY2/. It should work in all modern browsers except IE7.
Matt Ball's solution is good but if you paragraph has and image or markup or quotes the regex will not just fail but break the html
for instance
<p><strong>Important</strong></p>
or
<p>"Important"</p>
You can avoid breaking the html in these cases by adding "'< to the exuded initial characters. Though in this case there will be no span wrapped on the first character.
return html.replace(/^[^a-zA-Z'"<]*([a-zA-Z])/g, '<span class="big-cap">$1</span>');
I think Optimally you may wish to wrap the first character after a ' or "
I would however consider it best to not wrap the character if it was already in markup, but that probably requires a second replace trial.
I do not seem to have permission to reply to an answer so forgive me for doing it like this. The answer given by Matt Ball will not work if the P contains another element as first child. Go to the fiddle and add a IMG (very common) as first child of the P and the I from Img will turn into a drop cap.
If you use the x parameter (not sure if it's supported in jQuery), you can have the script ignore whitespace in the pattern. Then use something like this:
/^([a-zA-Z]).*$/
You know what format your first character should be, and it should grab only that character into a group. If you could have other characters other than whitespace before your first letter, maybe something like this:
/.*?([a-zA-Z]).*/
Conditionally catch other characters first, and then capture the first letter into a group, which you could then wrap around a span tag.

Use JS to replace text in Gmail message body

I want to write an GnuPG extension for Google Chrome. So far, everything works as expected: If I detect ASCII armored crypt-text, I parse it with my extension and then replace it. (after password has been entered)
Gmail however litters the message body with an insane amount of tags, so my simple JS approach doesn't work anymore. Is there something which can select an certain amount of visible text, no matter how many tags are contained in it, and replace it with some other text? (the tags don't need to survive). ie I want to unencrypt the mailbody in place.
what do you need is something like this:
/<[^>]+>/g
this regexp will remove all tags, an leave plain text...
just gotta replace for nothing... something like this:
"<p>text <b>full</b> of <i>junk</i> and <u>unwanted</u> tags</p>".replace(/<[^>]+>/g, "");
...and about selecting an specific part you can use substring, I guess!
What I really needed to do was a little different:
expand my regex so it didn't care about tags:
var re = /-----[\s\S]+?-----[\s\S]+?-----[\s\S]+?-----/gm;
store all the matches, with tags
use the regex provided by gibatronic to remove tags and then further process the cleaned text using gpg
use body.innerHTML.replace() to replace the matches from 1) with the processed text from 3)
It works now, the only problem is it breaks Gmail. Site layout stays intact, but all buttons and links become defunct. Only solution is to reload the page. Gotta fix this :S

Categories

Resources