JavaScript RegExp - How to match a word based on conditions - javascript

I'm building a search results page (in Angular) but using regular expressions to highlight the searched 'keywords' based on a condition. I'm having problems with RegExp with getting the correct condition, so apologies if my current syntax is messy, I've been playing about for hours.
Basically for this test i'm highlighting the word 'midlands' and I want to highlight every 'midlands' word except the word within the 'a' tag <a /> of the href="" attribute. So anything that's apart of the URL I do not want to highlight as I'll be wrapping the keywords within a span and this will break the url structure. Can anyone help? - I think I'm almost there.
Here's the current RegExp I'm using:
/(\b|^|)(\s|\()midlands(\b|$)(|\))/gi
Here's a link to test what I'm after.
https://regex101.com/r/wV4gC3/2
Further info, after the view has rendered I grab the the html content of the repeating results and then do a search based on the rendered html with the condition above. - If this helps anyone.

You're going about this all wrong. Don't parse HTML with regular expressions - use the DOM's built in HTML parser and explicitly run the regex on text nodes.
First we get all the text nodes. With jQuery that's:
var texts = $(elem).content().get().filter(function(el){
return el.nodeType === 3; // 3 is text
});
Otherwise - see the answer here for code for getting all text nodes in VanillaJS.
Then, iterate them and replace the relevant text only in the text nodes:
foreach(var text of texts) { // if old browser - angular.forEach(texts, fn(text)
text.textContent = text.textContent.replace(/midlands/g, function(m){
return "<b>" + m + "</b>"; // surround with bs.
});
}

Related

Jquery detect commands/Custom BBcode

So I want to be able to detect what a user on my forums writes in a post and change it's CSS accordingly. For example,
[hl color:'yellow']example test[/hl]
should apply a hightlight to the text:
style="background-color: yellow"
I want the jQuery code to detect [hl color: if successful, save the value between the ' ' in a variable then test for the remaining ]. I then want it to apply style="background-color: " + var to the text after the ] and before the [/hl]
Thanks in advanced.
Current unworking code:
$('.post').each(function() {
if($(this:contains('[hl color:'))) {
var txt = [I have no idea];
$(select the text in between tags).attr('style', 'background-color:' + txt);
}
});
Option 1: Use a library
There are already plenty of JavaScript libraries that parse BBCode. This one looks promising, and is extensible so that you can add your own tags. You could also consider doing the parsing on the server side (in PHP or whatever you are using) using a library there (like jBBCode for PHP).
Option 2: Do it yourself
No actual jQuery is needed for this. Instead, for simple tags regex does the trick:
function bbcodeToHTML(bbcode) {
return bbcode.replace(
/\[\s*hl\s*color\s*:\s*(.*?)\s*\](.*?)\[\s*\/\s*hl\s*\]/gi,
'<span style="background-color:$1;">$2</span>'
);
}
So what does this regex do?
\[\s*hl\s*color\s*:\s*: Literal [, then any number of whitespaces, then color, then any number of whitespaces, then literal :, then any number of whitespaces.
(.*?): Captures (as $1) any characters lazily. This means that it tries to match as few characters as possible.
\s*\]: Ends the opening tag.
(.*?): Same as above, but captures as $2
\[\s*\/\s*hl\s*\]: Ending tag, with any number of whitespaces thrown in.
g: Global flag. Replaces all matches and do not stop after first.
i: Case insensitive flat. Match both HL and hl.
See it in action here.
Replacing the content of forum posts
Now you will need som jQuery. I will assume that the elements that contain forum posts in BBCode that you want replaced with HTML all have a class named post so that they can be identified. Then this code would do the job:
//Itterate over all forum posts.
jQuery('.post').each(function() {
//Get a reference to the current post.
currentPost = jQuery(this);
//Get the content ant turn it into HTML.
postHTML = bbcodeToHTML(currentPost.html());
//Put the html into the post.
currentPost.html(postHTML);
});
For more info on jQuery, you can always check out the documentation.

Slice text in two without breaking tags in jQuery

I have the following code that I managed to put up by combining different resources. What this does is that it takes html of a content and breaks it into two halves (for a read more application). Following code is such that it doesn't break a word (waits until the end of word).
var minCharCount = 600;
var divcontent = $('#myDiv').html();
var firstHalf = divcontent.substr(0, minCharCount);
firstHalf = firstHalf.substr(0, Math.min(firstHalf.length, firstHalf.lastIndexOf(" ")));
var secondHalf = divcontent.substr(firstHalf.length, divcontent.length);
However, one last issue with this is that it can break html tags resulting in bad code. Is there a way to make sure that the code breaks them in two after any potential tag ends?
Edit: may be it was a little difficult to understand. What I want is:
long text comes here with tags like <b>bold</b> or even <i>italic</i>.
^1 ^2 ^3
So my point is if we break at 1 its fine, but if we break at 2 and append the two parts somewhere, we get problems. So before breaking at 2 we need to check if it is in the middle of a tag. If it is then wait until the tag ends and then break: i.e. at 3.
WORKING DEMO
instead of
$('#myDiv').html();
run the function on the string returned from
$('#myDiv').text();
This way you don't get any html tags in the input string.
http://api.jquery.com/text/
UPDATE:
(in response to comment)
since you want the html tags, then you can loop through the children() of the target, measure the length of their .text(), and add them to the out put until you reach the minChars amount. Then do the same for the last child you reached, until you reach the closest amount of text to the target char count.
children() excludes text nodes, so you have to use contents().
however, this approach is cumbersome. I think your best bet is to create a range object
see here: https://developer.mozilla.org/en-US/docs/Web/API/Document.createRange

How to manipulate particular character in DOM in JavaScript?

Suppose I have text called "Hello World" inside a DIV in html file. I want to manipulate sixth position in "Hello World" text and replace that result in DOM, like using innerHTML or something like that.
The way i do is
var text = document.getElementById("divID").innerText;
now somehow I got the text and and manipluate the result using charAt for particular position and replace the result in html by replacing the whole string not just that position element. What I want to ask is do we have to every time replace the whole string or is there a way using which we can extract the character from particular position and replace the result in that position only not the whole string or text inside the div.
If you just need to insert some text into an already existing string you should use replace(). You won't really gain anything by trying to replace only one character as it will need to make a new string anyway (as strings are immutable).
jsFiddle
var text = document.getElementById("divID").innerText;
// find and replace
document.getElementById("divID").innerText = text.replace('hello world', 'hello big world');
var newtext=text.replace(text[6],'b'); should work. Glad you asked, I didn't know that would work.
Curious that it works, it doesn't replace all instances of that character either which is odd... I guess accessing characters with bracket notation treats the character as some 'character' object, not just a string.
Don't quote me on that though.
Yes, you have to replace the entire string by another, since strings are immutable in JavaScript. You can in various ways hide this behind a function call, but in the end what happens is construction of a new string that replaces the old one.
Text with div's are actually text nodes and hence we will have to explicitly manipulate their content by replacing the older content with the newer one.
If you are using jQuery then you can refer to the below link for a possible technique:
[link Replacing text nodes with jQuery] http://www.bennadel.com/blog/2253-Replacing-Text-Nodes-With-jQuery.htm.
Behind the scenes, I would guess that jQuery still replaces the entire string ** for that text node**

Unable to parse the JSON correctly

In the response of type application/x-javascript I am picking the required JSON portion in a varaible. Below is the JSON-
{
"__ra":1,
"payload":null,
"data":[
[
"replace",
"",
true,
{
"__html": "\u003Cspan class=\"highlight fsm\" id=\"u_c_0\">I want this text only\u003C\/span>"
}
]
]
}
From the references, which I got from Stackoverflow, I am able to pick the content inside data in the following way-
var temp = JSON.parse(resp).data;
But my aim is to get only the text part of __html value which is I want this text only . Somebody help.
First you have to access the object you targeted:
var html = JSON.parse(resp).data[0][3]._html;
But then the output you want is I want this text only
The html variable doesn't containt that text but some html where the content you're looking for is the text inside a span
If you accept including jQuery in your project you can access that content this way
var text = $(html).text();
To put it all together:
var html = JSON.parse(resp).data[0][3]._html;
var div = document.createElement("div");
div.innerHTML = html;
var text = div.textContent || div.innerText || "";
Kudos #Tim Down for this answer on cross-browser innerHTML: JavaScript: How to strip HTML tags from string?
First you'll need to be a bit more specific with that data to get to the string of text you want:
var temp = JSON.parse(resp).data[0][3]['__html'];
Next you'll need to search that string to extract the data you want. That will largely depend on the regularity of the response you are getting. In any case, you will probably need to use a regular expression to parse the string you get in the response.
In this case, you are trying to get the text within the <span> element in the string. If that was the case for all your responses, you could do something like:
var text = /<span[^>]*>([^<]*)<\/span>/.exec(temp)[1];
This very specifically looks for text within the opening and closing of one span tag that contains no other HTML tags.
The main part to look at in the expression here is the ([^<]*), which will capture any character that is not an opening angled bracket, <. Everything around this is looking for instances of <span> with optional attributes. The exec is the method you perform on the temp string to return a match and the [1] will give you the first and only capture (e.g. the text between the <span> tags).
You would need read up more about RegExp to find out how to do something more specific (or provide more specific information in your question about the pattern of response you are looking for). But's generally well worth reading up on regular expressions if you're going to be doing this kind of work (parsing text, looking for patterns and matches) because they are a very concise and powerful way of doing it, if a little confusing at first.

How do I extract the title value from a string using Javascript regexp?

I have a string variable which I would like to extract the title value in id="resultcount" element. The output should be 2.
var str = '<table cellpadding=0 cellspacing=0 width="99%" id="addrResults"><tr></tr></table><span id="resultcount" title="2" style="display:none;">2</span><span style="font-size: 10pt">2 matching results. Please select your address to proceed, or refine your search.</span>';
I tried the following regex but it is not working:
/id=\"resultcount\" title=['\"][^'\"](+['\"][^>]*)>/
Since var str = ... is Javascript syntax, I assume you need a Javascript solution. As Peter Corlett said, you can't parse HTML using regular expressions, but if you are using jQuery you can use it to take advantage of browser own parser without effort using this:
$('#resultcount', '<div>'+str+'</div>').attr('title')
It will return undefined if resultcount is not found or it has not a title attribute.
To make sure it doesn't matter which attribute (id or title) comes first in a string, take entire html element with required id:
var tag = str.replace(/^.*(<[^<]+?id=\"resultcount\".+?\/.+?>).*$/, "$1")
Then find title from previous string:
var res = tag.replace(/^.*title=\"(\d+)\".*$/, "$1");
// res is 2
But, as people have previously mentioned it is unreliable to use RegEx for parsing html, something as trivial as different quote (single instead of double quote) or space in "wrong" place will brake it.
Please see this earlier response, entitled "You can't parse [X]HTML with regex":
RegEx match open tags except XHTML self-contained tags
Well, since no one else is jumping in on this and I'm assuming you're just looking for a value and not trying to create a parser, I'll give you what works for me with PCRE. I'm not sure how to put it into the java format for you but I think you'll be able to do that.
span id="resultcount" title="(\d+)"
The part you're looking to get is the non-passive group $1 which is the '\d+' part. It will get one or more digits between the quote marks.

Categories

Resources