Details about JQuery's .html() method - javascript

This is the description of the .html() method: "Get the HTML contents of the first element in the set of matched elements." My question is what exactly is the "HTML contents" it refers to?
Lets say I have a div with content I wish to store HTML content for but also render it raw for viewing purposes. Lets say e is my Div.
$(e).html() //this returns a string with what looks like regular HTML besides the occurrence of many HTML entities.
//if i search for greater than signs:
x.text().search("&gt") //I get 2273 which I'm assuming is the number of occurrences.
//However if I convert the HTML to a string I seem to get different content. Why?
$(output).val(e.html().toString()); //for viewing the output markup for a div
e.html().toString().search("&gt") //this now gives me 317, which is a lot less HTML entities than I had before
Would I be correct in saying that the .toString() method just corrupts my HTML content? Which one do I want? Whenever I render this to a raw div I get similar results for both even though they seem vastly different. Which one do I want in order to display and save the correct output information? Any knowledge is greatly appreciated!

&gt will match both > and > so when you run it on an html string will match all the tags as well as uses of > in text
Hard to tell what you are comparing since you use unknown x.text()

Related

Using jQuery to extract value from a row

I'm building a tic-tac-toe game with javascript. The issue is when I am checking to see if there are any winners on the board after each move.
When I run this jQuery function
$( "#row1")[0].innerHTML
The output is
"<span0>o</span0><span1>x</span1><span2>o</span2>"
Because each html element has a different span I'm not quite sure how to check without writing out all the possibilities. I have looked at SOF and found this Get array of values use JQuery?. It's quite similar but it doesn't account for the different span tags, e.g (span0, span1, span2).
I'm trying to see how I can only get the 'o','x','o' from the list.
To get you "oxo" in a string which you can then process however you see fit, you can use:
// gets you "oxo"
$( "#row1").text();
If you want those characters in an array, you could do this:
// gets you ["o", "x", "o"]
$( "#row1").text().split("");
I don't think that those spans are valid html5 tags. Each of the span tags should just be . If you are using the individual span names to insert text into them, then it is better to do that by id eg . So wherever you are referencing $("span1") or $("#row1 span1") you would instead reference the id like this: $("#square1") in order to insert the x and o text. There are other ways to do this, but for these purposes it is probably just best to have 9 separate ids. This way the example in the link that you referenced to read them into an array is essentially what you need.
If you really don't want to do that, then add a give all of your span tags a class= 'box' class. eg: . In this case the code to read into an array based on the example you provided in the link would have to change from $('#row1 span') to $('#row1 .box') (notice the period before "box". indicating that we are looking for classes, rather than tag names) I don't like this second solution, because it doesn't fix the invalid html5 tags.
I suppose there may be a way to use a wildcard to search all elements that begin with "span" but that would just be way more ugly.
Demo
Below code will do the job
var html = "<span0>o</span0><span1>x</span1><span2>o</span2>";
var values = $.map($(html), function( n, i ) {
return $(n).html();
});
console.log(values);
Here is how you can get it, we will loop through child span elements of #row1 and alert their text value (which is the inner text):
$("#row1").children().each(function()
({
alert($("this").text());
});

Replacing to HTML Character Entities and reverting back

When replacing things in my chat room it comes up in the box as the 'HTML Character Entities'. However, I want it to revert back and actually show the character typed in when it is then shown in the chat room. So I am using the following code to stop any html from being entered and damaging the chat room by replacing certain html character with there entities (I want to get one or two working before I look at the others I know there are many more.) ....
Javascript
var str1 = this.value.replace(/>/g, '<');
if (str1!=this.value) this.value=str1;
var str2 = this.value.replace(/</g, '>');
if (str2!=this.value) this.value=str2;
and then the following code then displays the text after it has been entered into the database etc. and on updating the chat box it uses the following to add in the the updated messages ...
Returned from php and then displayed through the following javascript
$('#chatroomarea').append($("<p>"+ data.text[i] +"</p>"));
I have messed around with this a few times changing it to val and using
.html(.append($("<p>"+ data.text[i] +"</p>")));
Etc. But I have had no luck. I'm not quite sure how to do this I just need the HTML Character Entities to actually show up back in there true Character instead of displaying something such as... '&#62'
This might be something I need to actually put within the replacing code where it will include code of it's own on replacing such as (this is just an example I'm not exactly sure on how I would write it) ....
var str1 = this.value.replace(/>/g, '.html(<)');
Any help on this would be much appreciated, Thank you.
$('#chatroomarea').append($("<xmp>"+ data.text[i] +"</xmp>"));
HTML xmp tag
The use is deprecated, but supported in most browsers.
Another option will be to use a styled textarea , To my knowledge these two are the tags that doesn't bother rendering html tags as it is.

Javascript: document innerHTML replace breaks forms

I'm currently trying to replace a piece of plain text in a page that also contains a form. I am aware that upon replacing code containing a form, the form elements get recreated. This can break forms (and it does on the webpage I'm manipulating).
Usually, I go about this by using the "getElementsByTagName" function, to make sure that I don't need to replace the code containing the form and this has always been possible so far. However at this point, I have arrived at a page where the smallest tagname is a div that contains the text I need to replace and a form. This div is further subdivided in tables so initially I thought "let's get elements by table", but exactly the piece that I need to replace is not subdivided in a table.
So I used this code to replace:
document.documentElement.innerHTML = document.documentElement.innerHTML.replace(RegEx, replaceString);
Of course, this breaks the form on the page, which is not wanted behavior.
Does anyone have any idea how to go about this without breaking the form? Is it possible to somehow get a reference to the part of the div that does not contain a table? Is it possible to alter just part of the code? Right now I take an instance of the code, replace the matches in the instance, and then overwrite the original code with the altered instance. I once remember trying document.documentElement.innerHTML.replace(RegEx, replaceString); on another page but this only returned an instance of altered code, it did not alter the original code.
This is part of the page:
<div class="BoxContent" style="background-image:url(http://static.tibia.com/images/global/content/scroll.gif);">
<TABLE></TABLE>
<BR>
Some text here.
<BR>
And some more.
<table></table>
<table></table>
</div>
I need to do some changes in the text between the tables.
I have looked around on SO and found similar question about adding things to a form with innerHTML, but this did not help my cause. So, all help is appreciated here!
Kenneth
Here - plain JS
DEMO
window.onload=function() {
var nodes = document.getElementsByClassName("BoxContent")[0].childNodes;
for (var i=0,n=nodes.length;i<n;i++) {
if (nodes[i].nodeType==3) {
// console.log(nodes[i].textContent)
nodes[i].textContent=nodes[i].textContent.replace(/some/gi,"Lots");
}
}
}

Find certain text and get complete text

I'm using a proxy to scrape data of this url: CNN Article
I would like to get the entire article text (heading not necessarily). So I tried this:
$(data).find("div:contains('Across the river from Cairo')");
This wil find the piece of text but when I do my thing with it myThing = $(this).text(); It seems it is getting a lot more than just the article. This might have something to do with the way the HTML is constructed. If I look at the source I see the article text is confined in p However changing the div:contains in to p:contains only gets me the first few lines (obviously)
So my question is how do I get the article text regardless it's HTML construction. I'm looking for something(code) that will say:
find.('Across the river from Cairo') and get this text and all the text underneath this text();
I'm getting the desired results from that article with the selector p.cnn_storypgraphtxt. To get the whole article, you can use $("p.cnn_storypgraphtxt").text() or
$("p.cnn_storypgraphtxt").map(function(){return $(this).text;}).get().join("\n");
For getting the text that follows a certain expression, you might use .last() to get the last selected node (i.e. the lowermost in the DOM) and then .nextAll() like
$(":contains('Across the river from Cairo')").last().nextAll().text()
but that will contain a lot of unwanted stuff.
Try using
$someString = $(data).find("div:contains('Across the river from Cairo')").html();
use that string for manipulations or whatever.

Regex replace string but not inside html tag

I want to replace a string in HTML page using JavaScript but ignore it, if it is in an HTML tag, for example:
visit google search engine
you can search on google tatatata...
I want to replace google by <b>google</b>, but not here:
visit google search engine
you can search on <b>google</b> tatatata...
I tried with this one:
regex = new RegExp(">([^<]*)?(google)([^>]*)?<", 'i');
el.innerHTML = el.innerHTML.replace(regex,'>$1<b>$2</b>$3<');
but the problem: I got <b>google</b> inside the <a> tag:
visit <b>google</b> search engine
you can search on <b>google</b> tatatata...
How can fix this?
You'd be better using an html parser for this, rather than regex. I'm not sure it can be done 100% reliably.
You may or may not be able to do with with a regexp. It depends on how precisely you can define the conditions. Saying you want the string replaced except if it's in an HTML tag is not narrow enough, since everything on the page is presumably within some HTML tag (BODY if nothing else).
It would probably work better to traverse the DOM tree for this instead of trying to use a regexp on the HTML.
Parsing HTML with a regular expression is not going to be easy for anything other than trivial cases, since HTML isn't regular.
For more details see this Stackoverflow question (and answers).
I think you're all missing the question here...
When he says inside the tag, he means inside the opening tag, as in the <a href="google.com"> tag...This is something quite different than text, say, inside a <p> </p> tag pair or <body> </body>. While I don't have the answer yet, I'm struggling with this same problem and I know it has to be solvable using regex. Once I figure it out, i'll come back and post.
WORKAROUND
If You can't use a html parser or are quite confident about Your html structure try this:
do the "bad" changing
repeat replace (<[^>]*)(<[^>]+>) to $1 a few times (as much as You need)
It's a simple workaround, but works for me.
Cons?
Well... You have to do the replace twice for the case ... ...> as it removes only first unwanted tag from every tag on the page
[edit:]
SOLUTION
Why not use jQuery, put the html code into the page and do something like this:
$(containerOrSth).find('a').each(function(){
if($(this).children().length==0){
$(this).text($(this).text().replace('google','evil'));
}else{
//here You have to care about children tags, but You have to know where to expect them - before or after text. comment for more help
}
});
I'm using
regex = new RegExp("(?=[^>]*<)google", 'i');
you can't really do that, your "google" is always in some tag, either replace all or none
Well, since everything is part of a tag, your request makes no real sense. If it's just the <a /> tag, you might just check for that part. Mainly by making sure you don't have a tailing </a> tag before a fresh <a>
You can do that using REGEX, but filtering blocks like STYLE, SCRIPT and CDATA will need more work, and not implemented in the following solution.
Most of the answers state that 'your data is always in some tags' but they are missing the point, the data is always 'between' some tags, and you want to filter where it is 'in' a tag.
Note that tag characters in inline scripts will likely break this, so if they exist, they should be processed seperately with this method. Take a look at here :
complex html string.replace function
I can give you a hacky solution…
Pick a non printable character that’s not in your string…. Dup your buffer… now overwrite the tags in your dup buffer using the non printable character… perform regex to find position and length of match on dup buffer … Now you know where to perform replace in original buffer

Categories

Resources