Consider the following HTML:
<div>
text1
</div>
<div>
<span>
text2
</span>
</div>
<div>
text3
</div>
I need to select all the nodes with text1/text2/text3. When I use
/html/body/div[position() > 0]
I obviously don't get the span around text2, but the div around <span>text2</span>. How can I say: If there is a span following the div, then return the span; if the div is already the last element in a path, return the div? So the intended nodes would be:
div[0]
div[1]/span
div[2]
Update: This one works, but is there a shorter way to do it? (e.g. I am writing /html/body/divin both of them, is it possible to make the pipe symbol (or) at a later place?)
/html/body/div[position() > 0 and count(*) = 0] | /html/body/div[position() > 0]/span
I order to select a node with text content in it, you can use the text() selector.
So if you want select all nodes with some text content form a root node, you can use this xpath selector:
//ROOT_NODE//text()
So, for your example and as you said in your comment:
/html/body/div//text()
Related
Example HTML:
<p class="labels">
<span>Item1</span>
<span>Item2</span>
<time class="time">
<span>I dont want to get this span</span>
</time>
</p>
I am currently getting all the spans within the tag with the labels class, but i just want to get the 2 spans directly under the labels class and i dont want to get any span tags from child elements.
Currently i am doing it like this obviously:
First i am getting the labels HTML from a much bigger HTML:
labels = html.findAll(_class="labels")
Then i extract the span tags out of this.
spans = labels[0].findAll('span', {"class": None}
In my case the "class": None doesn't change anything because no span tag has any class.
So my question again is, how can i just get the first 2 span tags without all child elements?
There is a little sentence in the BeautifulSoup Docs where one can find recursive = False
So the answer on this problem was:
spans = labels[0].findAll('span', {"class": None}, recursive=False)
for container in html.findAll(_class="labels"):
spans = container.findAll('span', {"class": None})
spans = [span for span in spans if span.parent is container]
Alternatively iterate the .children:
for container in html.findAll(_class="labels"):
filter = lambda c: c.name == 'span' and c.class_ == None
spans = [child for child in container.children if filter(child)]
To extract first two span elements try below
>>>[i.text for i in html.find('p',{"class":"labels"}).findAll('span', {"class": None})[0:2]]
>>>[u'Item1', u'Item2']
If you want to grab all span inside class labels then remove the slice-
>>>[i.text for i in html.find('p',{"class":"labels"}).findAll('span', {"class": None})]
>>>[u'Item1', u'Item2', u'I dont want to get this span']
I have a string that contains variable HTML content. The string can contain one, more or no p tags which may also have classes on them.
What is the best way to remove all p tags from this using jQuery while keeping the HTML content of each of them.
I first tried the following but of course this only works if I have the whole string wrapped in a paragraph and it would not cover if the paragraphs have classes or other attributes on them:
str.substring(3).slice(0, -4);
Edit
Here is an example but the number of p tags can vary and there can also be none at all.
Example before:
<p>Some text <p class="someClass"> Some other text</p> Some more text</p>
Example after:
"Some text Some other text Some more text"
Use Unwrap: $('p').contents().unwrap()
It is the opposite of wrap in that it removes the parents of the selector. The p tags are the parent elements of the content, selecting the content before unwraping will unwrap the p tags. jsFiddle
You could use a regular expression to do this. It only removes the p-tags and leaves all other tags in place.
JavaScript
var string = "<p>this is a test with <p class='bold'>multiple</p> p-tags.</p><span>THIS IS COOL</span>";
var result = string.replace(/<[\/]{0,1}(p)[^><]*>/ig,"");
console.log(result);
FIDDLE
If you'd like to remove all tags, you could use /(<([^>]+)>)/ig instead as regex.
Try the following:
var str = "<p>Test</p>";
var res = str.replace("<p>", "").replace("</p>", "");
If I understand correctly, and you want the p tags removed but the content still there it should be as simple as:
str.replace('<p>', '').replace('</p>', '');
You can also use replaceWith - jsFiddle Example, readable and works with parent / child tags
$('p').replaceWith($('p').text())
My html is like this, I can only identify the div's class, there are no span' ids. I need to replace one href text and one image with some other text within those spans.
<div class ="myclass">
<span style="vertical-align:middle;">
</span>
<span style="vertical-align:middle;">
</span>
<span style="vertical-align:middle">
<span class="myspan">
<a href="http://testlink3">
<img title="test" class="imglink"></a>
</span>
</span>
<span>
Text - *This text needs to be replaced*
</span>
</div>
in the above code, I need to replace the img within the third span with a clickable text (which should take us to url) and the text within fourth span to a new text (keeping the url the same).
How can I get identify these specific spans when they are missing ids/classes?
We have 3 different things to do here:
How to replace the content inside a given element
This can be done very quickly:
$("selector").html("New text, same href");
Replace a given element with another
This can be done this way:
$("selector").replaceWith("<a href='somewhere.html'>I replaced an Img</a>");
Selecting the DOM elements
When you don't have an ID, nor a CSS class for your element, but you do know its position within another element plus some info about the element (like tagName), you can select the parent element and specify a relative position.
var myElement = $("parentElement").find("tagName:eq(position)");
Remember that this kind of selector ( "tagName:eq(position)") is zero indexed, so if you want to grab the third element, you need to tell jQuery tagName:eq(2).
So, let's say you parent element (not given in the question) is a div with a parent CSS class.
First thing you want to do is select this div.
var parent = $(".parent");
Then you want to find the Img within the third span.
var myImg = parent.find("span:eq(2)").find("img");
Now you can replace this element with the whatever you want
myImg.replaceWith("<a href='somewhere.html'>I replaced an Img</a>");
Note that jQuery allows you to pass HTML elements as a plain string.
Finally, you need to change the text inside the fourth span. This can be accomplished this way:
parent.find("span:eq(3)").find("a").html("New text, same href");
You could use document.querySelector to select an a based on the href:
document.querySelector("a[href='http://link4']").innerHTML = "The text you want to put in"
Since you're open to jQuery, this works too:
$("a[href='http://link4']").text("The text you want to put in")
var s = document.getElementsByTagName('span');
var i = spans[2].firstChild.children[1]; // here you find your img
i.parentNode.appendChild(<<your new text element>>);
i.parentNode.removeChild(img);// remove the image
var a = spans[3].firstChild; // here is your href
a.innerHTML = 'your new text';
You could use :nth-child() selector to select from the div you can identify.
More on :nth-child(): http://api.jquery.com/nth-child-selector/
Then select the img tag from the child span you found.
This is a continuation of What's a good way to show parts of an element but hide the rest?
<h1>
Let's say you had this <span class="safe">text</span>.
</h1>
How could one wrap all non-safe regions in an element with a disappear class (with jQuery).
Final Output
<h1>
<span class="disappear">Let's say you had this </span>
<span class="safe">text</span>
<span class="disappear">.</span>
</h1>
That way, the parent node is still visible, but the non-safe regions disappear, leaving the safe.
I'm not sure how to do this, but surely it's possible.
Text nodes have a nodeType of 3. Iterate the nodes and use wrap() to wrap the text nodes.:
$someElement.contents().each(function() {
if (this.nodeType == 3)
$(this).wrap('<span class="disappear" />');
});
http://jsfiddle.net/SnjnJ/
Filter the textNodes and wrap them in spans :
$('h1').contents().filter(function() {
return this.nodeType === 3;
}).wrap('<span class="disappear" />');
FIDDLE
$('h1').children().not('.safe').hide();
http://jsfiddle.net/VeSCw/
Match all contents of the h1 and use .not() to remove the safe from your selection. Then use .wrap() to make them "disappear".
http://api.jquery.com/not/
I'm having trouble wrapping my head around what should be a simple solution. I want to replace text within a label tag, without affecting the other 'siblings', if they exist.
Sample markup:
<fieldset class="myFieldsetClass">
<legend>Sample Fieldset</legend>
<ol>
<li>
<label>
<span class="marker">*</span>
Some Label 1
</label>
</li>
<li>
<label>
Some Label 2
</label>
</li>
<li>
<label>
Text that doesn't match...
</label>
</li>
</ol>
</fieldset>
Goal:
To replace text Some Label X with Some Label (I.e. remove X from the label text).
span tags within label must stay intact if they exist, such as the <span class="marker"> above.
I do not know the value of X, which could be 1 or more characters long.
Current Script:
My jQuery script below works, but I know is very inefficient. I can't seem to wrap my head around this one for some reason...
//for each label in the fieldset that contains text "Some Label "
$(".myFieldsetClass label:contains('Some Label ')").each(function() {
if ($(this).has("span").length > 0) {
//if the label has the span tag within, remove it and prepend it back to the replaced text
$(this).find("span").remove().prependTo($(this).text(labelText));
}
else {
//otherwise just replace the text
$(this).text('Some Label');
}
});
I thought at first I could simply do:
$(".myFieldsetClass label:contains('Some Label ')").text("Some Label");
But this clears all contents of the label and hence removes the span, which I don't want. I can't use any replace functions to replace Some Label X with Some Label because I don't know what X will be.
Can anyone suggest a more elegant/efficient approach to this problem?
Thanks.
EDIT
After trying multiple answers, I think the problem seems to be that even if I select the right collection, they are text nodes, which jquery doesn't seem to want to modify.. I've used FireBug to select the collection (many answers below all select correctly but in slightly different ways). In firebug console resulting set is:
[<TextNode textContent="Some Label 1:">,
<TextNode textContent="Some Label 2:">,
<TextNode textContent="Some Label 3:">,
<TextNode textContent="Some Label 4:">,
<TextNode textContent="Some Label 5:">]
The problem seems to be that calling .replaceWith(), .replace(), .text(), etc. doesn't seem to affect the jquery collection. If I allow the above collection to contain one of the spans, then calling .replaceWith(), .replace(), etc functions correctly against the span, but the text nodes stay as is..
Try:
$(".myFieldsetClass label:contains('Some Label ')").contents().filter(':last-child').text("Some Label");
This should work assuming the text to be replaced will always be at the end. The contents() function selects all nodes, including text nodes.
http://api.jquery.com/contents/
EDIT: I should have used filter() instead of find(). Corrected.
EDIT: Works now. Here's one way.
// Store proper labels in a variable for quick and accurate reference
var $labelCollection = $(".myFieldsetClass label:contains('Some Label ')");
// Call contents(), grab the last one and remove() it
$labelCollection.each(function() {
$(this).contents().last().remove()
});
// Then simply append() to the label whatever text you wanted.
$labelCollection.append('some text')
As patrick points out, you can use contents() to select the text alone, and then do a replace on it all. Adapting the example given there, you could also try:
$(".myFieldsetClass label:contains('Some Label ')").contents().filter(function() {
return this.nodeType == 3 && $(this).is(":contains('Some Label ')");
})
.replaceWith("Some Label");
However, if you know that "Some Label " will always be the last element in the <label> then patrick's method will be faster I believe.
Why not simply do an entire replace using regex?
$(".myFieldsetClass label:contains('Some Label ')")
.each(function() {
$(this).html($(this).html().replace(/Some Label ./, "Some Label"));
});
A one-liner:
$('.myFieldsetClass label').contents().last().remove().end().first().after('New Label')