How to get the text - javascript

I have some html to scrape.
<div class="content">
<strong> This is first content </strong> This is second content
<br />
<small>
<p>Something</p>
</small>
</div>
how to get the This is second content with cheerio ?

Using nodeType property, it could solve your problem even if you have text before <strong> tag
<div class="content">
Before first content
<strong> This is first content </strong> This is second content
<br />
<small>
<p>Something</p>
</small>
</div>
Then it could be
var cheerio = require("cheerio")
const $ = cheerio.load('<div class="content">Before first content<strong> This is first content </strong> This is second content<br /><small><p>Something</p></small></div>');
var $outer = $("div.content").contents().filter(function() {
return this.nodeType === 3;
});
console.log($outer.text()); //"Before first content This is second content"
$outer.each(function() {
console.log($(this).text());
});
//"Before first content"
//" This is second content"
Check it here

You can't directly select text nodes. I usually do something like:
$('.content strong')[0].nextSibling.data

Maybe this could help:
<div class="content">
<strong> This is first content </strong> <span class="toBeSelected">This is second content</span>
<br />
<small>
<p>Something</p>
</small>
</div>
After this you can select the text this way.
$('div .toBeSelected').html()

Ideally you should put the 'This is a second content' in span or something else to specify properly to get its content.
do this:
<div class="content">
<strong>This is first content</strong><span>This is second content</span>
<br>
<small>
<p>Something</p>
</small>
</div>
Get it like this
console.log($('.content :nth-child(2)').text())
Working Demo: https://jsfiddle.net/usmanmunir/vg1dqm3L/17/ for you.

I think you can use regex to get the second content.
const cheerio = require('cheerio');
const $ = cheerio.load(`<div class="content">
<strong> This is first content </strong> This is second content
<br />
<small>
<p> Something </p>
</small>
</div>
`);
console.log($('div').html().replace(/\n/g, '').match(/<\/strong>(.*)<br>/)[1])

Related

HTMLCollection shows when replacing elements in js?

I'm trying to replace an Html(Div) Into Text Element The First i created the conditions you can see it here
Html
<!-- Replace Text Here-->
<h6 class="idk1"> Replace This </h6>
<!-- Main Column -->
<div class="js-cp-main-wrp">
<div class="support-2">
<html>stuff here</html>
</div>
<div class="end-cp-wrp">
<html>stuff here</html>
<html>stuff here</html>
<html>stuff here</html>
</div>
</div>
Js Script
var woahtikcets = document.getElementsByClassName('js-cp-main-wrp');
const loadthetickets = document.getElementsByClassName('idk1');
The run this
loadthetickets.replaceWith(woahtikcets)
it show me
[object HTMLCollection] ??
this method didn't work it sigh it as undefined ? please help <3
const loadthetickets = document.getElementsByClassName('idk1')[0];
I am not at all certain about what your intention is but <html> is not just some tag in HTML. It must be the outer tag (the root of the document) and can therefore only appear once in a document. So, if you replace the <html> elements with other (legal) entities, like <span> you will very likely get some results, as you can see below:
const woahtikcets = document.querySelectorAll('.js-cp-main-wrp span'),
newtext = document.getElementsByClassName('idk1')[0].textContent;
for (var el of woahtikcets){el.textContent=newtext}
<!-- Replace Text Here-->
<h6 class="idk1"> Replace This </h6>
<!-- Main Column -->
<div class="js-cp-main-wrp">
<div class="support-2">
<span>stuff here</span>
</div>
<div class="end-cp-wrp">
<span>stuff here</span>
<span>stuff here</span>
<span>stuff here</span>
</div>
</div>
In case you want the replace operation to work in the opposite direction, you could do the following:
const newtext = document.querySelectorAll('.js-cp-main-wrp')[0].textContent,
heading = document.getElementsByClassName('idk1');
for (var el of heading){el.textContent=newtext}
<!-- Replace Text Here-->
<h6 class="idk1"> Replace This </h6>
<!-- Main Column -->
<div class="js-cp-main-wrp">
<div class="support-2">
<span>stuff here</span>
</div>
<div class="end-cp-wrp">
<span>stuff here</span>
<span>stuff here</span>
<span>stuff here</span>
</div>
</div>

Wrap elements and content into div

I am requesting data (with jQuery load() function) from another site into my <div class="ajax-container"> to display the gathered data. Data comes with elements AND content (only text, like "born 2007"). I have tried to wrap these with jQuery wrap() and wrapAll method, but it doesn't wrap it correctly. So I want to wrap all elements and content into a single div so I can style it more easily.
How can I achieve this? I need to wrap everything UNTIL certain class is found. I've also tried jQuery parentsUntil() method but that didn't work either. I have no idea how to achieve this so I need your help.
Example data (html that prints to my page):
<div class="container">
<b class="big_text">Title</b>
"some content after title "
<b>Subtitle content here</b>
<br />
<span class="no-wrap"><b>Price</b></span>
<br />
"made in 2007"
<div class="picture">
...
</div>
</div>
So that's the problem, I have to wrap everything into a div until <div class="picture"> appears.
This did not work:
$(".container").children().nextUntil(".picture").wrapAll("<div></div>");
This would be the desired result:
<div class="container">
<div class="content">
<b class="big_text">Title</b>
"some content after title "
<b>Subtitle content here</b>
<br />
<span class="no-wrap"><b>Price</b></span>
<br />
"made in 2007"
</div>
<div class="picture">
...
</div>
I believe the best approach is to wrap the inner contents then move "picture":
$(document).ready(function(){
$(".container").wrapInner("<div></div>");
$(".container").append($(".container .picture"));
})
.container > div ~ .picture {color:red;}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div class="container">
<b class="big_text">Title</b>
"some content after title "
<b>Subtitle content here</b>
<br />
<span class="no-wrap"><b>Price</b></span>
<br />
"made in 2007"
<div class="picture">
...
</div>
</div>

document.execCommand weird results in FireFox

WYSIWYG text format function
/*
* myEditor - iframe element
*/
function doFormat(what) {
var cWin = $('#myEditor', window.parent.document)[0].contentWindow;
cWin.$('body').attr({ 'contentEditable': true, 'designMode': 'On'} );
var what = 'justifyright';
cWin.document.execCommand(what, false, arguments[2]);
cWin.$('body').attr({'contentEditable': 'false'} );
return;
}
HTML:
<span id="obj_123" contenteditable="true">
text<br />
moretext<br />
alot of text <br />
</span>
If clicked or highlighted PART (only if part selected) of the span contents and called function doFormat (with any justify function)
the results are:
<span id="obj_123" contenteditable="true">
text<br />
</span>
<div align="right">
<span id="obj_123" contenteditable="true">
moretext<br />
</span>
</div>
<span id="obj_123" contenteditable="true">
alot of text <br />
</span>
expected results:
<span id="obj_123" contenteditable="true">
text<br />
<div align="right">
moretext<br />
</div>
alot of text <br />
</span>
as if it copies the parent node and then replicates it each time when formatting occures.
other execCommand calls like bold/italic/underline etc works as supposed.
I would appreciate any help on this topic.
Thank you very much.
edit:
added a fiddle test case
http://jsfiddle.net/n7SgJ/
Turns out this happens only if the rich text wrapper is a SPAN tag. Everything works if DIV tag is used.
Is there a reasonable explanation for this?

Search the document for text to remove more than the tag it's found in

Imagine you had html code like this:
<div id="2761421" class="..." data-attributionline="testtext wrote at..." data-created-at="1342689802000" data-updated-at="1342689802000" data-user-id="36847" >
<div class="subject"> <strong>
<a name="2761421" href="#2761421">wee</a></strong>
</div>
<div class="info">
<div class="author"> author:
<span class="name"> testtext (
we)
</span>
</div>
<div class="date"> date:
<time datetime="dfgdf">dfgdfg
</time>
</div>
</div>
<hr style="clear: both;" />
<div class="text gainlayout"> some text
</div>
<div class="foot gainlayout unselectable">
<span class="menuitem postmenuitem-report">
cvbcvb
</span>
</div>
</div>
What would the javascript code look like that searches the whole document for parts where the div with data-attributionline= contains testtext to replace the whole cited div from start to finish with "Filtered"?
or
What would the javascript code look like that searches the whole document for
<span class="name">
where the name contains testtext to replace the whole div starting from
<div id="<someid>" class="..." data-attributionline="testtext<some text>" data-created-at="<somedate>" data-updated-at="<somedate>" data-user-id="<someid>" >
to the last div, ie
</span>
</div>
</div>
with "Filtered"?
There is a whole bunch of attribute selectors (see also CSS), that will match elements with attributes that start with or contain testtext.
In javascript, you can use document.querySelector[All]() or a library function that supports those to get the elements, then remove them.

How to limit jQuery searches to a certain area

Let's say I have the following.
<div class="foo">
<div>
some text
<div class="bar">
</div>
</div>
</div>
<div class="foo">
<div>
some text
<div class="bar">
some text
</div>
</div>
</div>
I want return all the divs of class "foo" that have "some text" inside the div class "bar." So the second would be returned, but not the second. How can I do that?
Try this
$("div.bar:contains('some text')").parents(".foo")
This will do it
$('.foo:has(.bar:not(:empty))')
Make sure there are no characters inside the .bar, even spaces or newlines.
http://jsfiddle.net/Nk4FB/

Categories

Resources