How should I parse complex element in Cheerio - javascript

I'm using cheerio library as a scraper in my nodejs project. I want to parse the following structure:
<li class="sub menu-category-main">
<p>
<span class="price">$16.00</span>
ZESTAW DNIA + ZUPA
</p>
</li>
<li class=" ">
<p>
<span class="price">$12.00</span>
<img class="allergens" title="Vegerarian" src="/new_site/img/vegetarian_.png">
NALEŚNIKI AMERYKAŃSKIE Z SOSEM OWOCOWYM
<br>
american pancakes with fruit sauce
</p>
</li>
<li class=" ">
<p>
<span class="price">$11.00</span>
<img class="allergens" title="lactose free" src="/new_site/img/lactose_.png">
<img class="allergens" title="gluten free" src="/new_site/img/gluten_.png">
<img class="allergens" title="Vegerarian" src="/new_site/img/vegetarian_.png">
LECZO WEGETARIAŃSKIE
<br>
vegetables lecho
</p>
</li>
How can I parse this HTML so I can have price, name and list of images? At the end I want to build a JSON object to reuse the data (I know how to build a JSON, just have problems with parsing above HTML).
You can notice that there are names in English and Polish. I'm interested in the strings in Polish. Also please note that the structure of this document is very irregular (not consistent).
I also want to add, that making .text() of "p" does not give me the results that I like.

Related

How to make pop up with reactjs

How to make a pop-up information box when you click an element.
Example like this image :
Like this image, you click "View all study equipment" an element and then get that pop-up for more information.
This is my code :
<>
<div className="description-container">
<div className="nav-tab">
<ul className="description-tab">
<li>
<a href="/" className="active">
Class Description
</a>
</li>
<li>
Testimony
</li>
<li>
FAQ
</li>
</ul>
</div>
</div>
<div className="description-info">
<div className="desc-list">
<h2>Description</h2>
<p>
Websites in today's era have become a major need that cannot be
ignored. All business or education sectors can use the website as a
tool for promotion, exchange of information, and others. Based on data
from the World Wide Web Technology Surveys, of all active websites,
88.2% use HTML, 95.6% use CSS and 95% use JavaScript. This class
thoroughly discusses the basics of HTML, CSS and JavaScript as the
three foundations of website creation.
</p>
<ul>
<li>
<p>The web is a platform that can be accessed through many kinds of devices. This is an advantage if you develop Web-based applications.</p>
</li>
<li>
<p>Web development does not require a computer/laptop that has high specifications, so it is not an obstacle for those of you who do not have a capable device.</p>
</li>
<li>
<p>The website is a platform that is reached by search engines such as Google Search, so the website is suitable as a medium for promoting business or content.</p>
</li>
<li>
<p>Developing a website includes development that is easy to maintain and easy to publish.</p>
</li>
</ul>
</div>
<div className="specs-description-info">
<h4>Learning Equipment</h4>
<h5><i class="fas fa-microchip"></i>Processor</h5>
<span>Intel Celeron (Core i3 and above Recommended)</span>
<h4>Tools yang dibutuhkan untuk belajar:</h4>
<span><i class="fad fa-window"></i>Web Browser (Google Chrome atau Mozilla Firefox)</span>
<br />
<br />
<a className="equip_info" href="/">View all study equipment</a>
</div>
</div>
<div className="sub-border-description-info">
<h3>Student Targets and Goals :</h3>
<ul>
<li>
<p>This class is intended for beginners who want to start their career in the field of web development (web creation) and need a strong foundation or foundation before studying deeper in the web field, with reference to international standards owned by Google Developers.</p>
</li>
<li>
<p>Classes can be attended by students who are IT literate so it is mandatory to have and be able to operate a computer well.</p>
</li>
<li>
<p>This class is designed for beginners so there are no prerequisites in prior understanding of programming.</p>
</li>
<li>
<p>Students must be able to learn independently, be committed, really have curiosity, and be interested in the subject matter, because no matter how good this class material is, it will not be useful without students' seriousness to learn, practice, and try.</p>
</li>
</ul>
</div>
<div className="subdescription-info">
<h3>General and Specific Objectives of the Training :</h3>
<ul>
<li>
<p>At the end of the training, participants can create a simple website using programming code that complies with global standards.</p>
</li>
<li>
<p>Build a website using simple HTML, CSS, and JavaScript code.</p>
</li>
<li>
<p>Implement a good website structure using standard semantic HTML.</p>
</li>
<li>
<p>Demonstrating the preparation of website layouts using float or flexbox techniques.</p>
</li>
</ul>
</div>
</>
Maybe you can share how to make it with React Javascript and CSS too.
This is using newest react.
Thank You
So, what you can do is use the useState hook. Let me explain:
first, import the usestate hook, check the react docs on how to do that, but if youre using vs code it will probably do that for you if you type the next part -->
this will create a state oject (popup) with the default value of false, that can be altered by the togglepopup function:
const [popup, togglepopup] = useState(false);
Then in your button/element you want to use to toggle the popup write:
onClick={togglepopup(!popup)}
this will change the current popup value to the oposite value, e.x. false => true, true => false.
then just create a piece of javascript that looks like this:
{popup && <div className={'popup-bg'}>
<div className={'popup'}></div></div>}
now just set the popup-bg in your stylesheet to position: fixed, width: 100vw, height: 100vw, display: flex, align-items: center, justify-content: center, backdropfilter: blur(10px);
and now it will center your popup div.

How to getElement of an "a" tag within a "class"?

I've been wrestling with vanilla javascript for GetElementsBy(ID/Tag/Span) for some time, and was wondering if any of you have encountered this or know a solution to this problem.
I'm trying to getElementBy(...) for 3 innerHTML texts in the DOM that looks like this:
<ul class ="main_Bucket">
<li class="id_category">
<span class="id_item">
<span class="id_device">
This is the Data I want to Grab
</span>
</li>
<ul class ="main_Bucket">
<li class="id_category">
<span class="id_item">
<span class="id_device">
This is the Data I want to Grab
</span>
</li>
<ul class ="main_Bucket">
<li class="id_category">
<span class="id_item">
<span class="id_device">
This is the Data I want to Grab
</span>
</li>
Ultimately, I want to grab all three texts inside the text tag using GetElementsBy(...). What is the right approach to get this data?
Your HTML is pretty messy. I tried to clean it up in the example below, and added some distinct text for each bit of text you're trying to extract so it's more illustrative that we're grabbing 3 different links' texts. The code iterates over the anchors found, and uses the same highly specific selector #Jaromanda X wrote in his comment.
Click the "Run" button below to see it in action.
let anchors = document.querySelectorAll('ul.main_Bucket>li.id_category>span.id_device>a');
console.log(anchors.length, "anchors found");
anchors.forEach((anchor)=>console.log(anchor.innerText));
<ul class ="main_Bucket">
<li class="id_category">
<span class="id_item"></span>
<span class="id_device">
ONE: This is the Data I want to Grab
</span>
</li>
</ul>
<ul class ="main_Bucket">
<li class="id_category">
<span class="id_item"></span>
<span class="id_device">
TWO: This is the Data I want to Grab
</span>
</li>
</ul>
<ul class ="main_Bucket">
<li class="id_category">
<span class="id_item"></span>
<span class="id_device">
THREE: This is the Data I want to Grab
</span>
</li>
</ul>

get html content using jsoup

I am not able to get the data from the below content using jsoup. Can you please suggest me how to get the title,date,href,description of that href using jsoup. Please suggest me using javascript how can i handle it. Finally i should have two rows with the above specified columns which is under div tag. I have tried it by using this way
Jsoup = org.jsoup.Jsoup;
Whitelist = org.jsoup.safety.Whitelist;
OutputSettings = org.jsoup.nodes.Document.OutputSettings;
EscapeMode = org.jsoup.nodes.Entities.EscapeMode;
doc = Jsoup.parse(html);
os = OutputSettings().escapeMode(EscapeMode.xhtml).charset("utf-8");
var title = doc.outputSettings(os).select("a[href]").text();
var links = doc.outputSettings(os).select("a").attr("href");
But after writing that i got everything in one row. Along with that is there any we can get hyperlink description. I want two rows for the below data.
<div class="section">
<div class="row">
<div class="col-sm-12">
<p>Dec 18, 2014, 11:00 ET</p>
<ul>
<li><a title="SANS Honors People Who Made a Difference in Cybersecurity in 2014" href="http://www.prnewswire.com/news-releases/sans-honors-people-who-made-a-difference-in-cybersecurity-in-2014-300011928.html">SANS Honors People Who Made a Difference in Cybersecurity in 2014</a></li>
<li> SANS Institute is pleased to announce the winners of the SANS 2014 Difference Makers awards. While the headlines focus on security breaches, there are thousands of security practitioners out there who are quietly succeeding and keeping their companies and customers safe from attacks. The SANS... </li>
<li>More news about: <a title="SANS Institute">SANS Institute</a> </li>
</ul>
</div>
</div>
<div class="row">
<div class="col-sm-9">
<p>Dec 17, 2014, 07:00 ET</p>
<ul>
<li><a title="Microsemi's Ultra-secure SmartFusion2 SoC FPGAs and IGLOO2 FPGAs Recognized on EDN's List of Hot 100 Products of 2014" href="http://www.prnewswire.com/news-releases/microsemis-ultra-secure-smartfusion2-soc-fpgas-and-igloo2-fpgas-recognized-on-edns-list-of-hot-100-products-of-2014-300010576.html">Microsemi's Ultra-secure SmartFusion2 SoC FPGAs and IGLOO2 FPGAs Recognized on EDN's List of Hot 100 Products of 2014</a></li>
<li>  Microsemi Corporation (Nasdaq: MSCC), a leading provider of semiconductor solutions differentiated by power, security, reliability and performance, today announced its SmartFusion2® SoC FPGAs and IGLOO2® FPGAs were recognized among the Hot 100 Products of 2014 by EDN. Among the many... </li>
<li>More news about: <a title="Microsemi Corporation">Microsemi Corporation</a> </li>
</ul>
</div>
<div class="col-sm-3">
<a title="Microsemi's Ultra-secure SmartFusion2 SoC FPGAs and IGLOO2 FPGAs Recognized on EDN's List of Hot 100 Products of 2014" href="http://www.prnewswire.com/news-releases/microsemis-ultra-secure-smartfusion2-soc-fpgas-and-igloo2-fpgas-recognized-on-edns-list-of-hot-100-products-of-2014-300010576.html"> <img src="http://photos.prnewswire.com/prnthumb/20110909/MM66070LOGO" alt="Microsemi Corporation."> </a>
</div>
</div>

Perch CMS - Get first item out of region list

I'm using the Perch CMS to pull through some captions for a bxslider. I currently have 4 taglines. Because I'm using a different image on each bxslider list item I'm wondering if there's a way to pull out a specific index of the array itself.
<ul class="bxslider">
<li> <img src="assets/img/banners/hay-banner.jpg"/> </li>
<li> <img src="assets/img/banners/final-farmhouse-banner.jpg"/> </li>
<li> <img src="assets/img/banners/final-tractor-banner.jpg"/> </li>
<li> <img src="assets/img/banners/property-owners.jpg"/> </li>
</ul>
That's my HTML code currently. And I want to be able to pull the taglines out using
<?php perch_content('Taglines');?>
But obvioulsy that will pull all of the taglines into the title and not just the first tagline for the first <li>, the second tagline for the second <li> tag.
Is there a way to do this within perch? (Ideal output below).
<li> <img src="assets/img/banners/hay-banner.jpg" title="perch_content('Taglines (1))"
Create a template for that region in perch/templates/content/bxslider.html
Assuming you want the images to be rendered to 1000x400 on upload:
<perch:before>
<ul class="bxslider">
</perch:before>
<li> <img src="<perch:content id="image" label="Image" type="image" width="1000" height="400" crop="true" />" title="<perch:content id="tagline" label="Tagline" type="text" />" /> </li>
<perch:after>
</ul>
</perch:after>
On your page, create the region with <?php perch_content('Slider'); ?>, reload the page in the browser, go to perch backend, set the now appeared region "Slider" on that page to allow multiple items and use the template "Bxslider" - Done.
Many more options for images are to be found on http://docs.grabaperch.com/docs/templates/attributes/type/image/

Change recipe ingredients as serving changes

i am working on a recipe site (which is almost finished except this problem) on a single recipe page, i have recipe serving input field and recipe ingredients.
My question is that using jquery i want to change recipe ingredients dynamically if the serving changes
i have seen a similar closed threat, but it did not help me
my exact code will look like this
Serving: <input type="text" name="serving" class="serving" value="5"/> persons
<h3>ingredients</h3>
<ul class="ingredients">
<li class="ingredient">
<span class="amount">1</span> cups
<span class="name">yogurt</span>
</li>
<li class="ingredient">
<span class="amount">2</span> tbsp
<span class="name">chillies</span>
</li>
<li class="ingredient">
<span class="amount">3</span> pieces
<span class="name">butter</span>
</li>
</ul>
i would highly appreciate any help on this to get me finish my pending site
Regards
Try this: http://jsfiddle.net/vansimke/tLWpr/

Categories

Resources