Puppeteer- Need help to extract the text from h2 and span - javascript

Absolute beginner here with JS. I need help to extract the text from DOM which looks like this.
Extracting can be done by querySelectorAll() or getElementsByTagName(). But what I'm looking for is to create an object with each h2 element as the key and the span as it's value. I don't have an idea of how this can be achieved. Any suggestions would be very helpful.
<div class ="product-list">
<div class="row column">
<div class="column medium-9 large-10">
<h2 class="product-name">Products List 1</h2>
</div>
</div>
<div class="row">
<span>First Product</span>
</div>
<div class="row">
<span> Second Product</span>
</div>
.
.
.
<div class="row">
<span>
Nth Product
</span>
</div>
<div class="row column">
<div class="column medium-9 large-10">
<h2 class="product-name">Products List 2</h2>
</div>
</div>
<div class="row">
<span>Thrid Product</span>
</div>
<div class="row">
<span> Fourth Product</span>
</div>
.
.
.
<div class="row">
<span>
Nth Product
</span>
</div>
</div>
From this DOM I need to store the data as
[
Products List 1 :[First Product,Second Product...Nth Product],
Products List 2 :[Third Product,Fourth Product...Nth Product]
]
JS:
const products=await page.evaluate(()=>{
const productsArray=[];
var pdName1=document.querySelectorAll('div.column > h2.product-name');
var pdName2=document.querySelectorAll("div.row > span")
pdName2.forEach(query=>{
productArray.push(query.innerText)
})
return productArray
})

You can try something like this:
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch();
const html = `
<!doctype html>
<html>
<head><meta charset='UTF-8'><title>Test</title></head>
<body>
<div class ="product-list">
<div class="row column">
<div class="column medium-9 large-10">
<h2 class="product-name">Products List 1</h2>
</div>
</div>
<div class="row"><span>First Product</span></div>
<div class="row"><span> Second Product</span></div>
<div class="row"><span>Nth Product</span></div>
<div class="row column">
<div class="column medium-9 large-10">
<h2 class="product-name">Products List 2</h2>
</div>
</div>
<div class="row"><span>Thrid Product</span></div>
<div class="row"><span> Fourth Product</span></div>
<div class="row"><span>Nth Product</span></div>
</div>
</body>
</html>`;
try {
const [page] = await browser.pages();
await page.goto(`data:text/html,${html}`);
const data = await page.evaluate(() => {
const elements = document.querySelectorAll('h2, div.row span');
const list = {};
let currentKey = null;
for (const element of elements) {
if (element.tagName === 'H2') {
currentKey = element.innerText;
list[currentKey] = [];
} else {
list[currentKey].push(element.innerText);
}
}
return list;
});
console.log(data);
} catch (err) { console.error(err); } finally { await browser.close(); }

Related

drop down search bar on click with history?

I want to add a feature to my current search bar where if I click, I am able to see a drop down of all the previous inputs. And if I were to click on this previous input, it will run my code again. I am a current boot camp student and I just need guidance into how to make this work. If someone to just point me in the right direction, or explain some sample functions that would be really helpful. Thanks in advance.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link
rel="stylesheet"
href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
/>
<link rel="stylesheet" href="./assets/style.css" />
<title>Weather Dashboard</title>
</head>
<body onload="load()">
<div class="container">
<div class="card">
<div class="row">
<div class="col-9 left">
<nav class="row top">
<div class="col" id="cityName">City Name</div>
<form class="form-outline">
<input
type="search"
id="userInput"
class="form-control"
placeholder="search for a city"
aria-label="Search"
/>
</form>
<div class="col" id="date">Date</div>
</nav>
<div class="row">
<div class="col-7 temp" id="temperature">15°</div>
<div class="col-5 time">
<p id="time">11:00</p>
<h2 id="today"><b>Saturday</b></h2>
<p id="conditions">Cloudy</p>
</div>
</div>
<div class="row bottom">
<div class="col"><hr /></div>
<div class="col">
<div class="row" id="condition1">Condition</div>
<div class="row data"><img id="conditionIcon1" /></div>
</div>
<div class="col">
<div class="row" id="condition2">Condition</div>
<div class="row data"><img id="conditionIcon2" /></div>
</div>
<div class="col">
<div class="row" id="condition3">Condition</div>
<div class="row data"><img id="conditionIcon3" /></div>
</div>
<div class="col">
<div class="row" id="condition4">Condition</div>
<div class="row data"><img id="conditionIcon4" /></div>
</div>
<div class="col">
<div class="row" id="condition5">Condition</div>
<div class="row data"><img id="conditionIcon5" /></div>
</div>
<div class="col"><hr /></div>
</div>
<div class="row bottom">
<div class="col"><hr /></div>
<div class="col">
<div class="row" id="date1">Sun</div>
<div class="row data" id="date1Temp"><b>0°</b></div>
</div>
<div class="col">
<div class="row" id="date2">Mon</div>
<div class="row data" id="date2Temp"><b>0°</b></div>
</div>
<div class="col">
<div class="row" id="date3">Tue</div>
<div class="row data" id="date3Temp"><b>0°</b></div>
</div>
<div class="col">
<div class="row" id="date4">Wed</div>
<div class="row data" id="date4Temp"><b>0°</b></div>
</div>
<div class="col">
<div class="row" id="date5">Thu</div>
<div class="row data" id="date5Temp"><b>0°</b></div>
</div>
<div class="col"><hr /></div>
</div>
<div class="row bottom">
<div class="col"><hr /></div>
<div class="col">
<div class="row">Humidity</div>
<div class="row data" id="humidity1"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Humidity</div>
<div class="row data" id="humidity2"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Humidity</div>
<div class="row data" id="humidity3"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Humidity</div>
<div class="row data" id="humidity4"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Humidity</div>
<div class="row data" id="humidity5"><b>0°</b></div>
</div>
<div class="col"><hr /></div>
</div>
<div class="row bottom">
<div class="col"><hr /></div>
<div class="col">
<div class="row">Wind Speed</div>
<div class="row data" id="windSpeed1"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Wind Speed</div>
<div class="row data" id="windSpeed2"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Wind Speed</div>
<div class="row data" id="windSpeed3"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Wind Speed</div>
<div class="row data" id="windSpeed4"><b>0°</b></div>
</div>
<div class="col">
<div class="row">Wind Speed</div>
<div class="row data" id="windSpeed5"><b>0°</b></div>
</div>
<div class="col"><hr /></div>
</div>
</div>
<div class="col-3 right">
<div class="row top" id="right-header">Today's Statistics</div>
<div class="timely">
<div class="row">Temp High:<b id="tempHigh">0°</b></div>
<div class="row">Temp Low:<b id="tempLow">0°</b></div>
<div class="row">Feels Like:<b id="feelslike">0°</b></div>
<div class="row">Wind Speed:<b id="windspeed">0°</b></div>
<div class="row">Humidity:<b id="humidity">0°</b></div>
<div class="row">Pressure:<b id="pressure">0°</b></div>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/dayjs#1/dayjs.min.js"></script>
<script>
dayjs().format();
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.24.0/moment.min.js"></script>
<script src="./assets/script.js"></script>
</body>
</html>
const date = document.querySelector("#date");
const time = document.querySelector("#time");
const dayOfWeek = document.querySelector("#today");
const input = document.querySelector("#userInput");
date.innerText = moment().format("MMMM Do YYYY");
time.innerText = moment().format("h:mm A");
dayOfWeek.innerText = moment().format("dddd");
// applies elements on page load with current position
function load() {
navigator.geolocation.getCurrentPosition((position) => {
let lat = position.coords.latitude;
let long = position.coords.longitude;
let fiveDayURL = `https://api.openweathermap.org/data/2.5/forecast?lat=${lat}&lon=${long}&appid=b169b31281ffa2a2b70b9e8ac22c3e88&units=imperial`;
fetch(fiveDayURL)
.then((res) => {
return res.json();
})
.then((data) => {
fiveDayWeather(data);
console.log(data);
localStorage.setItem("response", JSON.stringify(data.city.name));
loadUrl();
});
});
}
function loadUrl() {
let cityName = JSON.parse(localStorage.getItem("response"));
let requestURL = `https://api.openweathermap.org/data/2.5/weather?q=${cityName}&appid=b169b31281ffa2a2b70b9e8ac22c3e88&units=imperial`;
fetch(requestURL)
.then((res) => {
return res.json();
})
.then((data) => {
// console.log(data);
displayWeather(data);
})
.catch(() => {
alert("Unable to connect to OpenWeather");
});
}
// uses user input as parameter to getApi()
input.addEventListener("keypress", function (e) {
if (e.key === "Enter") {
e.preventDefault();
// let cityName = document.querySelectro("#userInput").value;
// let li = document.createElement("li")
// li.innerText = cityName;
// document.querySelector('ul');
// ul.appendChild(li);
getApi();
input.value = "";
}
});
// fetches api using the user input
function getApi() {
let cityName = document.querySelector("#userInput").value;
let requestURL = `https://api.openweathermap.org/data/2.5/weather?q=${cityName}&appid=b169b31281ffa2a2b70b9e8ac22c3e88&units=imperial`;
fetch(requestURL)
.then((res) => {
return res.json();
})
.then((data) => {
// console.log(data);
displayWeather(data);
})
.catch(() => {
alert("Unable to connect to OpenWeather");
});
}
// uses api data from getApi() and replaces text in html
let displayWeather = function (weatherData) {
document.querySelector("#cityName").innerText = weatherData.name;
document.querySelector("#temperature").innerText =
Math.floor(weatherData.main.temp) + "\u00B0";
document.querySelector("#conditions").innerText =
weatherData.weather[0].description;
document.querySelector("#tempHigh").innerText =
weatherData.main.temp_max + "\u00B0 F";
document.querySelector("#tempLow").innerText =
weatherData.main.temp_min + "\u00B0 F";
document.querySelector("#feelslike").innerText =
weatherData.main.feels_like + "\u00B0 F";
document.querySelector("#windspeed").innerText =
weatherData.wind.speed + " MPH";
document.querySelector("#humidity").innerText =
weatherData.main.humidity + "%";
document.querySelector("#pressure").innerText =
weatherData.main.pressure + " hPa";
let fiveDayURL = `https://api.openweathermap.org/data/2.5/forecast?lat=${weatherData.coord.lat}&lon=${weatherData.coord.lon}&appid=b169b31281ffa2a2b70b9e8ac22c3e88&units=imperial`;
fetch(fiveDayURL)
.then((res) => {
return res.json();
})
.then((data) => {
// console.log(data);
fiveDayWeather(data);
})
.catch(() => {
alert("Unable to connect to OpenWeather");
});
};
// obtains lon and lat from previous function then completes new fetch to display 5 day forecast
let fiveDayWeather = function (weatherValue) {
let todaysMonth = dayjs().$M;
for (let i = 1; i < 6; i++) {
document.querySelector("#date" + i).innerText = `${todaysMonth}/${
dayjs().$D + i
}`;
document.querySelector("#date" + i + "Temp").innerText =
weatherValue.list[i].main.temp + "\u00B0 F";
document.querySelector("#condition" + i).innerText =
weatherValue.list[i].weather[0].description;
document.querySelector("#conditionIcon" + i).src =
"http://openweathermap.org/img/wn/" +
weatherValue.list[i].weather[0].icon +
"#2x.png";
document.querySelector("#humidity" + i).innerText =
weatherValue.list[i].main.humidity + "%";
document.querySelector("#windSpeed" + i).innerText =
weatherValue.list[i].wind.speed + "MPH";
}
};
You can store and retrieve the search history as an array to the local storage this way:
// our array
var movies = ["Reservoir Dogs", "Pulp Fiction", "Jackie Brown",
"Kill Bill", "Death Proof", "Inglourious Basterds"];
// storing our array as a string
localStorage.setItem("quentinTarantino", JSON.stringify(movies));
// retrieving our data and converting it back into an array
var retrievedData = localStorage.getItem("quentinTarantino");
var movies2 = JSON.parse(retrievedData);
//making sure it still is an array
alert(movies2.length);
You retrieve the array from local storage and add it to the dropdown of the search box. When you search another query string, push it to the retrieved array and store it to local storage again. Hope it helps!
Hello I saw the code and I could suggest you few points-
Create an array and store the e.target.value from your #userinput when the enter key is pressed. If you want you can store the data in local storage as well at the later stage.
Then through an event listener in the input box mention the values stored in the array in list form(Probably use simple ul, li format combination and then style it later).
You also need to be able to pick the value of the history inputs which user selects and for that one way is to use event listeners in all the items of the array.
Now just call the api or whatever you want to do with the e.target.value from one of the event listeners.

changing Div order in a div main container, javascript DOM manipulation

i want to move a div form the start to the end in a the same div:from 1-2-3 to 2-3-1
my code
const cards = document.querySelectorAll(".card");
const firstCard = document.querySelectorAll(".card")[0].innerHTML;
cards[0].remove();
document.getElementById("mainC").appendChild(firstCard);
<div id="mainC">
<div class="card"> 1 </div>
<div class="card"> 2 </div>
<div class="card"> 3 </div>
</div>
i want to move a div form the start to the end in a the same div:from 1-2-3 to 2-3-1
Based on your original code,we need to remove .innerHTML,then it will work
const cards = document.querySelectorAll(".card");
const firstCard = document.querySelectorAll(".card")[0];// remove .innerHTML and it will work
cards[0].remove();
document.getElementById("mainC").appendChild(firstCard);
<div id="mainC">
<div class="card"> 1 </div>
<div class="card"> 2 </div>
<div class="card"> 3 </div>
</div>
Another solution is to store the content into an array and change the array element order
let divs = []
document.querySelectorAll('#mainC .card').forEach(d =>{
divs.push(d.outerHTML)
})
divs.push(divs.shift())
document.querySelector('#mainC').innerHTML = divs.join('')
<div id="mainC">
<div class="card"> 1 </div>
<div class="card"> 2 </div>
<div class="card"> 3 </div>
</div>
you have used document.querySelectorAll(".card")[0].innerHTML which gives '1' which is not type "node" so it will give an error when appending as a child.
remove .innerHTML and it will work
here is an example that removes the first child and append it to the end.
const shuffle = () => {
const parent = document.querySelector("#mainContainer");
const childrens = [...parent.children];
parent.appendChild(childrens.splice(0,1)[0]);
};
<button type="button" onclick=shuffle()> suffel</button>
<div id="mainContainer">
<div class="card">1</div>
<div class="card">2</div>
<div class="card">3</div>
</div>

Puppeteer; Get Values within an element

I'm stuck here.
I got multiple rows with class rowcontent.
I get them like that:
const rows = await page.$$('.row-content');
Almost every row in rows got many spans with the class named cashspan.
I would like to get those values in an array called 'values'.
I've tried much to many things with no success
for (let m = 0; m < rows.length; m++) {
const row = await rows[m];
const values = await row.evaluate(() => Array.from(row.getElementsByClassName('cashspan'), element => element.textContent));
console.log(values)
}
this was the latest thing I've tried.
With
const spancashs = await page.evaluate(() => Array.from(document.querySelectorAll('[class="cashspan"]'), element => element.textContent));
I get all the elements on the page. But i need them for every row. Hope that makes sense.
Update1
Example:
<div class="container">
<div class="row-content">
<div class="someclass1">
<div class="someclass2">
<span class="cashspan">1</span>
</div>
</div>
<div class="someclass3">
<div class="someclass4">
<span class="cashspan">2</span>
</div>
</div>
<div class="someclass5">
<div class="someclass6">
<span class="cashspan">3</span>
</div>
</div>
</div>
<div class="row-content">
<div class="someclass7">
<div class="someclass8">
<span class="cashspan">4</span>
</div>
</div>
<div class="someclass9">
<div class="someclass10">
<span class="cashspan">5</span>
</div>
</div>
<div class="someclass11">
<div class="someclass12">
<span class="cashspan">6</span>
</div>
</div>
</div>
<div class="row-content">
<div class="someclass13">
<div class="someclass14">
<span class="cashspan">7</span>
</div>
</div>
<div class="someclass15">
<div class="someclass16">
<span class="cashspan">8</span>
</div>
</div>
<div class="someclass17">
<div class="someclass18">
<span class="cashspan">9</span>
</div>
</div>
</div>
</div>
Code:
const rows = await page.$$('.row-content');
for (let i = 0; i < rows.length; i++) {
const row = await rows[i];
const values = await row.evaluate(() =>
Array.from(row.getElementsByClassName('cashspan'), element =>
element.textContent));
console.log(values)
}
I'm trying to get all cashspan values in every row-content container. The output for this example should be:
[ 1, 2, 3 ]
[ 4, 5, 6 ]
[ 7, 8 ,9 ]
Following up on the comments, the row variable inside of evaluate()'s callback was never defined in browser context. Adding that variable to the evaluate() callback parameter list worked for me on the provided example. This is the only non-cosmetic change below:
const puppeteer = require("puppeteer"); // ^13.5.1
const html = `
<body>
<div class="container">
<div class="row-content">
<div class="someclass1">
<div class="someclass2">
<span class="cashspan">1</span>
</div>
</div>
<div class="someclass3">
<div class="someclass4">
<span class="cashspan">2</span>
</div>
</div>
<div class="someclass5">
<div class="someclass6">
<span class="cashspan">3</span>
</div>
</div>
</div>
<div class="row-content">
<div class="someclass7">
<div class="someclass8">
<span class="cashspan">4</span>
</div>
</div>
<div class="someclass9">
<div class="someclass10">
<span class="cashspan">5</span>
</div>
</div>
<div class="someclass11">
<div class="someclass12">
<span class="cashspan">6</span>
</div>
</div>
</div>
<div class="row-content">
<div class="someclass13">
<div class="someclass14">
<span class="cashspan">7</span>
</div>
</div>
<div class="someclass15">
<div class="someclass16">
<span class="cashspan">8</span>
</div>
</div>
<div class="someclass17">
<div class="someclass18">
<span class="cashspan">9</span>
</div>
</div>
</div>
</div>
</body>
`;
let browser;
(async () => {
browser = await puppeteer.launch({headless: true});
const [page] = await browser.pages();
await page.setContent(html);
const rows = await page.$$('.row-content');
for (let i = 0; i < rows.length; i++) {
const row = await rows[i];
const values = await row.evaluate(row => Array.from(
row.getElementsByClassName('cashspan'),
element => element.textContent
));
console.log(values);
}
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;
If this isn't working on the live site, the problem could be due to any number of JS behaviors, visibility or timing issues, so more detail would be necessary to accurately reproduce the problem.

JavaScript, HTML search engine - how to get parent of an element?

My problem is:
Search script is working, but it only hides h3 elements from the code.
<h3 class="post-subtitle" style="display: flex;">Protokoły tunelowania VPN</h3>
<h3 class="post-subtitle" style="display: flex;">Certyfikat cyfrowy</h3>
I need the code to hide the whole div with "post" ID instead of just h3 element.
How do i do that?
HTML Code for Search Bar:
<div id="kontener" class="container">
<div style="text-align:center" id="search-bar">
<input type="text" id="searchbar" onkeyup="searchBar()" class="shadow-lg">
</div>
</div>
HTML Code on Website
<!-- First element -->
<div id="post">
<div class="row">
<div class="col-lg-8 col-md-10 mx-auto">
<div class="post-preview">
<a href="URL">
<h2 class="post-title"><i class="far fa-sticky-note fa-xs" aria-hidden="true"></i> ASO</h2>
<h3 class="post-subtitle" style="display: flex;">Protokoły tunelowania VPN</h3>
</a>
<p class="post-meta">11 Maj, 2021</p>
</div>
</div>
</div>
<hr>
</div>
<!-- End of First element -->
<!-- Second element -->
<div id="post">
<div class="row">
<div class="col-lg-8 col-md-10 mx-auto">
<div class="post-preview">
<a href="URL">
<h2 class="post-title"><i class="far fa-sticky-note fa-xs" aria-hidden="true"></i> ELSK</h2>
<h3 class="post-subtitle" style="display: flex;">Certyfikat cyfrowy</h3>
</a>
<p class="post-meta">26 Kwiecień, 2021</p>
</div>
</div>
</div>
<hr>
</div>
<!-- End of Second element -->
JavaScript code:
<script>
function searchBar() {
let input = document.getElementById('searchbar').value
input=input.toLowerCase();
let x = document.getElementsByClassName('post-subtitle');
for (i = 0; i < x.length; i++) {
if (!x[i].innerHTML.toLowerCase().includes(input)) {
x[i].style.display="none";
}
else {
x[i].style.display="flex";
}
}
}
</script>
You just need to target couple of parent nodes, either by .parentElement / .parentNode or use .closest function.
Example:
<script>
function searchBar() {
let input = document.getElementById('searchbar').value
input=input.toLowerCase();
let x = document.getElementsByClassName('post-subtitle');
for (i = 0; i < x.length; i++) {
if (!x[i].innerHTML.toLowerCase().includes(input)) {
x[i].closest('#post').style.display="none";
// Or this below (note each parentElement targets parent tag)
// x[i].parentElement.parentElement.parentElement.parentElement.parentElement.style.display="none";
}
else {
x[i].closest('#post').style.display="flex";
}
}
}
</script>

Wrapping elements while looping through HTMLCollection causes problem

I want to wrap each item of the container in a div. When I loop through HTMLCollection, some elements are accessed multiple times while others are left out
HTML
<div class="container">
<div class="item_1"></div>
<div class="item_2"></div>
<div class="item_3"></div>
<div class="item_4"></div>
<div class="item_5"></div>
<div class="item_6"></div>
<div class="item_7"></div>
<div class="item_8"></div>
<div class="item_9"></div>
</div>
JS
const container = document.querySelector('.container');
const items = container.children;
for(let i = 0; i < items.length; i++) {
const wrapper = document.createElement('div');
wrapper.classList.add('wrapper');
wrapper.appendChild(items[i]);
container.appendChild(wrapper);
}
Looping directly through HTMLCollection gives this bizarre result
<div class="container">
<div class="item_2"></div>
<div class="item_4"></div>
<div class="item_6"></div>
<div class="item_8"></div>
<div class="wrapper">
<div class="item_1"></div>
</div>
<div class="wrapper">
<div class="item_5"></div>
</div>
<div class="wrapper">
<div class="item_9"></div>
</div>
<div class="wrapper">
<div class="wrapper">
<div class="item_7"></div>
</div>
</div>
<div class="wrapper">
<div class="wrapper">
<div class="wrapper">
<div class="wrapper">
<div class="item_3"></div>
</div>
</div>
</div>
</div>
</div>
problem gets solved when I convert HTMLCollection to an Array
const items = Array.from(container.children);
I can't understand what causes such behavior
You were iterating the container.children list which you were also changing during the iterations. This messed up the iteration. You can solve this, as you mentioned yourself, by converting the container.children to an array because then you are not iterating over the live container.children list but over an array copy of that. This copy is still referring to the correct child elements so they are moved correctly with the appendChild() function.
As an alternative you can use the querySelecterAll() to retrieve all the elements you want to wrap.
const container = document.querySelector('.container');
const items = container.querySelectorAll('.container > *');
for(let i = 0; i < items.length; i++) {
const wrapper = document.createElement('div');
wrapper.classList.add('wrapper');
wrapper.appendChild(items[i]);
container.appendChild(wrapper);
}
.wrapper {
background-color: red;
}
<div class="container">
<div class="item_1">1</div>
<div class="item_2">2</div>
<div class="item_3">3</div>
<div class="item_4">4</div>
<div class="item_5">5</div>
<div class="item_6">6</div>
<div class="item_7">7</div>
<div class="item_8">8</div>
<div class="item_9">9</div>
</div>

Categories

Resources