Can I access elements from a web page with JavaScript? - javascript

I'm making a Discord bot in JavaScript and implementing a feature where when you ask a coding question it gives you a snippet. I'm using Grepper and returning the url with the search results. For example:
Hello World in JavaScript Search Results. I would like to access the div containing the snippet. Is this possible? And how would I do it?
Here's my code:
if (message.startsWith('programming')) {
// Command = programming
message = message.replace('programming ', ''); // Remove programming from the string
message = encodeURIComponent(message) // Encode the string for a url
msg.channel.send(`https://www.codegrepper.com/search.php?answer_removed=1&q=${message}`); // Place formatted string into url and send it to the discord server
// Here the program should access the element containing the snippet instead of sending the url:
}
I'm new to JavaScript so sorry if this is a stupid question.

As far as I know the API you are using returns HTML/Text data, not JSON, Grepper has a lot more APIs if you just look into them, you can instead use this API that returns JSON data. If you need more information you can check this Unofficial List of Grepper APIs
https://www.codegrepper.com/api/get_answers_1.php?v=2&s=${SearchQuery}&u=98467
How Do I Access the div containing the snippet?
To access the div you might need to use python web scraping to scrape the innerHTML of the div but I think it's easier to use the other API.
Or
You can put /api/ in the url like:
https://www.codegrepper.com/api/search.php?answer_removed=1&q=js%20loop

The easiest way for this is to send a GET request to the underlying API
https://www.codegrepper.com/api/search.php?q=hello%20world%20javascript&search_options=search_titles
This will return the answers in JSON format. Obviously you'd have to adjust the parameters.
How did I find out about this?
Simply look at the network tab of your browser's dev tools while loading the page. You'll see a GET request being sent out to the endpoint, returning mentioned answers as JSON.

The best way is to use the grepper api.
Install node-fetch
npm i node-fetch, You need this package for making requestes to the api.
To import It in the code just type:
const fetch = require('node-fetch');
Write this code
Modify your code like this:
if (message.startsWith('programming')) {
message = message.replace('programming ', '');
message = encodeURIComponent(message)
// Making the request
fetch(`https://www.codegrepper.com/api/search.php?answer_removed=1&q=${message}`)
.then(res => res.json())
.then(response => {
// response is a json object containing all the data You need
// now You need to parse this data
const answers = response.answers; // this is an array of objects
const answers_limit = 3; // set a limit for the answers
// cheking if there is any answer
if(answers.length == 0) {
return msg.channel.send("No answers were found!");
}
// creating the embed
const embed = new Discord.MessageEmbed()
.setTitle("Here the answers to your question!")
.setDescription("")
// parsing
for(let i = 0; i < answers_limit; i++) {
if(answers[i]) {
embed.description += `**${i+1}° answer**:\n\`\`\`js\n${answers[i].answer}\`\`\`\n`;
}
}
console.log(embed)
msg.channel.send(embed);
});
}

Related

Cheerio web scraping Twitter loading different data

I'm new to Web Scraping, I'm using Axios to fetch the URL, and then access the data with Cheerio.
I want to web scrape twitter by getting my account's number of followers, I inspected the element who holds the number of followers, then tried to execute it, but it doesn't return anything
So I tried to execute each span tag in the page, and it returns the string "Something went wrong, but don’t fret — let’s give it another shot."
When I inspect the page, I can see the tag elements, but when I click on "view page source", it shows a totally different thing.
I found that the string "Something went wrong, but don’t fret — let’s give it another shot." is located in the page source here:
The element I want when inspecting my twitter page is:
This is my JS code:
const cheerio = require('cheerio');
const axios = require('axios')
axios('https://twitter.com/SaudAlghamdi97')
.then(response => {
run();
async function run() {
const html = await response.data;
const $ = cheerio.load(html);
$('span').each((i, el) => {
console.log($(el).text());
});
}
})
This is what I get in the terminal:
am I missing something here? I'm struggling to scrape the number of followers.
The data you request seems to be rendered by Javascript. You'll need another library for example puppeteer, which will be able to view the rendered page like when you see it in your browser.
"Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol"

Extract From Cursor-Based Pagination URL from HTTP Request Response Headers - Google Apps Script

I am using Google Apps Script to make a series of HTTP Requests. The endpoint I have been using just switched to cursor-based pagination.
The response looks like this.
{...
Link=
<https://x.shopify.com/admin/api/2019-10/inventory_levels.json?limit=250&page_info=abc>;rel="previous",
<https://x.shopify.com/admin/api/2019-10/inventory_levels.json?limit=250&page_info=def>;rel="next"
}
I can use response['Link'] to get it down to
<https://x.shopify.com/admin/api/2019-10/inventory_levels.json?limit=250&page_info=abc>;rel="previous",
<https://x.shopify.com/admin/api/2019-10/inventory_levels.json?limit=250&page_info=def>;rel="next"
Is there a way to extract page_info reliably from the "next" URL without regular expression? I am fine resorting to regular expression but I wondered if there was specific method for getting it.
Thanks in advance for your help. I dabble and get that I still have a ton to learn.
You can use a regex to extract the URL and whether the link is the next or previous page.
/<(.*)>; rel=\"(.*)\"/
To use this against your code you you could do something like this:
const urls = headers.links.map(link => {
const linkContents = link.match(/<(.*)>; rel=\"(.*)\"/)
const url = linkContents[1]
const type = linkContents[2] // next or previous
return { url, type }
})

JavaScript (node.js) How to get the URL of a random image from a subreddit?

Basically, I'm trying to get the URL of a random picture from a specific subreddit. I've tried using the raw JSON here, but I can't quite figure it out. I've been using snekfetch to get the JSON, since it's worked before on less complicated sites, but I've seen other methods like superagent and snoowrap that I don't have any idea how to use properly. Here's what I've tried using snekfetch (I'm trying to incorporate this into a discord bot):
case "pic":
if (!args[1]) return message.channel.send("Enter a title (Be VERY specific");
// pics was set equal to "https://www.reddit.com/r/pics.json" earlier
snekfetch.get(pics).then(r => {
let postTitle = args[1];
let img = r.preview.images[0].source.url;
let entry = r.find(post => post.title == postTitle);
let picture = new Discord.RichEmbed()
.setAuthor(entry.title)
.addField(entry)
.setImage(img);
message.channel.send(picture);
//message.channel.send(entry.preview.images[0].source.url);
});
break;
I'm new to JSON, so it wouldn't surprise me if this code is full of horrible mistakes.
Through some googling, I managed to find that apparently each reddit post has an ID in base 36. But these ID's aren't actually in order, so I'd need to store them all in an array and randomly select from that.
In short, how do I retrieve an image from reddit as a URL, and how do I put a certain amount of these images into an array?
Using the JSON data provided you can get all of the images using something like this:
async function getImages(url) {
const images = [];
const response = await snekfetch.get(url);
response.body.data.children.forEach((child) => {
child.data.preview.images.forEach((image) => {
images.push(image.source.url);
});
});
return images;
}
If you need to only gather N images, then you may want to use a standard for loop that breaks when N === images.length.

Login into a website with steam login using nodejs

I am trying to log in to a website like for this example csgolounge which requires the steam login authentication using nodejs.
Even thought I have tried a few things none of them came even close to working, so there is no point of me including the code here.
I was wondering if there is any way of doing this.
EDIT: I think I write my question incorrectly as I want the node application to login to csgolounge using steam and NOT have a website that is 'like' csgolounge with the login option.
To answer your question, yes. There is a way of doing this.
The first thing you'll need to do is get a steam api key which you can do by heading over here. Then as steam says:
Just download an OpenID library for your language and platform of choice and use http://steamcommunity.com/openid as the provider. The returned Claimed ID will contain the user's 64-bit SteamID. The Claimed ID format is: http://steamcommunity.com/openid/id/
If you're set on using Node.JS I suggest checking out node-jsonwebtoken or passport-openidconnect. If you choose to go with passport, someone has already developed a "strategy" for including steam. Check that out here.
I have the same issue, i dont know if it helps you, but i wrote some methods to get user steamID, then u can use it to get user info with this method. I did it only having info how to do it with PHP - thats why i wanted to rewrite it on js.
1) method to build link
const http_build_query = (obj) => {
let str = ""
for (const key in obj) {
const value = obj[key];
str += `${key}=${value}&`;
}
return str;
}
2) method which returns you link where you shoud go to login with steam (you also can use in in )
const genUrl = (urlToReturnTo) => {
const params = {
'openid.ns' : 'http://specs.openid.net/auth/2.0',
'openid.mode' : 'checkid_setup',
'openid.return_to' : urlToReturnTo,
'openid.realm' : 'http://localhost:8080',
'openid.identity' : 'http://specs.openid.net/auth/2.0/identifier_select',
'openid.claimed_id' : 'http://specs.openid.net/auth/2.0/identifier_select',
};
const url = `${STEAM_LOGIN}?${http_build_query(params)}`
return url;
};
Also in method genUrl you need to pass as a param url where you want to be redirected after login. If login is successful you will be redirected to your url and will have some params in url it will look like "http://yoururl?here_is_params"
and you need to get some how [here_is_params] from url i used this:
const search = location.search.substring(1);
const urlObj = JSON.parse('{"' + decodeURI(search).replace(/"/g, '\\"').replace(/&/g, '","').replace(/=/g,'":"') + '"}')
So after that you will have an object with query params
3) Now all you need its to get steamID from this object:
const getUserId = (response) =>
{
const str = response["openid.claimed_id"];
const res = decodeURIComponent(str)
const propsArr = res.split("\/");
console.log(propsArr);
return propsArr[propsArr.length-1];
}
const userId = getUserId(urlObj)
4) Now you have userId and all you need its to send request with fetch or axios. it will return you an JSON OBJ with user data
http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key={apiKey}&steamids=${userId}

Cache HTML using request-promise and Node.js

I'm looking for a simple way to cache HTML that I pull using the request-promise library.
The way I've done this in the past is specify a time-to-live say one day. Then I take the parameters passed into request and I hash them. Then whenever a request is made I save the HTML contents on the file-system in a specific folder and name the file the name of the hash and the unix timestamp. Then when a request is made for the using the same parameters I check if the cache is still relevant via timestamp and pull it or make a new request.
Is there any library that can help with this that can wrap around request? Does request have a method of doing this natively?
I went with the recco in the comments and used Redis. Note this only works for get requests.
/* cached requests */
async function cacheRequest(options){
let stringOptions = JSON.stringify(options)
let optionsHashed = crypto.createHash('md5').update(stringOptions).digest('hex')
let get = await client.getAsync(optionsHashed)
if (get) return get
let HTML = await request.get(options)
await client.setAsync(optionsHashed, HTML)
return HTML
}

Categories

Resources