I want to download several data files from this URL: https://pselookup.vrymel.com/
The site contains a date field and a download button. I want to download data for multiple years (which would mean a lot of requests) and I want to make it automatically.
I've created a Javascript snippet, however, it keeps downloading just the same file over and over again.
$dateField = document.getElementsByClassName('csv_download_input__Input-encwx-1 dDiqPH')[2]
$dlButton = document.getElementsByClassName('csv_download_input__Button-encwx-0 KLfyv')[2]
var now = new Date();
var daysOfYear = [];
for (var d = new Date(2016, 0, 1); d <= now; d.setDate(d.getDate() + 1)) {
daysOfYear.push(new Date(d).toISOString().substring(0,10));
}
(function theLoop (i) {
setTimeout(function () {
$dlButton.click()
$dateField.value = daysOfYear[i]
if (--i) { // If i > 0, keep going
theLoop(i); // Call the loop again, and pass it the current value of i
}
}, 3000);
})(daysOfYear.length-1);
How could I download all of the files automatically?
First off, javascript in the client is probably not the best language to do this nor the best approach to make this happen. It might work, but it's better to know what is best when choosing an approach to a problem. Also, it will avoid for you clicking ~800 times in the popup accepting the download.
You can get the files in a programatically way by just learning what you browser is doing to get the file and trying to reproduce it in bunch.
After inspecting the calls you can see that it's calling an endpoint and that endpoint is returning a link which contains the file that you can download.
Well, that is going to be easy, so now you just need to make the script in any language to be able to retrieve them.
I've chosen javascript but not client side, but nodejs which means that this has to run from your computer.
You could do the same with bash, python or any other language.
To run this do the following:
Go to a new empty directory
Run npm install axios
Create a file with the code I pasted let's call it crawler.js
Run node crawler.js
This has been tested using node v8.15.0
// NOTE: Require this to make a request and save the link as file 20190813:Alevale
const axios = require('axios');
const fs = require('fs');
let now = new Date();
let daysOfYear = [];
const baseUrl = 'https://a4dzytphl9.execute-api.ap-southeast-1.amazonaws.com/prod/eod/'
for (var d = new Date(2016, 0, 1); d <= now; d.setDate(d.getDate() + 1)) {
daysOfYear.push(new Date(d).toISOString().substring(0,10));
}
const waitFor = (time) => {
return new Promise((resolve => setTimeout(resolve, time)))
}
const getUrls = async () =>{
let day
for (day of daysOfYear) {
console.log('getting day', baseUrl + day)
// NOTE: Throttle the calls to not overload the server 20190813:Alevale
await waitFor(4000)
await axios.get(baseUrl + day)
.then(response => {
console.log(response.data);
console.log(response);
if (response.data && response.data.download_url) {
return response.data.download_url
}
return Promise.reject('Could not retrieve response.data.download_url')
})
.then((url) =>{
axios({
method: 'get',
url,
responseType: 'stream'
})
.then(function (response) {
// NOTE: Save the file as 2019-08-13 20190813:Alevale
response.data.pipe(fs.createWriteStream(`${day}.csv`))
})
.catch(console.error)
})
.catch(error => {
console.log(error);
});
}
}
getUrls()
You can instead of simulating the user, get the link to download from:
https://a4dzytphl9.execute-api.ap-southeast-1.amazonaws.com/prod/eod/2019-08-07
just change the date at the end to the date of the file you want to download. And use axios to get this URL.
This will save you sometime (in case you don't really need to simulate the click of the user etc)
Then you will get a response like this:
{
download_url":"https://d3u9ukmkxau9he.cloudfront.net/eod/2019-08-07.csv?Expires=1566226156&Signature=QRUk3tstuNX5KYVPKJSWrXsSXatkWS-eFBIGUufaTEMJ~rgpVi0iPCe1AXl5pbQVdBQxOctpixCbyNz6b9ycDgYNxEdZqPr2o2pDe8cRL655d3zXdICnEGt~dU6p35iMAJkMpPSH~jbewhRSCPUwWXQBfOiEzlHwxru9lPnDfsdSnk3iI3GyR8Oc0ZP50EdUMHF7MjWSBRbCIwnu6wW4Jh0bPmZkQDQ63ms5QxehsmtuGLOgcrC6Ky1OffVQj~ihhmBt4LGhZTajjK4WO18hCP3urKt03qpC4bOvYvJ3pxvRkae0PH1f-vbTWMDkaWHHVCrzqZhkAh3FlvMTWj8D4g__&Key-Pair-Id=APKAIAXOVAEOGN2AYWNQ"
}
and then you can use axios to GET this url and download your file.
Related
I'm pretty new to webdev and am trying to wrap my head around this. Presently I have a fairly simple express.js server running on a VPS that pulls and parses a JSON datafeed from a 3rd party source, then outputs that data in a readable way. That JSON feed refreshes every 15 seconds, but at the moment I'm doing all the work on receipt of a GET request, and then not checking again until a new request comes in.
My question then, I suppose, is can I pull and parse that datafeed, say, every 60 seconds, and then re-render the page if it's different? How would I go about doing that?
Here's the router.get function that's doing all the work at present:
router.get('/', async function(req, res) {
const json = await fetch("https://data.vatsim.net/v3/vatsim-data.json")
.then(jsonRes => jsonRes.json());
let isControlling = json.controllers.filter(function(controller) {
return controller.cid == cid &&
/(DEL|GND|TWR|APP|DEP|CTR|FSS)$/.test(controller.callsign) &&
controller.frequency != "199.998";
});
let atisArray = json.atis.filter(function(atis) {
return atis.cid == cid;
});
let atises = [];
for (var atis of atisArray) {
let atisAirport = atis.callsign.slice(0, 4);
if (!atises.includes(atisAirport))
{
atises.push(atisAirport);
}
}
if(isControlling.length > 0) {
res.render('controlling', {params:{callsign: isControlling[0].callsign, frequency: isControlling[0].frequency, atisList: atises}});
}
else {
res.render('not_controlling');
}
});
Thanks in advance
First of all, I would like to say that I'm a student learning programming for around a month, so expect to see many mistakes.
I'm working on a website where I use a chart from the ChartJs library. The data used for this chart is taken through requests to a server.
What I want to do is update the content of the studenGesamt variable every 20 seconds. My main idea was using a setInterval, but it didn't work. I am thinking that I could make a new request to the server every 20 seconds, but I am kind of lost on how to do that or if it is actually a good idea. If someone could help me I would really appreciate it!
let serverData;
let studenGesamt;
let date;
const http = new XMLHttpRequest();
const url = 'https://url.com/'; // I have hidden this URL as it is the actual server from my company
http.open("GET", url);
http.setRequestHeader('key', 'sample-key'); // I have hidden this key for the same reason as above
http.send();
const chart = document.getElementById("multie-pie-chart");
// Function that calculates the workdays passed up until today
const workdaysCount = () => [...new Array(new Date().getDate())]
.reduce((acc, _, monthDay) => {
const date = new Date()
date.setDate(1 + monthDay) ![0, 6].includes(date.getDay()) && acc++
return acc
}, 0)
http.onload = (e) => {
// Parsing the JSON file and storing it into a variable (Console.Log() to make sure it works)
serverData = JSON.parse(http.responseText);
console.log(serverData);
// Storing the value of total hours from the database in a variable
studenGesamt = serverData.abzurechnen.gesamt;
chartRender(); // Function that calls and renders the chart
setInterval(dataLoop, 5000); // 5 seconds to be able to test it, will be changed to 20 when finished
};
let dataLoop = () => {
studenGesamt = serverData.abzurechnen.gesamt;
console.log('test'); // Logging test to see if it is working
};
I am trying to set up a simple serverless function on Netlify just to test out usage of environment variables. I have defined the following two environment variables in Netlify for my site:
Variable Name
Value
ALPHABET_SEPARATION
2
CHARS_BETWEEN
3
I have also updated my functions directory as follows:
Functions directory: myfunctions
I am using continuous deployment from github. As I do not know the use of npm at present and finding it convenient to directly test the production deploy, I have defined a subdirectory called myfunctions inside my root directory and have placed my javascript file containing the "serverless" function inside it on my local machine. I have built in logic so that the "serverless" function gets called only when a "netlify" flag is set, otherwise, an alternate function gets executed client-side. Basically it works as follows:
const deploy = "netlify" //Possible valid values are "local" and "netlify"
async function postRandomString() {
const stringToUpdate = "THISISATESTSTRING"
var stringToPost = "DUMMYINITIALVALUE";
if (deploy === "local") {
stringToPost = updateString(stringToUpdate); //updateString is a function defined elsewhere and executes client-side;
}
else if (deploy === "netlify") {
const config = {
method: 'GET',
headers: {
'Accept': 'application/json',
}
};
const res = await fetch(`myfunctions/serverUpdateString?input=${stringToUpdate}`, config);
const data = await res.json();
stringToPost = data.retVal;
console.log(data.retVal);
}
else {
stringToPost = "##ERROR##";
}
postString(stringToPost); //postString is a function defined elsewhere and executes client-side;
}
The serverless function file serverUpdateString.js is coded as follows (it basically sets a character at a certain position (determined by CHARS_BETWEEN) in the string to an alphabetical character which is a certain number (determined by ALPHABET_SEPARATION) of places in the alphabet after the first character of the string (don't ask why - the point is that it never even receives/handles the request):
exports.handler = async function (event) {
const { CHARS_BETWEEN, ALPHABET_SEPARATION } = process.env;
const charsBetween = CHARS_BETWEEN;
const alphabetSeparation = ALPHABET_SEPARATION;
const initString = event.queryStringParameters.input;
const rootUnicode = initString.charCodeAt(0);
const finalUnicode = "A".charCodeAt(0) + (rootUnicode - "A".charCodeAt(0) + alphabetSeparation) % 26;
const finalChar = String.fromCharCode(finalUnicode);
const stringArray = initString.split("");
stringArray[charsBetween + 1] = finalChar;
const stringToReturn = stringArray.join("");
const response = {
statusCode: 200,
retVal: stringToReturn,
}
return JSON.stringify(response);
}
When I run it, I get a 404 error for the GET request:
In the above image, script.js:43 is the line const res = await fetch(myfunctions/serverUpdateString?input=ATESTSTRIN, config); in the calling file, as shown in the first code block above.
What am I doing incorrectly? Surely Netlify should be able to pick up the serverless function file given that I have specified the folder alright and have placed it at the right place in the directory structure? I have given the whole code for completeness but the problem seems quite elementary. Look forward to your help, thanks.
I got assistance from Netlify forums. Basically the following changes needed to be made:
The fetch request -- line 43 in the calling code (script.js) -- needed to be changed to
const res = await fetch(`https://netlifytestserverless.netlify.app/.netlify/functions/serverUpdateString?input=${stringToUpdate}`, config);
The return statement in the lambda function needed to be changed to:
const response = {
statusCode: 200,
body: JSON.stringify(stringToReturn),
}
Other minor changes such as using parseInt with the environment variables.
The code works now.
The idea is like this: At 6:00 am Argentina, I want a announce (image) to be displayed that remains active for one hour, that is, that it can be visible and that when it reaches 60min it is hidden, that is , at 7:00 am hidden. That this action is repeated every 7 hours. Therefore I want it to remain hidden for 7 hours and repeat the action again. At 2:00 p.m. it appears and at 3:00 p.m. it hides. 7 hours pass. It reappears at 10pm and hides at 11pm. 7 hours pass and he appears again at 6:00 am.
I have this code created so that it recognizes the time differences and runs at the same time in all countries, that is, the ad comes out at 6:00 am Argentina and at the same time it is shown in Los Angeles even though it is 2:00 am. But it's not working. It appears at the time according to the country.
NOTE: there are two elements in the code, one is for another ad that appears at 0:00
var offset = new Date().getTimezoneOffset() / 60;
var horarios1 = [6 + offset, 14 + offset, 22 + offset];
var elemento1 = document.getElementById("panel1");
var horarios2 = [0 + offset];
var elemento2 = document.getElementById("panel2");
setInterval(function() {
var hora = new Date().getHours();
if (horarios1.includes((hora + offset) % 24)) {
elemento1.style.display = 'block';
} else {
elemento1.style.display = 'none';
}
if (horarios2.includes((hora + offset) % 24)) {
elemento2.style.display = 'block';
} else {
elemento2.style.display = 'none';
}
}, 1000);
<div id="panel1" style="display: none;">PANEL 6, 14, 22</div>
<div id="panel2" style="display: none;">PANEL 0</div>
Thank you in advance.
Your code is using javascript time. Javascript takes the time from users machine. So when you visit your website, it will show your machine's time, when I visit it'll show my machine's time. However if you want a universal time for the whole world, i.e show the ad Argentina time 06:00 all over the world, than you can apply either of the following methods.
1. USE SERVER TIME
You need a bit of a backend code here. Show the time from your server, and its fixed for the whole world. Details depend on what backend technology (php/java/python) you are using.
2. USE A THIRD PARTY API
Use api from another website. Like worldtimeapi.org/. Make an ajax call, get the time of your desired location. You can use plain javascript or use any ajax library to do that. Here I'm including two methods: 1) plain javascript and 2) using axios (a popular ajax library)
Vanilla JS
function getTime(url) {
return new Promise((resolve, reject) => {
const req = new XMLHttpRequest();
req.open("GET", url);
req.onload = () =>
req.status === 200
? resolve(req.response)
: reject(Error(req.statusText));
req.onerror = (e) => reject(Error(`Network Error: ${e}`));
req.send();
});
}
Now Use this function to make the ajax call
let url = "http://worldtimeapi.org/api/timezone/America/Argentina/Buenos_Aires";
getTime(url)
.then((response) => { //the api will send this response which is a JSON
// you must parse the JSON to get an object using JSON.parse() method
let dateObj = JSON.parse(response);
let dateTime = dateObj.datetime;
console.log(dateObj);
console.log(dateTime);
})
.catch((err) => {
console.log(err);
});
AXIOS
Add axios library to your project.
axios({
url:"http://worldtimeapi.org/api/timezone/America/Argentina/Buenos_Aires",
method: "get",
})
// Here response is an object. The api will send you a JSON. But axios automatically
// convert it to an object. So you don't need to convert it manually.
.then((response) => {
let dateObj = response.data;
let dateTime = dateObj.datetime;
console.log(dateObj);
console.log(dateTime);
})
.catch((err) => {
console.log(err);
});
(function () {
var url =
"http://worldtimeapi.org/api/timezone/America/Argentina/Buenos_Aires",
horarios1 = [6, 14, 22],
elemento1 = document.getElementById("panel1"),
horarios2 = [0],
elemento2 = document.getElementById("panel2");
function getTime(url) {
return new Promise((resolve, reject) => {
const req = new XMLHttpRequest();
req.open("GET", url);
req.onload = () =>
req.status === 200
? resolve(req.response)
: reject(Error(req.statusText));
req.onerror = (e) => reject(Error(`Network Error: ${e}`));
req.send();
});
}
setInterval(function () {
getTime(url)
.then((data) => {
var dateObj = JSON.parse(data);
var dateTime = dateObj.datetime;
var hora = Number(dateTime.slice(11, 13));
if (horarios1.includes(hora)) {
elemento1.style.display = "block";
} else {
elemento1.style.display = "none";
}
if (horarios2.includes(hora)) {
elemento2.style.display = "block";
} else {
elemento2.style.display = "none";
}
})
.catch((err) => {
console.log(err);
});
}, 1000);
})();
<div id="panel1" style="display: none;">PANEL 6, 14, 22</div>
<div id="panel2" style="display: none;">PANEL 0</div>
Hope that helps. Few Things to remember though -
1. worldtimeapi.org/ is a third party service. If they choose to terminate their service, your code will break. But if you use your server time, as long as your server is running, your code will run.
2. Because of the ajax call, this code will not work in stackoverflow. Copy paste the code in your project to make it work.
3. If still it doesn't work, it means you are facing CORS (cross origin policy) issue. Read this link, search internet/SO. You will find your solution. Happy coding :)
I'm trying to do what I 'thought' would be a simple task. I have an array of URLs that I'd like to loop through and download on to the client machine when the user clicks a button.
Right now I have a parent component that contains the button and an array of the urls (in the state) that I'd like to loop through and download. For some reason, the way I'm doing it now only downloads one of the files, not all of the contents of the array.
Any idea how to do this correctly within React?
handleDownload(event){
var downloadUrls = this.state.downloadUrls;
downloadUrls.forEach(function (value) {
console.log('yo '+value)
const response = {
file: value,
};
window.location.href = response.file;
})
}
I would use setTimeout to wait a little bit between downloading each files.
handleDownload(event){
var downloadUrls = this.state.downloadUrls.slice();
downloadUrls.forEach(function (value, idx) {
const response = {
file: value,
};
setTimeout(() => {
window.location.href = response.file;
}, idx * 100)
})
}
In Chrome, this will also prompt the permission asking for multiple files download.