How do I make multiple fetch calls without getting 429 error? - javascript

I came across a problem in a book which I can't seem to figure out. Unfortunately, I don't have a live link for it, so if anyone could help me with my approach to this theoretically, I'd really appreciate it.
The process:
I get from a fetch call an array of string codes (["abcde", "fghij", "klmno", "pqrst"]).
I want to make a call to a link with each string code.
example:
fetch('http://my-url/abcde').then(res => res.json()).then(res => res).catch(error => new Error(`Error: ${error}`)); // result: 12345
fetch('http://my-url/fghij').then(res => res.json()).then(res => res).catch(error => new Error(`Error: ${error}`)); // result: 67891
...etc
Each of the calls is going to give me a number code, as shown.
I need to get the highest number of the 5 and get its afferent string code and make another call with that.
"abcde" => 1234
"fghij" => 5314
"klmno" => 3465
"pqrst" => 7234 <--- winner
fetch('http://my-url/pqrst').then(res => res.json()).then(res => res).catch(error => new Error(`Error: ${error}`));
What I tried:
let codesArr = []; // array of string codes
let promiseArr = []; // array of fetch using each string code in `codesArr`, meant to be used in Promise.all()
let codesObj = {}; // object with string code and its afferent number code gotten from the Promise.all()
fetch('http://my-url/some-code')
.then(res => res.json())
.then(res => codesArr = res) // now `codesArr` is ["abcde", "fghij", "klmno", "pqrst"]
.catch(error => new Error(`Error: ${error}`);
for(let i = 0; i < codesArr.length; i++) {
promiseArr.push(
fetch(`http://my-url/${codesArr[i]}`)
.then(res => res.text())
.then(res => {
codesObj[codesArr[i]] = res;
// This is to get an object from which I can later get the highest number and its string code. Like this:
// codesObj = {
// "abcde": 12345,
// "fghij": 67891
// }
})
.catch(error => new Error(`Error: ${error}`));
// I am trying to make an array with fetch, so that I can use it later in Promise.all()
}
Promise.all(promiseArray) // I wanted this to go through all the fetches inside the `promiseArr` and return all of the results at once.
.then(res => {
for(let i = 0; i < res.length; i++) {
console.log(res[i]);
// this should output the code number for each call (`12345`, `67891`...etc)
// this is where I get lost
}
})
One of the problems with my approach so far seems to be that it makes too many requests and I get 429 error. I sometimes get the number codes all right, but not too often.

Like you already found out the 429 means that you send too many requests:
429 Too Many Requests
The user has sent too many requests in a given amount of time ("rate
limiting").
The response representations SHOULD include details explaining the
condition, and MAY include a Retry-After header indicating how long to
wait before making a new request.
For example:
HTTP/1.1 429 Too Many Requests
Content-Type: text/html
Retry-After: 3600
<html>
<head>
<title>Too Many Requests</title>
</head>
<body>
<h1>Too Many Requests</h1>
<p>I only allow 50 requests per hour to this Web site per
logged in user. Try again soon.</p>
</body>
</html>
Note that this specification does not define how the origin server
identifies the user, nor how it counts requests. For example, an
origin server that is limiting request rates can do so based upon
counts of requests on a per-resource basis, across the entire server,
or even among a set of servers. Likewise, it might identify the user
by its authentication credentials, or a stateful cookie.
Responses with the 429 status code MUST NOT be stored by a cache.
To handle this issue you should reduce the amount of requests made in a set amount of time. You should iterate your codes with a delay, spacing out the request by a few seconds. If not specified in the API documentation or the 429 response, you have to use trial and error approach to find a delay that works. In the example below I've spaced them out 2 seconds (2000 milliseconds).
The can be done by using the setTimeout() to execute some piece of code later, combine this with a Promise to create a sleep function. When iterating the initially returned array, make sure to await sleep(2000) to create a 2 second delay between each iteration.
An example could be:
const fetch = createFetchMock({
"/some-code": ["abcde", "fghij", "klmno", "pqrst"],
"/abcde": 12345,
"/fghij": 67891,
"/klmno": 23456,
"/pqrst": 78912,
});
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
(async function () {
try {
const url = "https://my-url/some-code";
console.log("fetching url", url);
const response = await fetch(url);
const codes = await response.json();
console.log("got", codes);
const codesObj = {};
for (const code of codes) {
await sleep(2000);
const url = `https://my-url/${code}`;
console.log("fetching url", url);
const response = await fetch(url);
const value = await response.json();
console.log("got", value);
codesObj[code] = value;
}
console.log("codesObj =", codesObj);
} catch (error) {
console.error(error);
}
})();
// fetch mocker factory
function createFetchMock(dataByPath = {}) {
const empty = new Blob([], {type: "text/plain"});
const status = {
ok: { status: 200, statusText: "OK" },
notFound: { status: 404, statusText: "Not Found" },
};
const blobByPath = Object.create(null);
for (const path in dataByPath) {
const json = JSON.stringify(dataByPath[path]);
blobByPath[path] = new Blob([json], { type: "application/json" });
}
return function (url) {
const path = new URL(url).pathname;
const response = (path in blobByPath)
? new Response(blobByPath[path], status.ok)
: new Response(empty, status.notFound);
return Promise.resolve(response);
};
}

In this case... You should run and wait each fetch run finish before run new fetch by using async/await
runFetch = async (codesArr) => {
for(let i = 0; i < codesArr.length; i++){
const rawResponse = await fetch(`http://my-url/${codesArr[i]}`);
const codeResponse = rawResponse.json();
console.log(rawResponse);
codesObj[codesArr[i]] = codeResponse;
}
}
hope that help you.

Related

How do I refactor this series of API calls to execute faster?

I have a series of API calls I need to make in order to render a grid of image tiles for selection by the user. Right now it takes 3-5 seconds for the page to load and I think it's because I've accidentally added some extra loops, but I'm struggling to discern where the wasted flops are. This is technically a question about NFT data, but the problem is algorithmic not crypto related.
The call sequence is:
Call "Wallet" API to get all assets associated with an address - API doc
On success, call "Asset Metadata" API to get further info about each asset API Doc
Loop step 2 until all assets have a metadata response
This is my code that works (unless there is no assets associated with a wallet), but is just very slow. I'm sure there is a better way to handle this, but I'm struggling to see how. Thanks for your time!
// API Request
var myHeaders = new Headers();
myHeaders.append("X-API-Key", CENTER_API_KEY); //API Key in constants file
var requestOptions = {
method: 'GET',
headers: myHeaders,
redirect: 'follow'
};
const [nftData, updatenftData] = useState();
const [apiState, updateapiState] = useState("init");
const [renderNFT, updaterenderNFT] = useState([]);
useEffect(() => {
const getData = async () => {
let resp = await fetch(walletAPICall, requestOptions);
let json = await resp.json()
updatenftData(json.items);
updateapiState("walletSuccess");
}
const getRender = async () => {
let nftTemp = [];
for (let i=0;i<nftData.length;i++) {
let tempAddress = nftData[i].address;
let tempTokenId = nftData[i].tokenId;
let resp = await fetch(`https://api.center.dev/v1/ethereum-mainnet/${tempAddress}/${tempTokenId}`, requestOptions)
let json = await resp.json()
// console.log(json);
nftTemp.push(json);
}
updaterenderNFT(nftTemp);
updateapiState("NftDataSuccess");
}
if (apiState=="init") {
getData();
}
else if (apiState=="walletSuccess") {
getRender();
}
}, [requestOptions]);
getRender fetches data items sequentially.
You should do it in parallel using Promise.all or Promise.allSettled
Something like this...
function fetchItem(item) {
const res = await fetch(item.url);
return res.json();
}
await Promise.all[...data.map(fetchItem)]

how to detect file size / type while mid-download using axios or other requestor?

I have a scraper that looks for text on sites from a google search. However, occasionally the URLs for search are LARGE files without extension names (i.e. https://myfile.com/myfile/).
I do have a timeout mechanism in place, but by the time it times out, the file has already overloaded the memory. Is there any way to detect a file size or file type while it's being downloaded?
Here is my request function:
const getHtml = async (url, { timeout = 10000, ...opts } = {}) => {
const CancelToken = axios.CancelToken
const source = CancelToken.source()
try {
const timeoutId = setTimeout(() => source.cancel('Request cancelled due to timeout'), timeout)
let site = await axios.get(url, {
headers: {
'user-agent': userAgent().toString(),
connection: 'keep-alive', // self note: Isn't this prohibited on http/2?
},
cancelToken: source.token,
...opts,
})
clearTimeout(timeoutId)
return site.data
} catch (err) {
throw err
}
}
PS: I've seen similar questions, but none had an answer that would apply.
Ok so this isn't as easy to solve as one might expect. Ideally, http headers 'Content-length' and 'Content-type' exist so the user can know what he should expect but these aren't required headers. However those are often inaccurate or missing.
The solution I've found for this problem, which looks to be very reliable, involves two things:
Making the request as a Stream
Reading the file signature that the first byte of a lot of files have(probably due to ISO 8859-1, which lists these signatures); These are actually commonly known as Magic Numbers/Bytes.
A great way to use these two things is to stream the response and read the first bytes to check for the file signature; After you know if the file is in whatever format you support/want, then you can just process it as you'd normally or cancel the request before you read the next chunk of the stream, which should prevent overloading of your system(and which you can also use to measure the file size more accurately - which I show in the following snippet)
Here's how I implemented the solution mentioned above:
const getHtml = async (url, { timeout = 10000, ...opts } = {}) => {
const CancelToken = axios.CancelToken
const source = CancelToken.source()
try {
const timeoutId = setTimeout(() => source.cancel('Request cancelled due to timeout'), timeout)
const res = await axios.get(url, {
headers: {
connection: 'keep-alive',
},
cancelToken: source.token,
// Use stream mode so we can read the first chunk before getting the rest(1.6kB/chunk(highWatermark))
responseType: 'stream',
...opts,
})
const stream = res.data;
let firstChunk = true
let size = 0
// Not to be confused with arrayBuffer(the object) ;)
const bufferArray = []
// Async iterator syntax for consuming the stream. Iterating over a stream will consume it fully, but returning or breaking the loop in any way will destroy it
for await (const chunk of stream) {
if (firstChunk) {
firstChunk = false
// Only check the first 100(relevant, spaces excl.) chars of the chunk for html. This would possibly only fail in a raw text file which contains the word html at the very top(very unlikely and even then, wouldn't break anything)
const stringChunk = String(chunk).replace(/\s+/g, '').slice(0, 100).toLowerCase()
if (!stringChunk.includes('html')) return { error: `Requested URL is detected as a file. URL: ${url}\nChunk's magic 100: ${stringChunk}` };
}
size += Buffer.byteLength(chunk);
if (size > sizeLimit) return { error: `Requested URL is too large.\nURL: ${url}\nSize: ${size}` };
const buff = new Buffer.from(chunk)
bufferArray.push(buff)
}
// After the stream is fully consumed, we clear the timeout and create one big buffer to convert to str and return that
clearTimeout(timeoutId)
return { html: Buffer.concat(bufferArray).toString() }
} catch (err) {
throw err
}
}

What is going wrong with my express call? I need an array of ID's but its returning an empty array

Im guessing this problem is because I don't know how to use async await effectively. I still dont get it and I've been trying to understand for ages. sigh.
Anyway, heres my function:
app.post("/declineTrades", async (request, response) => {
//---------------------------------------------
const batch = db.batch();
const listingID = request.body.listingID;
const tradeOfferQuery = db
//---------------------------------------------
//Get trade offers that contain the item that just sold
//(therefore it cannot be traded anymore, I need to cancel all existing trade offers that contain the item because this item isn't available anymore)
//---------------------------------------------
.collection("tradeOffers")
.where("status", "==", "pending")
.where("itemIds", "array-contains", listingID);
//---------------------------------------------
//Function that gets all trade offers that contain the ID of the item.
async function getIdsToDecline() {
let tempArray = [];
tradeOfferQuery.get().then((querySnapshot) => {
querySnapshot.forEach((doc) => {
//For each trade offer found
let offerRef = db.collection("tradeOffers").doc(doc.id);
//Change the status to declined
batch.update(offerRef, { status: "declined" });
//Get the data from the trade offer because I want to send an email
//to the who just got their trade offer declined.
const offerGet = offerRef.get().then((offer) => {
const offerData = offer.data();
//Check the items that the receiving person had in this trade offer
const receiverItemIds = Array.from(
offerData.receiversItems
.reduce((set, { itemID }) => set.add(itemID), new Set())
.values()
);
//if the receiver item id's array includes this item that just sold, I know that
//I can get the sender ID (users can be sender or receiver, so i need to check which person is which)
if (receiverItemIds.includes(listingID)) {
tempArray.push(offerData.senderID);
}
});
});
});
//With the ID's now pushed, return the tempArray
return tempArray;
}
//---------------------------------------------
//Call the above function to get the ID's of people that got declined
//due to the item no longer being available
const peopleToDeclineArray = await getIdsToDecline();
//Update the trade offer objects to declined
const result = await batch.commit();
//END
response.status(201).send({
success: true,
result: result,
idArray: peopleToDeclineArray,
});
});
Im guessing that my return tempArray is in the wrong place? But I have tried putting it in other places and it still returns an empty array. Is my logic correct here? I need to run the forEach loop and add to the array before the batch.commit happens and before the response is sent.
TIA Guys!
As #jabaa pointed out in their comment, there are problems with an incorrectly chained Promise in your getIdsToDecline function.
Currently the function initializes an array called tempArray, starts executing the trade offer query and then returns the array (which is currently still empty) because the query hasn't finished yet.
While you could throw in await before tradeOfferQuery.get(), this won't solve your problem as it will only wait for the tradeOfferQuery to execute and the batch to be filled with entries, while still not waiting for any of the offerRef.get() calls to be completed to fill the tempArray.
To fix this, we need to make sure that all of the offerRef.get() calls finish first. To get all of these documents, you would use the following code to fetch each document, wait for all of them to complete and then pull out the snapshots:
const itemsToFetch = [ /* ... */ ];
const getAllItemsPromise = Promise.all(
itemsToFetch.map(item => item.get())
);
const fetchedItemSnapshots = await getAllItemsPromise;
For documents based on a query, you'd tweak this to be:
const querySnapshot = /* ... */;
const getSenderDocPromises = [];
querySnapshot.forEach((doc) => {
const senderID = doc.get("senderID");
const senderRef = db.collection("users").doc(senderID);
getSenderDocPromises.push(senderRef.get());
}
const getAllSenderDocPromise = Promise.all(getSenderDocPromises);
const fetchedSenderDataSnapshots = await getAllSenderDocPromise;
However neither of these approaches are necessary, as the document you are requesting using these offerRef.get() calls are already returned in your query so we don't even need to use get() here!
(doc) => {
let offerRef = db.collection("tradeOffers").doc(doc.id);
//Change the status to declined
batch.update(offerRef, { status: "declined" });
//Get the data from the trade offer because I want to send an email
//to the who just got their trade offer declined.
const offerGet = offerRef.get().then((offer) => {
const offerData = offer.data();
//Check the items that the receiving person had in this trade offer
const receiverItemIds = Array.from(
offerData.receiversItems
.reduce((set, { itemID }) => set.add(itemID), new Set())
.values()
);
//if the receiver item id's array includes this item that just sold, I know that
//I can get the sender ID (users can be sender or receiver, so i need to check which person is which)
if (receiverItemIds.includes(listingID)) {
tempArray.push(offerData.senderID);
}
});
}
could be replaced with just
(doc) => {
// Change the status to declined
batch.update(doc.ref, { status: "declined" });
// Fetch the IDs of items that the receiving person had in this trade offer
const receiverItemIds = Array.from(
doc.get("receiversItems") // <-- this is the efficient form of doc.data().receiversItems
.reduce((set, { itemID }) => set.add(itemID), new Set())
.values()
);
// If the received item IDs includes the listed item, add the
// sender's ID to the array
if (receiverItemIds.includes(listingID)) {
tempArray.push(doc.get("senderID"));
}
}
which could be simplified to just
(doc) => {
//Change the status to declined
batch.update(doc.ref, { status: "declined" });
// Check if any items that the receiving person had in this trade offer
// include the listing ID.
const receiversItemsHasListingID = doc.get("receiversItems")
.some(item => item.itemID === listingID);
// If the listing ID was found, add the sender's ID to the array
if (receiversItemsHasListingID) {
tempArray.push(doc.get("senderID"));
}
}
Based on this, getIdsToDecline actually queues declining the invalid trades and returns the IDs of those senders affected. Instead of using the batch and tradeOfferQuery objects that are outside of the function that make this even more unclear, you should roll them into the function and pull it out of the express handler. I'll also rename it to declineInvalidTradesAndReturnAffectedSenders.
async function declineInvalidTradesAndReturnAffectedSenders(listingID) {
const tradeOfferQuery = db
.collection("tradeOffers")
.where("status", "==", "pending")
.where("itemIds", "array-contains", listingID);
const batch = db.batch();
const affectedSenderIDs = [];
const querySnapshot = await tradeOfferQuery.get();
querySnapshot.forEach((offerDoc) => {
batch.update(offerDoc.ref, { status: "declined" });
const receiversItemsHasListingID = offerDoc.get("receiversItems")
.some(item => item.itemID === listingID);
if (receiversItemsHasListingID) {
affectedSenderIDs.push(offerDoc.get("senderID"));
}
}
await batch.commit(); // generally, the return value of this isn't useful
return affectedSenderIDs;
}
This then would change your route handler to:
app.post("/declineTrades", async (request, response) => {
const listingID = request.body.listingID;
const peopleToDeclineArray = await declineInvalidTradesAndReturnAffectedSenders(listingID);
response.status(201).send({
success: true,
result: result,
idArray: peopleToDeclineArray,
});
});
Then adding the appropriate error handling, swapping out the incorrect use of HTTP 201 Created for HTTP 200 OK, and using json() instead of send(); you now get:
app.post("/declineTrades", async (request, response) => {
try {
const listingID = request.body.listingID;
const affectedSenderIDs = await declineInvalidTradesAndReturnAffectedSenders(listingID);
response.status(200).json({
success: true,
idArray: affectedSenderIDs, // consider renaming to affectedSenderIDs
});
} catch (error) {
console.error(`Failed to decline invalid trades for listing ${listingID}`, error);
if (!response.headersSent) {
response.status(500).json({
success: false,
errorCode: error.code || "unknown"
});
} else {
response.end(); // forcefully end corrupt response
}
}
});
Note: Even after all these changes, you are still missing any form of authentication. Consider swapping the HTTPS Event Function out for a Callable Function where this is handled for you but requires using a Firebase Client SDK.

Async sometimes returning undefined

When calling the following method:
getLyrics: async function(song) {
const body = await this.getSongBody(song);
const lyrics = await cheerio.text(body('.lyrics'));
return lyrics;
}
as such:
genius.getLyrics('What a wonderful')
.then((res) => console.log(res))
.catch((err) => console.log(err.message));
Everything works fine and the lyrics of "What a wonderful world" by Louise Armstrong pops up in the console.
However, when I run the same code but without "await" in front of "cheerio.text..." sometimes the lyrics are produced and other times "undefined" shows up in the console. What has been making me scratch my head for a while now is that "cheerio.text..." does not return a promise (albeit "getSongBody" does), so to my understanding, there is no need to "wait" for it to finish.
I'm clearly missing something about async/await but have no idea what. Any help would be greatly appreciated!
Thanks
EDIT: Added a reproducible example as requested below:
const fetch = require('node-fetch');
const cheerio = require('cheerio');
// API
function geniusApi(token) {
this._token = token;
this._auth = {'Authorization': 'Bearer ' + this._token};
};
geniusApi.prototype = {
getSongURL : async function(search_keyword){
const res = await fetch('https://api.genius.com/search?q=' +
search_keyword,{headers: this._auth});
const body = await res.text();
const body_parsed = JSON.parse(body);
if (body_parsed.response.hits.length == 0){
console.log('No such song found');
throw Error('No such song found');
}
const url = body_parsed.response.hits[0].result.url;
return url;
},
getSongBody: async function (song){
const url = await this.getSongURL(song);
const response = await fetch(url);
const body = await response.text();
const body_parsed = cheerio.load(body);
return body_parsed;
},
getLyrics: async function(song) {
const body = await this.getSongBody(song);
const lyrics = cheerio.text(body('.lyrics'));
return lyrics;
}
}
// TEST EXAMPLE
const token =
'OTh1EYlsNdO1kELVwcevqLPtsgq3FrxfShIXg_w0EaEd8CHZrJWbWvN8Be773Cyr';
const genius = new geniusApi(token);
genius.getLyrics('What a wonderful')
.then((res) => console.log(res))
.catch((err) => console.log(err.message));
For anyone who ever stumbles upon the same issue, the problem in this case had nothing to do with async, promise or any other JS feature. It was merely a coincidence that the code had functioned correctly while using async, it later turned out that it didn't always work with async either.
The reason was simply that the Genius API that I was using to fetch the data, would return different source codes for identical API queries.
Two different source codes were returned, one contained a div called "lyrics" while the other did not. Therefore, sometimes the lyrics were found using cheerio, other times, they were not.

Nodejs: How to handle http request throttle error using async/await?

I m trying to build a search API which takes the location as the argument and returns the latitude and longitude of that location.
I m using http://geocode.xyz to get the details of the location.
For example :
https://geocode.xyz/?locate=Warsaw,Poland&json=1
This will return the required latitude and longitude of that location.
This is the code snippet I have:
const url = 'https://geocode.xyz/?locate=Warsaw,Poland&json=1';
const getData = async url => {
try {
const response = await fetch(url);
const json = await response.json();
console.log(json);
} catch (error) {
console.log(error);
}
};
getData(url);
With this, this is the error I am seeing:
node geocode.js
{success: false, error: { code: '006', message: 'Request Throttled.'}}
I guess the error is because I need to throttle the number of requests that is hitting the API https://geocode.xyz. I m not sure how to use rate limiter with async/await. Any help would be appreciated.
EDIT:
Based on the answers below, I still see 'Request Throttled'
var request = require('request');
const fetch = require("node-fetch");
const requester = {
lastRequest: new Date(),
makeRequest: async function(url) {
// first check when last request was made
var timeSinceLast = (new Date()).getTime() - this.lastRequest.getTime();
if (timeSinceLast < 2000) {
this.lastRequest = new Date(this.lastRequest.getTime() + (2000 - timeSinceLast));
await new Promise(resolve => setTimeout(resolve, 2000-timeSinceLast));
// await new Promise(resolve => setTimeout(resolve, timeSinceLast));
}
const response = await fetch(url);
const json = await response.json();
return json;
}
};
requester.makeRequest('https://geocode.xyz/?locate=Warsaw,Poland&json=1')
.then(console.log)
I m still getting the same error:
{ success: false, error: { code: '006', message: 'Request Throttled.' } }
Is this because of geocode pricing? Is there any way to limit the rate of requests from the user side?
You could make an object that is used for sending all requests out, and that object can keep track of when the last time a request was made. For example
const requester = {
lastRequest: new Date(2000,0,1),
makeRequest: async function (url) {
// first check when last request was made
var timeSinceLast = (new Date()).getTime() - this.lastRequest.getTime();
this.lastRequest = new Date();
if (timeSinceLast < 1000) {
this.lastRequest = new Date(this.lastRequest.getTime() + (1000 - timeSinceLast));
await new Promise(resolve => setTimeout(resolve, 1000-timeSinceLast));
}
// make request here and return result
const response = await fetch(url);
const json = await response.json();
return json;
}
};
https://jsfiddle.net/3k7b0grd/
if you use requester.request, it only lets you make a request once a second, and waits long enough (even for successive requests) to obey that rate limit

Categories

Resources