Is it possible (without an application layer cache of requests) to prevent sending an HTTP request for the same resource multiple times when it's cachable? And if yes, how?
E.g. instead of
at time 0: GET /data (request#1)
at time 1: GET /data (request#2)
at time 2: received response#1 for request#1 // headers indicate that the response can be cached
at time 3: received response#2 for request#2 // headers indicate that the response can be cached
at time 0: GET /data (request#1)
at time 1: GET /data (will wait for the response of request#1)
at time 2: received response#1 for request#1 // headers indicate that the response can be cached
at time 3: returns response#1 for request#2
This would require that its possible to indicate to the browser that the response will be cachable before the response headers are read. I am asking if there is such a mechanism. E.g. with a preceding OPTIONS or HEAD request of some kind.
My questions is, if there is a mechanism to signal the browser that the response for URI will be cachable
Yes this is what the Cache-control headers do.
and any subsequent requests for that URI can return the response of any in-flight request....Ideally this would be part of the HTTP spec
No HTTP does not do this for you, you need to implement the caching yourself. This is what browsers do.
I did want to check if there is already something ready ouf-of-the-box
Javascript libraries don't typically honour caching, as a AJAX request is usually for data and any caching of data usually happens on the server. I don't know any library and of course asking for Js libraries is out of scope on SO.
Depending on the browser the second request could be stalled and served if cachable, e.g. in Chromium for non range requests:
The cache implements a single writer - multiple reader lock so that only one network request for the same resource is in flight at any given time.
https://www.chromium.org/developers/design-documents/network-stack/http-cache
Here an example where three concurrent requests result in only a single server call:
fetch('/data.json').then(async r => console.log(await r.json()));
fetch('/data.json').then(async r => console.log(await r.json()));;
setTimeout(() => fetch('/data.json').then(async r => console.log(await r.json())), 2000);
The subsequent request have 0B transferred and have the same random number, showing that only a single server call was made.
This behavior is not the same for e.g. Firefox:
An interesting question that comes to mind is what would happen when a request for a resource is made while a H2 push for that resource was initiated before but not yet finished.
For reproducing here the test code:
https://gist.github.com/nickrussler/cd74ac1c07884938b205556030414d34
Related
I've configured my server responses to include Cache-Control: max-age=<some number> on several endpoints. I'm using Axios on the front end to make AJAX requests on these endpoints. When I refresh the page a few requests are properly pulled from the browser cache but two of them always go to the server again.
It's always the same two requests which refuse to pull from the cache.
I checked the browser cache and the responses are indeed cached.
Axios adds max-age=0 in the headers of the two problematic requests but not the other three and if I add a custom header to the Axios request:
let payload = {params: {cik: 999}, headers: {'Cache-Control': 'max-age=9999'}};
axios.get('/api/13f-holdings/filer/historical', payload).then((resp) => {
// handle response
});
The request goes through with the following Cache-Control headers:
Cache-Control: max-age=9999, max-age=0
and it ignores the cached data again.
Given that the responses in question are in fact being cached by the browser it seems that the problem lies in the Axios request. But the requests hitting the cache look exactly the same as the requests missing the cache. Let me know if I can provide any additional information to help diagnose this.
Edit: I'm using VueJS. I noticed that the two requests that never hit the browser cache are fired after the Vue component has mounted. Is this significant? Does Vue not have access to the browser cache immediately following component mounting?
This behaviour has to do with the way browser developers choose to load data when a page is refreshed and is, to great extent, out of the hands of the website developer.
If one is concerned that the request is not being cached according to the server's Cache-Control response headers one may paste the request URI in the address bar of a new tab and verify that the page is loaded from the browser cache.
See this question for a detailed explanation:
Why do AJAX GET requests sent from the mounted hook in Vue.js always ignore the browser cache?
I'm sending two requests:
let gameId = 'myGameId';
let chatRoomId = 'myChatRoomId';
requester.sendRequest(
'POST', '/createGame', {gameId: gameId});
requester.sendRequest(
'POST', '/createChatRoom', {gameId: gameId, chatRoomId: chatRoomId});
And as far as I know, those should get to the server in the same order I just sent them, because that's a guarantee of TCP.
However, when I send these out, it sometimes reaches the server out of order. When I inspect Chrome's console, I see that there is a preflight OPTIONS request sent out (always in the correct order) but then the server might respond to them out of order (it's a google app engine server), or they might arrive out of order.
Chrome waits for the OPTIONS response to come back before it sends out the real POST request, which means my POST requests are out of order.
The only solution I can see is to put .then()s on the requests, which is unacceptable because waiting for every dependent request would slow down my application too much.
Any ideas for how I can get around this frustrating preflight request, or order my requests somehow?
I have a site that sends XMLHTTPRequests to a php file that handles the HTTP POST Request and returns data in JSON format. The urls for the post_requests files are public information (since a user can just view the JS code for a page and find the URLs I'm sending HTTP requests to)
I mainly handle HTTP Post Requests in PHP by doing this:
//First verify XMLHTTPRequest, then get the post data
if (isset($_SERVER['HTTP_X_REQUESTED_WITH']) && strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) === 'xmlhttprequest')
{
$request = file_get_contents('php://input');
$data = json_decode($request);
//Do stuff with the data
}
Unfortunately, I'm fairly sure that the headers can be spoofed and some devious user or click bot can just spam my post requests, repeatedly querying my database until either my site goes down or they go down fighting.
I'm not sure if their requests will play a HUGE role in the freezing the server with their requests (as 20 requests per second isn't that much). Should I be doing something about this? (especially in the case of a DDOS attack). I've heard of rate-limiting where you record an instance of every time some IP requests data and then trace if they are spammy in nature:
INSERT INTO logs (ip_address, page, date) values ('$ip', '$page', NOW())
//And then every time someone loads the php post request, check to see if they loaded the same one in the past second or 10 seconds
But that means every time there's a request by a normal user, I have to expend resources to log them. Is there a standard or better "practice" (maybe some server configuration?) for preventing or dealing with his concern?
Edit: Just for clarification. I'm referring to some person coding a software (with a cookie or is logged in) that just sends millions of requests per second to all my PHP post request files on my site.
The solution for this is to rate-limit requests, usually per client IP.
Most webservers have modules which can do this, so use one of them - that way your application only receives requests it's suppsed to handle.
nginx: ngx_http_limit_req
Apache: mod_evasive
There are many things you can do:
Use tokens to authenticate request. Save token in session and allow only some amount of requests per token (eg. 20). Also make tokens expire after some amount of time (eg. 5 min). The exact values depend on your site usage patterns. This of course will not stop attacker, as he may refresh the site and grab new token, but it is a small and almost costless aggravation.
Once you have tokens, require captcha after several token refresh requests. Also adjust it to your usage patterns to avoid displaying captcha to regular users.
Adjust your server's firewall rules. Use iptables connlimit and recent modules (see http://ipset.netfilter.org/iptables-extensions.man.html). This will reduce request ratio handled by your http server, so it will be harder to exhaust resources.
On my site I have an auto-suggest text input that suggests results as the user types. The results are provided by a AJAX calls to an API on a different domain. This means I have to use CORS to allow the requests.
It is all working quite well, but every time the user types a new character, the browser sends a new OPTIONS request to ensure it is authorized.
Is there a way around all these repeated options requests?
My php script receiving the requests has
header("Access-Control-Allow-Origin: http://consent.example.com");
and the requests are all originating from consent.example.com. To be clear, the authorization works just fine, and the request completes successfully, but I don't know why it needs to keep making options calls. It would make sense to me that the browser would cache this.
According to RFC 2616 ("Hypertext Transfer Protocol -- HTTP/1.1"), section 9.2:
9.2 OPTIONS
...
Responses to this method are not cacheable.
The HTTP spec explicitly disallows caching OPTIONS responses.
It is worth noting that the GET responses do not employ caching either (I see that customers?search=alex is 200 each time). This is simply because the server chooses not to send 304 responses for that request, or your browser doesn't let the server know it has a cached copy, by an If-Modified-Since or If-None-Match request header.
I'm implementing my own http module.
As I'm reading the official node.js http module api, I couldn't understand a few things:
If the user is using the response.writeHead(statusCode, [reasonPhrase], [headers]) function, are the headers should be written immidiatly to the socket or are they first supposed to be saved as a member to the object? and then written only after .end() function?
What is the meaning of implicit headers that should be used whenever the user didn't use writeHead()? are they supposed to be set ahead? and if the user didn't set them? what should be the behavior? thanks
Answers:
Anything that you write into response either headers with writeHead or body with write is buffered and sent. You see they use socket buffers. They can only hold fixed amount of data, before being sent. The important fact to remember is that you can only set headers before you start writing the body. If you do, some headers will set for you by the http server itself.
Implicit headers are ones which you don't write specifically but are still sent. Setup a simple http server, by responding a request without setting any header. Then view the request headers opening the site in browser. There will be headers like Date, Server, Host etc which are added to every request automatically without user's volition.
I found answer for the first question, but still don't understand the second one.
The first time response.write() is called, it will send the buffered header information and the first body to the client. The second time response.write() is called, Node assumes you're going to be streaming data, and sends that separately. That is, the response is buffered up to the first chunk of body.