Trying to understand Google Cloud Platform functions - javascript

I want to create a REST API for a mobile application.
I wanted to try GCP Functions to see if it could fit my needs.
Now I got some problems. I think I misunderstood something.
When I try a function locally with the firebase-tools, the server is recreating everytime a request comes to the function instance. I thought the instance would keep my server alive for some time.
I know that each instance can only process one request at a time. But I am scared that the time it loses recreating the server at every request is a lot.
I am sure there is something wrong in my understanding.
I just want to know how it works to make the best of it.
Thank you :)
Here is my main function with a nestjs server
import { NestFactory } from '#nestjs/core'
import { AppModule } from './module'
import { loggerMiddleware } from './middlewares/logger.middleware'
import { ExpressAdapter } from '#nestjs/platform-express'
import * as functions from 'firebase-functions'
import * as express from 'express'
const server: express.Application = express()
export async function createNestServer(server) {
const app = await NestFactory.create(AppModule, new ExpressAdapter(server))
app.use(loggerMiddleware)
return app.init()
}
createNestServer(server)
.then(() => console.log('Nest ok..'))
.catch(error => console.log('Nest broken', error))
export const api = functions.https.onRequest(server)
Edit:
Screenshot of the initialization logs when a request comes in the GCP function instance
As you can see, the function ends before the end of the NestJS server initialization. And it is doing this initialization each time I make a request to this URL. Even if the NestJS server is up, it does not keep the state to the next call.

If your function takes a long time to execute, then Cloud Functions will spin up a new instance whenever there is no existing instance idle and ready to accept the connection. Eventually, when your function finishes, the instance that handled it will go idle.
Code that runs in the global scope of your code will be executed as part of the first incoming request. It does not come "for free" in anyway.
For HTTP type connections, the concurrent access limit is based on the bandwidth required by your function, not its compute time. You could have hundreds of instances concurrently operating on longer-running functions, as long as you're willing to pay the cost of each one.
It would be good to understand the documented limits and quotas of Cloud Functions.

Related

Cloud Functions - How to instantiate global functions/variables only once?

I have a firebase application that uses Cloud Functions to talk to a Google Cloud SQL instance. These cloud functions are used to perform CRUD actions. I would like to ensure that the database reflects the CRUD operations, as such, run migration code every time I push new function code to ensure the database is always up to date.
I do this in a global function
const functions = require('firebase-functions')
const pg = require('pg')
// Create if not exists database
(function() {
console.log('create db...')
})()
exports.helloWorld = functions.https.onRequest((request, response) => {
console.log('Hello from Firebase function log!')
response.send('Hello from Firebase!')
})
exports.helloWorld2 = functions.https.onRequest((request, response) => {
console.log('Hello from Firebase function log 2!')
response.send('Hello from Firebase 2!')
})
This console log then runs twice when I deploy.
Now I understand that there is no way of knowing how many instances Cloud Functions will spin up for the functions, as stated in their docs:
The global scope in the function file, which is expected to contain the function definition, is executed on every cold start, but not if the instance has already been initialized.`
If I add a third function, this console log is now shown 3 times in the logs, instead of 2, one for each function. Would it be correct in saying that there's a new instance for every single function uploaded? I am trying to understand what happens under the hood when I upload a set of cloud functions.
If so - is there no reliable way to run migration code inside a global function in cloud functions?
What you're doing isn't a supported use case for Cloud Functions. Cloud Functions code runs in response to events that occur in your project. There is no "one time" function invocations that happen on deployment. If you need to run code a single time, just run that from your desktop or some other server you control.
You should also strive to minimize the amount of work that happens in the global scope of your functions. Globals will be instantiated and run once for each allocated server instance running a function in your app, as each function runs in full isolation of each other, and each has its own copy of everything. Watch my video about function scaling and isolation to better understand this behavior.

Does not Cloud Spanner manage sessions properly?

I have looked up for this issue but could not find any sufficient information about it.
Google Cloud Spanner client libraries handles sessions automatically and its limit is 10.000 sessions for each node, no problem till here.
I have a micro serviced application which also has Google Cloud Functions. I am doing some specific database jobs on Cloud Functions and I'm also calling those functions continuously. After a little while, Cloud Spanner is starting to throw an error;
Too many active sessions in database, limit is 10000. Increase the node count to allow more sessions.
I know about the limits, but there is not any operation that will cause my app to exceed those limits.
After I noticed this, I have two questions which I could not find any answer;
1- Does Cloud Functions creates new session for every call? (I am using HTTP Trigger)
Here is what I did so far;
1- Here is example cloud functions declaration of mine;
exports.myFunction = function myFunction(req, res) {}
I was declaring my database instance out of this scope before I realize this issue;
const db = Spanner({projectId: '[my-project]'}).instance('[my-cs-instance]').database('[my-database]');
exports.myFunction = function myFunction(req, res) {}
After this issue, I have put it in the scope like this, and closed the database session after I'm done;
exports.myFunction = function myFunction(req, res) {
const db = Spanner({projectId: '[my-project]'}).instance('[my-cs-instance]').database('[my-database]');
// codes
db.close();
}
That didn't change anything, it still exceeds the session limit after a while.
Do you have any experience what causes this? Is this related to Cloud Functions or Cloud Spanner itself?
2- If every transaction object use one connection at a time, what happens in this scenario.
I have a REST endpoint other than these Cloud Functions. It creates a database instance when its starting to listen HTTP endpoints and I am not creating any other instance in its lifecycle anymore. At that endpoint, I am making CRUDs and I am using transactions and they all use the same instance which I created at the start of process. My experience is;
Sometimes transactions or other CRUD operations works with a bit delay which does not happen all the time.
My question is;
Is that because when transaction starts to work, does it lock the connection and all other operations should wait until it ends? If so, should I create independent database instances for transactions on that endpoint?
Thanks in advance
This now has been fixed per the issue opened at #89 and the fix at #91, and logged as #71987137 at Google Issue Trackers.
If any issue persists, please report at Google issue tracker they will re-open to examine.

Node.js API to spawn off a call to another API

I created a Node.js API.
When this API gets called I return to the caller fairly quickly. Which is good.
But now I also want API to call or launch an different API or function or something that will go off and run on it's own. Kind of like calling a child process with child.unref(). In fact, I would use child.spawn() but I don't see how to have spawn() call another API. Maybe that alone would be my answer?
Of this other process, I don't care if it crashes or finishes without error.
So it doesn't need to be attached to anything. But if it does remain attached to the Node.js console then icing on the cake.
I'm still thinking about how to identify & what to do if the spawn somehow gets caught up in running a really long time. But ready to cross that part of this yet.
Your thoughts on what I might be able to do?
I guess I could child.spawn('node', [somescript])
What do you think?
I would have to explore if my cloud host will permit this too.
You need to specify exactly what the other spawned thing is supposed to do. If it is calling an HTTP API, with Node.js you should not launch a new process to do that. Node is built to run HTTP requests asynchronously.
The normal pattern, if you really need some stuff to happen in a different process, is to use something like a message queue, the cluster module, or other messaging/queue between processes that the worker will monitor, and the worker is usually set up to handle some particular task or set of tasks this way. It is pretty unusual to be spawning another process after receiving an HTTP request since launching new processes is pretty heavy-weight and can use up all of your server resources if you aren't careful, and due to Node's async capabilities usually isn't necessary especially for things mainly involving IO.
This is from a test API I built some time ago. Note I'm even passing a value into the script as a parameter.
router.put('/test', function (req, res, next) {
var u = req.body.u;
var cp = require('child_process');
var c = cp.spawn('node', ['yourtest.js', '"' + u + '"'], { detach: true });
c.unref();
res.sendStatus(200);
});
The yourtest.js script can be just about anything you want it to be. But I thought I would have enjoy learning more if I thought to first treat the script as a node.js console desktop app. FIRST get your yourtest.js script to run without error by manually running/testing it from your console's command line node yourstest.js yourparamtervalue THEN integrate it in to the child.spawn()
var u = process.argv[2];
console.log('f2u', u);
function f1() {
console.log('f1-hello');
}
function f2() {
console.log('f2-hello');
}
setTimeout(f2, 3000); // wait 3 second before execution f2(). I do this just for troubleshooting. You can watch node.exe open and then close in TaskManager if node.exe is running long enough.
f1();

nodejs with polling mechanism on aws crashes every x hours without a clear cause

The post might seem a bit lengthy but please bear with me.
I'm trying to figure out for a couple of days now very mysterious behavior of my express/node app.
My stack is:
nodejs/express (with setInterval polling SNMP endpoint)
AWS (medium instance with 8GB EBS)
amazon-linux
https sever running on port 3000 (the whole app is running on it)
pm2 (as a node process manager - tried foreverjs too with the same results)
The server looks like this:
let debug = require('debug')('server'),
app = require('../app');
app.set('port', process.env.PORT || 3000);
process.on('uncaughtException', (exception) => {
debug(`UncaughtException: ${exception}`);
});
process.on('unhandledRejection', (reason) => {
debug(`UnhandledPromiseRejection: ${reason}`);
});
let server = app.listen(app.get('port'), function () {
debug('Express server listening on port ' + server.address().port);
});
The app itself contains two parts, HTTP routes handling API calls and so-called roller which is a class and it looks like this:
class SnmpPoller {
constructor () {
this.snmpAdapter = null;
this.site = config.get('system.site');
}
startPolling () {
debug('Snmp poller started');
timers.setInterval(
this.poll(),
config.get('poll.interval')
);
}
poll () {
return () => {
if (dbConnection.get()) {
debug('Polling data');
this.doMagic(this.site);
}
};
}
// other super useful methods
}
The poller runs a function every poll.interval seconds.
doMagic method calls very complicated mechanism of polling data from different endpoints with a lot of promises and callbacks. It saves data to at least 4 different MongoDB collections, parsing and calculating different values along.
All is good here. The poller is working fine all the promises are handled, all the errors are handled.
I put logs to each and every callback and promise.
Now, the situation is as follows:
When I leave the app running for several hours, it becomes unresponsive. When I try to reach it using postman I'm getting didn’t send any data. ERR_EMPTY_RESPONSE. It is definitely not 404 error. The request knows that there is something but cannot access it.
Also, pm2 is not restarting the app, there is nothing in the log files, so it seems that it's not caused by the app itself.
I was suspecting memory leaks and unhandled promises but I checked it, and all is fine, the garbage collector is behaving properly keeping the app memory at around 40-50Mb. I also got rid of all unhandled promises during the process.
I also ruled out db connection issues. Double checked if it's not happening on app losing the connection with the db. It's not the problem.
The QUESTION:
Why is it happening, I can not find the cause for a couple of days now. I have exactly the same setup running on production and it's not "crushing" there. (production is not an AWS server)
Might it be something specific to AWS, amazon-Linux?
Any help greatly appreciated.
Thanks!

Live updating Node.js server

I want to design a live updating Node.js Express server, perhaps having a particular route, say /update, which loads a new configuration file. The only concern I have with this right now is that the server could potentially be in any state when the update happens. If I load a new configuration file while there is a JS message being processed for a user request, at the beginning of the user request there could be one configuration and before the request completes there could be a second configuration when the new config file is loaded. The only way I can think of to prevent this is to take down the server for at least one minute (keep the server live, but prevent any incoming requests altogether) and then update the server and put it back online, but then that's not really the best form of hot reloading or live updating is it?
How can I somehow trick the JS event loop so that the config file only gets loaded once all requests have completed and delay any new requests until after the config is loaded?
One algorithm would be:
set a flag "starting re-configuration"
above flag prevents any new requests from being processed (using Express middleware)
check that all current requests are completed (can't think of anything better than a polling loop here)
once above check is done, load new configuration
once configuration is loaded, switch the flag from (1)
Disclaimer: I have not tried this in production. In fact, I have not tried this at all. while I believe the idea is sane, there may be hidden pitfalls along the road which are not currently known to me.
There is one thing that many Node.js developers tend to forget/not fully realise:
There can always be only one JavaScript statement executed at a time.
It does not matter that you can do async I/O or execute a function later in time. No matter how hard you try, all the JS code that you write is executed in a single thread, no parallelism. Only the underlying implementation (which is completely out of our control) can do things in parallel.
This helps us, because as long as our update procedure is synchronous, no other JS code (i.e. client response) can be executed.
Configuration live-patching
The solution to prevent configuration change mid-request is rather simple:
Each request gets its own copy of the application's configuration.
If your application's configuration lives in a JavaScript object, you simply clone that object for each new request. This means that even if you suddenly change the configuration, it will only be applied to new incoming requests.
There are tons of modules for cloning (even deep cloning) objects, but since I believe mine is best I will use this opportunity for some small self-promotion - semantic-merge.
Code live-patching
This is a bit trickier, but should be generally possible with enough effort.
The trick here is to first remove/unregister current Express routes, clear Node's require cache, require the updated files again and re-register route handlers. Now Express will finish all pending requests using the old code (this is because Node cannot remove these old functions from memory as long as the code contains active object references - which it does - req and res) and use the newly required modules/routes for new incoming requests. Old code should get released from memory as soon as there are no more requests that started with the old code.
You must not use require anywhere during request processing, otherwise you risk the same problem as with changing configuration mid-request. You can of course use require in a module-level scope because that will be executed when the module itself is required, thus being synchronous.
Example:
// app/routes/users.js (or something)
// This is okay, because it is executed only once - when users.js
// itself is required
var path = require('path')
// This function is something you would put into app.use()
module.exports = function usersRoute (req, res, next) {
// Do not use require() here! It will be executed per-request!
}
I think that instead of looping a request to the server you can use a Websocket.
That way, when there's a change in the config file that you mentioned, the server can 'emit' a message to the users, so they refresh their data.
If you are using nodeJS and Express, this will help you:
Socket.io with NodeJS
The server will wait for the signal of some user or anybody and emit the signal to all the users, so they get the new data
Node.js:
var express = require('express');
var app = express();
var server = require('http').createServer(app);
var io = require('socket')(server);
var port = process.env.PORT || 3000;
server.listen(port, function () {
console.log('Server listening at port %d', port);
});
app.use(express.static("/PATH_TO_PROJECT"));
io.on('connection', function (socket) {
socket.on('someone update data', function (data) {
socket.to(socket.room).broadcast.emit('data updated', {params:"x"});
}
});
Meanwhile, the client will be listening if there's any change:
View.js:
var socket = io();
socket.on('new message', function (data) {
liveUpdate(data);
});
I hope I understood correctly what you asked
This is a good problem to solve.
A possible solution could be:
That you derive the controllers on each path from a parent controller. The parent controller can flag a property ON (a flag / file) when a request arrives and turn it OFF when the response is sent back.
Now subclass this parent controller for each express end-point facing the front end. If you make a request now to '/update', the update controller would know if the server is busy or not through the FLAG and send back a reply if the update was successful or not.
For update failures the front end could possibly post back to the '/update' end point with some back-off scheme.
This scheme might work for you ...

Categories

Resources