I have a question regarding SQL connection pools. My team is using the knex.js library in one of our node applications to make database query's.
The application from time to time needs to switch databases. So my team created an initialization function that returns a knex object configured to the correct database. Then that object is used to do said query. To me this seems redundant and can cause bad performance, because we initiate a knex object every time need to do a query instead of reusing a single knex object. Which i could ignore if knex already does this when you which databases (and if anyone could shed light on this question as well that would be FANTASTIC !) . Moreover, (and this leads me to my question titled above) the connection pool properties are redefined. So does that mean we are creating new pools every time, or does the SQL ( SQL Sever in this case) reuse the connection pool you already defined ? The question might not be Knex specific, like if i used a library like knex for C#, and call that library a similar way, would SQL Server know not to make more connection pools?
Example code:
/** db.js
* #param {any} database
* #returns db: Knex
*/
module.exports = ( database ) => {
var knex = require('knex')({
client: 'mssql',
connection: {
database: database,
server: '127.0.0.1',
user: 'your_database_user',
password: 'your_database_password'
},
pool: {
min: 0,
max: 10,
idleTimeoutMillis: 5000,
softIdleTimeoutMillis: 2000,
evictionRunIntervalMillis: 500
}
});
return knex;
};
Index.js
var db = require('./db.js');
/**
* #returns users:Array
*/
const getUsers = async() => {
const users = await db('master')
.select()
.from('users_table')
.orderBy('user_id');
return users;
}
Short answer: The 'singleton' nature of the node require() statement prevents reinitialization of multiple occurrences of knex. So the initially created pool continues to be used for the duration of your process, not recreated, as long as you don't discard the db. variable reference.
More discussion...
... my team created an initialization function that returns a knex
object configured to the correct database. Then that object is used to
do said query. To me this seems redundant and can cause bad
performance, because we initiate a knex object every time need to do a
query instead of reusing a single knex object. Which i could ignore if
knex already does this when you switch databases...
var db = require('./db.js');
The node.js require statement creates a singleton object. (You probably already know) this means that the first time the module is called by your program using the require statement, the module and it's data will be initialized, but successive identical require calls will just reuse the same module reference and will not reinitialize the module.
... the connection pool properties are redefined. So does that mean
we are creating new pools every time, or does the SQL ( SQL Sever
in this case) reuse the connection pool you already defined ?
So since the require()-ed module is not reinitialized, then the originally created pool will not be re-created. Unless you discard the db variable reference (discussed more below).
The question might not be Knex specific, like if i used a library like
knex for C#, and call that library a similar way, would SQL Server
know not to make more connection pools?
Generally speaking, you need to build or acquire connection some code to properly manage a pool of connections throughout the life of your process. Knex and most other database wrappers do this for us. (Under the covers Knex uses this library before v0.18.3 and this one on/after.)
Properly initializing and then using the singly initialized pooling code throughout the life of your application process accomplishes this. Discarding the pool and recreating it within your process defeats the purpose of having pooling. Often pooling is setup as part of process initialization.
Also, this was probably just a misstatement within your question, but your Node.js module is making the connection pools, not the SQL Server.
... The application from time to time needs to switch databases. my
team created an initialization function that returns a knex object
configured to the correct database.
From that statement, I would expect to see code like the following:
var db = require('./db.js');
var dbOther = require('./dbOther.js');
... which each establishes a different database connection. If you are instead using:
var db = require('./db.js');
// ... do other stuff here in the same module ...
var db = require('./dbOther.js');
... then you are likely throwing away the original reference to your first database, and in that case, YES, you are discarding your DB connection and connection pool as you switch connections.
Or, you could do something like the following:
// initialize the 2 connection pools
const dbFirst = require('./db.js');
const dbOther = require('./dbOther.js');
// set the active connection
var db = dbFirst;
// change the active connection
db = dbOther;
Related
I’m working on an application that uses Firebase Functions as a API interface between my web application and Google Cloud SQL (MySQL 5.7).
I have a process for importing records from the client app; basically the client app reads a CSV file then executes a function for every row in the CSV file. The function executes three or four queries during processing of the record (checking to see if the main record exists, creating it and/or other needed records, updating a stats record for this process).
The function’s called sequentially for each row, so there’s never more than one request (row) processed at a time executing 3 or 4 queries before returning data to the client app which then processes the next row (async/await).
The process works great for CSV files with 1 to 100 rows. As soon as it goes above about 900 rows, the Firebase Functions starts reporting ERROR Error: ER_CON_COUNT_ERROR: Too many connections
My code, shown below, originally had a connection limit of 10, but I bumped it up to 100 connections but it still fails.
Here’s my code that executes the SQL queries:
import * as functions from "firebase-functions";
import * as mysql from 'mysql';
export async function executeQuery(cmd: string) {
const mySQLConfig = {
host: functions.config().sql.prodhost,
user: functions.config().sql.produser,
password: functions.config().sql.prodpswd,
database: functions.config().sql.proddatabase,
connectionLimit: 100,
}
var pool: any;
if (!pool) {
pool = mysql.createPool(mySQLConfig);
}
return new Promise(function (resolve, reject) {
//#ts-ignore
pool.query(cmd, function (error, results) {
if (error) {
return reject(error);
}
resolve(results);
});
});
}
As I understand it, with a pool like I think I’ve implemented above, each request will get a connection up to the max connections. Each connection will automatically return to the pool once its done processing the request. So, even if it takes a while to release the connection, with the connection limit at 100, I should be able to process quite a few rows (20 or so at least) before there’s contention for connections and then the process will queue up and wait for free connections before continuing. If that’s right, what’s happening here?
I found an article here: https://cloud.google.com/sql/docs/mysql/manage-connections that describes some additional settings I can use to tweak connection management:
// 'connectTimeout' is the maximum number of milliseconds before a timeout
// occurs during the initial connection to the database.
connectTimeout: 10000,
// 'acquireTimeout' is the maximum number of milliseconds to wait when
// checking out a connection from the pool before a timeout error occurs.
acquireTimeout: 10000,
// 'waitForConnections' determines the pool's action when no connections are
// free. If true, the request will queued and a connection will be presented
// when ready. If false, the pool will call back with an error.
waitForConnections: true, // Default: true
// 'queueLimit' is the maximum number of requests for connections the pool
// will queue at once before returning an error. If 0, there is no limit.
queueLimit: 0, // Default: 0
I’m tempted to try bumping up the timeouts, but I’m not sure whether that’s actually impacting me here.
Since I’m running this in Firebase Functions (Google Cloud Functions under the covers), do these settings even really apply? Isn’t my function’s VM resetting after every execution or at least my function terminating after every execution? Does the pool even exist in this context? If not, then how do I do this type of processing in Functions?
One option is, of course, to push all of my processing to the function, just send up a JSON object for the row array and let the function process them all at once. This, I think, should make proper use of pools, but I’m worried I’ll bump up against execution limits in Functions (5 minutes) which is why I built it like I did.
Stupid developer trick, I was paying such close attention to my pool code that I missed that I'm declaring the pool variable in the wrong place. Moving the pool declaration outside of the method fixed my problem. With the code the way it was, I was creating a pool with every SQL query which quickly used up all of my connections.
im not that experienced with node js but im developing something similar to how uber displays their cars in real time on a map.
So i have an sql database with a ton of cars and their gps location. The client sends their gps coordinates and a radius to the following function. some is in pseudo code for now.
var mysql = require('mysql');
var express = require('express');
var connection = mysql.createConnection({
host: "",
port: ,
user: "",
password: "",
database: ""
});
user.on('returnCars', function(gps, radius){
connection.query({
sql: "SELECT * FROM cars WHERE radius = ?",
values: [username] },
function(error, results, fields)
{
if(results)
{
user.emit('returnCars', results);
}
}
});
});
});
So as sql querys arnt instant, if there was 1000 people running this function at once it would surely clog up. All my research is telling me that this is the right way to do it so the only option would for it to be ran asnync right?
Also would it just be the returnCars function to run asynchronously? Im not sure if because the connection/ sql details variable isnt in a function or anything it would all try and read it at once so maybe it should go inside the function or something.
The code is far too fragmentary to really help you with, but in general:
If you're using Node's built-in HTTP serving or something layered on top of it like Express, your code for when a request is received is expected to run asynchronously and there's nothing special you need to do to make that happen.
If you're using any of the main npm modules for MySQL access, the functions you use to make queries will run asynchronously and there's nothing special you have to do to make that happen.
In your pseudocode example, you've shown a callback for your sqlQuery function stand-in. That's likely how you would use the MySQL access module you choose to use, either with direct callbacks like that or promises.
Here's the issue: When I use my local environment MYSQL I get no issues when not using MYSQL pools, however, when I connect to the remote DB that I want to use for production, I get an error about too many connections (it says the current amount of connections is 10).
So here's what I did to solve the issue:
if (typeof GLOBAL.connection === 'undefined') {
GLOBAL.connection = mysql.createPool({
connectionLimit : 10,
host : this.host,
user : this.user,
password : this.password,
database : this.database
});
}
this.connection = GLOBAL.connection;
This solves the issue by creating one global pool, that all queries must run through. The only "problem" as I can see it is that now I have this pool sitting in my global variable.
Each time my queries run it instantiates a new Query(); object, which contains the above code in it. I'm basically just trying to find out if this has repercussions that I can't currently see, that may bite me in the butt later?
Thanks for your help!
I'm using Node package pg for postgres (here):
npm i pg
var pg = require('pg');
I'm querying a large cluster which is not owned by me, and under certain conditions may fail. Failure may be bad response which is easy to handle or endless query.
Please note I can not introduce changes [config or otherwise] on the DB side.
Is there any way to set a timeout for query time?
I'd like my client to give up after a set time, and return timeout error.
Couldn't find anything as such in the docs.
Thanks from ahead!
You can setup statement_timeout in the client:
const { Client } = require('pg')
const client = new Client({
statement_timeout: 10000
})
or in the pool:
const { Pool } = require('pg')
const pool = new Pool({
statement_timeout: 10000
})
Best practice is using an init query, to set query timeout for the session.
SET statement_timeout TO 15000; # x milliseconds or 0 (turns off limitation)
This takes an argument of the timeout in ms, and is applied for the session.
Afterwards, when a query takes longer than the value specified, you will receive an error from the server. Note this is on user's request:
ERROR: Query (150) cancelled on user's request
Also note this actually cancels the query on the server side, reducing load.
I am using the node-mongodb-native drivers and I am looking for a way to open a persistent database connection rather than opening/closing it each time.
A simplified connection might look like this...
var DB = new mongo.Db('vows', new mongo.Server("127.0.0.1", 27017, {})),
connection = DB.open(function(err, db) {
// Here we have access to db
});
How can I make the db object accessible to any module in my application? Rather than having to open the connection for every module separately?
Can this be done using module.exports? Or a global variable?
My solution:
getClient = function(cb) {
if(typeof client !== "undefined") {
return cb(null, client);
} else {
db.open(function(err, cli) {
client = cli;
getClient(cb);
});
}
}
Now, instead of
db.open(function(err, client) {
...stuff...
});
Do:
getClient(function(err, client) {
...stuff...
});
Your first db call opens a connection, the others use that connection.
BTW: suggestions on checking that client is still alive?
Edit: Don't use mongoose, Use something like mongo-col or mongo-client. Then have a single client open in your application. I have ./client.js file that exports a properly opened and configured mongo client.
Mongoose is a solid abstraction on top of mongodb that will allow you to handle mongodb more easily. It's a worth a look.
What you really want to do though is re-open your client every time you do anything with mongo.
You don't keep an open connection to any other database.
Just place your DB in a module along with some helper / wrapper functions.