Why does my sequelize model instance lose its id? - javascript

I've got a node-based microservice built on top of postgres, using sequelize to perform queries. I've got a table of Pets, each with an id (uuid) and a name (string). And, I've got a function for fetching Pets from the database by name, which wraps the nasty-looking sequelize call:
async function getPetByName( petName ) {
const sqlzPetInstance = Database.Pet.findOne({
where: { name: { [Sequelize.Op.iLike]: petName } }
})
if(!sqlzPetInstance) return undefined
return sqlzPetInstance
}
It works great.
Later, to improve performance, I added some very short-lived caching to that function, like so:
async function getPetByName( petName ) {
if( ramCache.get(petName) ) return ramCache.get(petName)
const sqlzPetInstance = await Database.Pet.findOne({ ... })
if(!sqlzPetInstance) return undefined
return ramCache.set(petName, sqlzPetInstance) // persists for 5 seconds
}
Now I've noticed that items served from the cache sometimes have their id prop removed! WTF?!
I've added logging, and discovered that the ramCache entry is still being located reliably, and the value is still an instance of the sqlz Pet model. All the other attributes on the model are still present, but dataValues.id is undefined. I also noticed that _previousDataValues.id has the correct value, which suggests to me this really is the model instance I want it to be, but modified for some reason.
What can explain this? Is this what I would see if callers who obtain the model mutate it by assigning to id? What can cause _previousDataValues and dataValues to diverge? Are there cool sqlz techniques I can use to catch the culprit (perhaps by defining custom setters that log or throw)?
EDIT: experimentation shows that I can't overwrite the id by assigning to it. That's cool, but now I'm pretty much out of ideas. If it's not some kind of irresponsible mutation (which I could protect against), then I can't think of any sqlz instance methods that would result in removing the id.

I don't have a smoking gun, but I can describe the fix I wrote and the hypothesis that shaped it.
As I said, I was storing sequelize model instances in RAM:
ramCache[ cacheKey ] = sqlzModelInstance
My hypothesis is that, by providing the same instance to every caller, I created a situation in which naughty callers could mutate the shared instance.
I never figured out how that mutation was happening. I proved through experimentation that I could not modify the id attribute by overwriting it:
// this does not work
sqlzModelInstance.id = 'some-fake-id'
// unchanged
However, I read a few things in the sqlz documentation that suggested that every instance retains some kind of invisible link to a central authority, and so there's the possibility of "spooky action at a distance."
So, to sever that link, I modified my caching system to store the raw data, rather than sqlz model instances, and to automatically re-hydrate that raw data upon retrieval.
Crudely:
function saveInCache( cacheKey, sqlzModelInstance ) {
cache[ cacheKey ] = sqlzModelInstance.get({ plain: true })
}
function getFromCache( cacheKey ) {
let data = cache[ cacheKey ]
if(!data) return undefined
return MySqlzClass.build( data, { isNewRecord: false, raw: true } )
}
I never located the naughty caller -- and my general practice is to avoid mutating arguments, so it's unlikely any straightforward mutation is happening -- but the change I describe has fixed the easily-reproducible bug I was encountering. So, I think my hypothesis, vague as it is, is accurate.
I will refrain for a while from marking my answer as correct, in the hopes that someone can shed some more light on the problem.

Related

Using Merge with a single Create call in FaunaDB is creating two documents?

Got a weird bug using FaunaDB with a Node.js running on a Netlify Function.
I am building out a quick proof-of-concept and initially everything worked fine. I had a Create query that looked like this:
const faunadb = require('faunadb');
const q = faunadb.query;
const CreateFarm = (data) => (
q.Create(
q.Collection('farms'),
{ data },
)
);
As I said, everything here works as expected. The trouble began when I tried to start normalizing the data FaunaDB sends back. Specifically, I want to merge the Fauna-generated ID into the data object, and send just that back with none of the other metadata.
I am already doing that with other resources, so I wrote a helper query and incorporated it:
const faunadb = require('faunadb');
const q = faunadb.query;
const Normalize = (resource) => (
q.Merge(
q.Select(['data'], resource),
{ id: q.Select(['ref', 'id'], resource) },
)
);
const CreateFarm = (data) => (
Normalize(
q.Create(
q.Collection('farms'),
{ data },
),
)
);
This Normalize function works as expected everywhere else. It builds the correct merged object with an ID with no weird side effects. However, when used with CreateFarm as above, I end up with two identical farms in the DB!!
I've spent a long time looking at the rest of the app. There is definitely only one POST request coming in, and CreateFarm is definitely only being called once. My best theory was that since Merge copies the first resource passed to it, Create is somehow getting called twice on the DB. But reordering the Merge call does not change anything. I have even tried passing in an empty object first, but I always end up with two identical objects created in the end.
Your helper creates an FQL query with two separate Create expressions. Each is evaluated and creates a new Document. This is not related to the Merge function.
Merge(
Select(['data'], Create(
Collection('farms'),
{ data },
)),
{ id: Select(['ref', 'id'], Create(
Collection('farms'),
{ data },
)) },
)
Use Let to create the document, then Update it with the id. Note that this increases the number of Write Ops required for you application. It will basically double the cost of creating Documents. But for what you are trying to do, this is how to do it.
Let(
{
newDoc: Create(q.Collection("farms"), { data }),
id: Select(["ref", "id"], Var("newDoc")),
data: Select(["data"], Var("newDoc"))
},
Update(
Select(["ref"], Var("newDoc")),
{
data: Merge(
Var("data"),
{ id: Var("id") }
)
}
)
)
Aside: why store id in the document data?
It's not clear why you might need to do this. Indexes can be created on the ref value themselves. If your client receives a Ref, then that can be passed into subsequent queries directly. In my experience, if you need the plain id value directly in an application, transform the Document as close to that point in the application as possible (like using ids as keys for an array of web components).
There's even a slight Compute advantage for using Ref values rather than re-building Ref expressions from a Collection name and ID. The expression Ref(Collection("farms"), "1234") counts as 2 FQL functions toward Compute costs, but reusing the Ref value returned by queries is free.
Working with GraphQL, the _id field is abstracted out for you because working with Document types in GraphQL would be pretty awful. However, the best practice for FQL queries would be to use the Ref's directly as much as possible.
Don't let me talk in absolute terms, though! I believe generally that there's a reason for anything. If you believe you really need to duplicate the ID in the Documents data, then I would be interested in a comment why.

How to deal with race conditions in event listeners and shared state?

I have 2 event listeners that operate on the same shared data/state. For instance:
let sharedState = {
username: 'Bob',
isOnline: false,
};
emitter.on('friendStatus', (status) => {
sharedState.isOnline = status.isOnline;
});
emitter.on('friendData', (friend) => {
if (sharedState.isOnline) {
sharedState.username = friend.username;
}
});
My problem is that these events are emitted at any order. The friendData event might come in before the friendStatus. But friendData does something with the data returned from friendStatus. In other words: I need the event handler for friendData to execute after friendStatus, but I don't have this assurance from the event emitter perspective. I need to somehow implement this in my code.
Now of course I could simply remove the if (sharedState.isOnline) { from the friendData listener and let it run its course. Then I'd have a function run after both handlers have finished and somewhat reconciliate the shared state dependencies:
emitter.on('friendStatus', (status) => {
sharedState.isOnline = status.isOnline;
reconcileStateBetweenUsernameAndIsOnline();
});
emitter.on('friendData', (friend) => {
sharedState.username = friend.username;
reconcileStateBetweenUsernameAndIsOnline();
});
Problem is that this reconciliation function knows about this specific data dependencies use case; hence cannot be very generic. With large interconnected data dependencies this seems a lot harder to achieve. For instance I am already dealing with other subscriptions and other data dependencies and my reconciliation function is becoming quite large and complicated.
My question is: is there a better way to model this? For instance if I had the assurance that the handlers would run in a specific order I wouldn't have this issue.
EDIT: expected behavior is to use the sharedState and render a UI where I want the username to show ONLY if the status isOnline is true.
From #Bergi's answer in the comments the solution I was hinting seems to be the most appropriate for such case. Simply let the event-handlers set their own independent state, then observe on the values changing and write appropriate logic based on what you need to do. For instance I need to show a username; this function shouldn't care about the order or have any knowledge of time: it should simply check whether the isOnline status is true and if there's a username. Then the observable pattern can be used to call this function whenever each dependency of the function changes. In this case the function depends on status.isOnline and friend.username hence it will observe and re-execute whenever those values change.
function showUsername() {
if (status.isOnline && friend.username != '') return true;
}
This function must observe the properties it depends on (status.isOnline and friend.username). You can have a look at RxJS or other libraries for achieving this in a more "standard" way.

Passing down arguments using Facebook's DataLoader

I'm using DataLoader for batching the requests/queries together.
In my loader function I need to know the requested fields to avoid having a SELECT * FROM query but rather a SELECT field1, field2, ... FROM query...
What would be the best approach using DataLoader to pass down the resolveInfo needed for it? (I use resolveInfo.fieldNodes to get the requested fields)
At the moment, I'm doing something like this:
await someDataLoader.load({ ids, args, context, info });
and then in the actual loaderFn:
const loadFn = async options => {
const ids = [];
let args;
let context;
let info;
options.forEach(a => {
ids.push(a.ids);
if (!args && !context && !info) {
args = a.args;
context = a.context;
info = a.info;
}
});
return Promise.resolve(await new DataProvider().get({ ...args, ids}, context, info));};
but as you can see, it's hacky and doesn't really feel good...
Does anyone have an idea how I could achieve this?
I am not sure if there is a good answer to this question simply because Dataloader is not made for this usecase but I have worked extensively with Dataloader, written similar implementations and explored similar concepts on other programming languages.
Let's understand why Dataloader is not made for this usecase and how we could still make it work (roughly like in your example).
Dataloader is not made for fetching a subset of fields
Dataloader is made for simple key-value-lookups. That means given a key like an ID it will load a value behind it. For that it assumes that the object behind the ID will always be the same until it is invalidated. This is the single assumption that enables the power of dataloader. Without it the three key features of Dataloader won't work anymore:
Batching requests (multiple requests are done together in one query)
Deduplication (requests to the same key twice result in one query)
Caching (consecutive requests of the same key don't result in multiple queries)
This leads us to the following two important rules if we want to maximise the power of Dataloader:
Two different entities cannot share the same key, othewise we might return the wrong entity. This sounds trivial but it is not in your example. Let's say we want to load a user with ID 1 and the fields id and name. A little bit later (or at the same time) we want to load user with ID 1 and fields id and email. These are technically two different entities and they need to have a different key.
The same entity should have the same key all the time. Again sounds trivial but really is not in the example. User with ID 1 and fields id and name should be the same as user with ID 1 and fields name and id (notice the order).
In short a key needs to have all the information needed to uniquely identify an entity but not more than that.
So how do we pass down fields to Dataloader
await someDataLoader.load({ ids, args, context, info });
In your question you have provided a few more things to your Dataloader as a key. First I would not put in args and context into the key. Does your entity change when the context changes (e.g. you are querying a different database now)? Probably yes, but do you want to account for that in your dataloader implementation? I would instead suggest to create new dataloaders for each request as described in the docs.
Should the whole request info be in the key? No, but we need the fields that are requested. Apart from that your provided implementation is wrong and would break when the loader is called with two different resolve infos. You only set the resolve info from the first call but really it might be different on each object (think about the first user example above). Ultimately we could arrive at the following implementation of a dataloader:
// This function creates unique cache keys for different selected
// fields
function cacheKeyFn({ id, fields }) {
const sortedFields = [...(new Set(fields))].sort().join(';');
return `${id}[${sortedFields}]`;
}
function createLoaders(db) {
const userLoader = new Dataloader(async keys => {
// Create a set with all requested fields
const fields = keys.reduce((acc, key) => {
key.fields.forEach(field => acc.add(field));
return acc;
}, new Set());
// Get all our ids for the DB query
const ids = keys.map(key => key.id);
// Please be aware of possible SQL injection, don't copy + paste
const result = await db.query(`
SELECT
${fields.entries().join()}
FROM
user
WHERE
id IN (${ids.join()})
`);
}, { cacheKeyFn });
return { userLoader };
}
// now in a resolver
resolve(parent, args, ctx, info) {
// https://www.npmjs.com/package/graphql-fields
return ctx.userLoader.load({ id: args.id, fields: Object.keys(graphqlFields(info)) });
}
This is a solid implementation but it has a few weaknesses. First, we are overfetching a lot of fields if we have different field requiements in the same batch request. Second, if we have fetched an entity with key 1[id,name] from cache key function we could also answer (at least in JavaScript) keys 1[id] and 1[name] with that object. Here we could build a custom map implementation that we could supply to Dataloader. It would be smart enough to know these things about our cache.
Conclusion
We see that this is really a complicated matter. I know it is often listed as a benefit of GraphQL that you don't have to fetch all fields from a database for every query, but the truth is that in practice this is seldomly worth the hassle. Don't optimise what is not slow. And even is it slow, is it a bottleneck?
My suggestion is: Write trivial Dataloaders that simply fetch all (needed) fields. If you have one client it is very likely that for most entities the client fetches all fields anyways, otherwise they would not be part of you API, right? Then use something like query introsprection to measure slow queries and then find out which field exactly is slow. Then you optimise only the slow thing (see for example my answer here that optimises a single use case). And if you are a big ecomerce platform please don't use Dataloader for this. Build something smarter and don't use JavaScript.

Shortest code to cache Rxjs http request while not complete?

I'm trying to create an observable flow that fulfills the following requirements:
Loads data from storage at subscribe time
If the data has not yet expired, return an observable of the stored value
If the data has expired, return an HTTP request observable that uses the refresh token to get a new value and store it
If this code is reached again before the request has completed, return the same request observable
If this code is reached after the previous request completed or with a different refresh token, start a new request
I'm aware that there are many different answers on how to perform step (3), but as I'm trying to perform these steps together I am looking for guidance on whether the solution I've come up with is the most succinct it can be (which I doubt!).
Here's a sample demonstrating my current approach:
var cachedRequestToken;
var cachedRequest;
function getOrUpdateValue() {
return loadFromStorage()
.flatMap(data => {
// data doesn't exist, shortcut out
if (!data || !data.refreshtoken)
return Rx.Observable.empty();
// data still valid, return the existing value
if (data.expires > new Date().getTime())
return Rx.Observable.return(data.value);
// if the refresh token is different or the previous request is
// complete, start a new request, otherwise return the cached request
if (!cachedRequest || cachedRequestToken !== data.refreshtoken) {
cachedRequestToken = data.refreshtoken;
var pretendHttpBody = {
value: Math.random(),
refreshToken: Math.random(),
expires: new Date().getTime() + (10 * 60 * 1000) // set by server, expires in ten minutes
};
cachedRequest = Rx.Observable.create(ob => {
// this would really be a http request that exchanges
// the one use refreshtoken for new data, then saves it
// to storage for later use before passing on the value
window.setTimeout(() => { // emulate slow response
saveToStorage(pretendHttpBody);
ob.next(pretendHttpBody.value);
ob.completed();
cachedRequest = null; // clear the request now we're complete
}, 2500);
});
}
return cachedRequest;
});
}
function loadFromStorage() {
return Rx.Observable.create(ob => {
var storedData = { // loading from storage goes here
value: 15, // wrapped in observable to delay loading until subscribed
refreshtoken: 63, // other process may have updated this between requests
expires: new Date().getTime() - (60 * 1000) // pretend to have already expired
};
ob.next(storedData);
ob.completed();
})
}
function saveToStorage(data) {
// save goes here
}
// first request
getOrUpdateValue().subscribe(function(v) { console.log('sub1: ' + v); });
// second request, can occur before or after first request finishes
window.setTimeout(
() => getOrUpdateValue().subscribe(function(v) { console.log('sub2: ' + v); }),
1500);
First, have a look at a working jsbin example.
The solution is a tad different then your initial code, and I'd like to explain why. The need to keep returning to your local storage, save it, save flags (cache and token) didn't not fit for me with reactive, functional approach. The heart of the solution I gave is:
var data$ = new Rx.BehaviorSubject(storageMock);
var request$ = new Rx.Subject();
request$.flatMapFirst(loadFromServer).share().startWith(storageMock).subscribe(data$);
data$.subscribe(saveToStorage);
function getOrUpdateValue() {
return data$.take(1)
.filter(data => (data && data.refreshtoken))
.switchMap(data => (data.expires > new Date().getTime()
? data$.take(1)
: (console.log('expired ...'), request$.onNext(true) ,data$.skip(1).take(1))));
}
The key is that data$ holds your latest data and is always up to date, it is easily accessible by doing a data$.take(1). The take(1) is important to make sure your subscription gets a single values and terminates (because you attempt to work in a procedural, as opposed to functional, manner). Without the take(1) your subscription would stay active and you would have multiple handlers out there, that is you'll handle future updates as well in a code that was meant only for the current update.
In addition, I hold a request$ subject which is your way to start fetching new data from the server. The function works like so:
The filter ensures that if your data is empty or has no token, nothing passes through, similar to the return Rx.Observable.empty() you had.
If the data is up to date, it returns data$.take(1) which is a single element sequence you can subscribe to.
If not, it needs a refresh. To do so, it triggers request$.onNext(true) and returns data$.skip(1).take(1). The skip(1) is to avoid the current, out dated value.
For brevity I used (console.log('expired ...'), request$.onNext(true) ,data$.skip(1).take(1))). This might look a bit cryptic. It uses the js comma separated syntax which is common in minifiers/uglifiers. It executes all statements and returns the result of the last statement. If you want a more readable code, you could rewrite it like so:
.switchMap(data => {
if(data.expires > new Date().getTime()){
return data$.take(1);
} else {
console.log('expired ...');
request$.onNext(true);
return data$.skip(1).take(1);
}
});
The last part is the usage of flatMapFirst. This ensures that once a request is in progress, all following requests are dropped. You can see it works in the console printout. The 'load from server' is printed several times, yet the actual sequence is invoked only once and you get a single 'loading from server done' printout. This is a more reactive oriented solution to your original refreshtoken flag checking.
Though I didn't need the saved data, it is saved because you mentioned that you might want to read it on future sessions.
A few tips on rxjs:
Instead of using the setTimeout, which can cause many problems, you can simply do Rx.Observable.timer(time_out_value).subscribe(...).
Creating an observable is cumbersome (you even had to call next(...) and complete()). You have a much cleaner way to do this using Rx.Subject. Note that you have specifications of this class, the BehaviorSubject and ReplaySubject. These classes are worth knowing and can help a lot.
One last note. This was quite a challange :-) I'm not familiar with your server side code and design considerations yet the need to suppress calls felt uncomfortable to me. Unless there is a very good reason related to your backend, my natural approach would be to use flatMap and let the last request 'win', i.e. drop previous un terminated calls and set the value.
The code is rxjs 4 based (so it can run in jsbin), if you're using angular2 (hence rxjs 5), you'll need to adapt it. Have a look at the migration guide.
================ answers to Steve's other questions (in comments below) =======
There is one article I can recommend. It's title says it all :-)
As for the procedural vs. functional approach, I'd add another variable to the service:
let token$ = data$.pluck('refreshtoken');
and then consume it when needed.
My general approach is to first map my data flows and relations and then like a good "keyboard plumber" (like we all are), build the piping. My top level draft for a service would be (skipping the angular2 formalities and provider for brevity):
class UserService {
data$: <as above>;
token$: data$.pluck('refreshtoken');
private request$: <as above>;
refresh(){
request.onNext(true);
}
}
You might need to do some checking so the pluck does not fail.
Then, each component that needs the data or the token can access it directly.
Now lets suppose you have a service that needs to act on a change to the data or the token:
class SomeService {
constructor(private userSvc: UserService){
this.userSvc.token$.subscribe(() => this.doMyUpdates());
}
}
If your need to synthesize data, meaning, use the data/token and some local data:
Rx.Observable.combineLatest(this.userSvc.data$, this.myRelevantData$)
.subscribe(([data, myData] => this.doMyUpdates(data.someField, myData.someField));
Again, the philosophy is that you build the data flow and pipes, wire them up and then all you have to do is trigger stuff.
The 'mini pattern' I've come up with is to pass to a service once my trigger sequence and register to the result. Lets take for example autocomplete:
class ACService {
fetch(text: string): Observable<Array<string>> {
return http.get(text).map(response => response.json().data;
}
}
Then you have to call it every time your text changes and assign the result to your component:
<div class="suggestions" *ngFor="let suggestion; of suggestions | async;">
<div>{{suggestion}}</div>
</div>
and in your component:
onTextChange(text) {
this.suggestions = acSVC.fetch(text);
}
but this could be done like this as well:
class ACService {
createFetcher(textStream: Observable<string>): Observable<Array<string>> {
return textStream.flatMap(text => http.get(text))
.map(response => response.json().data;
}
}
And then in your component:
textStream: Subject<string> = new Subject<string>();
suggestions: Observable<string>;
constructor(private acSVC: ACService){
this.suggestions = acSVC.createFetcher(textStream);
}
onTextChange(text) {
this.textStream.next(text);
}
template code stays the same.
It seems like a small thing here, but once the app grows bigger, and the data flow complicated, this works much better. You have a sequence that holds you data and you can use it around the component wherever you need it, you can even further transform it. For example, lets say you need to know the number of suggestions, in the first method, once you get the result, you need to further query it to get it, thus:
onTextChange(text) {
this.suggestions = acSVC.fetch(text);
this.suggestionsCount = suggestions.pluck('length'); // in a sequence
// or
this.suggestions.subscribe(suggestions => this.suggestionsCount = suggestions.length); // in a numeric variable.
}
Now in the second method, you just define:
constructor(private acSVC: ACService){
this.suggestions = acSVC.createFetcher(textStream);
this.suggestionsCount = this.suggestions.pluck('length');
}
Hope this helps :-)
While writing, I tried to reflect about the path I took to getting to use reactive like this. Needless to say that on going experimentation, numerous jsbins and strange failures are big part of it. Another thing that I think helped shape my approach (though I'm not currently using it) is learning redux and reading/trying a bit of ngrx (angular's redux port). The philosophy and the approach does not let you even think procedural so you have to tune in to functional, data, relations and flows based mindset.

From Event driven OO to Redux architecture

I have a structure that is pretty much OO and I am about to migrate to React/Redux due to event mess.
I am curious what to do with current modules, I have objects that have schema like:
User {
getName()
getSurname()
etc...
}
And there are lots of these, they were used as fasade/factories for raw json data as I used to pass json and manipulate it (mutable data)
Now how to solve this in redux?
I get to the part where I have an async action call, I recieve raw data from api and than what?
Should I pass 'complex' object with their getters/setters to state? Its said to be immutable so it doesnt seem well with redux recomendations.
Or maybe convert the class-like elements to accessors like:
function getName(rawJson) {
return rawJson.name
}
function setName(rawJson, name) {
return Object.assign({}, rawJson, {name})
}
parse it in action and return a rawJSON chunk from action to reducer and than stick it to the new state?
EDIT:
A simple pseudocode module for user:
function User(raw) {
return {
getName: function() {
return raw.name
}
setName: function(name) {
raw.name = name
return this
}
}
}
My point is about moving all data and flattening/normalizing it in store - would it be fine to have an array of e.g. User objects in store? or should they all be pure json. I want to be sure its really only correct way to have all those objects turn into basic values cause it gonna be lots of work.
Not sure if this will be totally relevant, but if I'm understanding your question and putting it another way: where to put business logic and other validations/manipulations:
https://github.com/reactjs/redux/issues/1165
I personally follow this trend as well in that I put all of my async action manipulation in my action creators before storing them in a format of my choosing in the reducer.
Here, I choose to convert to JSON whatever objects I get back from the API. Similarly you can do whatever logic you need here before dispatching a success request to store in your reducers.
Hope that helps/is relevant to whatever you were asking...

Categories

Resources