How to search for partial match using index in fauna db

How to search for partial match using index in fauna db - javascript

I have a faunadb collection of users. The data is as follows:
{
"username": "Hermione Granger",
"fullName": "Hermione Jean Granger",
"DOB": "19-September-1979",
"bloodStatus": "Muggle-Born",
"gender": "Female",
"parents": [
"Wendell Wilkins",
"Monica Wilkins"
]
}
when I use an index I have to search for the whole phrase i.e. Hermione Granger. But I want to search for just Hermione and get the result.

I came across a solution that seems to work.
The below uses the faunadb client.
"all-items" is an index setup on a collection in Fauna that returns all items in the collection
The lambda is searching on the title field
This will return any document with a title that partially matches the search term.
I know this is a bit late; I hope it helps anyone else who may be looking to do this.
const response = await faunaClient.query(
q.Map(
q.Filter(
q.Paginate(q.Match(q.Index("all_items"))),
q.Lambda((ref) =>
q.ContainsStr(
q.LowerCase(
q.Select(["data", "title"], q.Get(ref))
),
title // <= this is your search term
)
)
),
q.Lambda((ref) => q.Get(ref))
)

The Match function only applies an exact comparison. Partial matches are not supported.
One approach that might work for you is to store fields that would contain multiple values that need to be indexed as arrays.
When you index a field whose value is an array, the index creates multiple index entries for the document so that any one of the array items can be used to match entries. Note that this strategy increases the read and write operations involved.
Here's an example:
> CreateCollection({ name: "u" })
{
ref: Collection("u"),
ts: 1618532727920000,
history_days: 30,
name: 'u'
}
> Create(Collection("u"), { data: { n: ["Hermione", "Granger"] }})
{
ref: Ref(Collection("u"), "295985674342892032"),
ts: 1618532785650000,
data: { n: [ 'Hermione', 'Granger' ] }
}
> Create(Collection("u"), { data: { n: ["Harry", "Potter"] }})
{
ref: Ref(Collection("u"), "295985684233060864"),
ts: 1618532795080000,
data: { n: [ 'Harry', 'Potter' ] }
}
> Create(Collection("u"), { data: { n: ["Ginny", "Potter"] }})
{
ref: Ref(Collection("u"), "295985689713967616"),
ts: 1618532800300000,
data: { n: [ 'Ginny', 'Potter' ] }
}
> CreateIndex({
name: "u_by_n",
source: Collection("u"),
terms: [
{ field: ["data", "n"] }
]
})
{
ref: Index("u_by_n"),
ts: 1618533007000000,
active: true,
serialized: true,
name: 'u_by_n3',
source: Collection("u"),
terms: [ { field: [ 'data', 'n' ] } ],
partitions: 1
}
> Paginate(Match(Index("u_by_n"), ["Potter"]))
{
data: [
Ref(Collection("u"), "295985684233060864"),
Ref(Collection("u"), "295985689713967616")
]
}
Note that you cannot query for multiple array items in a single field:
> Paginate(Match(Index("u_by_n"), ["Harry", "Potter"]))
{ data: [] }
The reason is that the index has only one field defined in terms, and successful matches require sending an array having the same structure as terms to Match.
To be able to search for the full username and the username as an array, I'd suggest storing both the string and array version of the username field in your documents, e.g. username: 'Hermione Granger' and username_items: ['Hermione', 'Granger']. Then create one index for searching the string field, and another for the array field, then you can search either way,

Related

Mongodb find documents with given field value inside an array and its id in another array inside the same document

My data model:
{
_id: ObjectId,
persons:[{
_id: ObjectId,
name: String,
...
}],
relations: [{
type: String,
personId: ObjectId,
...
}],
...
}
Here's my issue:
I am trying to find documents where person's name is x and it's _id is inside the relations array (personId) with a given type.
Example:
My data:
[{
_id:"1",
persons:[{
_id:"1",
name: "Homer"
},
{
_id:"2",
name: "Bart"
}],
relations: [{
type:"House_Owner",
personId: 1,
}],
}]
Request_1:
Find all documents where "Homer" is the house owner
Result:
[{
_id:"1",
...
}]
Request_2:
Find all documents where "Bart" is the house owner
Result:
[]
Any help would be appreciated.
The only solution I see here is to do the find operation with the given name value and after that filter the mongodb result.
PS: I cannot change the existing data model
EDIT:
I found a solution to do this by using $where operator with a javascript function but I am not sure that's the most efficient way.
db.myCollection("x").find({
$where: function() {
for (const relation of this.relations) {
if(relation.type === "House_Owner") {
for (const person of this.persons) {
if(person.name === "Homer" && person._id.equals(relation.personId)) {
return true;
}
}
}
}
}
})

You can do something like this:
const requiredName="x"
const requiredId = "id"
await yourModel.find({$and:[{"relations.personId":requiredId },{"persons.name":requiredName}]})

Edit multiple objects in array using mongoose (MongoDB)

So I tried several ways, but I can't, I can modify several objects with the same key but I can't modify any with different keys, if anyone can help me is quite a complex problem
{
id: 123,
"infos": [
{ name: 'Joe', value: 'Disabled', id: 0 },
{ name: 'Adam', value: 'Enabled', id: 0 }
]
};
In my database I have a collection with an array and several objects inside which gives this.
I want to modify these objects, filter by their name and modify the value.
To give you a better example, my site returns me an object with the new data, and I want to modify the database object with the new object, without clearing the array, the name key never changes.
const object = [
{ name: 'Joe', value: 'Hey', id: 1 },
{ name: 'Adam', value: 'None', id: 1 }
];
for(const obj in object) {
Schema.findOneAndUpdate({ id: 123 }, {
$set: {
[`infos.${obj}.value`]: "Test"
}
})
}
This code works but it is not optimized, it makes several requests, I would like to do everything in one request, and also it doesn't update the id, only the value.
If anyone can help me that would be great, I've looked everywhere and can't find anything
My schema structure
new Schema({
id: { "type": String, "required": true, "unique": true },
infos: []
})
I use the $addToSet method to insert objects into the infos array

Try This :
db.collection.update({
id: 123,
},
{
$set: {
"infos.$[x].value": "Value",
"infos.$[x].name": "User"
}
},
{
arrayFilters: [
{
"x.id": {
$in: [
1
]
}
},
],
multi: true
})
The all positional $[] operator acts as a placeholder for all elements in the array field.
In $in you can use dynamic array of id.
Ex :
const ids = [1,2,..n]
db.collection.update(
//Same code as it is...
{
arrayFilters: [
{
"x.id": {
$in: ids
}
},
],
multi: true
})
MongoPlayGround Link : https://mongoplayground.net/p/Tuz831lkPqk

Maybe you look for something like this:
db.collection.update({},
{
$set: {
"infos.$[x].value": "test1",
"infos.$[x].id": 10,
"infos.$[y].value": "test2",
"infos.$[y].id": 20
}
},
{
arrayFilters: [
{
"x.name": "Adam"
},
{
"y.name": "Joe"
}
],
multi: true
})
Explained:
You define arrayFilters for all names in objects you have and update the values & id in all documents ...
playground

finding a value in an object

I have an object:
{
id: 16,
defs: {
name: "Depot (Float)", field: "Depot"
}
}
And an array (which can have more than one object in it but for the purposes of this only has one):
[
{
Percentage Monthly Potential: 1,
Area Manager: "Ashar",
Business Unit: "Retail",
Cust no: 68345,
Depot Name: "Leicester",
Group Number: "",
Depot: 14,
Target: 46100
}
]
What I need to do is take the field value from the object and use it to find the key that it matches in the second object and retrieve the value of it, so in this case I should be getting 14.
Any help with this would be much appreciated.
Thanks for your time.

If you are using ES6, you can try this:
const field = lookupObject.defs.field;
const matches = array.map(arrayItem => {
return {
field,
value: arrayItem[field]
}
});
The matches array will contain the data you are interested in.

RethinkDB - Updating nested array

I have a survey table that looks like so:
{
id: Id,
date: Date,
clients: [{
client_id: Id,
contacts: [{
contact_id: Id,
score: Number,
feedback: String,
email: String
}]
}]
}
I need to updated the score and feedback fields under a specific contact. Currently, I am running the update like this:
function saveScore(obj){
var dfd = q.defer();
var survey = surveys.get(obj.survey_id);
survey
.pluck({ clients: 'contacts' })
.run()
.then(results => {
results.clients.forEach((item, outerIndex) => {
item.contacts.forEach((item, index, array) => {
if(Number(item.contact_id) === Number(obj.contact_id)) {
array[index].score = obj.score;
console.log(outerIndex, index);
}
});
});
return survey.update(results).run()
})
.then(results => dfd.resolve(results))
.catch(err => dfd.resolve(err));
return dfd.promise;
};
When I look at the update method, it specifies how to update nested key:value pairs. However, I can't find any examples to update an individual item in an array.
Is there a better and hopefully cleaner way to update items in a nested array?

You might need to get the array, filter out the desired value in the array and then append it again to the array. Then you can pass the updated array to the update method.
Example
Let's say you have a document with two clients that both have a name and a score and you want to update the score in one of them:
{
"clients": [
{
"name": "jacob" ,
"score": 200
} ,
{
"name": "jorge" ,
"score": 57
}
] ,
"id": "70589f08-284c-495a-b089-005812ec589f"
}
You can get that specific document, run the update command with an annonymous function and then pass in the new, updated array into the clients property.
r.table('jacob').get("70589f08-284c-495a-b089-005812ec589f")
.update(function (row) {
return {
// Get all the clients, expect the one we want to update
clients: row('clients').filter(function (client) {
return client('name').ne('jorge')
})
// Append a new client, with the update information
.append({ name: 'jorge', score: 57 })
};
});
I do think this is a bit cumbersome and there's probably a nicer, more elegant way of doing this, but this should solve your problem.
Database Schema
Maybe it's worth it to create a contacts table for all your contacts and then do a some sort of join on you data. Then your contacts property in your clients array would look something like:
{
id: Id,
date: Date,
clients: [{
client_id: Id,
contact_scores: {
Id: score(Number)
},
contact_feedbacks: {
Id: feedback(String)
}
}]
}

database schema
{
"clients": [
{
"name": "jacob" ,
"score": 200
} ,
{
"name": "jorge" ,
"score": 57
}
] ,
"id": "70589f08-284c-495a-b089-005812ec589f"
}
then you can do like this using map and branch query .
r.db('users').table('participants').get('70589f08-284c-495a-b089-005812ec589f')
.update({"clients": r.row('clients').map(function(elem){
return r.branch(
elem('name').eq("jacob"),
elem.merge({ "score": 100 }),
elem)})
})

it works for me
r.table(...).get(...).update({
contacts: r.row('Contacts').changeAt(0,
r.row('Contacts').nth(0).merge({feedback: "NICE"}))
})

ReQL solution
Creating a query to update a JSON array of objects in-place, is a rather complicated process in ReThinkDB (and most query languages). The best (and only) solution in ReQL that I know about, is to use a combination of update,offsetsOf,do,changeAt, and merge functions. This solution will retain the order of objects in the array, and only modify values on objects which match in the offsetsOf methods.
The following code (or something similar) can be used to update an array of objects (i.e. clients) which contain an array of objects (i.e. contracts).
Where '%_databaseName_%', '%_tableName_%', '%_documentUUID_%', %_clientValue_%, and %_contractValue_% must be provided.
r.db('%_databaseName_%').table('%_tableName_%').get('%_documentUUID_%').update(row =>
row('clients')
.offsetsOf(clients => client('client_id').eq('%_clientValue_%'))(0)
.do(clientIndex => ({
clients: row('clients')(clientIndex)
.offsetsOf(contacts => contact('contact_id').eq('%_contactValue_%')))(0)
.do(contactIndex => ({
contacts: row(clientIndex)
.changeAt(contractIndex, row(clientIndex)(contractIndex).merge({
'score': 0,
'feedback': 'xyz'
}))
})
}))
)
Why go through the trouble of forming this into ReQL?
survey
.pluck({ clients: 'contacts' }).run()
.then(results => {
results.clients.forEach((item, outerIndex) => {
item.contacts.forEach((item, index, array) => {
if(Number(item.contact_id) === Number(obj.contact_id)) {
array[index].score = obj.score;
console.log(outerIndex, index);
}
});
});
return survey.update(results).run()
})
While the code provided by Jacob (the user who asked the question here on Stack Overflow - shown above) might look simpler to write, the performance is probably not as good as the ReQL solution.
1) The ReQL solution runs on the query-server (i.e. database side) and therefore the code is optimized during the database write (higher performance). Whereas the code above, does not make full use of the query-server, and makes a read and write request pluck().run() and update().run(), and data is processed on the client-request side (i.e. NodeJs side) after the pluck() query is run (lower performance).
2) The above code requires the query-server to send back all the data to the client-request side (i.e. NodeJs side) and therefore the response payload (internet bandwidth usage / download size) can be several megabytes. Whereas the ReQL solution is processed on the query-server, and therefore the response payload typically just confirms that the write was completed, in other words only a few bytes are sent back to the client-request side. Which is done in a single request.
ReQL is too complicated
However, ReQL (and especially SQL) seem overly complicated when working with JSON, and it seems to me that JSON should be used when working with JSON.
I've also proposed that the ReThinkDB community adopt an alternative to ReQL that uses JSON instead (https://github.com/rethinkdb/rethinkdb/issues/6736).
The solution to updating nested JSON arrays should be as simple as...
r('database.table').update({
clients: [{
client_id: 0,
contacts: [{
contact_id: 0,
score: 0,
feedback: 'xyz',
}]
}]
});

tfmontague is on the right path but I think his answer can be improved a lot. Because he uses ...(0) there's a possibility for his answer to throw errors.
zabusa also provides a ReQL solution using map and branch but doesn't show the complete nested update. I will expand on this technique.
ReQL expressions are composable so we can isolate complexity and avoid repetition. This keeps the code flat and clean.
First write a simple function mapIf
const mapIf = (rexpr, test, f) =>
rexpr.map(x => r.branch(test(x), f(x), x));
Now we can write the simplified updateClientContact function
const updateClientContact = (doc, clientId, contactId, patch) =>
doc.merge
( { clients:
mapIf
( doc('clients')
, c => c('client_id').eq(clientId)
, c =>
mapIf
( c('contacts')
, c => c('contact_id').eq(contactId)
, c =>
c.merge(patch)
)
)
}
);
Use it like this
// fetch the document to update
const someDoc =
r.db(...).table(...).get(...);
// create patch for client id [1] and contact id [12]
const patch =
updateClientContact(someDoc, 1, 12, { name: 'x', feedback: 'z' });
// apply the patch
someDoc.update(patch);
Here's a concrete example you can run in reql> ...
const testDoc =
{ clients:
[ { client_id: 1
, contacts:
[ { contact_id: 11, name: 'a' }
, { contact_id: 12, name: 'b' }
, { contact_id: 13, name: 'c' }
]
}
, { client_id: 2
, contacts:
[ { contact_id: 21, name: 'd' }
, { contact_id: 22, name: 'e' }
, { contact_id: 23, name: 'f' }
]
}
, { client_id: 3
, contacts:
[ { contact_id: 31, name: 'g' }
, { contact_id: 32, name: 'h' }
, { contact_id: 33, name: 'i' }
]
}
]
};
updateClientContact(r.expr(testDoc), 2, 23, { name: 'x', feedback: 'z' });
The result will be
{ clients:
[ { client_id: 1
, contacts:
[ { contact_id: 11, name: 'a' }
, { contact_id: 12, name: 'b' }
, { contact_id: 13, name: 'c' }
]
}
, { client_id: 2
, contacts:
[ { contact_id: 21, name: 'd' }
, { contact_id: 22, name: 'e' }
, { contact_id: 23, name: 'x', feedback: 'z' } // <--
]
}
, { client_id: 3
, contacts:
[ { contact_id: 31, name: 'g' }
, { contact_id: 32, name: 'h' }
, { contact_id: 33, name: 'i' }
]
}
]
}

Better late than never
I had your same problem and i could solve it with two ways:
With specific client_id
r.db('nameDB').table('nameTable').get('idRegister')
.update({'clients': r.row('clients')
.map(elem=>{
return r.branch(
elem('client_id').eq('your_specific_client_id'),
elem.merge({
contacts: elem('contacts').map(elem2=>
r.branch(
elem2('contact_id').eq('idContact'),
elem2.merge({
score: 99999,
feedback: 'yourString'
}),
elem2
)
)
}),
elem
)
})
})
Without specific client_id
r.db('nameDB').table('nameTable').get('idRegister')
.update({'clients': r.row('clients')
.map(elem=>
elem.merge({
contacts: elem('contacts').map(elem2=>
r.branch(
elem2('contact_id').eq('idContact'),
elem2.merge({
score: 99999,
feedback: 'yourString'
}),
elem2
)
)
})
)
})
I hope that it works for you, even when happened much time ago

MongoDB queries optimisation

I wish to retrieve several information from my User model that looks like this:
var userSchema = new mongoose.Schema({
email: { type: String, unique: true, lowercase: true },
password: String,
created_at: Date,
updated_at: Date,
genre : { type: String, enum: ['Teacher', 'Student', 'Guest'] },
role : { type: String, enum: ['user', 'admin'], default: 'user' },
active : { type: Boolean, default: false },
profile: {
name : { type: String, default: '' },
headline : { type: String, default: '' },
description : { type: String, default: '' },
gender : { type: String, default: '' },
ethnicity : { type: String, default: '' },
age : { type: String, default: '' }
},
contacts : {
email : { type: String, default: '' },
phone : { type: String, default: '' },
website : { type: String, default: '' }
},
location : {
formattedAddress : { type: String, default: '' },
country : { type: String, default: '' },
countryCode : { type: String, default: '' },
state : { type: String, default: '' },
city : { type: String, default: '' },
postcode : { type: String, default: '' },
lat : { type: String, default: '' },
lng : { type: String, default: '' }
}
});
In Homepage I have a filter for location where you can browse Users from Country or City.
All the fields contains also the number of users in there:
United Kingdom
All Cities (300)
London (150)
Liverpool (80)
Manchester (70)
France
All Cities (50)
Paris (30)
Lille (20)
Nederland
All Cities (10)
Amsterdam (10)
Etc...
This in the Homepage, then I have also the Students and Teachers pages where I wish to have information only about how many teachers there are in those Countries and Cities...
What I'm trying to do is to create a query to MongoDB to retrieve all these information with a single query.
At the moment the query looks like this:
User.aggregate([
{
$group: {
_id: { city: '$location.city', country: '$location.country', genre: '$genre' },
count: { $sum: 1 }
}
},
{
$group: {
_id: '$_id.country',
count: { $sum: '$count' },
cities: {
$push: {
city: '$_id.city',
count: '$count'
}
},
genres: {
$push: {
genre: '$_id.genre',
count: '$count'
}
}
}
}
], function(err, results) {
if (err) return next();
res.json({
res: results
});
});
The problem is that I don't know how to get all the information I need.
I don't know how to get the length of the total users in every Country.
I have the users length for each Country.
I have the users length for each city.
I don't know how to get the same but for specific genre.
Is it possible to have all these information with a single query in Mongo?
Otherwise:
Creating few promises with 2, 3 different requests to Mongo like this:
getSomething
.then(getSomethingElse)
.then(getSomethingElseAgain)
.done
I'm sure it would be easier storing every time specified data but: is it good for performance when there are more than 5000 / 10000 users in the DB?
Sorry but I'm still in the process of learning and I think these things are crucial to understand MongoDB performance / optimisation.
Thanks

What you want is a "faceted search" result where you hold the statistics about the matched terms in the current result set. Subsequently, while there are products that "appear" to do all the work in a single response, you have to consider that most generic storage engines are going to need multiple operations.
With MongoDB you can use two queries to get the results themselves and another to get the facet information. This would give similar results to the faceted results available from dedicated search engine products like Solr or ElasticSearch.
But in order to do this effectively, you want to include this in your document in a way it can be used effectively. A very effective form for what you want is using an array of tokenized data:
{
"otherData": "something",
"facets": [
"country:UK",
"city:London-UK",
"genre:Student"
]
}
So "factets" is a single field in your document and not in multiple locations. This makes it very easy to index and query. Then you can effectively aggregate across your results and get the totals for each facet:
User.aggregate(
[
{ "$unwind": "$facets" },
{ "$group": {
"_id": "$facets",
"count": { "$sum": 1 }
}}
],
function(err,results) {
}
);
Or more ideally with some criteria in $match:
User.aggregate(
[
{ "$match": { "facets": { "$in": ["genre:student"] } } },
{ "$unwind": "$facets" },
{ "$group": {
"_id": "$facets",
"count": { "$sum": 1 }
}}
],
function(err,results) {
}
);
Ultimately giving a response like:
{ "_id": "country:FR", "count": 50 },
{ "_id": "country:UK", "count": 300 },
{ "_id": "city:London-UK", "count": 150 },
{ "_id": "genre:Student": "count": 500 }
Such a structure is easy to traverse and inspect for things like the discrete "country" and the "city" that belongs to a "country" as that data is just separated consistently by a hyphen "-".
Trying to mash up documents within arrays is a bad idea. There is a BSON size limit of 16MB to be respected also, from which mashing together results ( especially if you are trying to keep document content ) is most certainly going to end up being exceeded in the response.
For something as simple as then getting the "overall count" of results from such a query, then just sum up the elements of a particular facet type. Or just issue your same query arguments to a .count() operation:
User.count({ "facets": { "$in": ["genre:Student"] } },function(err,count) {
});
As said here, particularly when implementing "paging" of results, then the roles of getting "Result Count", "Facet Counts" and the actual "Page of Results" are all delegated to "separate" queries to the server.
There is nothing wrong with submitting each of those queries to the server in parallel and then combining a structure to feed to your template or application looking much like the faceted search result from one of the search engine products that offers this kind of response.
Concluding
So put something in your document to mark the facets in a single place. An array of tokenized strings works well for this purpose. It also works well with query forms such as $in and $all for either "or" or "and" conditions on facet selection combinations.
Don't try and mash results or nest additions just to match some perceived hierarchical structure, but rather traverse the results received and use simple patterns in the tokens. It's very simple to
Run paged queries for the content as separate queries to either facets or overall counts. Trying to push all content in arrays and then limit out just to get counts does not make sense. The same would apply to a RDBMS solution to do the same thing, where paging result counts and the current page are separate query operations.
There is more information written on the MongoDB Blog about Faceted Search with MongoDB that also explains some other options. There are also articles on integration with external search solutions using mongoconnector or other approaches.

Develop Reference

JavaScript is the programming language of the Web.

How to search for partial match using index in fauna db - javascript

Related

Mongodb find documents with given field value inside an array and its id in another array inside the same document

Edit multiple objects in array using mongoose (MongoDB)

finding a value in an object

RethinkDB - Updating nested array

MongoDB queries optimisation

Categories

Resources