CouchDB case in-sensitive and OR options like in mysql - javascript

I'm quite new with couchDB and I need a little support.
In MySQL I could run simply this query:
SELECT `name`, `id`, `desc` FROM `table`
WHERE `name`="jack" OR `cat` LIKE "%|52224|%";
And here are my two problems:
I started to create a view (still without Like option and everything):
function(doc) {
emit([doc.name, doc.cat], {
"name" : doc.name,
"desc" : doc.desc,
"id" : doc._id
});
}
1. When I use "emit([doc.name" the string must match 100% (also case sensitive).
-> How I make this option case un-sesnsitive?
That I can ask for ("Jack, jack, jAck, JAck,...) like in mysql?
2. How I create the OR option?
When I use [doc.name, doc.cat], I'm also forced to ask for both vars.
But when I have just one of them,
how I can query without creating for each option an own view?

To implement case insensitive search you just have to convert keys to lowercase:
emit([doc.name.toLowerCase(), doc.cat.toLowerCase()])
Now, if you convert your query to lowercase you'll have case insensitive matching.
The problem with this solution is that if you want both case sensitive and case insensitive search you can either emit both values:
[doc.name.toLowerCase(), doc.name, doc.cat]
and use startkey and endkey to filter results, or create a separate view.
The second question is a bit more tricky to implement.
First of all, if you need to filter by doc.name only you can send request with startkey=["jack",0] and endkey=["jack",'zzzzzz'] which will return all documents with doc.name="jack" and doc.cat between 0 and 'zzzzzz' (there probably is a better way to say "any cat", unfortunately I can't find it right now).
If you need a real OR, then you should emit two rows for each document:
emit(doc.name, doc); emit(doc.cat, doc);
This way you can POST needed keys with your request: {"keys": ["jack", "cat_name"]}
This will return every document with either "jack" OR "cat_name" key. However documents that have both will be returned twice, so you have to filter duplicates in your application code.
You can also use couchdb-lucene which will solve both of your problems and probably a lot more. It is a popular choice among couchdb users for implementing advanced queries.

Related

In MarkLogic, how do I search in JSON documents using only the key?

I have a bunch of JSON documents in my db. I need to perform delete operation on a few documents by searching the documents that have the particular field present in them {key only}. What query can I add to my code so that it finds all the documents with the field? I will be using them to get their values(integer), put them in an array and then use them one by one.
Expanding a bit on the link provided by George Bailey, you might want to use cts.uris() instead of cts.search() because xdmp.documentDelete() takes uri strings instead of documents:
const uris = cts.uris(
null,
['score-zero', 'unchecked'],
cts.jsonPropertyScopeQuery('theKey', cts.trueQuery())
);
xdmp.documentDelete(uris);
If it's a large number of documents, you might need to specify the start value and a limit on the call to cts.uris() to delete different slices of documents in multiple passes.
Hoping that helps,

How to do an 'AND' statement in Firebase or equivalent?

I need to do a query where I can show only specific data using an 'AND' statement or equivalent to it. I have taken the example which is displayed in the Firebase Documentation.
// Find all dinosaurs whose height is exactly 25 meters.
var ref = firebase.database().ref("dinosaurs");
ref.orderByChild("height").equalTo(25).on("child_added", function(snapshot) {
console.log(snapshot.key);
});
I understand this line is going to retrieve all the dinosaurs whose height is exactly 25, BUT, I need to show all dinosaurs whose height is '25' AND name is 'Dino'. Is there any way to retrieve this information?
Thanks in advance.
Actually firebase only supports filtering/ordering with one propery, but if you want to filter with more than one property like you said I want to filter with age and name, you have to use composite keys.
There is a third party library called querybase which gives you some capabilities of multy property filtering. See https://github.com/davideast/Querybase
You cannot query by multiple keys.
If you need to sort by two properties your options are:
Create a hybrid key. In reference to your example, if you wanted to get all 'Dino' and height '25' then you would create a hybrid name_age key which could look something like Dino_25. This will allow you to query and search for items with exactly the same value but you lose the ability for ordering (i.e. age less than x).
Perform one query on Firebase and the other client side. You can query by name on Firebase and then iterate through the results and keep the results that match age 25.
Without knowing much about your schema I would advise you to make sure you're flattening your data sufficiently. Often I have found that many multi-level queries can be solved by looking at how I'm storing the data. This is not always the case and sometimes you may just have to take one of the routes I have mentioned above.

Knex.js insert from select

I'm trying to generate a query like the following via Knex.js:
INSERT INTO table ("column1", "column2")
SELECT "someVal", 12345
WHERE NOT EXISTS (
SELECT 1
FROM table
WHERE "column2" = 12345
)
Basically, I want to insert values only if a particular value does not already exist. But Knex.js doesn't seem to know how to do this; if I call knex.insert() (with no values), it generates an "insert default values" query.
I tried the following:
pg.insert()
.into(tableName)
.select(_.values(data))
.whereNotExists(
pg.select(1)
.from(tableName)
.where(blah)
);
but that still just gives me the default values thing. I tried adding a .columns(Object.keys(data)) in hopes that insert() would honor that, but no luck.
Is it possible to generate the query I want with knex, or will I just have to build up a raw query, without Knex.js methods?
I believe the select needs to be passed into the insert:
pg.insert(knex.select().from("tableNameToSelectDataFrom")).into("tableToInsertInto");
Also in order to select constant values or expressions you'll need to use a knex.raw expression in the select:
knex.select(knex.raw("'someVal',12345")).from("tableName")
This is my first post and I didn't test your specific example but I've done similar things like what you're asking using the above techniques.
The most comprehensive answer I've found (with explicit column names for INSERT, and custom values in the SELECT-statement) is here:
https://github.com/knex/knex/issues/1056#issuecomment-156535234
by Chris Broome
here is a copy of that solution:
const query = knex
// this part generates this part of the
// query: INSERT "tablename" ("field1", "field2" ..)
.into(knex.raw('?? (??, ??)', ['tableOrders', 'field_user_id', 'email_field']))
// and here is the second part of the SQL with "SELECT"
.insert(function() {
this
.select(
'user_id', // select from column without alias
knex.raw('? AS ??', ['jdoe#gmail.com', 'email']), // select static value with alias
)
.from('users AS u')
.where('u.username', 'jdoe')
});
console.log(query.toString());
and the SQL-output:
insert into "orders" ("user_id", "email")
select "user_id", 'jdoe#gmail.com' AS "email"
from "users" as "u"
where "u"."username" = 'jdoe'
another one approach (by Knex developer): https://github.com/knex/knex/commit/e74f43cfe57ab27b02250948f8706d16c5d821b8#diff-cb48f4af7c014ca6a7a2008c9d280573R608 - also with knex.raw
I've managed to make it work in my project and it doesn't look all that bad!
.knex
.into(knex.raw('USER_ROLES (ORG_ID, USER_ID, ROLE_ID, ROLE_SOURCE, ROLE_COMPETENCE_ID)'))
.insert(builder => {
builder
.select([
1,
'CU.USER_ID',
'CR.ROLE_ID',
knex.raw(`'${ROLES.SOURCE_COMPETENCE}'`),
competenceId,
])
.from('COMPETENCE_USERS as CU')
.innerJoin('COMPETENCE_ROLES as CR', 'CU.COMPETENCE_ID', 'CR.COMPETENCE_ID')
.where('CU.COMPETENCE_ID', competenceId)
.where('CR.COMPETENCE_ID', competenceId);
Note that this doesn't seem to work properly with the returning clause on MSSQL (it's just being ignored by Knex) at the moment.

How do I create a "like" filter view in CouchDB

Here's an example of what I need in sql:
SELECT name FROM employ WHERE name LIKE %bro%
How do I create view like that in CouchDB?
The simple answer is that CouchDB views aren't ideal for this.
The more complicated answer is that this type of query tends to be very inefficient in typical SQL engines too, and so if you grant that there will be tradeoffs with any solution then CouchDB actually has the benefit of letting you choose your tradeoff.
1. The SQL Ways
When you do SELECT ... WHERE name LIKE %bro%, all the SQL engines I'm familiar with must do what's called a "full table scan". This means the server reads every row in the relevant table, and brute force scans the field to see if it matches.
You can do this in CouchDB 2.x with a Mango query using the $regex operator. The query would look something like this for the basic case:
{"selector":{
"name": {
"$regex": "bro"
}
}}
There do not appear to be any options exposed for case-sensitivity, etc. but you could extend it to match only at the beginning/end or more complicated patterns. If you can also restrict your query via some other (indexable) field operator, that would likely help performance. As the documentation warns:
Regular expressions do not work with indexes, so they should not be used to filter large data sets. […]
You can do a full scan in CouchDB 1.x too, using a temporary view:
POST /some_database/_temp_view
{"map": "function (doc) { if (doc.name && doc.name.indexOf('bro') !== -1) emit(null); }"}
This will look through every single document in the database and give you a list of matching documents. You can tweak the map function to also match on a document type, or to emit with a certain key for ordering — emit(doc.timestamp) — or some data value useful to your purpose — emit(null, doc.name).
2. The "tons of disk space available" way
Depending on your source data size you could create an index that emits every possible "interior string" as its permanent (on-disk) view key. That is to say for a name like "Dobros" you would emit("dobros"); emit("obros"); emit("bros"); emit("ros"); emit("os"); emit("s");. Then for a term like '%bro%' you could query your view with startkey="bro"&endkey="bro\uFFFF" to get all occurrences of the lookup term. Your index will be approximately the size of your text content squared, but if you need to do an arbitrary "find in string" faster than the full DB scan above and have the space this might work. You'd be better served by a data structure designed for substring searching though.
Which brings us too...
3. The Full Text Search way
You could use a CouchDB plugin (couchdb-lucene now via Dreyfus/Clouseau for 2.x, ElasticSearch, SQLite's FTS) to generate an auxiliary text-oriented index into your documents.
Note that most full text search indexes don't naturally support arbitrary wildcard prefixes either, likely for similar reasons of space efficiency as we saw above. Usually full text search doesn't imply "brute force binary search", but "word search". YMMV though, take a look around at the options available in your full text engine.
If you don't really need to find "bro" anywhere in a field, you can implement basic "find a word starting with X" search with regular CouchDB views by just splitting on various locale-specific word separators and omitting these "words" as your view keys. This will be more efficient than above, scaling proportionally to the amount of data indexed.
Unfortunately, doing searches using LIKE %...% aren't really how CouchDB Views work, but you can accomplish a great deal of search capability by installing couchdb-lucene, it's a fulltext search engine that creates indexes on your database that you can do more sophisticated searches with.
The typical way to "search" a database for a given key, without any 3rd party tools, is to create a view that emits the value you are looking for as the key. In your example:
function (doc) {
emit(doc.name, doc);
}
This outputs a list of all the names in your database.
Now, you would "search" based on the first letters of your key. For example, if you are searching for names that start with "bro".
/db/_design/test/_view/names?startkey="bro"&endkey="brp"
Notice I took the last letter of the search parameter, and "incremented" the last letter in it. Again, if you want to perform searches, rather than aggregating statistics, you should use a fulltext search engine like lucene. (see above)
You can use regular expressions. As per this table you can write something like this to return any id that contains "SMS".
{
"selector": {
"_id": {
"$regex": "sms"
}
}
}
Basic regex you can use on that includes
"sms$" roughly to LIKE "%sms"
"^sms" roughly to LIKE "sms%"
You can read more on regular expressions here
i found a simple view code for my problem...
{"getavailableproduct": {
"map": "function(doc) { var prefix = doc['productid'].match(/[A-Za-z0-9]+/g); if(prefix) for(var pre in prefix) { emit(prefix[pre],null); } }"
}
}
from this view code if i split a key sentence into a key word...
and i can call
?key="[search_keyword]"
but i need more complex code because if i run this code i can only find word wich i type (ex: eat, food, etc)...
but if i want to type not a complete word (ex: ea from eat, or foo from food) that code does not work..
I know it is an old question, but: What about using a "list" function? You can have all your normal views, andthen add a "list" function to the design document to process the view's results:
{
"_id": "_design/...",
"views": {
"employees": "..."
},
"lists": {
"by_name": "..."
}
}
And the function attached to "by_name" function, should be something like:
function (head, req) {
provides('json', function() {
var filtered = [];
while (row = getRow()) {
// We can retrive all row information from the view
var key = row.key;
var value = row.value;
var doc = req.query.include_docs ? row.doc : {};
if (value.name.indexOf(req.query.name) == 0) {
if (req.query.include_docs) {
filtered.push({ key: key, value: value, doc: doc});
} else {
filtered.push({ key: key, value: value});
}
}
}
return toJSON({ total_rows: filtered.length, rows: filtered });
});
}
You can, of course, use regular expressions too. It's not a perfect solution, but it works to me.
You could emit your documents like normal. emit(doc.name, null); I would throw a toLowerCase() on that name to remove case sensitivity.
and then query the view with a slew of keys to see if something "like" the query shows up.
keys = differentVersions("bro"); // returns ["bro", "br", "bo", "ro", "cro", "dro", ..., "zro"]
$.couch("db").view("employeesByName", { keys: keys, success: dealWithIt } )
Some considerations
Obviously that array can get really big really fast depending on what differentVersions returns. You might hit a post data limit at some point or conceivably get slow lookups.
The results are only as good as differentVersions is at giving you guesses for what the person meant to spell. Obviously this function can be as simple or complex as you like. In this example I tried two strategies, a) removed a letter and pushed that, and b) replaced the letter at position n with all other letters. So if someone had been looking for "bro" but typed in "gro" or "bri" or even "bgro", differentVersions would have permuted that to "bro" at some point.
While not ideal, it's still pretty fast since a look up in Couch's b-trees is fast.
why cann't we just use indexOf() in view?

What is the maximum value for a compound CouchDB key?

I'm using what seems to be a common trick for creating a join view:
// a Customer has many Orders; show them together in one view:
function(doc) {
if (doc.Type == "customer") {
emit([doc._id, 0], doc);
} else if (doc.Type == "order") {
emit([doc.customer_id, 1], doc);
}
}
I know I can use the following query to get a single customer and all related Orders:
?startkey=["some_customer_id"]&endkey=["some_customer_id", 2]
But now I've tied my query very closely to my view code. Is there a value I can put where I put my "2" to more clearly say, "I want everything tied to this Customer"? I think I've seen
?startkey=["some_customer_id"]&endkey=["some_customer_id", {}]
But I'm not sure that {} is certain to sort after everything else.
Credit to cmlenz for the join method.
Further clarification from the CouchDB wiki page on collation:
The query startkey=["foo"]&endkey=["foo",{}] will match most array keys with "foo" in the first element, such as ["foo","bar"] and ["foo",["bar","baz"]]. However it will not match ["foo",{"an":"object"}]
So {} is late in the sort order, but definitely not last.
I have two thoughts.
Use timestamps
Instead of using simple 0 and 1 for their collation behavior, use a timestamp that the record was created (assuming they are part of the records) a la [doc._id, doc.created_at]. Then you could query your view with a startkey of some sufficiently early date (epoch would probably work), and an endkey of "now", eg date +%s. That key range should always include everything, and it has the added benefit of collating by date, which is probably what you want anyways.
or, just don't worry about it
You could just index by the customer_id and nothing more. This would have the nice advantage of being able to query using just key=<customer_id>. Sure, the records won't be collated when they come back, but is that an issue for your application? Unless you are expecting tons of records back, it would likely be trivial to simply pluck the customer record out of the list once you have the data retrieved by your application.
For example in ruby:
customer_records = records.delete_if { |record| record.type == "customer" }
Anyways, the timestamps is probably the more attractive answer for your case.
Rather than trying to find the greatest possible value for the second element in your array key, I would suggest instead trying to find the least possible value greater than the first: ?startkey=["some_customer_id"]&endkey=["some_customer_id\u0000"]&inclusive_end=false.
CouchDB is mostly written in Erlang. I don't think there would be an upper limit for a string compound/composite key tuple sizes other than system resources (e.g. a key so long it used all available memory). The limits of CouchDB scalability are unknown according to the CouchDB site. I would guess that you could keep adding fields into a huge composite primary key and the only thing that would stop you is system resources or hard limits such as maximum integer sizes on the target architecture.
Since CouchDB stores everything using JSON, it is probably limited to the largest number values by the ECMAScript standard.All numbers in JavaScript are stored as a floating-point IEEE 754 double. I believe the 64-bit double can represent values from - 5e-324 to +1.7976931348623157e+308.
It seems like it would be nice to have a feature where endKey could be inclusive instead of exclusive.
This should do the trick:
?startkey=["some_customer_id"]&endkey=["some_customer_id", "\uFFFF"]
This should include anything that starts with a character less than \uFFFF (all unicode characters)
http://wiki.apache.org/couchdb/View_collation

Categories

Resources