Equal Precedence View Collation CouchDB? - javascript

According to the view collation documentation for CouchDB(
http://wiki.apache.org/couchdb/View_collation), member order does matter for collation. I was wondering if there is a way to disable this attribute such that collation order does not matter? I want to be able to "search" my views such that the documents that are emitted satisfy all the key ranges for the field.
here is some more on view collation for your reference: CouchDB sorting and filtering in the same view
Likewise, if it is possible to set CouchDB such that order does not matter for view collation, the following parameters used for the GET request should only emit docs where doc.phone_number == "ZZZZZZZ" , whereas right now it emits the documents that fall within the range of the first 3 keys and completely ignores the last key. This occurs because the last key has the least precedence in the current collation scheme.
startkey: [null,null,null,"ZZZZZZZ"],
endkey: ["\ufff0","\ufff0","\ufff0","ZZZZZZZZ"],
Sample Mapping Function
var map = function(doc) {
/*
//Keys emitted
1. name
2. address
3. age
3. phone_number
*/
emit([doc.name,doc.address,doc.num_age,doc.phone_number],doc._id)
}
Is this possible, or do I have to create multiple views to perform this? The use of multiple views seems very inefficent.
I've read that CouchDB-Lucene:( How to realize complex search filters in couchdb? Should I avoid temporary views? )would be helpful for complex searching, but that doesn't seem applicable in this case.

Use of multiple views is not inefficient, quite to the contrary : having four views (name, address, age and phone number) will not use significantly more time or memory than having a single view emit everything. It is the simple, straightforward, efficient way of performing "WHERE field = value" queries in CouchDB.
If you are in fact looking for "WHERE field = value AND field2 = value2" queries, then CouchDB will not help you, and you will need to use Lucene.
You need to understand that the collation merely describes how keys are ordered. Even if you could specify any arbitrary collation, you will still have to deal with the fact that CouchDB need you to define an order for the keys, and only lets you query contiguous ranges of keys. This is not compatible with multi-dimensional range queries.

Related

How to do an 'AND' statement in Firebase or equivalent?

I need to do a query where I can show only specific data using an 'AND' statement or equivalent to it. I have taken the example which is displayed in the Firebase Documentation.
// Find all dinosaurs whose height is exactly 25 meters.
var ref = firebase.database().ref("dinosaurs");
ref.orderByChild("height").equalTo(25).on("child_added", function(snapshot) {
console.log(snapshot.key);
});
I understand this line is going to retrieve all the dinosaurs whose height is exactly 25, BUT, I need to show all dinosaurs whose height is '25' AND name is 'Dino'. Is there any way to retrieve this information?
Thanks in advance.
Actually firebase only supports filtering/ordering with one propery, but if you want to filter with more than one property like you said I want to filter with age and name, you have to use composite keys.
There is a third party library called querybase which gives you some capabilities of multy property filtering. See https://github.com/davideast/Querybase
You cannot query by multiple keys.
If you need to sort by two properties your options are:
Create a hybrid key. In reference to your example, if you wanted to get all 'Dino' and height '25' then you would create a hybrid name_age key which could look something like Dino_25. This will allow you to query and search for items with exactly the same value but you lose the ability for ordering (i.e. age less than x).
Perform one query on Firebase and the other client side. You can query by name on Firebase and then iterate through the results and keep the results that match age 25.
Without knowing much about your schema I would advise you to make sure you're flattening your data sufficiently. Often I have found that many multi-level queries can be solved by looking at how I'm storing the data. This is not always the case and sometimes you may just have to take one of the routes I have mentioned above.

Parse - How do I query a Class and include another that points to it?

I have two classes - _User and Car. A _User will have a low/limited number of Cars that they own. Each Car has only ONE owner and thus an "owner" column that is a to the _User. When I got to the user's page, I want to see their _User info and all of their Cars. I would like to make one call, in Cloud Code if necessary.
Here is where I get confused. There are 3 ways I could do this -
In _User have a relationship column called "cars" that points to each individual Car. If so, how come I can't use the "include(cars)" function on a relation to include the Cars' data in my query?!!
_User.cars = relationship, Car.owner = _User(pointer)
Query the _User, and then query all Cars with (owner == _User.objectId) separately. This is two queries though.
_User.cars = null, Car.owner = _User(pointer)
In _User have a array of pointers column called "cars". Manually inject pointers to cars upon car creation. When querying the user I would use "include(cars)".
_User.cars = [Car(pointer)], Car.owner = _User(pointer)
What is your recommended way to do this and why? Which one is the fastest? The documentation just leaves me further confused.
I recommend you the 3rd option, and yes, you can ask to include an array. You even don't need to "manually inject" the pointers, you just need to add the objects into the array and they'll automatically be converted into pointers.
You've got the right ideas. Just to clarify them a bit:
A relation. User can have a relation column called cars. To get from user to car, there's a user query and then second query like user.relation("cars").query, on which you would .find().
What you might call a belongs_to pointer in Car. To get from user to car you'd have a query to get your user and you create a carQuery like carQuery.equalTo("user", user)
An array of pointers. For small-sized collections, this is superior to the relation, because you can aggressively load cars when querying user by saying include("cars") on a user query. Not sure if there's a second query under the covers - probably not if parse (mongo) is storing these as embedded.
But I wouldn't get too tied up over one or two queries. Using the promise forms of find() will keep your code nice and tidy. There probably is a small speed advantage to the array technique, which is good while the collection size is small (<100 is my rule of thumb).
It's easy to google (or I'll add here if you have a specific question) code examples for maintaining the relations and for getting from user->car or from car->user for each approach.

javascript where clause builder, including functions

I am looking for a visual javascript WHERE clause builder that also supports scalar functions. E.g.,
...
WHERE COL_A = '1234'
AND IS_HOLIDAY(DATECOL, LOCALITY)
AND (BASE_MEASURE(UOM, QTY) > 1234 OR AMOUNT > 1234)
...
The ones I have seen lack support for functions. (Like in MS-Excel.) They also tend to pull the table schema directly from database which is not suitable for me, esp. as I have just one set to search and have no access to database.
What I have: a list of searchable column names and their data types, several sets of functions (simple mathematical/date/string and many homebrewn ones encapsulating business rules) and their arguments / return types. Null support is not important, nor are function call within function calls. What I need is a javascript library that I can encode for metadata and let loose in a DOM element; it manages the user interactions, returning a WHERE clause string.
I will appreciate if you could please point me to either some good starting code or to an outline on how to add function support, or both.
Thanks.

How do I create a "like" filter view in CouchDB

Here's an example of what I need in sql:
SELECT name FROM employ WHERE name LIKE %bro%
How do I create view like that in CouchDB?
The simple answer is that CouchDB views aren't ideal for this.
The more complicated answer is that this type of query tends to be very inefficient in typical SQL engines too, and so if you grant that there will be tradeoffs with any solution then CouchDB actually has the benefit of letting you choose your tradeoff.
1. The SQL Ways
When you do SELECT ... WHERE name LIKE %bro%, all the SQL engines I'm familiar with must do what's called a "full table scan". This means the server reads every row in the relevant table, and brute force scans the field to see if it matches.
You can do this in CouchDB 2.x with a Mango query using the $regex operator. The query would look something like this for the basic case:
{"selector":{
"name": {
"$regex": "bro"
}
}}
There do not appear to be any options exposed for case-sensitivity, etc. but you could extend it to match only at the beginning/end or more complicated patterns. If you can also restrict your query via some other (indexable) field operator, that would likely help performance. As the documentation warns:
Regular expressions do not work with indexes, so they should not be used to filter large data sets. […]
You can do a full scan in CouchDB 1.x too, using a temporary view:
POST /some_database/_temp_view
{"map": "function (doc) { if (doc.name && doc.name.indexOf('bro') !== -1) emit(null); }"}
This will look through every single document in the database and give you a list of matching documents. You can tweak the map function to also match on a document type, or to emit with a certain key for ordering — emit(doc.timestamp) — or some data value useful to your purpose — emit(null, doc.name).
2. The "tons of disk space available" way
Depending on your source data size you could create an index that emits every possible "interior string" as its permanent (on-disk) view key. That is to say for a name like "Dobros" you would emit("dobros"); emit("obros"); emit("bros"); emit("ros"); emit("os"); emit("s");. Then for a term like '%bro%' you could query your view with startkey="bro"&endkey="bro\uFFFF" to get all occurrences of the lookup term. Your index will be approximately the size of your text content squared, but if you need to do an arbitrary "find in string" faster than the full DB scan above and have the space this might work. You'd be better served by a data structure designed for substring searching though.
Which brings us too...
3. The Full Text Search way
You could use a CouchDB plugin (couchdb-lucene now via Dreyfus/Clouseau for 2.x, ElasticSearch, SQLite's FTS) to generate an auxiliary text-oriented index into your documents.
Note that most full text search indexes don't naturally support arbitrary wildcard prefixes either, likely for similar reasons of space efficiency as we saw above. Usually full text search doesn't imply "brute force binary search", but "word search". YMMV though, take a look around at the options available in your full text engine.
If you don't really need to find "bro" anywhere in a field, you can implement basic "find a word starting with X" search with regular CouchDB views by just splitting on various locale-specific word separators and omitting these "words" as your view keys. This will be more efficient than above, scaling proportionally to the amount of data indexed.
Unfortunately, doing searches using LIKE %...% aren't really how CouchDB Views work, but you can accomplish a great deal of search capability by installing couchdb-lucene, it's a fulltext search engine that creates indexes on your database that you can do more sophisticated searches with.
The typical way to "search" a database for a given key, without any 3rd party tools, is to create a view that emits the value you are looking for as the key. In your example:
function (doc) {
emit(doc.name, doc);
}
This outputs a list of all the names in your database.
Now, you would "search" based on the first letters of your key. For example, if you are searching for names that start with "bro".
/db/_design/test/_view/names?startkey="bro"&endkey="brp"
Notice I took the last letter of the search parameter, and "incremented" the last letter in it. Again, if you want to perform searches, rather than aggregating statistics, you should use a fulltext search engine like lucene. (see above)
You can use regular expressions. As per this table you can write something like this to return any id that contains "SMS".
{
"selector": {
"_id": {
"$regex": "sms"
}
}
}
Basic regex you can use on that includes
"sms$" roughly to LIKE "%sms"
"^sms" roughly to LIKE "sms%"
You can read more on regular expressions here
i found a simple view code for my problem...
{"getavailableproduct": {
"map": "function(doc) { var prefix = doc['productid'].match(/[A-Za-z0-9]+/g); if(prefix) for(var pre in prefix) { emit(prefix[pre],null); } }"
}
}
from this view code if i split a key sentence into a key word...
and i can call
?key="[search_keyword]"
but i need more complex code because if i run this code i can only find word wich i type (ex: eat, food, etc)...
but if i want to type not a complete word (ex: ea from eat, or foo from food) that code does not work..
I know it is an old question, but: What about using a "list" function? You can have all your normal views, andthen add a "list" function to the design document to process the view's results:
{
"_id": "_design/...",
"views": {
"employees": "..."
},
"lists": {
"by_name": "..."
}
}
And the function attached to "by_name" function, should be something like:
function (head, req) {
provides('json', function() {
var filtered = [];
while (row = getRow()) {
// We can retrive all row information from the view
var key = row.key;
var value = row.value;
var doc = req.query.include_docs ? row.doc : {};
if (value.name.indexOf(req.query.name) == 0) {
if (req.query.include_docs) {
filtered.push({ key: key, value: value, doc: doc});
} else {
filtered.push({ key: key, value: value});
}
}
}
return toJSON({ total_rows: filtered.length, rows: filtered });
});
}
You can, of course, use regular expressions too. It's not a perfect solution, but it works to me.
You could emit your documents like normal. emit(doc.name, null); I would throw a toLowerCase() on that name to remove case sensitivity.
and then query the view with a slew of keys to see if something "like" the query shows up.
keys = differentVersions("bro"); // returns ["bro", "br", "bo", "ro", "cro", "dro", ..., "zro"]
$.couch("db").view("employeesByName", { keys: keys, success: dealWithIt } )
Some considerations
Obviously that array can get really big really fast depending on what differentVersions returns. You might hit a post data limit at some point or conceivably get slow lookups.
The results are only as good as differentVersions is at giving you guesses for what the person meant to spell. Obviously this function can be as simple or complex as you like. In this example I tried two strategies, a) removed a letter and pushed that, and b) replaced the letter at position n with all other letters. So if someone had been looking for "bro" but typed in "gro" or "bri" or even "bgro", differentVersions would have permuted that to "bro" at some point.
While not ideal, it's still pretty fast since a look up in Couch's b-trees is fast.
why cann't we just use indexOf() in view?

What is the maximum value for a compound CouchDB key?

I'm using what seems to be a common trick for creating a join view:
// a Customer has many Orders; show them together in one view:
function(doc) {
if (doc.Type == "customer") {
emit([doc._id, 0], doc);
} else if (doc.Type == "order") {
emit([doc.customer_id, 1], doc);
}
}
I know I can use the following query to get a single customer and all related Orders:
?startkey=["some_customer_id"]&endkey=["some_customer_id", 2]
But now I've tied my query very closely to my view code. Is there a value I can put where I put my "2" to more clearly say, "I want everything tied to this Customer"? I think I've seen
?startkey=["some_customer_id"]&endkey=["some_customer_id", {}]
But I'm not sure that {} is certain to sort after everything else.
Credit to cmlenz for the join method.
Further clarification from the CouchDB wiki page on collation:
The query startkey=["foo"]&endkey=["foo",{}] will match most array keys with "foo" in the first element, such as ["foo","bar"] and ["foo",["bar","baz"]]. However it will not match ["foo",{"an":"object"}]
So {} is late in the sort order, but definitely not last.
I have two thoughts.
Use timestamps
Instead of using simple 0 and 1 for their collation behavior, use a timestamp that the record was created (assuming they are part of the records) a la [doc._id, doc.created_at]. Then you could query your view with a startkey of some sufficiently early date (epoch would probably work), and an endkey of "now", eg date +%s. That key range should always include everything, and it has the added benefit of collating by date, which is probably what you want anyways.
or, just don't worry about it
You could just index by the customer_id and nothing more. This would have the nice advantage of being able to query using just key=<customer_id>. Sure, the records won't be collated when they come back, but is that an issue for your application? Unless you are expecting tons of records back, it would likely be trivial to simply pluck the customer record out of the list once you have the data retrieved by your application.
For example in ruby:
customer_records = records.delete_if { |record| record.type == "customer" }
Anyways, the timestamps is probably the more attractive answer for your case.
Rather than trying to find the greatest possible value for the second element in your array key, I would suggest instead trying to find the least possible value greater than the first: ?startkey=["some_customer_id"]&endkey=["some_customer_id\u0000"]&inclusive_end=false.
CouchDB is mostly written in Erlang. I don't think there would be an upper limit for a string compound/composite key tuple sizes other than system resources (e.g. a key so long it used all available memory). The limits of CouchDB scalability are unknown according to the CouchDB site. I would guess that you could keep adding fields into a huge composite primary key and the only thing that would stop you is system resources or hard limits such as maximum integer sizes on the target architecture.
Since CouchDB stores everything using JSON, it is probably limited to the largest number values by the ECMAScript standard.All numbers in JavaScript are stored as a floating-point IEEE 754 double. I believe the 64-bit double can represent values from - 5e-324 to +1.7976931348623157e+308.
It seems like it would be nice to have a feature where endKey could be inclusive instead of exclusive.
This should do the trick:
?startkey=["some_customer_id"]&endkey=["some_customer_id", "\uFFFF"]
This should include anything that starts with a character less than \uFFFF (all unicode characters)
http://wiki.apache.org/couchdb/View_collation

Categories

Resources