I am looking for a visual javascript WHERE clause builder that also supports scalar functions. E.g.,
...
WHERE COL_A = '1234'
AND IS_HOLIDAY(DATECOL, LOCALITY)
AND (BASE_MEASURE(UOM, QTY) > 1234 OR AMOUNT > 1234)
...
The ones I have seen lack support for functions. (Like in MS-Excel.) They also tend to pull the table schema directly from database which is not suitable for me, esp. as I have just one set to search and have no access to database.
What I have: a list of searchable column names and their data types, several sets of functions (simple mathematical/date/string and many homebrewn ones encapsulating business rules) and their arguments / return types. Null support is not important, nor are function call within function calls. What I need is a javascript library that I can encode for metadata and let loose in a DOM element; it manages the user interactions, returning a WHERE clause string.
I will appreciate if you could please point me to either some good starting code or to an outline on how to add function support, or both.
Thanks.
Related
I need to do a query where I can show only specific data using an 'AND' statement or equivalent to it. I have taken the example which is displayed in the Firebase Documentation.
// Find all dinosaurs whose height is exactly 25 meters.
var ref = firebase.database().ref("dinosaurs");
ref.orderByChild("height").equalTo(25).on("child_added", function(snapshot) {
console.log(snapshot.key);
});
I understand this line is going to retrieve all the dinosaurs whose height is exactly 25, BUT, I need to show all dinosaurs whose height is '25' AND name is 'Dino'. Is there any way to retrieve this information?
Thanks in advance.
Actually firebase only supports filtering/ordering with one propery, but if you want to filter with more than one property like you said I want to filter with age and name, you have to use composite keys.
There is a third party library called querybase which gives you some capabilities of multy property filtering. See https://github.com/davideast/Querybase
You cannot query by multiple keys.
If you need to sort by two properties your options are:
Create a hybrid key. In reference to your example, if you wanted to get all 'Dino' and height '25' then you would create a hybrid name_age key which could look something like Dino_25. This will allow you to query and search for items with exactly the same value but you lose the ability for ordering (i.e. age less than x).
Perform one query on Firebase and the other client side. You can query by name on Firebase and then iterate through the results and keep the results that match age 25.
Without knowing much about your schema I would advise you to make sure you're flattening your data sufficiently. Often I have found that many multi-level queries can be solved by looking at how I'm storing the data. This is not always the case and sometimes you may just have to take one of the routes I have mentioned above.
I have a firebase model where each object looks like this:
done: boolean
|
tags: array
|
text: string
Each object's tag array can contain any number of strings.
How do I obtain all objects with a matching tag? For example, find all objects where the tag contains "email".
Many of the more common search scenarios, such as searching by attribute (as your tag array would contain) will be baked into Firebase as the API continues to expand.
In the mean time, it's certainly possible to grow your own. One approach, based on your question, would be to simply "index" the list of tags with a list of records that match:
/tags/$tag/record_ids...
Then to search for records containing a given tag, you just do a quick query against the tags list:
new Firebase('URL/tags/'+tagName).once('value', function(snap) {
var listOfRecordIds = snap.val();
});
This is a pretty common NoSQL mantra--put more effort into the initial write to make reads easy later. It's also a common denormalization approach (and one most SQL database use internally, on a much more sophisticated level).
Also see the post Frank mentioned as that will help you expand into more advanced search topics.
I am building a query engine for a database which is pulling data from SQL and other sources. For normal use cases the users can use a web form where the use can specify filtering parameters with select and ranged inputs. But for advanced use cases, I'd like to to specify a filtering equation box where the users could type
AND, OR
Nested parenthesis
variable names
, <, =, != operators
So the filtering equation could look something like:
((age > 50) or (weight > 100)) and diabetes='yes'
Then this input would be parsed, input errors detected (non-existing variable name, etc) and SQL Alchemy queries built based on it.
I saw an earlier post about the similar problem https://stackoverflow.com/a/1395854/315168
There seem to exist several language and mini-language parsers for Python http://navarra.ca/?p=538
However, does there exist any package which would be out of the box solution or near solution for my problem? If not what would be the simplest way to construct such query parser and constructor in Python?
Have a look at https://github.com/dfilatov/jspath
It's similar to xpath, so the syntax isn't as familiar as SQL, but it's powerful over hierarchical data.
I don't know if this is still relevant to you, but here is my answer:
Firstly I have created a class that does exactly what you need. You may find it here:
https://github.com/snow884/filter_expression_parser/
It takes a list of dictionaries as an input + filter query and returns the filtered results. You just have to define the list of fields that are allowed plus functions for checking the format of the constants passed as a part of filter expression.
The filter expression it ingests has to have the following format:
(time > 45.34) OR (((user_id eq 1) OR (date gt '2019-01-04')) AND (username ne 'john.doe'))
or just
username ne 'john123'
Secondly it was foolish of me to even create this code because dataframe.query(...) from pandas already does almost exactly what you need: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html
According to the view collation documentation for CouchDB(
http://wiki.apache.org/couchdb/View_collation), member order does matter for collation. I was wondering if there is a way to disable this attribute such that collation order does not matter? I want to be able to "search" my views such that the documents that are emitted satisfy all the key ranges for the field.
here is some more on view collation for your reference: CouchDB sorting and filtering in the same view
Likewise, if it is possible to set CouchDB such that order does not matter for view collation, the following parameters used for the GET request should only emit docs where doc.phone_number == "ZZZZZZZ" , whereas right now it emits the documents that fall within the range of the first 3 keys and completely ignores the last key. This occurs because the last key has the least precedence in the current collation scheme.
startkey: [null,null,null,"ZZZZZZZ"],
endkey: ["\ufff0","\ufff0","\ufff0","ZZZZZZZZ"],
Sample Mapping Function
var map = function(doc) {
/*
//Keys emitted
1. name
2. address
3. age
3. phone_number
*/
emit([doc.name,doc.address,doc.num_age,doc.phone_number],doc._id)
}
Is this possible, or do I have to create multiple views to perform this? The use of multiple views seems very inefficent.
I've read that CouchDB-Lucene:( How to realize complex search filters in couchdb? Should I avoid temporary views? )would be helpful for complex searching, but that doesn't seem applicable in this case.
Use of multiple views is not inefficient, quite to the contrary : having four views (name, address, age and phone number) will not use significantly more time or memory than having a single view emit everything. It is the simple, straightforward, efficient way of performing "WHERE field = value" queries in CouchDB.
If you are in fact looking for "WHERE field = value AND field2 = value2" queries, then CouchDB will not help you, and you will need to use Lucene.
You need to understand that the collation merely describes how keys are ordered. Even if you could specify any arbitrary collation, you will still have to deal with the fact that CouchDB need you to define an order for the keys, and only lets you query contiguous ranges of keys. This is not compatible with multi-dimensional range queries.
Here's an example of what I need in sql:
SELECT name FROM employ WHERE name LIKE %bro%
How do I create view like that in CouchDB?
The simple answer is that CouchDB views aren't ideal for this.
The more complicated answer is that this type of query tends to be very inefficient in typical SQL engines too, and so if you grant that there will be tradeoffs with any solution then CouchDB actually has the benefit of letting you choose your tradeoff.
1. The SQL Ways
When you do SELECT ... WHERE name LIKE %bro%, all the SQL engines I'm familiar with must do what's called a "full table scan". This means the server reads every row in the relevant table, and brute force scans the field to see if it matches.
You can do this in CouchDB 2.x with a Mango query using the $regex operator. The query would look something like this for the basic case:
{"selector":{
"name": {
"$regex": "bro"
}
}}
There do not appear to be any options exposed for case-sensitivity, etc. but you could extend it to match only at the beginning/end or more complicated patterns. If you can also restrict your query via some other (indexable) field operator, that would likely help performance. As the documentation warns:
Regular expressions do not work with indexes, so they should not be used to filter large data sets. […]
You can do a full scan in CouchDB 1.x too, using a temporary view:
POST /some_database/_temp_view
{"map": "function (doc) { if (doc.name && doc.name.indexOf('bro') !== -1) emit(null); }"}
This will look through every single document in the database and give you a list of matching documents. You can tweak the map function to also match on a document type, or to emit with a certain key for ordering — emit(doc.timestamp) — or some data value useful to your purpose — emit(null, doc.name).
2. The "tons of disk space available" way
Depending on your source data size you could create an index that emits every possible "interior string" as its permanent (on-disk) view key. That is to say for a name like "Dobros" you would emit("dobros"); emit("obros"); emit("bros"); emit("ros"); emit("os"); emit("s");. Then for a term like '%bro%' you could query your view with startkey="bro"&endkey="bro\uFFFF" to get all occurrences of the lookup term. Your index will be approximately the size of your text content squared, but if you need to do an arbitrary "find in string" faster than the full DB scan above and have the space this might work. You'd be better served by a data structure designed for substring searching though.
Which brings us too...
3. The Full Text Search way
You could use a CouchDB plugin (couchdb-lucene now via Dreyfus/Clouseau for 2.x, ElasticSearch, SQLite's FTS) to generate an auxiliary text-oriented index into your documents.
Note that most full text search indexes don't naturally support arbitrary wildcard prefixes either, likely for similar reasons of space efficiency as we saw above. Usually full text search doesn't imply "brute force binary search", but "word search". YMMV though, take a look around at the options available in your full text engine.
If you don't really need to find "bro" anywhere in a field, you can implement basic "find a word starting with X" search with regular CouchDB views by just splitting on various locale-specific word separators and omitting these "words" as your view keys. This will be more efficient than above, scaling proportionally to the amount of data indexed.
Unfortunately, doing searches using LIKE %...% aren't really how CouchDB Views work, but you can accomplish a great deal of search capability by installing couchdb-lucene, it's a fulltext search engine that creates indexes on your database that you can do more sophisticated searches with.
The typical way to "search" a database for a given key, without any 3rd party tools, is to create a view that emits the value you are looking for as the key. In your example:
function (doc) {
emit(doc.name, doc);
}
This outputs a list of all the names in your database.
Now, you would "search" based on the first letters of your key. For example, if you are searching for names that start with "bro".
/db/_design/test/_view/names?startkey="bro"&endkey="brp"
Notice I took the last letter of the search parameter, and "incremented" the last letter in it. Again, if you want to perform searches, rather than aggregating statistics, you should use a fulltext search engine like lucene. (see above)
You can use regular expressions. As per this table you can write something like this to return any id that contains "SMS".
{
"selector": {
"_id": {
"$regex": "sms"
}
}
}
Basic regex you can use on that includes
"sms$" roughly to LIKE "%sms"
"^sms" roughly to LIKE "sms%"
You can read more on regular expressions here
i found a simple view code for my problem...
{"getavailableproduct": {
"map": "function(doc) { var prefix = doc['productid'].match(/[A-Za-z0-9]+/g); if(prefix) for(var pre in prefix) { emit(prefix[pre],null); } }"
}
}
from this view code if i split a key sentence into a key word...
and i can call
?key="[search_keyword]"
but i need more complex code because if i run this code i can only find word wich i type (ex: eat, food, etc)...
but if i want to type not a complete word (ex: ea from eat, or foo from food) that code does not work..
I know it is an old question, but: What about using a "list" function? You can have all your normal views, andthen add a "list" function to the design document to process the view's results:
{
"_id": "_design/...",
"views": {
"employees": "..."
},
"lists": {
"by_name": "..."
}
}
And the function attached to "by_name" function, should be something like:
function (head, req) {
provides('json', function() {
var filtered = [];
while (row = getRow()) {
// We can retrive all row information from the view
var key = row.key;
var value = row.value;
var doc = req.query.include_docs ? row.doc : {};
if (value.name.indexOf(req.query.name) == 0) {
if (req.query.include_docs) {
filtered.push({ key: key, value: value, doc: doc});
} else {
filtered.push({ key: key, value: value});
}
}
}
return toJSON({ total_rows: filtered.length, rows: filtered });
});
}
You can, of course, use regular expressions too. It's not a perfect solution, but it works to me.
You could emit your documents like normal. emit(doc.name, null); I would throw a toLowerCase() on that name to remove case sensitivity.
and then query the view with a slew of keys to see if something "like" the query shows up.
keys = differentVersions("bro"); // returns ["bro", "br", "bo", "ro", "cro", "dro", ..., "zro"]
$.couch("db").view("employeesByName", { keys: keys, success: dealWithIt } )
Some considerations
Obviously that array can get really big really fast depending on what differentVersions returns. You might hit a post data limit at some point or conceivably get slow lookups.
The results are only as good as differentVersions is at giving you guesses for what the person meant to spell. Obviously this function can be as simple or complex as you like. In this example I tried two strategies, a) removed a letter and pushed that, and b) replaced the letter at position n with all other letters. So if someone had been looking for "bro" but typed in "gro" or "bri" or even "bgro", differentVersions would have permuted that to "bro" at some point.
While not ideal, it's still pretty fast since a look up in Couch's b-trees is fast.
why cann't we just use indexOf() in view?