Date Formatting with Marklogic Optic API

Date Formatting with Marklogic Optic API - javascript

My question is about getting the MarkLogic query console javascript API to format a column of strings to dates.
Working on a string directly works as expected:
var d = new Date("3/12/2019");
xdmp.monthNameFromDate(xs.date(d))
>>> March
Working with the optic api however:
const op = require('/MarkLogic/optic');
const ind = op.fromView('schema', 'money');
//get non null dates, stored as strings, [MM-DD-YYYY]
const ind2 = ind.where(op.ne(op.col('completed date'), ""))
const testMonth = op.as('testDate', fn.formatDate(xs.date(op.col('completed date')), "[M01]-[D01]-[Y0001]"))
Returns the following error:
[javascript] XDMP-CAST: function bound ()() -- Invalid cast: {_expr:"¿\"completed date\"", _preplans:null, _schemaName:null, ...} cast as xs.date
I believe this is different than the other questions on this topic because those didn't involve the OPTIC API as far as I can tell, and were resolved by just operating on single strings.How to convert string to date type in MarkLogic?
I need to take an optic "column" and convert its type to a date object so I can call the https://docs.marklogic.com/xdmp.monthNameFromDate and other related tools on it.
I feel missing something very straightforward about applying functions to row sets and selecting specific columns.
What I naturally want to do is apply a function to each property of the resulting row set:
let formatted = resulting_rows.map(x=>Date(x['completed date'])
or whatever. This is basically what I do client side, but it feels incorrect to just throw away so much of the built-in javascript functionality and do this all in the browser, especially when I need to do groups on years and months from these views.
It doesn't help that some links about operating on objects are broken:
https://docs.marklogic.com/map.keys

The op.as() call defines a dynamic column based on an expression that's applied to each row when the query is executed.
The expression can only use calls to functions provided by the Optic API. In particular, where xs.date() executes when called, op.xs.date() executes when each row is processed. Similarly fn.formatDate() executes immediately while op.fn.formatDate() executes during row processing.
To use the dynamic column, provide it as an argument to op.select(), similar to the following sketch:
op.fromView('schema', 'money');
.where(op.ne(op.col('completed date'), ""))
.select([
op.col('completed date'),
op.as('testDate', op.fn.formatDate(
xdmp.parseDateTime(
op.col('completed date'),
"[M01]/[D01]/[Y0001]"),
"[M01]-[D01]-[Y0001]"))
])
.result();
The call to .result() executes the query pipeline.
The map is an XQuery equivalent to JavaScript literal that's not used in sever-side JavaScript. Optic does support a map() pipeline step, which takes a lambda and appears in the pipeline step immediately before the call to result() as documented in:
http://docs.marklogic.com/AccessPlan.prototype.map
Belated footnote: One alternative for this case to parsing and formatting the date would be to use op.fn.translate() to transform the column value by turning every instance of "/" into "-"
Hoping that helps,

Related

Riak MapReduce in single node using javascript and python

I want to perform MapReduce job on data in Riak DB using javascript. But stuck in very begining, i couldnot understand how it is returning value.
client = riak.RiakClient()
query = client.add('user')
query.map("""
function(v){
var i=0;
i++;
return [i];
}
""")
for result in query.run():
print "%s" % (result);
For simplicity i have checked the above example.
Here query is bucket and user contain five sets of data in RiakDB.
i think map() returns single value but it returns array with 5 value, i think equivalent to five set of data in RiakDB.
1
1
1
1
1
And here, why I can return only array? it treats each dataset independently, and returns for each. so i think i have five 1's. Due to this reason when i process fetched data inside map(), returns gives unexpected result for me.
so please give me some suggestion. i think it is basic thing but i couldnot get it. i highly appreciate your help.

When you run a MapReduce job, the map phase code is sent out to the vnodes where the data is stored and executed for each value in the data. The resulting arrays are collected and passed to a single reduce phase, which also returns an array. If there are sufficiently many results, the reduce phase may be run multiple times, with the previous reduce result and a batch of map results as input.
The fact that you are getting 5 results implies that 5 keys were seen in your bucket. There is no global state shared between instances of the map phase function, so each will have an independent i, which is why each result is 1.
You might try returning [v.key] so that you have something unique for each one, or if the values are expected to be small, you could return [JSON.stringify(v)] so you can see the entire structure that is passed to the map.
You should note that according to the docs site javascript Map Reduce has been officially deprecated, so you may want to use Erlang functions for new development.

PostgreSQL GeoJSON <- php -> JavaScript

I'm in the throes of re-building something that built almost a year ago (don't ask where the old version went - it's embarrassing).
The core functionality uses an $.getJSON (ajax-ish) call in javascript that runs a PHP script that runs a PostgreSQL query that builds a JSON object and returns it. (Pause for breath).
The issue is what PostgreSQL spits out when it's its turn to shine.
I'm aware of the build_json_object() and build_json_array() functionality in PostgreSQL 9.4+, but one of the DBs on which this has to run hasn't been upgraded from 9.2 and I don't have time to do so in the next month or so.
For now I am using row_to_json() (and ST_AsGeoJSON() on the geometry) to build my GeoJSON collection, which gets flung back at the client via a callback.
Taking my cue from this very nice post (and staying within very small epsilon of that post's query structure), I run the following query:
select row_to_json(fc)
from (SELECT 'FeatureCollection' As type,
array_to_json(array_agg(f)) As features
from (SELECT 'Feature' as type,
row_to_json((select l from (select $vars) as l)) as properties,
ST_AsGeoJSON(ST_Transform(lg.g1,4326)) as geometry
from $source_table as lg
where g1 && ST_Transform(ST_SetSRID(ST_MakeEnvelope($bounds),4326),4283)
) as f ) as fc;
($vars, $source_table and $bounds are supplied by PHP from POST variables).
When I fetchAll(PDO::FETCH_ASSOC) that query to $result, and json_encode($result[0]["row_to_json"]), the object returned to javascript is an object which can be JSON.parse()'d to give the expected (an Object with a FeatureCollection which in turn contains a bunch of Features, one of which is geometry).
So far, so good. And quick - gets the data and is back in a second or so.
The problem is that at the query stage, the array of stuff that relates to the geometry is double-quoted: the relevant segment of the JSON for an individual Feature looks like
{"type":"Feature","geometry":"{\\"type\\":\\"Polygon\\",
\\"coordinates\\":"[[[146.885447408,-36.143199088],
[146.884964384,-36.143136232],
... etc
]]"
}",
"properties":{"address_pfi":"126546461",
"address":"blah blah",
...etc }
}
This is what I get if I COPY the PostgreSQL query result to file: it's before any mishandling of the output.
Note the (double-escaped) double-quotes that only affect attributes (in the non-JSON sense) of the geometry {type, coordinates}: the "geometry" bit looks like
"geometry":"{stuff}"
instead of
"geometry":{stuff}
If the JSON produced by PostgreSQL is put through the parser/checker at GeoJSONLint, it dies in a screaming heap (which it should - it's absolutely not 'spec') - and of course it's never going to render: it spits out 'invalid type' as you might expect.
For the moment I've sorted it out by a kludge (my normal M.O.) - when $.getJSON returns the object, I
turn it into a string, then
.replace(/"{/g, '{') and .replace(/}"/g, '}') and .replace(/\\/g, ''), and then
turn it back into an object and proceed with shenanigans.
This is not good practice (to say the least): it would be far better if the query itself could be encouraged to return valid GeoJSON.
It seems clear that the problem is the row_to_json() stage: it sees the attribute-set for "geometry" and treats it differently from the attribute-set for "properties" - it (incorrectly) quote-escapes the "geometry" (after slash-escaping all double-quotes) one but (correctly) leaves the "properties" one as-is.
So after this book-length prelude... the question.
Is there some nuance about the query that I'm missing or ignoring? I've RTFD for the relevant PostgreSQL commands, and apart from prettification switches there is nothing that I'm aware of.
And of course, if there is a parsimonious way of doing the whole round-trip I would embrace it: the only caveat is as it must retain its 'live-fetch' nature - the $.getJSON runs under a listener that triggers on "idle" in a Google Map, and the source table, variables of interest and zoom (which determines $bounds) are user-determined.
(Think of it as being a way to have a map layer that updates with pan and zoom by only fetching ~200-300 simple-ish (cadastre) features at a time -0 far better that, than to generate a tile pyramid for an entire state for zooms 10-19. I bet someone has already done such a thing on bl.ocks, but I haven't found it).

It seems that you are missing the cast to json.
It should be
ST_AsGeoJSON(ST_Transform(lg.g1,4326))::json
Without the cast, st_asgeojson returns a string, that is double-encoded.
However, you could also get attributes and geoJson, than json_decode the json with PHP, create a geoJson featurecollection array with php, and finally json_encode the whole result.

new Date(epoch) returning invalid date inside Ember component

I have a date-filter component that I am using in my Ember application that only works on initial render, not on a page reload, or even if I save a file (which triggers the application to live update).
In the main template of my application, I render the date-filter like this passing it a unix timestamp
{{date-filter unixepoch=item.date}}
Then, in components/date-filter.js, I use a computed property called timeConverter to change the unix epoch into a time string formatted according to user's language of choice, and then in my templates/components/date-filter.hbs file I do {{timeConverter}} to display the results
timeConverter: function(){
//step 1: get the epoch I passed in to the component
var epoch = this.get('unixepoch');
//step 2: create a human readable date string such as `Jun 29, 2015, 12:36PM`
var datestring = new Date(epoch)
//do language formatting --code omitted as the problem is with step2
}
It is step 2 that fails (returning invalid date) if I refresh the page or even save the file. It always returns the proper date string the first time this component is called. Even if I do new Date(epoch) in the parent component, and try to pass the result in to this component (to do foreign language formatting), I'm having the same problem.
Question: how can I figure out what's happening inside new Date(epoch), or whether it's an issue related to the component?

I suspect your epoch value is a string (of all digits). If so, then
var datestring = new Date(+epoch);
// Note ------------------^
...will fix it by converting it to a number (+ is just one way to do it, this answer lists your options and their pros/cons). Note that JavaScript uses the newer "milliseconds since The Epoch" rather than the older (original) "seconds since The Epoch." So if doing this starts giving you dates, but they're much further back in time than you were expecting, you might want epoch * 1000 to convert seconds to milliseconds.
If it's a string that isn't all digits, it's not an epoch value at all. The only string value that the specification requires new Date to understand is the one described in the spec here (although all major JavaScript engines also understand the undocumented format using / [not -] in U.S. date order [regardless of locale]: mm/dd/yyyy — don't use it, use the standard one).

Parsing and constructing filtering queries similiar to SQL WHERE clause in Python/JavaScript

I am building a query engine for a database which is pulling data from SQL and other sources. For normal use cases the users can use a web form where the use can specify filtering parameters with select and ranged inputs. But for advanced use cases, I'd like to to specify a filtering equation box where the users could type
AND, OR
Nested parenthesis
variable names
, <, =, != operators
So the filtering equation could look something like:
((age > 50) or (weight > 100)) and diabetes='yes'
Then this input would be parsed, input errors detected (non-existing variable name, etc) and SQL Alchemy queries built based on it.
I saw an earlier post about the similar problem https://stackoverflow.com/a/1395854/315168
There seem to exist several language and mini-language parsers for Python http://navarra.ca/?p=538
However, does there exist any package which would be out of the box solution or near solution for my problem? If not what would be the simplest way to construct such query parser and constructor in Python?

Have a look at https://github.com/dfilatov/jspath
It's similar to xpath, so the syntax isn't as familiar as SQL, but it's powerful over hierarchical data.

I don't know if this is still relevant to you, but here is my answer:
Firstly I have created a class that does exactly what you need. You may find it here:
https://github.com/snow884/filter_expression_parser/
It takes a list of dictionaries as an input + filter query and returns the filtered results. You just have to define the list of fields that are allowed plus functions for checking the format of the constants passed as a part of filter expression.
The filter expression it ingests has to have the following format:
(time > 45.34) OR (((user_id eq 1) OR (date gt '2019-01-04')) AND (username ne 'john.doe'))
or just
username ne 'john123'
Secondly it was foolish of me to even create this code because dataframe.query(...) from pandas already does almost exactly what you need: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html

Creating a BIRT Data Set with Dynamic Data - ORA-01722

Having some trouble getting BIRT to allow me to create a Data Set with Parameters that are set at run time.
The SQL that is giving me the error is:
...
FROM SPRIDEN, SPBPERS P, POSNCTL.NBRJOBS X, NHRDIST d1
where D1.NHRDIST_PAYNO between '#PAYNO_BEGIN' and '#PAYNO_BEGIN'
AND D1.NHRDIST_YEAR = '#YEAR'
...
I have my Report Parameters defined as PaynoBegin, PaynoEnd, Year
I also have a Data Set script set for beforeOpen as follows:
queryText = String (queryText).replace ("#PAYNO_END", Number(params["PaynoEnd"]));
queryText = String (queryText).replace ("#PAYNO_BEGIN", Number(params["PaynoBegin"]));
queryText = String (queryText).replace ("#YEAR", Number(params["Year"]));
The problem seems to be that the JDBC can't get the ResultSet from this, however I have 10 other reports that work the same way. I have commented out the where clause and it will generate the data set. I also tried breaking the where clause out into two and clauses with <= and >=, but it still throws a ORA-01722 invalid number error on the line.
Any thoughts on this?

Two quite separate thoughts:
1) You have single quotes around each of your parameters in the query, yet it appears as though each one is a numeric - try removing the single quotes, so that the where clause looks like this:
where D1.NHRDIST_PAYNO between #PAYNO_BEGIN and #PAYNO_BEGIN
AND D1.NHRDIST_YEAR = #YEAR
Don't forget that all three parameters should be required. If the query still returns an error, try replacing #PAYNO_BEGIN, #PAYNO_BEGIN and #YEAR with hardcoded numeric values in the query string, and see whether you still get an error.
2) You are currently using dynamic SQL - amending query strings to replace specified markers with the text of the entered parameters. This makes you vulnerable to SQL Injection attacks - if you are unfamiliar with the term, you can find a simple example here.
If you are familiar with the concept, you may be under the impression that SQL Injection attacks cannot be implemented with numeric parameters - Tom Kite has recently posted a few articles on his blog about SQL Injection, including one that deals with a SQL Injection flaw using NLS settings with numbers.
Instead, you should use bind parameters. To do so with your report, amend your query to include:
...
FROM SPRIDEN, SPBPERS P, POSNCTL.NBRJOBS X, NHRDIST d1
where D1.NHRDIST_PAYNO between ? and ?
AND D1.NHRDIST_YEAR = ?
...
instead of the existing code, remove the queryText replacement code from the beforeOpen script and map the three dataset parameters to the PaynoBegin, PaynoEnd and Year report parameters respectively in the Dataset Editor. (You should also change any other replaced text in your query text to bind parameter markers (?) and map dataset parameters to them as required.)

Develop Reference

JavaScript is the programming language of the Web.