Given the ElasticSearch document below:
{
"_id": "3330481",
"_type": "user",
"_source": {
"id": "3330481",
"project": "Cool_Project_One"
}
}
I'm building a UI component that will auto suggest to the user all the values in "project" field base on his text input
For example:
As the user types "Cool" i would like to show him all the values from the "project" field that starts with "Cool"
I've create this aggregation:
"aggs": {
"projects": {
"terms": {
"field": "project",
"size": 2
}
}
}
which returns me a list with all the values for the project field, but i can't understand how should i find only the values that are matching to a certain expression.
I've found this answer that shows how to add filter, but it seems that the filter returns only exact matches as i tried to do that:
{
"aggs": {
"projects": {
"filter": {
"term": {
"project": "Cool"
}
},
"aggs": {
"terms": {
"field": "project",
"size": 2
}
}
}
}
}
And it didn't worked.
Any help would be appreciated
Some notes:
term query will look for exact matches (casing included, if you want something more flexible you can use a regular "match" query
In general, you want to add the filter in the query, no aggregations
You can use aggregations to group by category type fields, but you can use regular queries to match against title type fields
I would suggest to use a suggestion field type for this, to also capture prefixes (proj in project por example).
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-as-you-type.html
I just seen you have _type defined, so probably you are using an old version of Elasticsearch, and search_as_you_type is relatively new. I will add some examples with both query and aggs:
Mappings for search_as_you_type, text, and keyword fields :
PUT test_suggestions
{
"mappings": {
"properties": {
"project": {
"type": "text",
"fields": {
"suggestions": {
"type": "search_as_you_type"
},
"keyword": {
"type": "keyword"
}
}
}
}
}
}
Indexing document
POST test_suggestions/_doc
{
"project": "Cool Project"
}
search_as_you_type query, supports prefixes out of the box:
GET test_suggestions/_search
{
"query": {
"multi_match": {
"query": "coo",
"type": "bool_prefix",
"fields": [
"project",
"project._2gram",
"project._3gram"
]
}
},
"aggs": {
"project_categories": {
"terms": {
"field": "project.keyword",
"size": 10
}
}
}
}
Regular query, case insensitive and you can write just a portion of the field and will work
GET test_suggestions/_search
{
"query": {
"match": {
"project": "Cool"
}
},
"aggs": {
"project_categories": {
"terms": {
"field": "project.keyword",
"size": 10
}
}
}
}
Bonus: prefix search without search_as_you_type or setting up ngrams:
GET test_suggestions/_search
{
"query": {
"match_phrase_prefix": {
"project": "coo"
}
},
"aggs": {
"project_categories": {
"terms": {
"field": "project.keyword",
"size": 10
}
}
}
}
Related
I am using NODE JS with elastic search DB .
I am using this package
https://www.npmjs.com/package/#elastic/elasticsearch
I have this collection in my elastic search DB
[
{
"_index": "products",
"_id": "wZRh3n8Bs9qQzO6fvTTS",
"_score": 1.0,
"_source": {
"title": "laptop issues",
"description": "laptop have issue present in according"
}
},
{
"_index": "products",
"_id": "wpRh3n8Bs9qQzO6fvzQM",
"_score": 1.0,
"_source": {
"title": "buy mobile",
"description": "mobile is in Rs 250"
}
},
{
"_index": "products",
"_id": "w5Rh3n8Bs9qQzO6fvzTz",
"_score": 1.0,
"_source": {
"title": "laptop payment",
"description": "laptop payment is given in any way"
}
}
]
now I am planning to fetch data from elastic DB . when I am passing "LAP" or "lap" . it is giving me blank array or [] array why ? "lap" is present in all object
I am doing like that
const result= await client.search({
index: 'products',
query: {
match_phrase: {
description: "lap"
}
}
where I am doing wrong . I need all result where lap keywords is present
Match query is not working because you are trying to search partial character of laptop term.
You can use Prefix Query for single term like below:
{
"query": {
"prefix": {
"title": {
"value": "lap"
}
}
}
}
If you want to search for phrase then you can use Phrase Prefix Query:
{
"query": {
"match_phrase_prefix": {
"title": "lap"
}
}
}
If you want to match only some of the word from query then you can use match query with operator set to or.
POST querycheck/_search
{
"query": {
"match": {
"title": {
"query": "i have issues",
"operator": "or"
}
}
}
}
Given:
Say that I am defining a schema for Contacts. But, I can have "Primary Contact", "Student" or one who is both; and different properties that go with all three choices. The contact types are defined in an array of contact_type: [ "Primary Contact", "Student" ] which can be either one, or both.
Say that the fields are as such per contact type:
If Primary Contact, then I want phone_number
If Student, then I want first_name
If Student and Primary Contact then I want phone_number and first_name
Usage
I use Ajv library to validate in Node.js using a code like such:
function validator(json_schema){
const Ajv = require('ajv');
const ajv = new Ajv({allErrors: true});
return ajv.compile(json_schema)
}
const validate = validator(json_schema);
const valid = validate(input);
console.log(!!valid); //true or false
console.log(validate.errors)// object or null
Note: I've had trouble with allErrors: true while using anyOf for this, and I use the output of allErrors to return ALL the missing/invalid fields back to the user rather than returning problems one at a time. Reference: https://github.com/ajv-validator/ajv/issues/980
Schema
I have written the following schema and it works if I do either "Student" or "Primary Contact" but when I pass both, it still wants to validate against ["Student"] or ["Primary Contact"] rather than both.
{
"$schema": "http://json-schema.org/draft-07/schema",
"type": "object",
"required": [],
"properties": {},
"allOf": [
{
"if": {
"properties": {
"contact_type": {
"contains": {
"allOf": [
{
"type": "string",
"const": "Primary Contact"
},
{
"type": "string",
"const": "Student"
}
]
}
}
}
},
"then": {
"additionalProperties": false,
"properties": {
"contact_type": {
"type": "array",
"items": [
{
"type": "string",
"enum": [
"Student",
"Primary Contact"
]
}
]
},
"phone": {
"type": "string"
},
"first_name": {
"type": "string"
}
},
"required": [
"phone",
"first_name"
]
}
},
{
"if": {
"properties": {
"contact_type": {
"contains": {
"type": "string",
"const": "Student"
}
}
}
},
"then": {
"additionalProperties": false,
"properties": {
"contact_type": {
"type": "array",
"items": [
{
"type": "string",
"enum": [
"Student",
"Primary Contact"
]
}
]
},
"first_name": {
"type": "string"
}
},
"required": [
"first_name"
]
}
},
{
"if": {
"properties": {
"contact_type": {
"contains": {
"type": "string",
"const": "Primary Contact"
}
}
}
},
"then": {
"additionalProperties": false,
"properties": {
"contact_type": {
"type": "array",
"items": [
{
"type": "string",
"enum": [
"Student",
"Primary Contact"
]
}
]
},
"phone": {
"type": "string"
}
},
"required": [
"phone"
]
}
}
]
}
Example Valid Inputs:
For just ["Primary Contact"]:
{
"contact_type":["Primary Contact"],
"phone":"something"
}
For just ["Student"]:
{
"contact_type":["Student"],
"first_name":"something"
}
For ["Primary Contact", "Student"]
{
"contact_type":["Primary Contact", "Student"],
"phone":"something",
"first_name":"something"
}
Question:
I would like this to validate even if allErrors: true, is this possible? If not, how should I change the schema?
Footnotes
I don't want to change the "contact_type" from being an array unless it is the last resort. (it is a requirement, but can be broken only if there's no other way)
I can't allow any additionalItems, therefore I'm fully defining each object in the if statements although contact_type is common. If I move contact_type out, then I get error messages about passing contact_type as an additionalItem (it looks at the if statement's properties and doesn't see contact_type when it's taken out to the common place). This is why my initial properties object is empty.
Here's how I might go about solving the validation issue: https://jsonschema.dev/s/XLSDB
Here's the Schema...
(It's easier if you try to break up concerns)
{
"$schema": "http://json-schema.org/draft-07/schema",
"type": "object",
First, we want to define our conditional checking subschemas...
"definitions": {
"is_student": {
"properties": {
"contact_type": {
"contains": {
"const": "Student"
}
}
}
},
"is_primay_contact": {
"properties": {
"contact_type": {
"contains": {
"const": "Primary Contact"
}
}
}
}
},
Next, I'm assuming you always want contact_type
"required": ["contact_type"],
"properties": {
"contact_type": {
"type": "array",
"items": {
"enum": ["Primary Contact", "Student"]
}
},
And we need to define all the allowed properties in order to prevent additional properties. (draft-07 cannot "see through" applicator keywords like allOf. You can with draft 2019-09 and beyond, but that's another story)
"phone": true,
"first_name": true
},
"additionalProperties": false,
Now, we need to define our structural constraints...
"allOf": [
{
If the contact is a student, first name is required.
"if": { "$ref": "#/definitions/is_student" },
"then": { "required": ["first_name"] }
},
{
If the contact is a primary contact, then phone is required.
"if": { "$ref": "#/definitions/is_primay_contact" },
"then": { "required": ["phone"] }
},
{
However, additionally, if the contact is both a student and a primary contact...
"if": {
"allOf": [
{ "$ref": "#/definitions/is_student" },
{ "$ref": "#/definitions/is_primay_contact" }
]
},
Then we require both phone and first name...
"then": {
"required": ["phone", "first_name"]
},
Otherwise, one of phone or first name is fine (which one is covered by the previous section)
"else": {
"oneOf": [
{
"required": ["phone"]
},
{
"required": ["first_name"]
}
]
}
}
]
}
I'm not convinced this is the cleanest approach, but it does work for the requirements you've provided.
As for getting validation errors you can pass back to your END user... given the conditional requirements you lay out, it's not something you can expect with pure JSON Schema...
Having said that, ajv does provide an extension to add custom error messages, which given the way I've broken the validation down into concerns, might be useable to add custom errors as you're looking to do (https://github.com/ajv-validator/ajv-errors).
I would like to get the opposite GraphQL query field:
which mean the query string will not appeal in the result, while not inside the query string will appeal in the result.
Because there are varties of json recorders, I can not manually write opposite query. Any way to write opposite query automatically?
for example, I have JSON like:
{
"data": {
"source": "AWS",
"hero": {
"version": "my version",
"name": "R2-D2",
"friends": [
{
"attribute": "like something",
"name": "Luke Skywalker"
},
{
"name": "Han Solo"
},
{
"name": "Leia Organa"
}
]
}
}
}
I have query
{
source,
hero {
name
friends {
attribute
}
}
}
I like to get a result:
{
"data": {
"hero": {
"version": "my version",
"friends": [
{
"name": "Luke Skywalker"
},
{
"name": "Han Solo"
},
{
"name": "Leia Organa"
}
]
}
}
}
Which the query fields not in the query will appear in the result, while the query fields in the query will not inside the result.
How to do these operations in JavaScript? Can you give me an example?
You only get the fields you include in the Query if you want to get whats not in the query then write another query for the fields not in the original query.
At its simplest, GraphQL is about asking for specific fields on objects.
https://graphql.org/learn/queries/
Query:
{
hero {
version
friends {
name
}
}
}
Result:
{
"data": {
"hero": {
"version": "my version",
"friends": [
{
"name": "Luke Skywalker"
},
{
"name": "Han Solo"
},
{
"name": "Leia Organa"
}
]
}
}
}
Update from comments:
Your question is more of how to Dynamically Generate GraphQL Queries?
In this case you could use fragments, but you would still have to write multiple queries.
Fragments let you construct sets of fields, and then include them in queries where you need to.
https://graphql.org/learn/queries/#fragments
I have a hard time translating aggregation query for elastic search into elastic.js. I am reading the documentation but I just can not figure it out. And the examples that you can find online are mostly about deprecated facets feature, that is not very useful.
The JSON for example aggregation is as follows:
{
"aggs": {
"foo": {
"filter": {
"bool": {
"must": [
{
"query": {
"query_string": {
"query": "*"
}
}
},
{
"terms": {
"shape": [
"wc"
]
}
}
]
}
},
"aggs": {
"field": {
"terms": {
"field": "shape",
"size": 10,
"exclude": {
"pattern": []
}
}
}
}
}
},
"size": 0
}
This is how you would nest terms aggregation into filter aggregation with elasticjs
ejs.Request()
.size(0)
.agg(ejs.FilterAggregation("foo").filter(ejs.BoolFilter()
.must(ejs.TermsFilter('shape', 'wc'))
.must(ejs.QueryFilter(ejs.QueryStringQuery().query("*"))))
.agg(ejs.TermsAggregation("field").field("shape").size(10).exclude("my_pattern"))
)
BTW you are filtering on shape and then doing aggregations on it. I am not sure what exactly you are trying.
I found their documentation pretty good, Also they have a great tool to check if your query is valid and right. This would help you a lot
Hope this helps!!
It seems like you have misplaced your query under the aggs element. Try this:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"query": {
"query_string": {
"query": "*"
}
}
},
{
"terms": {
"shape": [
"wc"
]
}
}
]
}
},
"aggs": {
"foo": {
"terms": {
"field": "shape",
"size": 10,
"exclude": {
"pattern": []
}
}
}
}
}
Is it possible to analyze a field that has already been analyzed?
For example, suppose we broke down /Health & Beauty/Vitamins & Supplements/Supplements into the following using a custom analysis with a hierarchical token:
/Health & Beauty
/Health & Beauty/Vitamins & Supplements
/Health & Beauty/Vitamins & Supplements/Supplements
Would it be possible to then run a separate analysis on each new string and store the results with the corresponding string?
How would we do that with the following mapping:
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"path-analyzer": {
"type": "custom",
"tokenizer": "path-tokenizer"
},
"url-analyzer": {
"type": "custom",
"char_filter" : ["urlFormat"],
"filter": ["lowercase"],
"tokenizer": "path-tokenizer"
},
"cat-analyzer": {
"type": "custom",
"char_filter" : ["catName"],
"tokenizer": "keyword"
}
},
"char_filter" : {
"urlFormat":{
"type":"pattern_replace",
"pattern":"[^a-z|A-Z|/]+",
"replacement":"-"
},
"catName":{
"type":"pattern_replace",
"pattern":"[^/]+/",
"replacement":""
}
},
"tokenizer": {
"path-tokenizer": {
"type": "path_hierarchy"
}
}
}
},
"mappings": {
"my_type": {
"dynamic": "strict",
"properties": {
"group_path": {
"type": "string",
"index_analyzer": "url-analyzer",
"search_analyzer": "keyword",
"fields": {
"name": {
"type": "string",
"index_analyzer": "cat-analyzer",
"search_analyzer": "keyword"
}
}
}
}
}
}
}
Thank you for your consideration