Elasticsearch setting up stemming and analyzer questions

Elasticsearch setting up stemming and analyzer questions - javascript

I'm using ES with my node server via the package "elasticsearch": "12.1.3".
I do bulk inserts of my documents. Excerpt:
var body = [];
_.each(rows, function(doc) {
body.push({
update: {
_index: 'mytest',
_type: 'mydoc',
_id: doc.id,
_retry_on_conflict: 3
}
});
body.push({
doc: doc,
doc_as_upsert: true
});
});
client.bulk({
body: body
}, ...
On demand, to individually update documents, I have this in place:
client.index({
index: 'mytest',
type: 'mydoc',
id: doc.id,
body: doc.body
}, ...);
Everything works as expected so far. Now I'm trying to add basic 'light_english' stemming.
Looking at the Docs here
and for the JS package here
I want certain fields in my document to be "fuzzy" matched, therefore I think stemming is the way to go?
It is not clear to me how I would set this up.
Assuming I use the example settings from the link above, would this be the right way to do it:
client.cluster.putSettings({
"settings": {
"analysis": {
"filter": {
"no_stem": {
"type": "keyword_marker",
"keywords": [ "skies" ]
}
},
"analyzer": {
"my_english": {
"tokenizer": "standard",
"filter": [
"lowercase",
"no_stem",
"porter_stem"
]
}
}
}
}
});
And would this then work permanently for my two code examples above, if applied once?
Bonus question: What would be a good default analyzer plugin (or settings) I can use? My main goal is that searches for example: "Günther" would also match "gunther" and vice versa.
Might it be better to do this manually before inserting/updating documents, so that strings are lower-cased, diacritics removed etc.?

Related

Formatting cells with the Google Sheets API (v4)

I'm using the Google Sheets API (v4) to create/update spreadsheets programmatically and have run into the following issue:
As per the documentation (https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/cells#CellFormat), I'm setting the number format to CURRENCY; this will correctly display the number as a numeric value with a ¥ sign at the front (Japanese locale). However, it does not seem to actually select the "Currency" option on the formats dropdown, and more importantly, does NOT reflect the specified format when downloading the spreadsheet (e.g. as an .xlsx file).
This is different from selecting the 'Currency' option manually (via the UI), in which case values are correctly displayed once downloaded.
Here's the relevant section of code:
import { google, sheets_v4 } from 'googleapis';
const sheetsApi = google.sheets({
version: 'v4',
auth: await this.getAuthClient(),
});
await sheetsApi.spreadsheets
.batchUpdate({
spreadsheetId: resource,
requestBody: {
requests: [
{
updateSpreadsheetProperties: {
fields: 'locale',
properties: {
locale: 'ja',
},
},
},
...,
{
repeatCell: {
fields: 'userEnteredFormat.numberFormat',
cell: {
userEnteredFormat: {
numberFormat: { type: 'CURRENCY' },
},
},
},
},
],
},
})
.catch((error) => {
console.log(error);
});
I've also tried settings the pattern (tried few different ones), but haven't been able to actually set the cell format, despite the value being displayed as such.
Probably a simple mistake, but I've been stuck on this for a while.. any help would be greatly appreciated!

In that case, I thought that the property of pattern might be required to be set. So in this case, how about modifying your request of repeatCell as follows?
Modified request:
{
"repeatCell": {
"range": {,,,}, // Please set range.
"cell": {
"userEnteredFormat": {
"numberFormat": {
"type": "CURRENCY",
"pattern": "[$¥-411]#,##0.00" // <--- Added
}
}
},
"fields": "userEnteredFormat.numberFormat" // or "fields": "userEnteredFormat"
}
}
Note:
In my environment, when above modified request is used for the batchUpdate method, I could confirm that "Currency" was checked.
References:
RepeatCellRequest- NumberFormat

Custom parameters for bootstrap-table server side pagination

I have a service created with spring boot, for which I am trying to display its data using the bootstrap-table library.
My service allows pagination with the query parameters ?page=x&size=y, where page starts at 0.
The response for the query returns something that looks like this:
{
"_embedded": {
"catalogueOrders": [ .... ]
},
"page": {
"size": 20,
"totalElements": 11,
"totalPages": 1,
"number": 0
}
}
Where _embedded.catalogueOrders contains all the data, and page contains the totals.
I tried configuring my table as following:
$('#orderTable').bootstrapTable({
url: "http://localhost:8088/catalogueOrders?orderStatus=" + orderState,
columns: [
{
field: 'orderId',
title: 'Order ID'
},
{
field: 'priority',
title: 'Priority'
}
],
pagination: true,
sidePagination: 'server',
totalField: 'page.totalElements',
pageSize: 5,
pageList: [5, 10, 25],
responseHandler: function(res) {
console.log(res)
return res["_embedded"]["catalogueOrders"]
}
})
This is able to retrieve and display the data, however it returns all the results, clearly due to it not knowing how to apply the pagination. Total elements doesn't seem to be retrieved either, as the table displays Showing 1 to 5 of undefined rows. Also, if I replace the responseHandler with dataField: '_embedded.catalogueOrders', it's no longer displaying the data.
How do I configure the query parameters needed for pagination?
And am I doing anything wrong when I try and configure dataField and totalField?

Figured it out:
Not sure what was wrong with the dataField and totalField, but it seems to not work with nested fields. To resolve this, I formatted the response into a new object inside responseHandler:
dataField: 'data',
totalField: 'total',
responseHandler: function(res) {
return {
data: res["_embedded"]["catalogueOrders"],
total: res["page"]["totalElements"]
}
}
As for the query parameters, by default, bootstrap-table provides the parameters limit and offset. To customize that and convert to size and page like in my case, the queryParams function can be provided:
queryParams: function(p) {
return {
page: Math.floor(p.offset / p.limit),
size: p.limit
}
}

one, yes, it doesn’t work with nested fields. if you want to use nested fields, try on sass code (get the compiler, just search up on web, there’s plenty of posts on the web).
two, i’m not exactly sure what you’re talking about, but you can set up a css variable
:root{
/*assign variables*/
—-color-1: red;
—-color-2: blue;
}
/*apply variables
p {
color: var(-—color-1):
}
you can find loads of info on this on the web.

How can I build a compound query with SearchBox in searchkit?

I'm using searchkit to try to build a basic text search. I think the query I want to build is fairly simple. It needs to be structured like this:
{
"query":{
"bool":{
"must":[
{
"multi_match":{
"query":"test search",
"type":"phrase_prefix",
"fields":[
"field_1^5",
"field_2^4",
"field_3"
]
}
},
{
"term":
{
"field_id": "3"
}
}
],
"must_not":[
{
"term":
{
"status": "archived"
}
}
]
}
},
"size":6,
"highlight":{
"fields":{
"field_1":{},
"field_2":{},
"field_3":{}
}
}
}
I've tried using the prefixQueryFields attribute, which gave me something fairly close to what I wanted except it was using a BoolShould rather than a BoolMust, plus it was always including the default SimpleQueryString query. It looked something like this:
const prefixQueryFields = [
'field_1^5',
'field_2^4',
'field_3',
];
...
<SearchBox
searchOnChange={true}
prefixQueryFields={prefixQueryFields}
/>
I couldn't figure out the issues there easily and decided to go with the queryBuilder attribute in SearchBox. This is what I came up with:
_queryBuilder(queryString) {
const prefixQueryFields = [
'field_1^5',
'field_2^4',
'field_3',
];
return new ImmutableQuery()
.addQuery(BoolMust([
MultiMatchQuery(queryString, {
type: 'phase_prefix',
fields: prefixQueryFields,
})
]))
.addQuery(BoolMustNot([
TermQuery('status', 'archived'),
]));
}
...
<SearchBox
searchOnChange={true}
queryBuilder={this.queryBuilder}
/>
This query came out even more messed up, and I have no idea what to try next after checking the documentation and a cursory look at the source code.
(For the sake of brevity, I will not bother including the incorrect queries these two attempts created unless someone thinks that info will be useful.)

Figured it out. Using the QueryDSL structures wasn't working out very well, but apparently you can create the query with pure JSON, which worked great. Basically updated my query builder to return as so:
return {
bool: {
must: [
{
multi_match:{
query: queryString,
type: 'phrase_prefix',
fields: prefixQueryFields,
}
}
],
must_not: [
{
term: {
status: 'archived',
}
}
]
}
};

Bad Request in Discord.js (Node) and cant find out whats causing it

Im coding a bot in Discord.js (Node) and I'm trying to send an embed with the server info, I've got all the code but it keeps causing a Bad Request and I've tried everything I know here's the code:
var FieldsData = [{ name: "Channels", value: msg.guild.channels.size }, { name: "Emojis", value: msg.guild.emojis.size }, { name: "Members", value: msg.guild.members.size }, { name: "Owner", value: msg.guild.owner }, { name: "Roles", value: msg.guild.roles.size }, { name: "Region", value: msg.guild.region }, { name: "Id", value: msg.guild.id }, { name: "Icon", value: msg.guild.iconURL }, { name: "Created At", value: msg.guild.createdAt }];
msg.channel.send('', {
embed: {
color: 37119,
title: "Server info for " + msg.guild.name,
fields: FieldsData
}
});
I've tried the message with just one field and it works,
I've tried it will each field by themselves and it works
but when I put them all together they make a Bad Request,
I've checked every line, every character and I'm just
stumped at what could possibly be causing this,
the max fields is 25 and I don't have that many,
all the variables are valid, none produce 'Null' or 'Undefined',
I've tried different setups of the code layout,
I've tried adding/removing parts, editing parts, replacing bits
here and there but to no avail I cant get it to work at all.
I've been trying to figure this out for 2 hours, I've searched online, docs, etc
Please Note: I'm not that advanced with javascript so if i've made a big mistake then don't be surprised.
"msg" is the object of the message, Example:
Bot.on('message', function (msg) { /*Stuff*/ });
I hope I've explained this enough, I'm using the LATEST version of Discord.js at the time of posting this and I'm not using ANY other extensions, packages, etc

SHORT ANSWER:
Now, don't just ignore this post after I say this (actually read my reasons, the whole thing), but please just use a Rich Embed
LONG ANSWER:
First of all, I strongly suggest using Rich Embeds, as it is easier to play with and edit. Anyways, here:
The first suggestion comes from your message event. In ES6, we now have arrow functions which look like this (arg1, arg2) => {doSomething();}, and using this new feature, your message event handler should look more like this:
client.on('message', msg => {
//Do my thing with that msg object
});
Now back to the point.
Objects are weird k? I believe that this: "Server info for " + msg.guild.name is not allowed. I don't know why, but when I tried to use a variable to display my bot's version, it gave me an error too. So if you want to fix that you have two options:
Recommended: Use Rich Embeds
Not Recommended: Use `${myVar}` instead (Not Tested)
Don't overcomplicate. What is this: ('', You can just do msg.channel.send({embed:{}});
You don't just use variables for the sake of it. What is the point of using FieldsData? It is only used once, and why can't you just do:
msg.channel.send({
embed: {
color: 37119,
title: "Server info for " + msg.guild.name,
fields: [{name: "Channels", value: msg.guild.channels.size}, { name: "Emojis", value: msg.guild.emojis.size }, { name: "Members", value: msg.guild.members.size }, { name: "Owner", value: msg.guild.owner }, { name: "Roles", value: msg.guild.roles.size }, { name: "Region", value: msg.guild.region }, { name: "Id", value: msg.guild.id }, { name: "Icon", value: msg.guild.iconURL }, { name: "Created At", value: msg.guild.createdAt }]
}
});
VERY IMPORTANT NOTE:
Now, you don't give any valid reason why you don't want to use Rich Embeds, because a rich embed is also an object. ;-; So just use a rich embed.
i have scripts that require the use of objects, and they are pre made
I wonder how you get access to your embed if its not stored.... Very interesting.

Iterate through poorly formated object

I have a really bad-formated javascript object:
commits = {
commit: {
name: 'First commit'
},
commit: {
name: 'Second commit'
}
}
As you can see, each sub-object of commits object is called commit so it practically precludes an option to use for ... in ... or any other javascript loop (well, that's what i think but i'm a really poor JS programmer so i'm probably wrong). So, the question is, how can i iterate through that object?
Please have in mind that i can't use jQuery here and i can't rewrite that object
edit: that object is parsed from the following json:
{
"commits": {
"commit": {
"name": "First commit"
},
"commit": {
"name": "Second commit"
},
}
}

Considering the JSON posted in the github link you provided, there's not a lot you have to do.
The JSON string seems to be valid, except that it's missing a } at the end of the string. With that fixed, it parses just fine:
JSON.parse('{\n \"accountURL\": \"https://domain.com\",\n \"newCommitsCount\": \"1\",\n \"pushURL\":\"https://domain.com/project/64249/git/source/compare/revisions/0b6438955f2a5a7981fd25cfa5b48fe3fb4c888d,7771e638d1356a14d1dc46f3f5cfaab858370a5e\",\n \"unsubscribeURL\": \"https://domain.com:443/unsubscribe?token=receiverToken&type=COMMITS&projectId=64249\",\n \"invokerEmail\": \"email#email.com\",\n \"projectURL\": \"https://domain.com/project/64249\",\n \"projectId\": \"64249\",\n \"afterPushRevision\": \"7771e638d1356a14d1dc46f3f5cfaab858370a5e\",\n \"invokerId\": \"38074\",\n \"pushDate\": \"2014-02-11T15:26:36+0000\",\n \"beforePushRevision\": \"0b6438955f2a5a7981fd25cfa5b48fe3fb4c888d\",\n \"repositoryURL\": \"git_url\",\n \"subdomain\": \"subdomain\",\n \"domain\": \"domain\",\n \"branch\": \"develop\",\n \"invokerProfileURL\": \"url\",\n \"commitsCount\": \"1\",\n \"invokerSmallAvatarURL\": \"xx\",\n \"projectName\": \"NAME\",\n \"invoker\": \"Invoker Name.\",\n \"commits\": {\"commit\": {\n \"revision\": \"7771e638d1356a14d1dc46f3f5cfaab858370a5e\",\n \"commitMessage\": \"quickfix\",\n \"committerId\": \"38074\",\n \"committerEmail\": \"email\",\n \"committerName\": \"Name.\",\n \"commitDate\": \"2014-02-11T15:26:27+0000\",\n \"commitURL\": \"https://domain.com/project/64249/git/source/commit/develop/7771e638d1356a14d1dc46f3f5cfaab858370a5e\" }}}')

Develop Reference

JavaScript is the programming language of the Web.

Elasticsearch setting up stemming and analyzer questions - javascript

Related

Formatting cells with the Google Sheets API (v4)

Custom parameters for bootstrap-table server side pagination

How can I build a compound query with SearchBox in searchkit?

Bad Request in Discord.js (Node) and cant find out whats causing it

Iterate through poorly formated object

Categories

Resources