I am new to SQL and wonder how to select nested tables.
I have two tables like this:
sensors
sensor_id
model_no
location_id
int
varchar
int
locations
location_id
name
location
radius
int
varchar
point
int
They are linked with foreign key. Currently, I select using
SELECT sensors.*, locations.*
FROM sensors INNER JOIN locations
ON sensors.location_id = locations.location_id;
to get the data from both like this:
{
"sensor_id": 1,
"model_no": "some string",
"location_id": 2,
"name": "Berlin",
"location": {
"x": 3,
"y": 3
},
"radius": 1000
}
I wonder if there is any way I can keep the location data grouped as its own object like this:
{
"sensor_id": 1,
"model_no": "some string",
"location": {
"name": "Berlin",
"location": {
"x": 3,
"y": 3
},
"radius": 1000
}
}
I am using MySQL 8 with mysql npm package to execute the queries. I know I can modify the response using javascript but wonder if it can be done directly in the query, and if so, is it better or worse for performance?
SELECT JSON_OBJECT(
'sensor_id', sensor_id,
'model_no', model_no,
'location', JSON_OBJECT(
'name', name,
'location', JSON_OBJECT(
'x', CAST(ST_X(location) AS SIGNED),
'y', CAST(ST_Y(location) AS SIGNED)
),
'radius', radius
)
) as JSON_output
FROM sensors
JOIN locations USING (location_id);
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=b17dfa3069b4bb9345a9e99e8b893121
You must manually label each column; you cannot get the desired result and use *.
A small consolation prize: If you change
ON sensors.location_id = locations.location_id;
to
USING(location_id)
There will be only one location_id column.
Related
Given the following db structure https://drawsql.app/sensor_network/diagrams/db, I would like to get all sensor_data from a certain location, while grouping the response using station.location, sensor_data.time.
I have the following query:
select
station.location_name,
sensor_data.time,
sensor.type,
sensor_data.value,
sensor.id
from station
inner join
(sensor inner join sensor_data on sensor.id = sensor_data.sensor_id)
on station.sensor_id = sensor.id
where station.location = ?
which gives me the rows I want, however, I would like to format the response in a way where all the rows which share location and time are in the same row. an example output is
location
time
type
value
id
city1
2022-01-01 08:00:00
temperature
290
1
city1
2022-01-01 09:00:00
temperature
292
1
city1
2022-01-01 08:00:00
ph
7
2
city1
2022-01-01 09:00:00
ph
8
2
which I would like to format either like (or similar to)
[
{"location": "city1", "time": "2022-01-01 08:00:00", "temperature": 290, "ph": 7},
{"location": "city1", "time": "2022-01-01 09:00:00", "temperature": 292, "ph": 8},
]
or
[
{"location": "city1", "time": "2022-01-01 08:00:00", "sensors": [{"id": 1, "type": "temperature", "value": 290}, {"id": 2, "type": "ph", "value": 7}]},
{"location": "city1", "time": "2022-01-01 09:00:00", "sensors": [{"id": 1, "type": "temperature", "value": 292}, {"id": 2, "type": "ph", "value": 8}]}
]
Currently, I am doing this formatting in javascript using array functions, but this has proven to be too slow. I have tried using group by query but to be fair, I am struggling to understand how to use that when you don't want to use methods like count etc.
I am using nodejs and its mysql package, the database is mysql:8 and tables are running with innodb engine
You can achieve this with subqueries.
SELECT
st.location,
sd.time,
(
SELECT sd2.value
FROM sensor_data sd2
INNER JOIN sensor s2
ON s2.id = sd2.sensor_id
WHERE sd2.time = sd.time
AND s2.type = "temperature"
) AS temperature,
(
SELECT sd2.value
FROM sensor_data sd2
INNER JOIN sensor s2
ON s2.id = sd2.sensor_id
WHERE sd2.time = sd.time
AND s2.type = "ph"
) AS ph
FROM station st
INNER JOIN sensor s
ON st.sensors = s.id
INNER JOIN sensor_data sd
ON s.id = sd.sensor_id
WHERE st.location = ?
GROUP BY sd.time
Obviously, this will work only if you know the list of types (temperature, ph) in advance and thus you can write separate subqueries for each of them.
If you don't want to build separate subquery for each type, you can concat the values into a single subquery column.
SELECT
st.location,
sd.time,
(
SELECT
GROUP_CONCAT(
CONCAT(s2.type, ':', sd2.value)
SEPARATOR ','
)
FROM sensor_data sd2
INNER JOIN sensor s2
ON s2.id = sd2.sensor_id
WHERE sd2.time = sd.time
GROUP BY sd2.time
) AS sensors
FROM station st
INNER JOIN sensor s
ON st.sensors = s.id
INNER JOIN sensor_data sd
ON s.id = sd.sensor_id
WHERE st.location = ?
GROUP BY sd.time;
I want to display these fields :name, age, addresses_id, addresses_city, addresses_primary for each person into data studio.
My JSON data
{
"data": [
{
"name": "Lio",
"age": 30,
"addresses": [
{
"id": 7834,
"city": "ML",
"primary": 1
},
{
"id": 5034,
"city": "MM",
"primary": 1
}
]
},
{
"name": "Kali",
"age": 41,
"addresses": [
{
"id": 3334,
"city": "WK",
"primary": 1
},
{
"id": 1730,
"city": "DC",
"primary": 1
}
]
},
...
]
}
there is no problem if i don't render the addresses field
return {
schema: requestedFields.build(),
rows: rows
};
//rows:
/*
"rows": [
{
"values": ["Lio", 30]
},
{
"values": ["Kali", 41]
},
...
]
*/
The problem is
I'm not able to model the nested JSON data in Google Data Studio. I
have the problem exactly in the "addresses" field.
Could anyone tell me what format should be for the rows in this case?
As you already know, for each name of your dataset, you clearly have more than one row (one person has multiple addresses). Data Studio only accepts a single data for each field, since arrays are not supported at all. So you need to work on this.
There are some ways to solve this, but always keep in mind that:
getSchema() should return all available fields for your connector (the order doesn't really matter, since Data Studio always sort alphabetically the available fields)
getData() should return a list of values. But here the order is relevant: it should be the same as the parameter passed to getData() (which means the results should be dynamic, sometimes you'll return all values, sometimes not, and the order may change).
Solution 1: Return multiple rows per record
Since you can produce multiple rows for each name, just do it.
To achieve this, your field definition (=getSchema()) should include fields address_id, address_city and address_primary (you can also add address_order if you need to know the position of the address in the list).
Supposing getData() is called with all fields in the same order they were discribed, rows array should look like this:
"rows": [
{
"values": ["Lio", 30, "7834", "ML", 1]
},
{
"values": ["Lio", 30, "5034", "MM", 1]
},
{
"values": ["Kali", 41, "3334", "WK", 1]
},
{
"values": ["Kali", 41, "1730", "DC", 1]
},
...
]
IMO, this is the best solution for your data.
Solution 2: Return one address only, ignoring others
If you prefer one row per person, you can get one of the addresses and display only it (usually the main/primary address, or the first one).
To achieve this, your field definition (=getSchema()) should include fields address_id, address_city and address_primary.
Supposing getData() is called with all fields in the same order they were discribed, rows array should look like this:
"rows": [
{
"values": ["Lio", 30, "7834", "ML", 1]
},
{
"values": ["Kali", 41, "3334", "WK", 1]
},
...
]
Solution 3: Return all addresses, serialized in a field
This is helpful if you really need all information but do not want a complex scheme.
Just create a field called addresses in your field definition (=getSchema()) and write the JSON there as a string (or any other format you want).
Supposing getData() is called with all fields in the same order they were discribed, rows array should look like this:
"rows": [
{
"values": ["Lio", 30, "[{\"id\": 7834, \"city\": "ML", \"primary\": 1}, {\"id\": 5034, \"city\": \"MM\", \"primary\": 1}]"]
},
{
"values": ["Kali", 41, "[{\"id\": 3334, \"city\": \"WK\", \"primary\": 1}, {\"id\": 1730, \"city\": \"DC\", \"primary\": 1}]"]
},
...
]
This solution may appear senseless, but it is possible to interact with this data later in DataStudio using REGEX if really needed.
Solution 4: Create a different field for each address
If you're sure all records has a maximum number of addresses (in you example, both names have 2 addresses, for example), you can create multiple fields.
Your field definition (=getSchema()) should include fields address_id1, address_city1, address_primary1, address_id2, ... address_primaryN.
I wouldn't explain how rows should look like in this situation, but it is not hard to guess with the other examples.
The keys and values are separated in the Json object that I get from an api call. I have tried finding a solution It looks like the following:
{
"range": "'1'!A1:AM243",
"majorDimension": "ROWS",
"values":
[
"DeptID",
"DeptDescr",
"VP Area",
"VP Descr",
"HR Category",
"Employee Relations1",
"ER1Title",
"ER1Phone",
"ER1Email",
"Employee Relations2",
"ER2Title",
"ER2Phone",
"ER2Email",
"Compensation1",
"Comp1Title",
"Comp1Phone",
"Comp1Email",
"Compensation2",
"Comp2Title",
"Comp2Phone",
"Comp2Email",
"Employment1",
"E1Title",
"E1Phone",
"E1Email",
"Employment2",
"E2Title",
"E2Phone",
"E2Email",
"Employee Pay Services1",
"EPS1Title",
"EPS1Phone",
"EPS1Email",
"Employee Pay Services2",
"EPS2Title",
"EPS2Phone",
"EPS2Email"
],
[
"20734",
"Academic Success Centers",
"VES",
"VP Enroll Mgmt & Student Aff",
"Administrative",
"Brian Schmidt",
" Employee Relations Consultant",
"(928)523-6139",
"Brian.Schmidt#nau.edu",
"Marcia Warden",
"Assistant Director, Employee Relations",
"(928)523-9624",
"Marcia.Warden#nau.edu",
"Nicole Christian",
"Employment & Compensation Analyst",
"(928)523-6127",
" Nicole.Christian#nau.edu",
"Cathy Speirs",
"Associate Director",
"(928)523-6136",
"Cathy.Speirs#nau.edu",
"Nicole Christian",
"Employment & Compensation Analyst",
"(928)523-6127",
" Nicole.Christian#nau.edu",
"Cathy Speirs",
"Associate Director",
"(928)523-6136",
"Cathy.Speirs#nau.edu",
"Katherine Kurpierz",
"Payroll Specialist",
"(928)523-6129",
"Katherine.Kurpierz#nau.edu",
"Cheryl Brothers",
"Assistant Director - HR Payroll Services",
"(928)523-6085",
"Cheryl.Brothers#nau.edu"
], etc.
But I need it to look like:
[
{
"DeptID": 20734,
"DeptDescr": "Academic Success Centers",
"VP Area": "VES",
"VP Descr": "VP Enroll Mgmt & Student Aff",
"HR Category": "Administrative",
"Employee Relations1": "Brian Schmidt",
"Employee Relations2": "Marcia Warden",
"Compensation1": "Nicole Christian",
"Compensation2": "Cathy Speirs",
"Employment1": "Nicole Christian",
"Employment3": "Cathy Speirs",
"Employee Pay Services1": "Katherine Kurpierz",
"Employee Pay Services2": "Cheryl Brothers"
},etc
I am trying to use the data to populate a drop down using javascript and ajax. Any help is really appreciated.
The object your API returns is not a valid JSON. Was that API made by you or can you get that fixed somehow?
There are 2 things you could do to make it work
-One is change it to return exactly what you want;
-Two is to fix what it returns so that it is a valid JSON;
Going for what is wrong with the file you initially posted, let's remove the contents of the arrays so it's easier to spot the problem:
Your original data looks roughly like this:
{ "range": "'1'!A1:AM243",
"majorDimension": "ROWS",
"values": [],[]
}
To be valid you would need it to look like this:
{ "range": "'1'!A1:AM243",
"majorDimension": "ROWS",
"values": {
"keys": [],
"data": [],
}
}
Notice that I wrapped the two arrays of "values" with { } because it has to be an object if you want it to contain two arrays in it.
Then I gave each array a key with which you can call them. With that you'd be able to get what you want from your "values", so that for each item in the "keys" array you have something in that "data" array.
Hope this helps.
Well let's have a look;
Suppose this is a short version of the response data you got:
var res = `
{
"range": "'1'!A1:AM243",
"majorDimension": "ROWS",
"values": [
"DeptID",
"DeptDescr",
"VP Area"
],
[
"20734",
"Academic Success Centers",
"VES"
],
[
"345543",
"Academic Fails Centers",
"OK"
]
}
`;
As we can see by the first data, looks like a dump from a spreadsheet of sorts, and someone maybe scripted a way to export this data in a JSON-ish way. The values "Array" are the rows of this "spreadsheet".
We will clean it up, and get only the chunks that looks like ["value", "another value", "etc"]
// clean tabs and returns
res = res.replace(/\t/g, '').replace(/\n/g, '');
// get the array-ish chunks
rows = res.match(/\[(((["'])(?:(?=(\\?))\4.)*?\3),*)+\]/gm)
now let's make them real arrays:
var data = rows.map(function (row) {
return JSON.parse(row);
});
Now we have an array of arrays of strings. that means, an array of "rows" that contains the values of the "cells". The first one looks like the header row (the one with the names of the fields)
Lets make objects using each row of data except the first one. The first will serve us as the keys (we match the position (index) of the value from rows[n] from the value on rows[0] to get a key-value pair)
// Here we will define an object to store data
var data_object = { values: [] };
// for each row except the first
for(var i = 1; i < data.length; i++) {
var my_data = {};
//for each element of this row
for(var j = 0; j < data[i].length; j++) {
my_data[data[0][j]] = data[i][j];
}
data_object.values.push(my_data);
}
We have our object, let's suppose you need it in JSON format now:
var json_data = JSON.stringify(data_object);
// let's look what we have here
console.log('json_data:', json_data);
We will look at something like this as a result:
json_data: {"values":[{"DeptID":"20734","DeptDescr":"Academic Success Centers","VP Area":"VES"},{"DeptID":"345543","DeptDescr":"Academic Fails Centers","VP Area":"OK"}]}
NOW A WARNING:
This is what you DON'T want to do if you can fix the API you are getting this data from first. If any inconsistency appears, things will break. and in this example i'm not managing any edge case or exception, neither checking boundaries of arrays or wrapping things in try-catch blocks.
So this is how my database looks right now:
{
_id": "r8uoPSvJY36nHgCK9",
"name": "Running",
"category": "leisure",
"duration": "2",
"createdAt": "2/10 16:42:17",
"skills": {
"creativity": 6,
"analytics": 3,
"fitness": 7,
"research": 4,
"communication": 4,
"problemSolving": 3,
"timeManagement": 7,
"leadership": 3,
"selfMotivation": 3,
"teamwork": 4
},
"started": "false",
"finished": "false"
}
How can I query the collection so that I get the value of the creativity field and store it in a variable? I have tried something like:
tasks.find({'skills.creativity': this._id});
but it doesn't appear to work.
.find() returns a cursor - i.e. all the matching records, not just the key that you're looking for.
If you're looking for the creativity field for a single record and assuming you are trying to find the document by its _id and not by the value of the skills.creativity field (which as #Styx points out would be silly) then:
const creativity = tasks.findOne(this._id).skills.creativity;
If you're looking to get the creativity field for all the matching records then:
const creativityArray = tasks.find(query).map((e) => e.skills.creativity);
where query defines the set of documents you're looking for.
I was trying to write a query which is finding MAX value from all documents. The scenario is something like I have 100 Students Documents, in which student Name, roll number as well as array of Tests inside that array of Subject and its respective marks. So, I am getting highest marks among subject physics from all documents. But I am not getting it with student roll number. That I was trying to find out.
TestDoc is:
Student[
StudenName:"A",
StudentRollNo :1,
id:"1",
StudentAdd:"---",
Test1:[
{
SubName:"S1",
Marks:20
},
{
SubName:"S2",
Marks:30
},
...
],
Test2:
[
Same as above
],
],
[
STUDENT2
] ,
and so on
Query I am using is:
select MAX(s.Marks) from c join test in c.Test1 join s in test.marks
According to your description, you want to implement function like GROUP BY in azure cosmosdb queries.
Per my experience, azure cosmosdb aggregation capability in SQL limited to COUNT, SUM, MIN, MAX, AVG functions. GROUP BY or other aggregation functionality are not be supported in azure cosmosdb now.
However, stored procedures or UDF can be used to implement your aggregation requirement.
You could refer to a great package documentdb-lumenize based on DocumentDb stored procedure.
For your first scenario in your post,I created two student documents in my azure cosmosdb account.
[
{
"id": "1",
"StudenName": "A",
"StudentRollNo": 1,
"Test": [
{
"SubName": "S1",
"Marks": 20
},
{
"SubName": "S2",
"Marks": 30
}
],
},
{
"id": "2",
"StudenName": "B",
"StudentRollNo": 2,
"Test": [
{
"SubName": "S1",
"Marks": 10
},
{
"SubName": "S2",
"Marks": 40
}
],
}
]
then I put the resultset searched by SQL below to the documentdb-lumenize mentioned above to get the max S2 mark.
SELECT c.StudentRollNo,test1.Marks as mark FROM c
join test1 in c.Test
where test1.SubName='S2'
For your second scenario in your comment,I removed the where clause of the SQL above.
SELECT c.StudentRollNo,test1.Marks as mark FROM c
join test1 in c.Test
and resultset like:
This applies only to one test.If you want to query multiple tests, you could use stored procedure.
You could also refer to SO threads below:
1.Azure DocumentDB - Group By Aggregates
2.Grouping by a field in DocumentDB