I am using postgresql 9.3 in my node.js application. In my database i have some 7lakhs records now. Also in my database i have json datatype column.
My query is as following:
EXPLAIN ANALYSE select id_0, name_0, id_1, name_1, id_2, name_2, id_3, name_3, id_4, name_4, latitude, longitude, to_char(collecteddate, 'dd/mm/yyyy') as collecteddate, key, value->>'xxxxx' as value from table where
CAST(value->'xxxxx'->> 'aaaaa' as INTEGER)BETWEEN 1 and 43722 and value->'PCA_2011'->> 'aaaaa' NOT LIKE ' ' and
CAST(value->'xxxxx'->> 'bbbbb' as INTEGER)BETWEEN 1 and 100 and value->'xxxx'->> 'bbbbb' NOT LIKE ' '
and leveltype = 'nnnn' and id_1= 'ww' and id_0 = 'uuu' and collecteddate = '2011-03-31';
This query will retrieve almost 1lakh records and takes 3 secs to be executed. I have created index for the json column and also the columns in where conditions. But i think its very long time to execute. Is there any way to reduce the execution time. I am new to this database optimization concepts, is there any optimization techniques to reduce my execution time to some milli seconds. Thanks in advance..
EDIT:
My index definition:
CREATE INDEX index_pop on table (id_0, id_1, collecteddate, leveltype, key, (value->'xxxxx'->>'aaaa'));
My Explain analyses result:
"Bitmap Heap Scan on table (cost=1708.27..59956.46 rows=1 width=132) (actual time=880.576..5137.266 rows=93615 loops=1)"
" Recheck Cond: (((id_0)::text = '356'::text) AND ((id_1)::text = '9'::text) AND (collecteddate = '2011-03-31'::date) AND ((leveltype)::text = 'pppp'::text))"
" Filter: ((((value -> 'xxxx'::text) ->> 'aaaa'::text) !~~ ' '::text) AND (((value -> 'xxxxx'::text) ->> 'bbbb'::text) !~~ ' '::text) AND ((((value -> 'xxxxx'::text) ->> 'aaaaa'::text))::integer >= 1) AND ((((value -> 'PCA (...)"
" Rows Removed by Filter: 4199"
" -> Bitmap Index Scan on index_name (cost=0.00..1708.27 rows=37856 width=0) (actual time=828.856..828.856 rows=97814 loops=1)"
" Index Cond: (((id_0)::text = '356'::text) AND ((id_1)::text = '9'::text) AND (collecteddate = '2011-03-31'::date) AND ((leveltype)::text = 'ppppp'::text))"
"Total runtime: 5211.271 ms"
ALso 1 more thing: Bitmap Index Scan on index_name is different index other than in my where condition index, also y only 1 index is earched??
Related
I have a dynamic string that is generated like one of the following:
var q = "FROM Table SELECT avg(1), avg(2), avg(3) where x='y'
var q = "SELECT avg(1), avg(2), avg(3) FROM Table where z='x' since x days ago
The values after the select are also dynamic where there could be 1 select option, or 10. I'm trying to create some logic to always pluck whatever is selected into an array, but having trouble dealing with the dynamic nature (string being constructed dynamically AND the # of selects being dynamic).
Basically, end result something like this:
['avg(1)', 'avg(2)', 'avg(3)']
Currently I'm doing something like the following, but it always expects the string to be formatted in a certain order (always starting with SELECT and where after the fields to pluck):
let splitQ = q.match(".*SELECT(.*)where");
let selects = splitQ[1].trim().split(",");
Here is a working solution.
It makes these assumptions about the query (after lowercased).
the values come after the first instance of the word 'select '
if the query starts with 'from', values end before the first instance of ' where'
if the query starts with 'select', values end before the first instance of ' from'
const test1 = "FROM Table SELECT avg(1), avg(2), avg(3) where x='y'";
const test2 = "SELECT avg(1), avg(2), avg(3) FROM Table where z='x' since x days ago";
function extractValues(query) {
// in both scenarios, the values always come directly after 'select '
const valuesComeAfterMe = 'select ';
query = query.toLowerCase();
let valuesEndBeforeMe;
// conditionally handle both query syntaxes
if (query.startsWith('from')) {
valuesEndBeforeMe = ' where';
} else if (query.startsWith('select')) {
valuesEndBeforeMe = ' from';
} else {
throw Error('query not handled');
}
// remove start
query = query.slice(query.indexOf(valuesComeAfterMe) + valuesComeAfterMe.length);
// remove end
query = query.slice(0, query.indexOf(valuesEndBeforeMe));
// split values and trim whitespace
return query.split(',').map(item => item.trim());
}
console.log(extractValues(test1));
console.log(extractValues(test2));
How can I access column's from SELECT, in my WHERE statement? I'm probably missing quotes. For context, this is in a controller, in Strapi CMS, which runs on a node.js server.
Problem:
Occurs at AND statement (mainly the first st_geomfromtext line):
const rawBuilder = strapi.connections.default.raw(
`
SELECT
locations.id as Location_ID,
locations.Title as Location_Title,
locations.Latitude as Location_Latitude,
locations.Longitude as Location_Longitude,
things.id,
things.Title,
things.Location
FROM locations
RIGHT JOIN things
ON locations.id = things.Location
WHERE things.Style = ` + ctx.query['Style.id'] + `
AND round(st_distance_sphere(
st_geomfromtext(CONCAT('POINT(',locations.Longitude, ' ', locations.Latitude,')')),
st_geomfromtext(CONCAT('POINT(` + ctx.query.Longitude + ` ` + ctx.query.Latitude + `)'))
)) <= ` + 5000
)
Test works:
Just for fun, same as above, but just passed request variables for both st_geomfromtext lines, and the response works; no SQL error:
AND round(st_distance_sphere(
st_geomfromtext(CONCAT('POINT(` + ctx.query.Longitude1 + ` ` + ctx.query.Latitude1 + `)')),
st_geomfromtext(CONCAT('POINT(` + ctx.query.Longitude2 + ` ` + ctx.query.Latitude2 + `)'))
)) <= ` + 5000
So as far as I can tell, the first st_geomfromtext line is the culprit, however it (the 1st line) works fine in a Go server... another clue that this is just a syntax problem.
Below is a working example in SQL Server that should help you resolve this.
Please try these steps:
Remove the "AND" statement from your where clause and save it somewhere
Add some filter criteria that will give you just few known locations
Add new output fields in your select criteria for each function so you will know what you are comparing.
Select CONCAT('POINT(',locations.Longitude, ' ', locations.Latitude,')') from locations
Select st_geomfromtext(CONCAT('POINT(',locations.Longitude, ' ', locations.Latitude,')')) from locations
Note: the output to the geo functions this will probably look cryptic like 0xE6100000010C75931804564253C042CF66D5E7724340
Once the values line up the way you expect then add a new version of the where clause with the adjustments you have made.
Check the precision of the st_distance_sphere function. In SQL Server this is defaulted to meters.
Example in SQL Server
CREATE TABLE #locations (id INT, Title VARCHAR(50), Latitude DECIMAL(10,4), Longitude DECIMAL(10,4))
CREATE TABLE #things (id INT, Title VARCHAR(50), LocationId INT)
INSERT INTO #locations (id, Title, Latitude, Longitude) Values (1,'WH', 38.8977, -77.0365)
INSERT INTO #locations (id, Title, Latitude, Longitude) Values (2,'CB', 38.8899, -77.0091)
INSERT INTO #things (id, Title, LocationId) Values (100,'White House',1)
INSERT INTO #things (id, Title, LocationId) Values (101,'United States Capitol',2)
--My Location at the Washington Monument
DECLARE #myLat DECIMAL(10,4) = 38.8895;
DECLARE #myLong DECIMAL(10,4) = -77.0353
SELECT
loc.id as Location_ID,
loc.Title as Location_Title,
loc.Latitude as Location_Latitude,
loc.Longitude as Location_Longitude,
th.id,
th.Title,
th.LocationId,
geometry::STGeomFromText(CONCAT('POINT(',loc.Longitude, ' ', loc.Latitude,')'),4326) as ItemPoint,
geometry::STGeomFromText(CONCAT('POINT(',#myLat,' ',#myLong,')'),4326) as MyPoint,
geometry::STGeomFromText(CONCAT('POINT(',loc.Longitude, ' ', loc.Latitude,')'),4326).STDistance(geometry::STGeomFromText(CONCAT('POINT(',#myLat,' ',#myLong,')'),4326))
FROM #locations loc
RIGHT JOIN #things th ON loc.id = th.LocationId
DROP TABLE #locations
DROP TABLE #things
I am writing a script to take a stock number, loop through existing stock numbers until a match is NOT found, then assign that unique stock number to the record. My problem is that the usual data[i][2] doesn't seem to reference a 'query' the same way that Apps Script would reference an array.
Fair warning, I'm trying to expand my Apps Script skills in to broader Javascript so I there's a good chance I'm doing it all wrong - I'm all ears if you tell me I'm doing this all incorrectly!
Using the log: data[i][2] gives me 'undefined' whereas data[2] gives me all fields of the third item in my query. Based on this I feel like I just need to learn how to reference it properly.
//Querying my datasource as 'var data'
var query = app.models.UsedVehicles.newQuery();
query.filters.ParentDealType._contains = prefix;
var data = query.run();
//Returns four records which is correct.
var testStockNo = prefix+month+countstring+year;
console.log("Test Stock Number " + j + ": " + testStockNo);
for (i = 0; i < data.length; i++){
console.log("data[i][2]: " + data[i][2]); //results: undefined
console.log("data[2]: " + data[2]); //results: all fields of 3rd query result.
if(data[i][2] === testStockNo){
k++;
break;
}else{
console.log("No Match");
}
}
Even if testStockNo equals the value in field:TStockNo, the log displays:
Test Stock Number 1: C1200118
data[i][2]: undefined
data[2]: Record : { TIndex: 8, TVin8: HS654987, TStockNo: null,
TParentStkNo: GSD6578, TYear: 2010, TMake: NISSAN, TModel: PICKUP,
TMileage: 24356, ParentDealType: C}
No Match
Issue/Solution:
query.run() returns array of records and NOT a array of arrays(2D). You should access the Record value using it's key instead of a index.
Snippets:
console.log("data[i][TStockNo]: " + data[i]['TStockNo']);
console.log("data[i].TStockNo: " + data[i].TStockNo);
console.log("data[2]: " + data[2]);
References:
Query#Run
I have a dhcp lease file with the following example entries:
lease 172.16.20.11 {
starts 4 2014/10/09 18:33:57;
ends 4 2014/10/09 18:43:57;
cltt 4 2014/10/09 18:33:57;
binding state active;
next binding state free;
rewind binding state free;
hardware ethernet XX:XX:XX:XX:XX:XX;
client-hostname "phone";
}
I am trying to find a way to convert the information into JSON so I can use in Dojo.
I would like the output to be like
{"leases": ["address":"172.16.20.11", "starts":"2014/10/09 18:33:57", "ends":"2014/10/09 18:43:57","
client-hostname":"phone"]}
Is there a way to do this?
Thanks,
Tim T
var str = 'lease 172.16.20.11 { starts 4 2014/10/09 18:33:57; ends 4 2014/10/09 18:43:57; cltt 4 2014/10/09 18:33:57; binding state active; next binding state free; rewind binding state free; hardware ethernet XX:XX:XX:XX:XX:XX; client-hostname "phone"; }';
var res = str.split(/[\s;]+/); // regex match spaces and semicolons
// Create your leases array with a lease object from the parsed string
var leases = {leases:[{
address: res[1],
starts: res[5] + " " + res[6],
ends: res[9] + res[10],
client_hostname: res[30].split('"')[1]
}]};
var json = JSON.stringify(leases); //convert the array of leases to json string
[EDIT] client-hostname must be client_hostname because of variable name restrictions
[EDIT] changed leases to be an object with an array property to more closely match your desired output
[EDIT] parsed phone from "phone" for client_hostname
I have been handed a project at work where I need to find duplicate pairings from multiple rows within a dataset. While the data set is much larger, the main portion revolves around the date of a training, the location of a training, and the names of the trainers. So every row of data has a date, a location, and then a comma separated list of names:
Date Location Names
1/13/2014 Seattle A, B, D
1/16/2014 Dallas C, D, E
1/20/2014 New York A, D
1/23/2014 Dallas C, E
1/27/2014 Seattle B, D
1/30/2014 Houston C, A, F
2/3/2014 Washington DC D, A, F
2/6/2014 Phoenix B, E
2/10/2014 Seattle C, B
2/13/2014 Miami A, B, E
2/17/2014 Miami C, D
2/20/2014 New York B, E, F
2/24/2014 Houston A, B, F
My goal is to be able to find rows with similar pairings of names. One example would be to know that A & B were in paired in Seattle on 1/13, Miami on 2/13, and Houston on 2/24, even though the third name is different in each occurrence. So instead of just simply finding duplicates among the entire string of names, I would also like to find pairings among partial segments of the “Names” column.
Is this possible to do within Excel or would I need to use a programming language to accomplish the task?
While I can manually do this, it represents a lot of time that could be used towards other things. If there was a way that I could automate this it would make this portion of my task a lot simpler.
Thank you in advance for any assistance or advice on a way forward.
You can do it with VBA. The solution below assumes
Your data is on the active sheet in columns A:C
You results will be output in columns E:G
The output will be a list sorted by pairs, and then by dates, so you can easily see where pairs repeated.
The routine assumes no more than three trainers at a time, but could be modified add more possible combinations.
Cities with just a single trainer will be ignored.
The routine uses a Class module to gather the information, and two Collections to process the data. It also makes use of the feature that collections will not allow addition of two items with the same key.
Class Module
Rename the Class Module: cPairs
Option Explicit
Private pTrainer1 As String
Private pTrainer2 As String
Private pCity As String
Private pDT As Date
Public Property Get Trainer1() As String
Trainer1 = pTrainer1
End Property
Public Property Let Trainer1(Value As String)
pTrainer1 = Value
End Property
Public Property Get Trainer2() As String
Trainer2 = pTrainer2
End Property
Public Property Let Trainer2(Value As String)
pTrainer2 = Value
End Property
Public Property Get City() As String
City = pCity
End Property
Public Property Let City(Value As String)
pCity = Value
End Property
Public Property Get DT() As Date
DT = pDT
End Property
Public Property Let DT(Value As Date)
pDT = Value
End Property
Regular Module
Option Explicit
Option Compare Text
Public cP As cPairs, colP As Collection
Public colCityPairs As Collection
Public vSrc As Variant
Public vRes() As Variant
Public rRes As Range
Public I As Long, J As Long
Public V As Variant
Public sKey As String
Sub FindPairs()
vSrc = Range("A1", Cells(Rows.Count, "C").End(xlUp))
Set colP = New Collection
Set colCityPairs = New Collection
'Collect Pairs
For I = 2 To UBound(vSrc)
V = Split(Replace(vSrc(I, 3), " ", ""), ",")
If UBound(V) >= 1 Then
'sort the pairs
SingleBubbleSort V
Select Case UBound(V)
Case 1
AddPairs V(0), V(1)
Case 2
AddPairs V(0), V(1)
AddPairs V(0), V(2)
AddPairs V(1), V(2)
End Select
End If
Next I
ReDim vRes(0 To colCityPairs.Count, 1 To 3)
vRes(0, 1) = "Date"
vRes(0, 2) = "Location"
vRes(0, 3) = "Pairs"
For I = 1 To colCityPairs.Count
With colCityPairs(I)
vRes(I, 1) = .DT
vRes(I, 2) = .City
vRes(I, 3) = .Trainer1 & ", " & .Trainer2
End With
Next I
Set rRes = Range("E1").Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
With rRes
.EntireColumn.Clear
.Value = vRes
With .Rows(1)
.HorizontalAlignment = xlCenter
.Font.Bold = True
End With
.Sort key1:=.Columns(3), order1:=xlAscending, key2:=.Columns(1), order2:=xlAscending, _
Header:=xlYes
.EntireColumn.AutoFit
V = VBA.Array(vbYellow, vbGreen)
J = 0
For I = 2 To rRes.Rows.Count
If rRes(I, 3) = rRes(I - 1, 3) Then
.Rows(I).Interior.Color = .Rows(I - 1).Interior.Color
Else
J = J + 1
.Rows(I).Interior.Color = V(J Mod 2)
End If
Next I
End With
End Sub
Sub AddPairs(T1, T2)
Set cP = New cPairs
With cP
.Trainer1 = T1
.Trainer2 = T2
.City = vSrc(I, 2)
.DT = vSrc(I, 1)
sKey = .Trainer1 & "|" & .Trainer2
On Error Resume Next
colP.Add cP, sKey
If Err.Number = 457 Then
Err.Clear
colCityPairs.Add colP(sKey), sKey & "|" & colP(sKey).DT & "|" & colP(sKey).City
colCityPairs.Add cP, sKey & "|" & .DT & "|" & .City
Else
If Err.Number <> 0 Then Stop
End If
On Error GoTo 0
End With
End Sub
Sub SingleBubbleSort(TempArray As Variant)
'copied directly from support.microsoft.com
Dim Temp As Variant
Dim I As Integer
Dim NoExchanges As Integer
' Loop until no more "exchanges" are made.
Do
NoExchanges = True
' Loop through each element in the array.
For I = LBound(TempArray) To UBound(TempArray) - 1
' If the element is greater than the element
' following it, exchange the two elements.
If TempArray(I) > TempArray(I + 1) Then
NoExchanges = False
Temp = TempArray(I)
TempArray(I) = TempArray(I + 1)
TempArray(I + 1) = Temp
End If
Next I
Loop While Not (NoExchanges)
End Sub
Ok. I got bored and did this whole thing in Python code. I assume you are familiar with the language; however, you should be able to get the following piece of code to work on any computer with Python installed.
I have made a few assumptions. For instance, I have used your example input as definite input.
A few things which will mess up the program:
Not entering with case sensitivity. Beware of capital letters etc.
Having a inputfile which has the following row: "Date Location Names". Just remove and keep straight facts in the file. I got lazy and do not bother adjusting this.
A ton of other small stuff. Just do what the program asks you to do and dont enter funky input.
About program:
Revolves around using a dictionary with person names as keys. The values in the dictionary is a set with tuples containing the places they've been during what date. By then comparing these sets and getting the intersection, we can find the answer.
Kinda messy since I took this as Python practice. Have not coded in Python for a while and I got a thrill out of doing it all without utilizing objects. Just follow the "instructions" and keep the inputfile, which stores all information, in the same folder as the piece of code are running.
As a side note, you might want to check that the program yields correct output.
If you have any questions, feel free to contact me.
def readWord(line, stringIndex):
word = ""
while(line[stringIndex] != " "):
word += line[stringIndex]
stringIndex += 1
return word, stringIndex
def removeSpacing(line, stringIndex):
while(line[stringIndex] == " "):
stringIndex += 1
return stringIndex
def readPeople(line, stringIndex):
lineSize = len(line)
people = []
while(stringIndex < lineSize):
people.append(line[stringIndex])
stringIndex += 3
return people
def readLine(travels, line):
stringIndex = 0
date, stringIndex = readWord(line, stringIndex)
stringIndex = removeSpacing(line, stringIndex)
location, stringIndex = readWord(line, stringIndex)
stringIndex = removeSpacing(line, stringIndex)
people = readPeople(line, stringIndex)
for person in people:
if(person not in travels.keys()):
travels[person] = set()
travels[person].add((date, location))
return travels
def main():
f = open(input("Enter filename (must be in same folder as this program code. For instance, name could be: testDocument.txt\n\n"))
travels = dict()
for line in f:
travels = readLine(travels, line)
print("\n\n\n\n PROGRAM RUNNING \n \n")
while(True):
persons = []
userInput = "empty"
while(userInput):
userInput = input("Enter person name (Type Enter to finish typing names): ")
if(userInput):
persons.append(userInput)
output = travels[persons[0]]
for person in persons[1:]:
output = output.intersection(travels[person])
print("")
for hit in output:
print(hit)
print("\nFINISHED WITH ONE RUN. STARTING NEW ONE\n")