I currently have a setup with multiple requests to the database in a for-loop.
// week is a moment-range object
for (const day of week.by('day')) {
// get start and end of day
const startDay = day.startOf('day').toDate();
const endDay = day.endOf('day').toDate();
// I would like to reduce this query to one
const data = await Model.find({
$or: [
{
$and: [{ start: { $gte: startDay } }, { start: { $lt: endDay } }],
},
{
$and: [{ end: { $gte: startDay } }, { end: { $lt: endDay } }],
},
],
})
.sort({ start: 'asc' })
.select('_id someref')
.populate('someref', 'name');
console.log(data);
}
This is not very efficient and therefore I would like to reduce it to one query, grouped by the current return of data.
I've tried already to prepare the find parameter in the for-loop but didn't get far. Any hints would be very much appreciated.
Your query seems to find records that either start or end in a given day. As week is a range of consecutive days, this really translates to a query for records that either start in that week, or end in that week.
So you could remove the for loop, and define startDay and endDay as the start/end of the week, as follows:
const startDay = week.start.startOf('day').toDate();
const endDay = week.end.endOf('day').toDate();
The rest of your code can remain as it is. Just remove the line with for and the corresponding ending brace.
One difference though is that you wouldn't get any duplicates any more. In your current code you would get records that both start and end in the week (but not on the same day) twice. That will not happen with this code.
Related
Building a rental listing application using MERN stack. My Listing model is below:
const listingShcema = new mongoose.Schema(
{
hostedBy: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
title: {
type: String,
required: true,
},
description: {
type: String,
required: true,
},
numberOfGuests: {
type: String,
required: true,
},
numberOfRooms: {
type: String,
required: true,
},
numberOfBeds: {
type: String,
required: true,
},
numberOfBaths: {
type: String,
required: true,
},
price: {
type: String,
requried: true,
},
location: {
streetAddress: { type: String },
city: { type: String },
state: { type: String },
postalCode: { type: String },
},
bookedDates: [
{
startDate: Date,
endDate: Date,
},
],
imgUrls: [
{
type: String,
},
],
amenities: [
{
type: String,
},
],
},
{ timestamps: true }
);
Now it is fairly easy to run queries on everything given by the users search query except for the dates they want to rent out. The listing model keeps track of all bookedDates. I want to be able search the mongodb for Listings that do not have bookedDates that match the dates supplied by the users search query (showing available listings to the user). I can't think of a way to do this?? I figured it is easy keeping track of only the booked dates instead of taking away booked dates from an array of all available dates.
Doing this directly in the DB is kind of awkward, especially if you're only storing the startDate and endDate for each booking. For example, if someone books a listing from the 1st to the 5th - if another user is searching for the same listing from the 3rd to the 7th, it doesn't match the booking saved, but the listing wouldn't still be counted as available.
I'd suggest taking another look at your model and perhaps even separating out the booked dates to their own documents.
But, keeping with what you have, assuming you're not booking too far in the future, it might be worth storing the bookedDates as a flat array. So if we have a listing booked from the 1st to the 3rd, and the 6th to the 8th, your array would look like this:
bookedDates: [
'2021-01-01',
'2021-01-02',
'2021-01-03',
'2021-01-06',
'2021-01-07',
'2021-01-08'
]
Then, if someone searches for the listing between the 2nd and 4th, you'd again break down those dates into a flat array, and then you should be able to use the $nin operator (https://docs.mongodb.com/manual/reference/operator/query/nin/):
const desiredDates = [
'2021-01-02',
'2021-01-03',
'2021-01-04'
]
Listing.find({ bookedDates: { $nin: desiredDates } })
To quote the relevant part of the page:
If the field holds an array, then the $nin operator selects the documents whose field holds an array with no element equal to a value in the specified array (e.g. , , etc.).
This is obviously going to work best if you have another way to filter out the majority of your listings, so your not doing an array-array check for every listing in your database.
You'll also have to keep bookedDates up-to-date by removing past dates.
Another option is just to query your listings and do the date filtering at the application level, in which case, you can probably keep the startDate and endDate format that you have.
Update for flattening dates
Something like this should work. I just brute force it - people are generally only going to book a listing for a few days mostly, so your loop is going to be quite small. There are some checks in there if it's for one day, and to make sure the start is before the end, but that's about it.
As a method, you can call it whenever you want, and it'll split two dates into a flattened string array in yyyy-mm-dd format
function getFlattenedDatesAr(inputStart, inputEnd) {
// convert to dates and make sure that start is before end
let startDate = new Date(inputStart)
let endDate = new Date(inputEnd)
if(startDate > endDate) {
let temp = startDate;
startDate = endDate;
endDate = temp;
}
// get our dates in yyyy-mm-dd format
const startDateStr = startDate.toISOString().substr(0, 10)
const endDateStr = endDate.toISOString().substr(0, 10)
// check if they've only booked for one day
if(startDateStr === endDateStr) {
return [startDateStr];
return;
}
// fill our our dates array
const bookedDates = [startDateStr]
let currDate = startDate;
while(true) {
// NOTE: setDate returns a timestamp, not a Date
const nextDateTS = currDate.setDate(currDate.getDate() + 1);
// get our date string and add it to our bookedDates array
currDate = new Date(nextDateTS)
const currDateStr = currDate.toISOString().substr(0, 10);
bookedDates.push(currDateStr);
// if our string matches our end date, we're done
if(currDateStr === endDateStr) {
break
}
}
return bookedDates
}
// assume these are the dates sent, in yyyy-mm-dd format
let inputStart = '2021-01-01'
let inputEnd = '2021-01-05'
const datesAr = getFlattenedDatesAr(inputStart, inputEnd)
console.log(datesAr);
I have collected an array of weather data that looks like this:
const data = [{
"city_name": "London",
"lat": 51.507351,
"lon": -0.127758,
"main": {
"temp": 289.89,
"temp_min": 288.468,
"temp_max": 291.15,
"feels_like": 287.15,
"pressure": 1004,
"humidity": 77
},
"wind": { "speed": 5.1, "deg": 230 },
"clouds": { "all": 90 },
"weather": [
{
"id": 804,
"main": "Clouds",
"description": "overcast clouds",
"icon": "04n"
}
],
"dt": 1593561600,
"dt_iso": "2020-07-01 00:00:00 +0000 UTC",
"timezone": 3600
},
...
];
This data continues in ascending date order (hour by hour), for the last 40 years.
(sample: https://pastebin.com/ciHJGhnq ) - the entire dataset is over 140MB.
From this data, I'd like to obtain the average temperature (object.main.temp) for each Month and Week of month, across the entire dataset.
The question I am trying to answer with my data is:
What is the average temperature for January, across the last 40 years.
What is the average temperature for February, across the last 40 years.
...
(get the temperature of each week in January and divide by the number of weeks, repeat for all of the other Januaries in the dataset and average that out too).
Repeat for remaining months.
The output I am aiming to create after parsing the data is:
{
[
"JANUARY": {
"weekNumber": {
"avgWeekTemp": 100.00
}
"avgMonthTemp": 69.00,
...
},
...
]
}
The city name & structure of the objects are always the same, in this case London.
// build a unique number of months
// work through our data to work out the week numbers
// work through the data once again and place the data in the right week inside finalOutput
// work through the final output to determine the average values
Unfortunately I'm not very proficient in JavaScript, so I couldn't get past the second obstacle:
"use strict";
const moment = require("moment");
const data = require("./data.json");
let months = [
{
January: [],
},
{
February: [],
},
{
March: [],
},
{
April: [],
},
{
May: [],
},
{
June: [],
},
{ July: [] },
{ August: [] },
{ September: [] },
{ October: [] },
{ November: [] },
{ December: [] },
];
const finalOutput = [];
finalOutput.push(...months);
data.forEach((object) =>
finalOutput.forEach((month) => {
if (
Object.keys(month)[0] === moment(new Date(object.dt_iso)).format("MMMM")
) {
[month].push(object.dt_iso);
}
})
);
console.log(finalOutput);
Which only returned the array of months with nothing in each month.
[
{ January: [] },
{ February: [] },
{ March: [] },
{ April: [] },
{ May: [] },
{ June: [] },
{ July: [] },
{ August: [] },
{ September: [] },
{ October: [] },
{ November: [] },
{ December: [] }
]
How can I calculate the average values per week & month across my entire data set?
I'm going to write your script for you, but while you wait here's some high-level guidance.
First, let's study your data. Each row is an hourly weather measurement. As a result, each datapoint you want will be an aggregate over a set of these rows. We should organize the script along those lines:
We'll write a function that accepts a bunch of rows and returns the arithmetic mean of the temperatures of those rows: function getAvgTemp(rows) => Number
We'll write another function that takes a bunch of rows, plus the desired month, and returns all the rows for just that month: function getRowsByMonth(month) => Array(rows)
We'll write another function that takes a bunch of rows, plus the desired week number, and returns all the rows for just that week: function getRowsByWeekNumber(rows, weekNumber) => Array(rows)
^^ that's if "week number" means 1-52. But if "week number" means "week within the month," then instead we'll do:
A function will also take a month: function getRowsByMonthWeek(rows, month, weekNumber) => Array(rows)
From these basic building blocks, we can write a routine that assembles the data you want.
What would that routine look like?
Probably something like this:
Loop through all the months of the year. We won't look in the data for these months, we'll hard-code them.
For each month, we'll call getRowsByMonth on the full data set. Call this set monthRows.
We'll pass monthRows to getAvgTemp -- it doesn't care what the timespan is, it just extracts and crunches the temp data it receives. That's our avgMonthTemp solved for.
Depending on what you mean by "week number," we'll divide the monthRows into smaller sets and then pass each set into getAvgTemp. (The hardest part of your script will be this division logic, but that's not to say it will be that hard.) This gives us your weekly averages.
We'll assemble these values into a data structure and insert it into the final structure that ultimately gets returned/logged.
Here's the implementation. It's a little different than I expected.
The biggest change is that I did some pre-processing up front so that the date values don't have to be parsed multiple times. While doing that, I also calculate each row's weekNumber. As a consequence, the week logic took the form of grouping rows by their weekNumbers rather than querying the dataset by weekNumber.
Some notes:
I decided that "weekNumber" means "week-of-year."
Instead of using Moment, I found a week-number algorithm on StackOverflow. If you want to use Moment's algo instead, go ahead.
The output data structure is not what you described.
Your example is not valid JSON, so I made a guess as to what you had in mind.
Here's an example of what it looks like:
{
"JUNE": {
"avgMonthTemp": 289.9727083333334,
"avgWeekTemps": {
"25": 289.99106382978727,
"26": 289.11
}
},
"JULY": {
"avgMonthTemp": 289.9727083333334,
"avgWeekTemps": {
"27": 289.99106382978727,
"30": 289.11
}
}
}
The output will include a top-level entry for every month, whether or not there is any data for that month. However, the avgWeekTemps hash will only have entries for weeks that are present in the data. Both behaviors can be changed, of course.
It's a reusable script that processes arbitrary JSON files in the format you shared.
You mentioned that each file has data from one city, so I figured you'll be running this on multiple files. I set it up so you can pass the path to the data file as a command-line argument. Note that the CLI logic is not sophisticated, so if you're doing funky things you will have a bad time. Doing CLI stuff well is a whole separate topic.
If your data for London is in a file named london.json, this is how you would process that file and save the results to the file london-temps.json:
$ node meantemp.js london.json > london-temps.json
// meantemp.js
const FS = require('fs')
// sets the language used for month names
// for language choices, see: http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
const MONTH_NAME_LANG_CODE = 'en-US'
// generate the list of month names once
const MONTH_NAMES = Array(12).fill().map(
( _, monthNum ) => new Date(2020, monthNum).toLocaleDateString(MONTH_NAME_LANG_CODE, { month: 'long' }).toUpperCase()
)
main()
function main() {
let filepath = process.argv[2]
let cityData = readJsonFile(filepath)
// before working on the data, prep the date values for processing
let allRows = cityData.map(row => {
let _date = new Date(row.dt_iso)
let _weekNum = getWeekNum(_date)
return { ...row, _date, _weekNum }
})
let output = MONTH_NAMES.reduce(( hash, monthName, monthNum ) => {
// grab this month's rows
let monthRows = getRowsForMonth(allRows, monthNum)
// calculate monthly average
let avgMonthTemp = getMeanTemp(monthRows)
// calculate weekly averages
let rowsByWeekNum = groupRowsByWeekNum(monthRows)
let avgWeekTemps = Object.keys(rowsByWeekNum)
.reduce(( hash, weekNum ) => ({
...hash,
[weekNum]: getMeanTemp(rowsByWeekNum[weekNum])
}), {})
return {
...hash,
[monthName]: { avgMonthTemp, avgWeekTemps }
}
}, {})
console.log(JSON.stringify(output))
}
function readJsonFile( path ) {
try {
let text = FS.readFileSync(path, 'utf8')
return JSON.parse(text)
} catch ( error ) {
if(error.code === 'ENOENT') {
console.error(`Could not find or read path ${JSON.stringify(path)}`)
process.exit()
} else if(error instanceof SyntaxError) {
console.error(`File is not valid JSON`)
process.exit()
} else {
throw error
}
}
}
function getRowsForMonth( rows, monthNum ) {
return rows.filter(row => monthNum === row._date.getUTCMonth())
}
function groupRowsByWeekNum( rows ) {
return rows.reduce(( hash, row ) => {
if(!hash.hasOwnProperty(row._weekNum)) {
hash[row._weekNum] = []
}
hash[row._weekNum].push(row)
return hash
}, {})
}
// ISO8601-compliant week-of-year function
// taken from https://stackoverflow.com/a/39502645/814463
// modified by me to prevent mutation of args
function getWeekNum( date ) {
// if date is a valid date, create a copy of it to prevent mutation
date = date instanceof Date
? new Date(date.getUTCFullYear(), date.getUTCMonth(), date.getUTCDate())
: new Date()
let nDay = (date.getDay() + 6) % 7
date.setDate(date.getDate() - nDay + 3)
let n1stThursday = date.valueOf()
date.setMonth(0, 1)
if (date.getDay() !== 4) {
date.setMonth(0, 1 + ((4 - date.getDay()) + 7) % 7)
}
return 1 + Math.ceil((n1stThursday - date) / 604800000)
}
function getMeanTemp( hourlyReadings ) {
let temps = hourlyReadings.map(reading => reading.main.temp)
let mean = getMean(temps)
return mean
}
function getMean( numbers ) {
let sum = numbers.reduce(( sum, num ) => sum + num, 0)
let mean = sum / numbers.length
return mean
}
I've a usecase in which I need to find the data of a particular month. How to get the start and end dates of given month?
Here's the sample code.
{"_id":"5e00bc55c31ecc38d023b156","heat":20,"humidity":10,"deviceId":"a-1","template":"13435158964","entryDayTime":"2019-12-23T13:08:37.841Z"},
{"_id":"5e00bbd2c31ecc38d023b155","heat":20,"humidity":10,"deviceId":"a-1","template":"13435158964","entryDayTime":"2019-12-23T13:06:26.366Z"},
{"_id":"5df4a8fb46b9da1e2c0731df","heat":88,"humidity":80,"deviceId":"a-1","template":"13435158964","entryDayTime":"2019-12-14T09:18:51.892Z"},
{"_id":"5e00b50bc127260398cf51dd","heat":20,"humidity":10,"deviceId":"a-1","template":"13435158964","entryDayTime":"2019-12-23T12:37:31.127Z"},
{"_id":"5df20e44e7c51b4bd0095af3","heat":41,"humidity":26,"deviceId":"a-1","template":"13435158964","entryDayTime":"2019-12-12T09:54:12.375Z"}
Here's my code without moment.js
Payload:
{
"deviceId":"a-1",
"year":2019,
"month":"December"
}
Collection.aggregate([
{
$match: {
"deviceId": payload.deviceId,
"entryDayTime": {
$lt: new Date(`${payload.month},${payload.year},2`).toISOString(),
$gte: new Date(`${payload.month},${payload.year},31`).toISOString()
}
}
}
])
These are the time ranges I'm getting in console(times passed in aggregate function),
2019-12-01T18:30:00.000Z
2019-12-30T18:30:00.000Z
Code with moment.js
Payload:
{
"deviceId":"a-1",
"year":2019,
"month":10
}
I've tried with moment.js too. But I'm not getting the times in the format like time format of database.
Collection.aggregate([
{
$match: {
"deviceId": payload.deviceId,
"entryDayTime": {
$lt:moment([payload.year]).month(payload.month).startOf('month').tz('Asia/Kolkata').format(),
$gte:moment([payload.year]).month(payload.month).endOf('month').tz('Asia/Kolkata').format()
}
}
}
])
Following are the timestamps I'm getting in console.
2019-11-01T00:00:00+05:30
2019-11-30T23:59:59+05:30
If moment.js is preferred, how to change the time format similar to the sample code's time format?
Just try this code:
var dated="2019-11-01T00:00:00+05:30";
var newdated= new Date(dated);
var output= newdated.toISOString();
console.log(output);
Result :
'2019-10-31T18:30:00.000Z'
The toISOString() method returns a string in simplified extended ISO format (ISO 8601), which is always 24 or 27 characters long (YYYY-MM-DDTHH:mm:ss.sssZ or ±YYYYYY-MM-DDTHH:mm:ss.sssZ, respectively).
The timezone is always zero UTC offset, as denoted by the suffix "Z".
To find the data of a particular month use Date.UTC with the Date constructor to create a range:
const payload = {
"deviceId": "a-1",
"year": 2019,
"month": 11 // months start from 0 = January, so 11 = December
}
const from = new Date(Date.UTC(payload.year, payload.month, 1)).toISOString(); // "2019-12-01T00:00:00.000Z"
const to = new Date(Date.UTC(payload.year, payload.month + 1, 1)).toISOString(); // "2020-01-01T00:00:00.000Z"
then use them as follows:
Collection.aggregate([
{
$match: {
"deviceId": payload.deviceId,
"entryDayTime": {
$lt: to,
$gte: from
}
}
}
])
Working example : https://mongoplayground.net/p/jkIJdJ-L7q-
I was trying to write "WHERE (CASE ... THEN ... ELSE ... END) > 0" to sequelize v3.33 but couldn't find the solution yet.
Had tried sequelize.literal('...') but didn't work.
Using "HAVING" is one of the solutions but it's no good for performance-wise for large data extraction and it's twice as slow.
This is just an example of MySQL code but pretty much close to what I want to achieve.
SELECT
(CASE `a`.`fee` IS NULL
THEN `a.b`.`fee`
ELSE `a`.`fee`
END) AS `_fee`
FROM `a`
WHERE
(CASE `a`.`fee` IS NULL
THEN `a.b`.`fee`
ELSE `a`.`fee`
END) > 0 AND
(created_at > currentDate
AND
created_at < futureDate)
I want to convert this to sequelize. Below is as far as I can go, I don't know how to add that case closure.
models.a.findAll({
...
where: {
created_at: { $gt: startDate, $lt: endDate }
}
})
*** Don't mind about created_at, it's just an example.
You can use sequelize.where and sequelize.literal for that :
where:{
$and : [
{ created_at: { $gt: startDate, $lt: endDate } },
sequelize.where(sequelize.literal('(CASE `a`.`fee` IS NULL THEN `a.b`.`fee` ELSE `a`.`fee` END)') , { $gt : 0 } )
]
}
Note : this might not work as alias a. of the table might be diff, you can debug and change as per your query
I have a collection which has a field called timestamp containing date object. I have this query:
db.articles.find({
timestamp:{
'$lte':new Date(),
'$gte': //Something to get the last week's date
}
})
Also if it is possible, Can I sort these returned documents by length of an array in this document. Here is the schema:
section: String,
title: String,
abstract: String,
url: String,
image: {
url: String,
caption: String
},
votes:{
up: [ObjectID],
down: [ObjectID]
},
comments:[ObjectID],
timestamp: Date
I want to sort the returned objects by size of difference of votes.up and votes.down. Right now I am sorting the returned objects in Javascript where this actually returns the data.
Seems the solution should look like
db.articles.find({
timestamp: {
$gte: new Date(new Date() - 7 * 60 * 60 * 24 * 1000)
}
});
it will return the previous week data i.e.,from sunday to saturday of previous week which is local where sunday is a starting day.
{
$match: {
createdAt: {
$gte: moment().day(-7).toDate(),
$lt: moment().startOf('week').toDate()
},
}
}
]);
I found a solution get the objects created in last week.
db.articles.find({timestamp:{'$lte':new Date(),'$gte':new Date(Date()-7)}})
This gets the work done. But I cant figure out how to sort the returned objects by the size of arrays.