Importing directory of json into mongoDB - javascript

I have a directory, with several sub-directories, holding a total of about a million JSON files. I need to import this entire thing into mongoDB. I imagine this is a very common problem, but I cannot find any tutorial for how to do so. Is there an easy solution here?
(Or should I write a script to iterate through the directories, read each file into a variable and then insert the content into my db?)

You're basically on the right path.
Iterate through the entire directory, read the files to create a JSON object in your code, and then just store your documents directly into MongoDB.
Take a look at this.
One big problem though: MongoDB stores your documents into Collections, with the condition that every document in your collection has the same general format/structure. Basically each document must have most, if not all of the same properties as each other. Otherwise you're going to have to use something different, or store all of the JSON as a property of an encapsulating document that you can then store.
Something like the following:
{
'_id': 'YOUR_DOCUMENT_ID',
'doc': 'JSON_OR_STRING_OF_YOUR_FILE'
}

Related

React Native - Dynamically load tons of small images

I would like to load images dynamically from an images folder.
Names in the folder like {company_name}.png. The names are stored in a json file along with other company data like name, type, etc.
e.g.:
name: meta
logo: "./images/meta.png"
Is there any way to dynamically load these images like require(changing variable based on the json logo string).
I found some solution online like:
https://stackoverflow.com/a/52598171/7990652
But all the other things are build around the json structure to load all the other data.
Is there any way to load everything from a folder one by one based on the provided logo link in the json?
I also tried to look around for a solution with babilon require all plugin, but I found nothing about that.
P.S. I know there are sevaral threads about this question, but in almost all the threads someone asking "what about if I want to load 100-1000 images?". That is the case with me too, I would like to load a lot of images, not just up to 10-20 which is complately okay with static .js require list.
Dynamically loading local assets at run-time including images is not supported within React Native.
In order to use static assets (images, fonts, documents,...). All assets must be bundled at compile-time with your app.
Attempts to load assets dynamic at run-time without pre-bundled at compile time will cause the app to crash.
Bundle all images you need at compile-time then use image reference in memory at run-time.
const staticImages = {
image_01: require("./path/to/image_01.png"),
image_02: require("./path/to/image_02.png"),
image_03: require("./path/to/image_03.png"),
// ... more images
};
Define this object containing images reference globally and make sure it is executed as early as possible when the app is initializing.
Later access any image reference as below:
<Image source={staticImages[IMAGE_REFERENCE_NAME]} />
No, unfortunately your question is a duplicate of the others you have found, and it is not possible to use require dynamically, no matter how many times you want to do it.
If you can break up the images into multiple components, you can conditionally load those components. But require must be called with a fixed string.

Recursive RESTAPI Query

For some background, I have a number of enterprise systems management products that run a REST API called Redfish. I can visit the root folder on one such device by going here:
https://ip.of.device/redfish/v1
If I browse the raw JSON from a GET request, the output includes JSON data that look like this. I'm truncating this due to how long it is, so perhaps some JSON syntax errors here.
{
Description: "Data",
AccountService: {
#odata.id: "/redfish/v1/AccountService"
},
CertificateService: {
#odata.id: "/redfish/v1/CertificateService"
}
}
Perhaps in my searching I'm using the wrong terminology, but each of those #odata.id items is basically a 'folder' I can navigate into. Each folder has additional data values, but still more folders. I can capture contents of folders I know about via javascript and parse the JSON simple enough, but there are hundreds of folders here, some multiple layers deep and from one device to the next, sometimes the folders are different.
Due to the size and dynamic nature of this data, is there a way to either recursively query this from an API itself or recursively 'scrape' an API's #odata.id 'folder' structure like this using Javascript? I'm tempted to write a bunch of nested queries in foreach loops, but there's surely a better way to accomplish this.
My eventual goal is to perform this from nodejs, parse the data, then present the data in a web form for a user to select what fields to keep, which we'll store for faster lookups in a mongodb database along with the path to the original data for more targeted api queries later.
Thanks in advance for any suggestions.

Search and update value in json file in node js

I'm trying to create a key-value paired datastore in nodejs in which I'm trying to use a json file as a database. Now I'm trying to read a particular value of key and update it or delete it.
For eg.(dummy data)
{
name: 'my name',
no:12345,
course : {
name: 'cname',
id: 'cid'
}
}
Now I want to change it as
{
name: 'my name',
no:12345,
course : {
name: 'cname1',
id: 'cid1'
},
fees: {
primaryFee: 1000,
donation: 100,
}
}
Or even delete the course key with it's value.
One way I thought of to achieve this is read the entire file and store it in a variable(json parsed). And update the value in that variable and write the whole data to the file.
But this is not efficient as it reads and writes the whole data every time of an update.
So is there any efficient approach for this to update or delete a particular key in the file itself???
Yes, even in a high-level programming language like JS, it is possible to update parts of a file, though it is usually more common in low-level programming languages like e.g. C
For node.js, check-out the official documentation of the file system module, especially the write function: https://nodejs.org/api/fs.html#fs_fs_write_fd_buffer_offset_length_position_callback
Other possible solutions to your problem:
Use a database as eol suggested
Use an append-only file where you only append the updates. This is more efficient because not the whole file has to be written.
Split your database-file into several files (this is what databases often do below the surface)
Since you are working on a file based data store, you could use FS in node.js to read and write files. The readfilesync callback returns you the whole content of the file which should be parsed as JSON and changes are to be made.
So AFAIK the only solution is to read the whole file and then update or delete the value. In the worst case scenario you will traverse O(n) when you are trying to update or deleting the key.
You've discovered why databases are so complicated ;)
There isn't really a good way to update a json file without completely reading & parsing the data, editing and then writing the result again.
If you're looking for a simple file database, have a look at sqlite
For making these types of data changes, USE DB, for JSON files you have to store full data in the variable and perform the operations.

How do I check if there is a duplicate file in a folder, without comparing file names using glob/listdir/etc..?

I have a folder that contains several images, the directory structure looks like this:
./images/
./images/1.png
./images/2.png
./images/3.png
./images/4.png
./images/{n}.png
These images have been downloaded and saved using the request and fs modules by a script called update.js.
Each file is named after the length of items in the folder (I.E: length + 1).
The update.js script downloads (and saves) each image, regardless of whether or not it exists.
I can get around this by deleting the images folder but this is a waste of resources.
What's the most efficient way to prevent this behaviour?
NOTE: I can't use a simple file name check since, the names are indexes.
Thanks.
You can issue a HTTP head request for each file and get its headers. Then you can see how big the target file is and avoid re-downloading it if the size matches exactly.
This isn’t ideal though as different files may have the same size.
Some servers give you a content md5 which would probably be the best. The md5 is unlikely to match between any two files you have unless your use case is very large.
You would be better served by just working to fix the script though so it store proper metadata, all this is quite hacky :). You can store the real file names and modified timestamps as another file in a sibling directory and be fairly sure it won’t affect anything. Then you can just check those before doing a download.

Finding a string in MongoDB collection

I want to implement a minor file system like collection in MongoDB .
So say my object looks like this
{
"\":{
'autoexec.bat':{
name:'autoexec',
filetype:'bat',
size:1302
},
'users':{ /* its own tree */ },
'windows':{
'system':{
'autoexec.bat':{
name:'autoexec',
filetype:'bat',
size:1302123
}
}
}
}
I am wondering how to find the term 'autoexec.bat' in the most effiecient manner , Further for a file tree is there any better way to implement the same in Node.js or C++ ? I wish to implement features like search etc.
I'd say: don't do this in one gigantic document. Let each file be its own document in a collection, with references to parent and (probably) children. Almost all operations are now trivial. You only have to think about efficient way to read/delete the whole tree.
Last week at MongoNYC Kyle Banker gave a nice talk on schema design by example. I think your problem is extremely similar to his first example of musical genre hierarchy.
In effect, every file would have a document in the collection. It would have a parent field to identify its direct parent (directory it's in in your case) and an array of all its ancestors.
The queries that now become easy are
- what directory is file "autoexec.bat" in
- list all files in directory "foo"
- list all files recursively in directory foo (all files with "foo" in its ancestors.
And don't forget, you also have an option of saving the full path name to a file/directory as well as its base name. This allows searching by leading parts of path if the field is indexed. It's also going to be unique, unlike just the file name.
As always, the key pieces of information are all the ways you'll need to query this collection and the performance and load expectations. Without that, it can be easy to pick a schema which will give you some challenges later on.

Categories

Resources