I'm creating an Electron application and I want to stream an image to a file (so basically download it).
I want to use the native Fetch API because the request module would be a big overhead.
But there is no pipe method on the response, so I can't do something like
fetch('https://imageurl.jpg')
.then(response => response.pipe(fs.createWriteStream('image.jpg')));
So how can I combine fetch and fs.createWriteStream?
I got it working. I made a function which transforms the response into a readable stream.
const responseToReadable = response => {
const reader = response.body.getReader();
const rs = new Readable();
rs._read = async () => {
const result = await reader.read();
if(!result.done){
rs.push(Buffer.from(result.value));
}else{
rs.push(null);
return;
}
};
return rs;
};
So with it, I can do
fetch('https://imageurl.jpg')
.then(response => responseToReadable(response).pipe(fs.createWriteStream('image.jpg')));
Fetch is not really able to work with nodejs Streams out of the box, because the Stream API in the browser differs from the one nodejs provides, i.e. you can not pipe a browser stream into a nodejs stream or vice versa.
The electron-fetch module seems to solve that for you. Or you can look at this answer: https://stackoverflow.com/a/32545850/2016129 to have a way of downloading files without the need of nodeIntegration.
There is also needle, a smaller alternative to the bulkier request, which of course supports Streams.
I guess today the answer is with nodejs 18+
node -e 'fetch("https://github.com/stealify").then(response => stream.Readable.fromWeb(response.body).pipe(fs.createWriteStream("./github.com_stealify.html")))'
in the above example we use the -e flag it tells nodejs to execute our cli code we download the page of a interristing Project here and save it as ./github.com_stealify.html in the current working dir the below code shows the same inside a nodejs .mjs file for convinience
Cli example using CommonJS
node -e 'fetch("https://github.com/stealify").then(({body:s}) =>
stream.Readable.fromWeb(s).pipe(fs.createWriteStream("./github.com_stealify.html")))'
fetch.cjs
fetch("https://github.com/stealify").then(({body:s}) =>
require("node:stream").Readable.fromWeb(s)
.pipe(require("node:fs").createWriteStream("./github.com_stealify.html")));
Cli example using ESM
node --input-type module -e 'stream.Readable.fromWeb(
(await fetch("https://github.com/stealify")).body)
.pipe(fs.createWriteStream("./github.com_stealify.html"))'
fetch_tla_no_tli.mjs
(await import("node:stream")).Readable.fromWeb(
(await fetch("https://github.com/stealify")).body).pipe(
(await import("node:fs")).createWriteStream("./github.com_stealify.html"));
fetch.mjs
import stream from 'node:stream';
import fs from 'node:fs';
stream.Readable
.fromWeb((await fetch("https://github.com/stealify")).body)
.pipe(fs.createWriteStream("./github.com_stealify.html"));
see: https://nodejs.org/api/stream.html#streamreadablefromwebreadablestream-options
Update i would not use this method when dealing with files
this is the correct usage as fs.promises supports all forms of iterators equal to the stream/consumers api
node -e 'fetch("https://github.com/stealify").then(({ body }) =>
fs.promises.writeFile("./github.com_stealify.html", body)))'
Related
I have an Electron app which is able to upload very big files to the server via HTTP in renderer process without user input. I decided to use axios as my HTTP client and it was able to retrieve upload progress but with this I met few problems.
Browser's supported js and Node.js aren't "friendly" with each other in some moments. I used fs.createReadStream function to get the file but axios does not understand what ReadStream object is and I can't pipe (there are several topics on their GitHub issue tab but nothing was done with that till now) this stream to FormData (which I should place my file in).
I ended up using fs.readFileSync and then form-data module with its getBuffer() method but now my file is loaded entirely in the memory before upload and with how big my files are it kills Electron process.
Googling I found out about request library which in-fact is able to pipe a stream to request but it's deprecated, not supported anymore and apparently I can't get upload progress from it.
I'm running out of options. How do you upload files with Electron without user input (so without file input) not loading them in the memory upfront?
P.S. on form-data github page there is a piece of code explaining how to upload a file stream with axios but it doesn't work, nothing is sent and downgrading the library as one issue topic suggested didn't help either...
const form = new FormData();
const stream = fs.createReadStream(PATH_TO_FILE);
form.append('image', stream);
// In Node.js environment you need to set boundary in the header field 'Content-Type' by calling method `getHeaders`
const formHeaders = form.getHeaders();
axios.post('http://example.com', form, {
headers: {
...formHeaders,
},
})
.then(response => response)
.catch(error => error)
I was able to solve this and I hope it will help anyone facing the same problem.
Since request is deprecated I looked up for alternatives and found got.js for NodeJS HTTP requests. It has support of Stream, fs.ReadStream etc.
You will need form-data as well, it allows to put streams inside FormData and assign it to a key.
The following code solved my question:
import fs from 'fs'
import got from 'got'
import FormData from 'form-data'
const stream = fs.createReadStream('some_path')
// NOT native form data
const formData = new FormData()
formData.append('file', stream, 'filename');
try {
const res = await got.post('https://my_link.com/upload', {
body: formData,
headers: {
...formData.getHeaders() // sets the boundary and Content-Type header
}
}).on('uploadProgress', progress => {
// here we get our upload progress, progress.percent is a float number from 0 to 1
console.log(Math.round(progress.percent * 100))
});
if (res.statusCode === 200) {
// upload success
} else {
// error handler
}
} catch (e) {
console.log(e);
}
Works perfectly in Electron renderer process!
When run bellow code it's give error, Reading file from directory working perfect but when pass url it's give file not found error. I've check fs.statSync accept url.
const stat = fs.statSync('http://techslides.com/demos/sample-videos/small.mp4');
Error: ENOENT: no such file or directory, stat 'http://techslides.com/demos/sample-videos/small.mp4'
fs.statSync() can take a URL, but ONLY if that URL is a file:// URL.
It is not clear what you would want to do if the argument was actually an http:// URL. You could check to see if it was not a file URL and then attempt to fetch the contents of the URL to see if it exists using a library such as got().
But, fetching data from another server with http will not be synchronous so you will have to change the design of your function to return a promise instead of a synchronous API.
That's because its hosted on a web-server, you need to send a HTTP GET to fetch it locally.
Install the axios package and issue a HTTP GET request to fetch the remote resource from the web-server.
npm install --save axios
Here's a program of the general idea
const fs = require('fs');
const axios = require('axios');
const { promisify } = require('util');
const writeFilePromise = promisify(fs.writeFile);
(async () => {
const url = 'http://techslides.com/demos/sample-videos/small.mp4';
const response = await axios.get(url);
if (response.data) {
await writeFilePromise('small.mp4', response.data);
}
})();
I'm working on a Node project where I have an array of files such as
var urls = ["http://web.site/file1.iso", "https://web.site/file2.pdf", "https://web.site/file3.docx", ...];
I'm looking to download those files locally in the most efficient way possible. There could be as many as several dozen URLs in this array... Is there a good library that would help me abstract this out? I need something that I can call with the array and the desired local directory to that will follow redirects, work with http & https, intelligently limit simultaneous downloads, etc.
node-fetch is a lovely little library that brings fetch capability to node. Since fetch returns a promise, managing parallel downloads is simple. Here's an example:
const fetch = require('node-fetch')
const fs = require('fs')
// You can expand this array to include urls are required
const urls = ['http://web.site/file1.iso', 'https://web.site/file2.pdf']
// Here we map the list of urls -> a list of fetch requests
const requests = urls.map(fetch)
// Now we wait for all the requests to resolve and then save them locally
Promise.all(requests).then(files => {
files.forEach(file => {
file.body.pipe(fs.createWriteStream('PATH/FILE_NAME.EXT'))
})
})
Alternatively, you could write each file as it resolves:
const fetch = require('node-fetch')
const fs = require('fs')
const urls = ['http://web.site/file1.iso', 'https://web.site/file2.pdf']
urls.map(file => {
fetch(file).then(response => {
response.body.pipe(fs.createWriteStream('DIRECTORY_NAME/' + file))
})
})
This is my first phone app. I am using Ionic for the cross-platform work which uses Angular as you know I'm sure. I have a separate program which scrapes a webpage using puppeteer and cheerio and creates an array of values from the web page. This works.
I'm not sure how I get the array in my web scraping program read by my ionic/angular program.
I have a basic ionic setup and am just trying a most basic activity of being able to see the array from the ionic/angular side but after trying to put it in several places I realized I really didnt know where to import the code to ionic/angular which returns the array or where to put the webscraper code directly in one of the .ts files or ???
This is my web scraping program:
const puppeteer = require('puppeteer'); // live webscraping
let scrape = async () => {
const browser = await puppeteer.launch({
headless: true
});
const page = await browser.newPage();
await page.goto('--page url here --'); // link to page
const result = await page.evaluate(() => {
let data = []; // Create an empty array that will store our data
let elements = document.querySelectorAll('.list-myinfo-block'); // Select all Products
let photo_elements = document.getElementsByTagName('img'); //
var photo_count = 0;
for (var element of elements) { // Loop through each product getting photos
let picture_link = photo_elements[photo_count].src;
let name = element.childNodes[1].innerText;
let itype = element.childNodes[9].innerText
data.push({
picture_link,
name,
itype
}); // Push an object with the data onto our array
photo_count = photo_count + 1;
}
return data;
});
browser.close();
return result; // Return the data
};
scrape().then((value) => {
console.log(value); // Success!
});
When I run the webscraping program I see the array with the correct values in it. Its getting it into the ionic part of it. Sometimes the ionic phone page will show up with nothing in it, sometimes it says it cannot find "/" ... I've tried so many different places and looked all over the web that I have quite a combination of errors. I know I'm putting it in the wrong places - or maybe not everywhere I should. Thank you!
You need a server which will run the scraper on demand.
Any scraper that uses a real browser (ie: chromium) will have to run in a OS that supports it. There is no other way.
Think about this,
Does your mobile support chromium and nodeJS? It does not. There are no chromium build for mobile which supports automation with nodeJS (yet).
Can you run a browser inside another browser? You cannot.
Way 1: Remote wsEndpoint
There are some services which offers wsEndpoint but I will not mention them here. I will describe how you can create your own wsEndPoint and use it.
Run browser and Get wsEndpoint
The following code will launch a puppeteer instance whenever you connect to it. You have to run it inside a server.
const http = require('http');
const httpProxy = require('http-proxy');
const proxy = new httpProxy.createProxyServer();
http
.createServer()
.on('upgrade', async(req, socket, head) => {
const browser = await puppeteer.launch();
const target = browser.wsEndpoint();
proxyy.ws(req, socket, head, { target })
})
.listen(8080);
When you run this on the server/terminal, you can use the ip of the server to connect. In my case it's ws://127.0.0.1:8080.
Use puppeteer-web
Now you will need to install puppeteer-web on your mobile/web app. To bundle Puppeteer using Browserify follow the instruction below.
Clone Puppeteer repository:
git clone https://github.com/GoogleChrome/puppeteer && cd puppeteer
npm install
npm run bundle
This will create ./utils/browser/puppeteer-web.js file that contains Puppeteer bundle.
You can use it later on in your web page to drive another browser instance through its WS Endpoint:
<script src='./puppeteer-web.js'></script>
<script>
const puppeteer = require('puppeteer');
const browser = await puppeteer.connect({
browserWSEndpoint: '<another-browser-ws-endpont>'
});
// ... drive automation ...
</script>
Way 2: Use an API
I will use express for a minimal setup. Consider your scrape function is exported to a file called scrape.js and you have the following index.js file.
const express = require('express')
const scrape= require('./scrape')
const app = express()
app.get('/', function (req, res) {
scrape().then(data=>res.send({data}))
})
app.listen(8080)
This will launch a express API on the port 8080.
Now if you run it with node index.js on a server, you can call it from any mobile/web app.
Helpful Resources
I had some fun with puppeteer and webpack,
playground-react-puppeteer
playground-electron-react-puppeteer-example
To keep the api running, you will need to learn a bit about backend and how to keep the server alive etc. See these links for full understanding of creating the server and more,
Official link to puppeteer-web
Puppeteer with docker
Docker with XVFB and Puppeteer
Puppeteer with chrome extension
Puppeteer with local wsEndpoint
Avoid memory leak on server
I am new to typescript and nodejs. I wanted to connect a server running on node to postgresdb in a containerized environment. There are examples available for this without typescirpt (only java script ) .
Any example or link would be helpful .
Thanks
Edit :
This JS Example
const { Client } = require('pg')
const client = new Client()
await client.connect()
const res = await client.query('SELECT $1::text as message', ['Hello world!'])
console.log(res.rows[0].message) // Hello world!
await client.end()
This is the example taken from https://node-postgres.com/ and can we convert that into typescript , Using typescirpt is design decision so i cant use javascript, wanted to try something and experiment before giveup on it.
I ended up using typeorm and that can be used with TS.
https://github.com/typeorm/typeorm
Installation and examples available in Github Repo.
https://github.com/typeorm/typeorm#Installation
This might not work for everyone but seems best fit for me and the library looks stable.