Here is the code:
buildImgSuccess(json) {
if (typeof json !== "undefined") {
if (json.filesUploaded.length) {
//-- Update Saving Status --//
this.setState({
saving: true
});
//-- Set Vars --//
let item = 0;
let sortOrder = 0;
let imgCount = this.state.images.length;
if (!imgCount) {
imgCount = 0;
}
while (item < json.filesUploaded.length) {
//-- Determine if PDF Document was Uploaded --//
if (json.filesUploaded[item].mimetype === "application/pdf") {
//-- Handle Document Upload --//
//-- Get Number of pages --//
let theKey = json.filesUploaded[item].key;
let theHandle = json.filesUploaded[item].handle;
axios.get(`/getPhotos`, {
headers: {
"Content-Type": 'application/json'
},
transformRequest: (data, headers) => { delete headers.common.Authorization; }
}).then(jsonResult => {
let pageCount = 1;
Our lint compiler is producing this error
Don't make functions within a loop
Anytime there is a .then() or .catch() inside a loop.
Does anyone understand what the problem is with this code structure and any possible solutions?
Thanks!
You need to create the function outside of the while loop.
Creating inside recreates the function every loop which is non performant.
See below.
Simplified Example - Wrong
const i = 0;
while (i < 10) {
const print = (i) => console.log(i++); //Created 10 times (0,1,...,8,9)
print(i);
};
Simplified Example - Correct
const i = 0;
const print = (i) => console.log(i++); //Created once
while (i < 10) {
print(i);
};
With your code
const handleJsonResult = (jsonResult) => {
let pageCount = 1;
//...
}
while (item < json.filesUploaded.length) {
//-- Determine if PDF Document was Uploaded --//
if (json.filesUploaded[item].mimetype === "application/pdf") {
//-- Handle Document Upload --//
//-- Get Number of pages --//
let theKey = json.filesUploaded[item].key;
let theHandle = json.filesUploaded[item].handle;
axios.get(`/getPhotos`, {
headers: {
"Content-Type": 'application/json'
},
transformRequest: (data, headers) => { delete headers.common.Authorization; }
}).then(handleJsonResult);
//...
}
Related
I would like to send the data by chunks
now what i'm sending to the server look like this
for loop - 1, 2, 3
what the server receives: 3,1,2 -> asynchronous.
and i need to send it synchronic so the server will receive as my for loop order: 1, 2, 3
How can i do it ?
//52428800
const chunkSize = 1377628
let beginUpload = data;
let component = this;
let start = 0;
let startCount = 0;
let callStoreCouunt = 0;
for (start; start < zipedFile.length; start += chunkSize) {
const chunk = zipedFile.slice(start, start + chunkSize + 1)
startCount +=1;
// debugger
// var base64Image = new Buffer( zipedFile ).toString('base64');
var base64Image = new Buffer( chunk ).toString('base64');
console.log(chunk, startCount);
let uploadPackage: documentInterfaces.UploadPackage = {
transaction: {
documentId: {value: data.documentId.value},
transactionId: data.transactionId,
fileGuid: data.fileGuid
},
packageBuffer: base64Image
};
// debugger
component.$store.dispatch('documents/uploadPackage', uploadPackage)
.then(({ data, status }: { data: documentInterfaces.ReciveAttachScene , status: number }) => {
// debugger
if(status !== 200){
component.$message({
message: data,
type: "error"
});
component.rejectUpload(beginUpload);
}
else{
callStoreCouunt+=1;
console.log(chunk, "res" + callStoreCouunt)
debugger
if(callStoreCouunt === startCount){
let commitPackage = {
transaction: {
documentId: {value: uploadPackage.transaction.documentId.value},
transactionId: uploadPackage.transaction.transactionId,
fileGuid: uploadPackage.transaction.fileGuid
}
};
debugger
component.commitUpload(commitPackage);
}
}
});
}
You cannot control which chunk of data reaches the server first. If there's a network problem somewhere on its way, it might go around the planet multiple times before it reaches the server.
Even if the 1st chunk was sent 5 ms earlier than the 2nd one, the 2nd chunk might reach the server earlier.
But there's a few ways you can solve this:
Method 1:
Wait for the server response before sending the next chunk:
let state = {
isPaused: false
}
let sentChunks = 0
let totalChunks = getTotalChunksAmount()
let chunkToSend = ...
setInterval(() => {
if (!isPaused && sentChunks < totalChunks) {
state.isPaused = true
send(chunkToSend)
sentChunks += 1
}
}, 100)
onServerReachListener(response => {
if (response === ...) {
state.isPaused = false
}
})
Method 2:
If you don't need to process chunks sequentially in real time, you can just wait for all of them to arrive on the server, then sort them before processing:
let chunks = []
onChunkReceived (chunk) {
if (chunk.isLast) {
chunks.push(chunk)
chunks.sort()
processChunks()
}
else {
chunks.push(chunk)
}
}
Method 3:
If you do need to process chunks sequentially in real time, give all the chunks an id property and processing them sequentially, while storing the other ones for later:
let chunksToProcess = []
let lastProcessedChunkId = -1
onChunkReceived (chunk) {
if (chunk.id === lastProcessedChunkId) {
processChunk()
lastProcessedChunkId += 1
processStoredChunks()
}
else {
chunksToProcess.push(chunk)
}
}
I'm trying to write a code that makes a request to a website, for webscraping
So this are the steps:
Here First part of Code STARTS
The program makes the request to the mainURL
The program selects some objects from the html of the mainURL, and store them in an array of objects(advert), on of the properties of the object, is it's link, which we'll call numberURL, that the code automatically selects using a css selector, the amount of objects is something like 80-90;
The program makes requests to every numberURL(80-90 requests),
and for each of them it does set another properties to the same object, and selects another link, that we'll call accountURL
The program creates an CSV file where it writes every object in different rows
Here First part of Code ENDS
So actually the first part works pretty good, it doesn't have any issues, but the second part does
Here Second part of Code STARTS
The program makes requests to every accountURL from the previous object
The program selects some objects from the html of the accountURL, and stores them in an another array of another objects(account), also using CSS selectors
The program should console.log() all the account objects
Here Second part of Code ENDS
But the second part does have some bugs, because when console.logging the objects we see that the objects properties doesn't changed their default value.
So in debugging purposes I took one advert object and putted it's value manually from the code
post[0].link = 'https://999.md/ru/profile/denisserj'
Finally when running the code for this object it actually works correctly, so it shows the changed properties, but for the rest of them it doesn't.
I tried to set some Timeouts, thinking that the code tries to read the link, before the second request finished, but no effects
I also tried to console.log the link, to see if it exists in the array, so it actually exists there, but also no effect.
Finally here is the code:
// CLASSES
class advert {
constructor() {
this.id = 0;
this.tile = new String();
this.link = new String();
this.phone = new String();
this.account = new String();
this.accountLink = new String();
this.text = new String();
this.operator = new String();
}
show() {
console.log(this.id, this.title, this.link, this.phone, this.account, this.accountLink, this.text, this.operator);
}
}
class account {
constructor() {
this.name = 0;
this.createdAt = 0;
this.phone = [];
this.ads = [];
this.adsNumber = 0;
}
show() {
console.log(this.name, this.createdAt, this.phone, this.ads, this.adsNumber);
}
}
// HEADERS
const mainRequest = require('request');
const auxRequest = require('request');
const cheerio1 = require('cheerio');
const cheerio2 = require('cheerio');
const fs = require('fs');
const fs2 = require('fs');
const adFile = fs.createWriteStream('anunturi.csv');
const accFile = fs2.createWriteStream('conturi.csv');
// SETTINGS
const host = 'https://999.md'
const category = 'https://999.md/ru/list/transport/cars'
const timeLimit = 60; //seconds
// VARIABLES
let post = [];
let postNumber = 0;
let acc = [];
// FUNCTIONS
function deleteFromArray(j) {
post.splice(j, 1);
}
function number(i) {
let category = post[i].link;
auxRequest(category, (error, response, html) => {
if (!error && response.statusCode == 200) {
const $ = cheerio1.load(html);
let phone;
const siteTitle = $('strong').each((id, el) => {
phone = $(el).text();
});
const txt = $('.adPage__content__description').html();
const person = $('.adPage__header__stats').find('.adPage__header__stats__owner').text();
const linkToPerson = host + $('.adPage__header__stats').find('.adPage__header__stats__owner').find('a').attr('href');
post[i].phone = phone;
post[i].account = person;
post[i].accountLink = linkToPerson;
post[i].text = txt;
if (i == postNumber) {
console.log('1. Number Putting done')
writeToFileAd(accountPutter, writeToFileAccount);
}
}
});
}
function writeToFileAd() {
adFile.write('ID, Titlu, Link, Text, Cont, LinkCont, Operator\n')
for (let i = 0; i <= postNumber; i++) {
adFile.write(`${post[i].id}, ${post[i].title}, ${post[i].link}, ${post[i].phone}, ${post[i].account}, ${post[i].accountLink}, ${post[i].operator}\n`);
}
console.log('2. Write To File Ad done')
accountPutter();
}
function accountAnalyzis(i) {
let category = post[i].link;
const mainRequest = require('request');
category = category.replace('/ru/', '/ro/');
mainRequest(category, (error, response, html) => {
if (!error && response.statusCode == 200) {
const $ = cheerio2.load(html);
const name = $('.user-profile__sidebar-info__main-wrapper').find('.login-wrapper').text();
let createdAt = $('.date-registration').text();
createdAt = createdAt.replace('Pe site din ', '');
const phones = $('.user-profile__info__data').find('dd').each((id, el) => {
let phone = $(el).text();
acc[i].phone.push(phone);
});
const ads = $('.profile-ads-list-photo-item-title').find('a').each((id, el) => {
let ad = host + $(el).attr('href');
acc[i].ads.push(ad);
acc[i].adsNumber++;
});
acc[i].name = name;
acc[i].createdAt = createdAt;
console.log(name)
if (i == postNumber) {
console.log('3. Account Putting done')
writeToFileAccount();
}
}
});
}
function writeToFileAccount() {
for (let i = 0; i <= postNumber; i++) {
accFile.write(`${acc[i].name}, ${acc[i].createdAt}, ${acc[i].phone}, ${acc[i].ads}, ${acc[i].adsNumber}\n`);
}
console.log('4. Write to file Account done');
}
function numberPutter() {
for (let i = 0; i <= postNumber; i++) {
number(i);
}
}
function accountPutter() {
for (let i = 0; i <= postNumber; i++) {
accountAnalyzis(i);
}
}
// MAIN
mainRequest(category, (error, response, html) => {
let links = [];
for (let i = 0; i < 1000; i++) {
post[i] = new advert();
}
for (let i = 0; i < 1000; i++) {
acc[i] = new account();
}
if (!error && response.statusCode == 200) {
const $ = cheerio2.load(html);
const siteTitle = $('.ads-list-photo-item-title').each((id, el) => {
const ref = host + $(el).children().attr('href');
const title = $(el).text();
post[id].id = id + 1;
post[id].title = title;
post[id].link = ref;
links[id] = ref;
postNumber = id;
});
post[0].link = 'https://999.md/ru/profile/denisserj'
numberPutter()
}
});
You have an error in line
const siteTitle = $('.ads-list-photo-item-title').each((id, el) => {
What you actually want is .find('a').each...
I have multiple pictures that a user uploads, once they hit the upload button I want to start going and uploading each picture, I first like to start with one at a time and maybe try for a batch of 5 at a time.
#action
uploadAllFiles = async () => {
for (let i = 0; i < this.files.length; i++) {
runInAction(async () => {
const file = this.files[i];
file.uploading = true;
var data = new FormData();
data.append('folderName', '4141515');
data.append('file', file.fileObject);
await axiosGenericInstance.post('/Images', data);
this.files.remove(file);
});
}
};
but I get
Since strict-mode is enabled, changing observed observable values
outside actions is not allowed. Please wrap the code in an action if
this change is intended. Tried to modify:
So I am not doing this right.
I did it like this :
export function getDetails(n, dataArray) {
return async(dispatch) => {
for(let i = 0; i < n; i++){
myService.getDetails(dataArray['x'][i]['y'])
.then(res => {
let item = dataArray['x'][i]['y']
dispatch({type: types.DETAILS_GRAPH, item, i, res});
})
.catch(error => {
dispatch({type: types.SET_ERROR, error});
});
}
//sleep(1) here if time needed between 2 calls
};
}
Hope it helps.
I think you should await runInAction
for (let i = 0; i < this.files.length; i++) {
// first write await
await runInAction(async () => {
const file = this.files[i];
file.uploading = true;
var data = new FormData();
data.append('folderName', '4141515');
data.append('file', file.fileObject);
await axiosGenericInstance.post('/Images', data);
this.files.remove(file);
});
}
I want to create a tree using a recursive function. The input to this function is a node and I want to add its children to it with that recursive function.
The following code will explain my problem in a better way:
function getUpstreamChildrenRecusrively(node) {
var receiverId = localStorage.getItem("ReceiverId");
//API call to get the children node
axios({
method: 'get',
url: window.location.origin+"/api/rwa/coverageView/getUpstreamChildren?id="+node.elementId,
headers: {
"ReceiverId":receiverId
}
})
.then(response => {
localStorage.setItem("ReceiverId",response.headers["receiverid"]);
var data = response.data;
for(var i = 0; i < data.length; i++) {
var obj = data[i];
var result = {};
result.text = obj.print;
result.elementId = obj.id;
result.elementText = obj.text;
result.expanded = true;
result.visible = true;
result.icon = window.location.origin+"/api"+obj.image;
getUpstreamChildrenRecusrively(result);
node.nodes = []; //nodes property will contain children
node.nodes.push(result);
console.log("Tree so far:"+JSON.stringify(node));
}
})
.catch(error => {
})
}
For every recursive call, the value of the node is a separate node having a single child in nodes property. I want to see the node to be grown with all its children as a final result.
What am I missing in this code?
Thank you in advance!
It looks like you expect your getUpstreamChildrenRecusrively to work synchronously. Learn more about javascript async/await and Promises.
here is how it should probably work
async function getUpstreamChildrenRecusrively(node) {
const receiverId = localStorage.getItem("ReceiverId")
const response = await axios({
method: 'get',
url: window.location.origin+"/api/rwa/coverageView/getUpstreamChildren?id="+node.elementId,
headers: {
ReceiverId: receiverId
}
})
localStorage.setItem("ReceiverId",response.headers["receiverid"])
const data = response.data
node.nodes = node.nodes || []
for(let i = 0; i < data.length; i++) {
const obj = data[i]
const result = {}
result.text = obj.print
result.elementId = obj.id
result.elementText = obj.text
result.expanded = true
result.visible = true
result.icon = window.location.origin + "/api" + obj.image
node.nodes.push(result)
await getUpstreamChildrenRecusrively(result)
}
}
getUpstreamChildrenRecusrively(initialNode).then(() => {
console.log('result node', initialNode)
})
Your understanding of how recursion works is flawed, I am not trying to be rude, just trying to help you understand that you need to study the subject more.
First of all you are not returning anything from your function
You are also checking the value of node after you have called your function recursively (which performs an async call and is scoped to the current function call).
You are making recursive api calls with no check in place for when the function should stop executing. Which means it will run until your api call fails.
function getUpstreamChildrenRecusrively(node) {
var receiverId = localStorage.getItem("ReceiverId");
//Api call to get the children node
return axios({
method: "get",
url:
window.location.origin +
"/api/rwa/coverageView/getUpstreamChildren?id=" +
node.elementId,
headers: {
ReceiverId: receiverId
}
})
.then(response => {
localStorage.setItem("ReceiverId", response.headers["receiverid"]);
var data = response.data;
for (var i = 0; i < data.length; i++) {
var obj = data[i];
var result = {};
result.text = obj.print;
result.elementId = obj.id;
result.elementText = obj.text;
result.expanded = true;
result.visible = true;
result.icon = window.location.origin + "/api" + obj.image;
node.nodes = getUpstreamChildrenRecusrively(result); //nodes property will contain children
console.log("Tree so far:" + JSON.stringify(node));
return node;
}
})
.catch(error => {
/* I am using ES6 here, you can use something equavelant to check if node has a value */
if (Object.keys(node).length > 0) {
return node;
} else {
/* obviously you need other error handling logic here too */
}
});
}
I'm new to ES6 and Promise. I'm trying pdf.js to extract texts from all pages of a pdf file into a string array. And when extraction is done, I want to parse the array somehow. Say pdf file(passed via typedarray correctly) has 4 pages and my code is:
let str = [];
PDFJS.getDocument(typedarray).then(function(pdf) {
for(let i = 1; i <= pdf.numPages; i++) {
pdf.getPage(i).then(function(page) {
page.getTextContent().then(function(textContent) {
for(let j = 0; j < textContent.items.length; j++) {
str.push(textContent.items[j].str);
}
parse(str);
});
});
}
});
It manages to work, but, of course, the problem is my parse function is called 4 times. I just want to call parse only after all 4-pages-extraction is done.
Similar to https://stackoverflow.com/a/40494019/1765767 -- collect page promises using Promise.all and don't forget to chain then's:
function gettext(pdfUrl){
var pdf = pdfjsLib.getDocument(pdfUrl);
return pdf.then(function(pdf) { // get all pages text
var maxPages = pdf.pdfInfo.numPages;
var countPromises = []; // collecting all page promises
for (var j = 1; j <= maxPages; j++) {
var page = pdf.getPage(j);
var txt = "";
countPromises.push(page.then(function(page) { // add page promise
var textContent = page.getTextContent();
return textContent.then(function(text){ // return content promise
return text.items.map(function (s) { return s.str; }).join(''); // value page text
});
}));
}
// Wait for all pages and join text
return Promise.all(countPromises).then(function (texts) {
return texts.join('');
});
});
}
// waiting on gettext to finish completion, or error
gettext("https://cdn.mozilla.net/pdfjs/tracemonkey.pdf").then(function (text) {
alert('parse ' + text);
},
function (reason) {
console.error(reason);
});
<script src="https://npmcdn.com/pdfjs-dist/build/pdf.js"></script>
A bit more cleaner version of #async5 and updated according to the latest version of "pdfjs-dist": "^2.0.943"
import PDFJS from "pdfjs-dist";
import PDFJSWorker from "pdfjs-dist/build/pdf.worker.js"; // add this to fit 2.3.0
PDFJS.disableTextLayer = true;
PDFJS.disableWorker = true; // not availaible anymore since 2.3.0 (see imports)
const getPageText = async (pdf: Pdf, pageNo: number) => {
const page = await pdf.getPage(pageNo);
const tokenizedText = await page.getTextContent();
const pageText = tokenizedText.items.map(token => token.str).join("");
return pageText;
};
/* see example of a PDFSource below */
export const getPDFText = async (source: PDFSource): Promise<string> => {
Object.assign(window, {pdfjsWorker: PDFJSWorker}); // added to fit 2.3.0
const pdf: Pdf = await PDFJS.getDocument(source).promise;
const maxPages = pdf.numPages;
const pageTextPromises = [];
for (let pageNo = 1; pageNo <= maxPages; pageNo += 1) {
pageTextPromises.push(getPageText(pdf, pageNo));
}
const pageTexts = await Promise.all(pageTextPromises);
return pageTexts.join(" ");
};
This is the corresponding typescript declaration file that I have used if anyone needs it.
declare module "pdfjs-dist";
type TokenText = {
str: string;
};
type PageText = {
items: TokenText[];
};
type PdfPage = {
getTextContent: () => Promise<PageText>;
};
type Pdf = {
numPages: number;
getPage: (pageNo: number) => Promise<PdfPage>;
};
type PDFSource = Buffer | string;
declare module 'pdfjs-dist/build/pdf.worker.js'; // needed in 2.3.0
Example of how to get a PDFSource from a File with Buffer (from node types) :
file.arrayBuffer().then((ab: ArrayBuffer) => {
const pdfSource: PDFSource = Buffer.from(ab);
});
Here's a shorter (not necessarily better) version:
async function getPdfText(data) {
let doc = await pdfjsLib.getDocument({data}).promise;
let pageTexts = Array.from({length: doc.numPages}, async (v,i) => {
return (await (await doc.getPage(i+1)).getTextContent()).items.map(token => token.str).join('');
});
return (await Promise.all(pageTexts)).join('');
}
Here, data is a string or buffer (or you could change it to take the url, etc., instead).
Here's another Typescript version with await and Promise.all based on the other answers:
import { getDocument } from "pdfjs-dist";
import {
DocumentInitParameters,
PDFDataRangeTransport,
TypedArray,
} from "pdfjs-dist/types/display/api";
export const getPdfText = async (
src: string | TypedArray | DocumentInitParameters | PDFDataRangeTransport
): Promise<string> => {
const pdf = await getDocument(src).promise;
const pageList = await Promise.all(Array.from({ length: pdf.numPages }, (_, i) => pdf.getPage(i + 1)));
const textList = await Promise.all(pageList.map((p) => p.getTextContent()));
return textList
.map(({ items }) => items.map(({ str }) => str).join(""))
.join("");
};
If you use the PDFViewer component, here is my solution that doesn't involve any promise or asynchrony:
function getDocumentText(viewer) {
let text = '';
for (let i = 0; i < viewer.pagesCount; i++) {
const { textContentItemsStr } = viewer.getPageView(i).textLayer;
for (let item of textContentItemsStr)
text += item;
}
return text;
}
I wouldn't know how to do it either, but thanks to async5 I did it. I copied his code and updated it to the new version of pdf.js.
I made minimal corrections and also took the liberty of not grouping all the pages into a single string. In addition, I used a regular expression that removes many of the empty spaces that PDF unfortunately ends up creating (it does not solve all cases, but the vast majority).
The way I did it should be the way that most will feel comfortable working, however, feel free to remove the regex or make any other changes.
// pdf-to-text.js v1, require pdf.js ( https://mozilla.github.io/pdf.js/getting_started/#download )
// load pdf.js and pdf.worker.js
function pdfToText(url, separator = ' ') {
let pdf = pdfjsLib.getDocument(url);
return pdf.promise.then(function(pdf) { // get all pages text
let maxPages = pdf._pdfInfo.numPages;
let countPromises = []; // collecting all page promises
for (let i = 1; i <= maxPages; i++) {
let page = pdf.getPage(i);
countPromises.push(page.then(function(page) { // add page promise
let textContent = page.getTextContent();
return textContent.then(function(text) { // return content promise
return text.items.map(function(obj) {
return obj.str;
}).join(separator); // value page text
});
}));
};
// wait for all pages and join text
return Promise.all(countPromises).then(function(texts) {
for(let i = 0; i < texts.length; i++){
texts[i] = texts[i].replace(/\s+/g, ' ').trim();
};
return texts;
});
});
};
// example of use:
// waiting on pdfToText to finish completion, or error
pdfToText('files/pdf-name.pdf').then(function(pdfTexts) {
console.log(pdfTexts);
// RESULT: ['TEXT-OF-PAGE-1', 'TEXT-OF-PAGE-2', ...]
}, function(reason) {
console.error(reason);
});