I'm trying to write a parser that supports the following type of query clauses
from: A person
at: a specific company
location: The person's location
So a sample query would be like -
from:Alpha at:Procter And Gamble location:US
How do i write this generic parser in javascript. Also, I was considering including AND operators inside queries like
from:Alpha AND at:Procter And Gamble AND location:US
However, this would conflict with the criteria value in any of the fields (Procter And Gamble)
Use a character like ";" instead of AND and then call theses functions:
var query = 'from:Alpha;at:Procter And Gamble;location:US';
var result = query.split(';').map(v => v.split(':'));
console.log(result);
And then you'll have an array of pairs, which array[0] = prop name and array[1] = prop value
var query = 'from:Alpha;at:Procter And Gamble;location:US';
var result = query.split(';').map(v => v.split(':'));
console.log(result);
Asuming your query will always look like this from: at: location:
You can do this:
const regex = /from:\s*(.*?)\s*at:\s*(.*?)\s*location:\s*(.*)\s*/
const queryToObj = query => {
const [,from,at,location] = regex.exec(query)
return {from,at,location}
}
console.log(queryToObj("from:Alpha at Betaat: Procter And Gamble location: US"))
However, adding a terminator allow you to mix order and lowering some keywords:
const regex = /(\w+):\s*(.*?)\s*;/g
const queryToObj = query => {
const obj = {}
let temp
while(temp = regex.exec(query)){
let [,key,value] = temp
obj[key] = value
}
return obj
}
console.log(queryToObj("from:Alpha at Beta;at:Procter And Gamble;location:US;"))
console.log(queryToObj("at:Procter And Gamble;location:US;from:Alpha at Beta;"))
console.log(queryToObj("from:Alpha at Beta;"))
Related
const fs = require('fs');
// var fileRefer=new Array();
var fileRefer = fs.readFileSync('D:\\NgageAuto\\LoginID\\Creds.txt').toString().split("\n");
for(i in fileRefer) {
console.log(fileRefer[i]);
}
Ouput:- Date: 2021-11-08 16:56:42 LoginID: pvgA1245 Password: Root#123
it's one of the example which is in file i want LoginID value i.e "pvgA1245 and password value i.e Root#123
Please , help me how can i make it!!!
there are hundreds of solutions to this problem, some having advantages in different scenarios then others.
Here are some for using split() or using a regex match. Each either using destructuring or just "normally assign" the variables. Regex has the advantage, that it can be used more flexible for example if sometimes the line schema is different and Password is missing. But if you can be sure that the schema is always the same or just wanna skip lines that don't have all values, split() is totally fine.
You would just add the relevant code snippet inside your for loop
let i = "Date: 2021-11-08 16:56:42 LoginID: pvgA1245 Password: Root#123";
// split and destructure
const [ , , , , id, ,pw,] = i.split(" ");
console.log(id, pw);
// split and normally assign
const a = i.split(" ");
const id2 = a[4];
const pw2 = a[6];
console.log(id2, pw2);
// regex and destructure
const [, id3, pw3] = i.match(/(?:LoginID: ([^\s]*)) ?(?:Password: ([^\s]*))/);
console.log(id3, pw3);
// regex and normally assign
const m = i.match(/(?:LoginID: ([^\s]*)) ?(?:Password: ([^\s]*))/);
const id4 = m[1];
const pw4 = m[2];
console.log(id4, pw4);
I've got an array of rows that I've parsed out of a table from html, stored in a list. Each of the rows in the list is a string that looks (something) like this:
["<td headers="DOCUMENT" class="t14data"><a target="6690-Exhibit-C-20190611-1" href="http://www.fara.gov/docs/6690-Exhibit-C-20190611-1.pdf" class="doj-analytics-processed"><span style="color:blue">Click Here </span></a></td><td headers="REGISTRATIONNUMBER" class="t14data">6690</td><td headers="REGISTRANTNAME" class="t14data">SKDKnickerbocker LLC</td><td headers="DOCUMENTTYPE" class="t14data">Exhibit C</td><td headers="STAMPED/RECEIVEDDATE" class="t14data">06/11/2019</td>","<td headers="DOCUMENT" class="t14data"><a target="5334-Supplemental-Statement-20190611-30" href="http://www.fara.gov/docs/5334-Supplemental-Statement-20190611-30.pdf" class="doj-analytics-processed"><span style="color:blue">Click Here </span></a></td><td headers="REGISTRATIONNUMBER" class="t14data">5334</td><td headers="REGISTRANTNAME" class="t14data">Commonwealth of Dominica Maritime Registry, Inc.</td><td headers="DOCUMENTTYPE" class="t14data">Supplemental Statement</td><td headers="STAMPED/RECEIVEDDATE" class="t14data">06/11/2019</td>"]
The code is pulled from the page with the following page.evaluate function using puppeteer.
I'd like to then parse this code with cheerio, which I find to be simpler and more understandable. However, when I pass each of the strings of html into cheerio, it fails to parse them correctly. Here's the current function I'm using:
let data = res.map((tr) => {
let $ = cheerio.load(tr);
const link = $("a").attr("href");
const number = $("td[headers='REGISTRATIONNUMBER']").text();
const name = $("td[headers='REGISTRANTNAME']").text();
const type = $("td[headers='DOCUMENTTYPE']").text();
const date = $("td[headers='STAMPED/RECEIVEDDATE']").text();
return { link, number, name, type, date };
});
For some reason, only the "a" tag is working correctly for each row. Meaning, the "link" variable is correctly defined, but none of the other ones are. When I use $("*") to return a list of what should be all of the td's, it returns an unusual node list:
What am I doing wrong, and how can I gain access to the td's with the various headers, and their text content? Thanks!
It usually looks more like this:
let data = res.map((i, tr) => {
const link = $(tr).find("a").attr("href");
const number = $(tr).find("td[headers='REGISTRATIONNUMBER']").text();
const name = $(tr).find("td[headers='REGISTRANTNAME']").text();
const type = $(tr).find("td[headers='DOCUMENTTYPE']").text();
const date = $(tr).find("td[headers='STAMPED/RECEIVEDDATE']").text();
return { link, number, name, type, date };
}).get();
Keep in mind that cheerio map has the arguments reversed from js map.
I found the solution. I'm simply returning the full html through puppeteer instead of trying to get individual rows, and then using the above suggestion (from #pguardiario) to parse the text:
const res = await page.evaluate(() => {
return document.body.innerHTML;
});
let $ = cheerio.load(res);
let trs = $(".t14Standard tbody tr.highlight-row");
let data = trs.map((i, tr) => {
const link = $(tr).find("a").attr("href");
const number = $(tr).find("td[headers='REGISTRATIONNUMBER']").text();
const registrant = $(tr).find("td[headers='REGISTRANTNAME']").text();
const type = $(tr).find("td[headers='DOCUMENTTYPE']").text();
const date = moment($(tr).find("td[headers='STAMPED/RECEIVEDDATE']").text()).valueOf().toString();
return { link, number, registrant, type, date };
});
I need to parse multiple email bodies that look like:
Name: Bob smith
Email: hellol#aol.com
Phone Number: 4243331212
As part of a larger program I have the following function to parse the page based on Efficiently parsing email body in javascript:
function parse (i, body) {
var obj = {};
body.split('\n').forEach(v=>v.replace(/\s*(.*)\s*:\s*(.*)\s*/, (s,key,val)=>{obj[key]=isNaN(val)||val.length<1?val||undefined:Number(val);}));
var objArr = Object.values(obj);
var res = [];
res[0] = i
res.push(objArr)
return res
}
when I run this I get a syntax error in:
body.split('\n').forEach(v=>v.replace(/\s*(.*)\s*:\s*(.*)\s*/, (s,key,val)=>{obj[key]=isNaN(val)||val.length<1?val||undefined:Number(val);}));
what am I doing wrong ?
Google Apps Script is based on the ECMA Script version that doesn't support arrow functions. Replace
array.forEach(element => element.replace(expression))
with
array.forEach(function(element) {
return element.replace(expression);
});
Website that I'm making is in two different languages each data is saved in mongodb with prefix _nl or _en
With a url I need to be able to set up language like that:
http://localhost/en/This-Is-English-Head/This-Is-English-Sub
My code look like that:
var headPage = req.params.headPage;
var subPage = req.params.subPage;
var slug = 'name';
var slugSub = 'subPages.slug_en';
var myObject = {};
myObject[slugSub] = subPage;
myObject[slug] = headPage;
console.log(myObject);
Site.find(myObject,
function (err, pages) {
var Pages = {};
pages.forEach(function (page) {
Pages[page._id] = page;
});
console.log(Pages);
});
After console.log it I get following:
{ 'subPages.slug_en': 'This-Is-English-Sub',
name: 'This-Is-English-Head' }
Is you can see objectname subPages.slug_en is seen as a String insteed of object name..
I know that javascript does not support underscores(I guess?) but I'm still looking for a fix, otherwise i'll be forced to change all underscores in my db to different character...
Edit:
The final result of console.log need to be:
{ subPages.slug_en: 'This-Is-English-Sub',
name: 'This-Is-English-Head' }
Insteed of :
{ 'subPages.slug_en': 'This-Is-English-Sub',
name: 'This-Is-English-Head' }
Otherwise it does not work
The reason you are seeing 'subPages.slug_en' (with string quotes) is because of the . in the object key, not the underscore.
Underscores are definitely supported in object keys without quoting.
Using subPages.slug_en (without string quotes) would require you to have an object as follows:
{ subPages: {slug_en: 'This-Is-English-Sub'},
name: 'This-Is-English-Head' }
Which you could set with the following:
myObject['subPages']['slug_en'] = subPage;
Or simply:
myObject.subPages.slug_en = subPage;
I'm trying to concat two buffers with a space in between them in Node.js.
Here is my code.
var buff1 = new Buffer("Jumping");
var buff2 = new Buffer("Japang");
var buffSpace = new Buffer(1);
buffSpace[0] = "32";
var newBuff = Buffer.concat([buff1, buffSpace, buff2], (buff1.length + buff2.length + buffSpace.length));
console.log(newBuff.toString());
As per official doc, the first argument will be the Array list of Buffer objects. Hence I've created buffSpace for space.
Class Method: Buffer.concat(list[, totalLength])
list : Array List of Buffer objects to concat
totalLength: Number Total length of the buffers when concatenated
I'm getting the result as expected but not sure whether it is right way to do so. Please suggest if any better solution to achieve the same.
There are three changes I would suggest.
First, if you are using Node v6, use Buffer.from() instead of new Buffer(), as the latter is deprecated.
Second, you don't need to pass an argument for totalLength to Buffer.concat(), since it will be calculated automatically from the length of all of the buffers passed. While the docs note it will be faster to pass a total length, this will really only be true if you pass a constant value. What you are doing above is computing the length and then passing that, which is what the concat() function will do internally anyway.
Finally, I would recommend putting this in a function that works like Array.prototype.join(), but for buffers.
function joinBuffers(buffers, delimiter = ' ') {
let d = Buffer.from(delimiter);
return buffers.reduce((prev, b) => Buffer.concat([prev, d, b]));
}
And you can use it like this:
let buf1 = Buffer.from('Foo');
let buf2 = Buffer.from('Bar');
let buf3 = Buffer.from('Baz');
let joined = joinBuffers([buf1, buf2, buf3]);
console.log(joined.toString()); // Foo Bar Baz
Or set a custom delimiter like this:
let joined2 = joinBuffers([buf1, buf2, buf3], ' and ');
console.log(joined2.toString()); // Foo and Bar and Baz
Read the Buffer stream and save it to file as:
const data = [];
req.on('data', stream => {
data.push(stream);
});
req.on('close', () => {
const parsedData = Buffer.concat(data).toString('utf8');
fs.writeFileSync('./test.text', parsedData);
});