replace all values using javascript in an xml payload - javascript

i have a xml string that i will replace all values of specific tags using javascript,
and this is the code :
function replaceDomainName (xmlPayload,domainId)
{
var oldDomain = '<DOMAIN_NAME>OOO';
var oldDomain2 = '<DomainName>OOO';
var newDomain = '<DOMAIN_NAME>'+domainId ;
var newDomain2 = '<DomainName>'+domainId ;
var xmlString = xmlPayload.toString();
var x = xmlString.replace(/oldDomain/g,newDomain)
x = x.replace(/oldDomain2/g,newDomain2)
console.log(x);
return x ;
}
when I try to invoke the function with the following XML it throws error
<TransmissionHeader xmlns:tran="http://xmlns.oracle.com/apps/otm/TransmissionService" xmlns="">
<Version>20b</Version>
<TransmissionCreateDt>
<GLogDate>20200819124057</GLogDate>
<TZId>UTC</TZId>
<TZOffset>+00:00</TZOffset>
</TransmissionCreateDt>
<TransactionCount>1</TransactionCount>
<SenderHostName>https://xxx</SenderHostName>
<SenderSystemID>https:xxx</SenderSystemID>
<UserName>OOO</UserName>
<SenderTransmissionNo>404836</SenderTransmissionNo>
<ReferenceTransmissionNo>0</ReferenceTransmissionNo>
<GLogXMLElementName>PlannedShipment</GLogXMLElementName>
<NotifyInfo>
<ContactGid>
<Gid>
<DomainName>OOO</DomainName>
<Xid>SYS</Xid>
</Gid>
</ContactGid>
<ExternalSystemGid>
<Gid>
<DOMAIN_NAME>OOO</DOMAIN_NAME>
<Xid>IOT_SYSTEM</Xid>
</Gid>
</ExternalSystemGid>
</NotifyInfo>
</TransmissionHeader>
error: unknown: Unexpected token (14:23)

x.replace(/<DOMAIN_NAME>OOO/g,'<DomainName>'+domainId)
use this

While you can get a lot done with Regex, it can get really complicated when parsing XML.
See this example of using DOMParser and XMLSerializer:
https://jsfiddle.net/o1cenvs3/
const XML = `<TransmissionHeader xmlns:tran="http://xmlns.oracle.com/apps/otm/TransmissionService" xmlns="">
<Version>20b</Version>
<TransmissionCreateDt>
<GLogDate>20200819124057</GLogDate>
<TZId>UTC</TZId>
<TZOffset>+00:00</TZOffset>
</TransmissionCreateDt>
<TransactionCount>1</TransactionCount>
<SenderHostName>https://xxx</SenderHostName>
<SenderSystemID>https:xxx</SenderSystemID>
<UserName>OOO</UserName>
<SenderTransmissionNo>404836</SenderTransmissionNo>
<ReferenceTransmissionNo>0</ReferenceTransmissionNo>
<GLogXMLElementName>PlannedShipment</GLogXMLElementName>
<NotifyInfo>
<ContactGid>
<Gid>
<DomainName>OOO</DomainName>
<Xid>SYS</Xid>
</Gid>
</ContactGid>
<ExternalSystemGid>
<Gid>
<DOMAIN_NAME>OOO</DOMAIN_NAME>
<Xid>IOT_SYSTEM</Xid>
</Gid>
</ExternalSystemGid>
</NotifyInfo>
</TransmissionHeader>`;
if(typeof(String.prototype.trim) === "undefined")
{
String.prototype.trim = function()
{
return String(this).replace(/^\s+|\s+$/g, '');
};
}
function replaceDomainName (xmlPayload, oldValue, newValue)
{
const parser = new DOMParser();
const xmlDoc = parser.parseFromString(xmlPayload,"text/xml");
for(let tagName of ['DOMAIN_NAME', 'DomainName']) {
const instances = xmlDoc.getElementsByTagName(tagName);
for (let instance of instances) {
if(instance.innerHTML.trim() == oldValue )
instance.innerHTML = newValue;
}
};
const s = new XMLSerializer();
const d = document;
const result = s.serializeToString(xmlDoc);
return result;
}
const resultXML = replaceDomainName(XML, 'OOO', 'new.com');
console.log('resultXML', resultXML);
const textarea = document.createElement("textarea");
textarea.innerHTML = resultXML;
textarea.cols = 80;
textarea.rows = 24;
document.body.appendChild(textarea);

Related

Modifying a JS browser extension's HTML parser to exclude images

I'll get out of the way that although I have a programming background I have no experience with any of the tools or languages I'm working with here. Sorry if I've made simple misunderstandings or communicate something unclearly.
I'm using an extension to generate EPUBs from website links. The extension has the option to customize the parsing logic from the options dialog. Doing so yields this code for the parser it's currently using:
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
var new_link = null;
var subchaps = [];
// Wordpress content
var main_cont = dom.querySelector(".entry-content")
if(main_cont != null){
var ancs = main_cont.querySelectorAll("a");
ancs.forEach((element) => {
if(RegExp(/click here to read|read here|continue reading/i).test(element.innerText)){
new_link = helpers["link_fixer"](element.href, url);
} else if (RegExp(/chapter|part/i).test(element.innerText)) {
subchaps.push(helpers["link_fixer"](element.href, url))
}
});
}
if (new_link != null) {
var res = await fetch(new_link);
var r_txt = await res.text();
dom = parser.parseFromString(r_txt, "text/html");
var out = helpers["readability"](dom);
return {title: out.title, html: out.content};
} else if (subchaps.length > 0) {
var html = "";
for(var subc in subchaps){
console.log(subchaps[subc]);
var cres = await fetch(subchaps[subc]);
var c_txt = await cres.text();
var cdom = parser.parseFromString(c_txt, "text/html");
var out = helpers["readability"](cdom);
html += "<h1>"+out.title+"</h1>"+ out.content
}
return {title: title, html: html};
}
var out = helpers["readability"](dom);
return {title: out.title, html: out.content};
I've inspected this code and gathered that it handles three cases: two ways that it needs to follow links deeper before parsing, and the simple case where it is already in the right place. The lions share of the code deals with the first two cases and it's largely the third that I'm interested in. Unfortunately, it appears that the second to last line is where the parsing actually happens:
var out = helpers["readability"](dom);
And this is a magic box to me. I can't for the life of me figure out what this is referencing.
I've searched the full file for a definition of 'helpers' or even 'readability' but come up blank. I was under the impression that the part I was editing was the 'readibility' parser. I thought I'd be able to pop into the parser logic, add a line to exclude nodes with the <img> tag, and live happily ever after. What am I mistaken about? Or is what I want to do impossible, given what the extension is letting me modify?
To be clear, I am not asking for a full guide on how to write parser logic. I considered just parsing and repackaging the document before the line in question, but I didn't want to write the same code 3 times, and I don't like that I can't tell what it's doing. I couldn't even begin to search for documentation, given that I can't find the definition in the first place. Even just explaining what that line does and pointing to any relevant documentation would be a great help.
Thanks in advance.
(And here is the full file, if you feel like verifying that I didn't miss anything.)
main_parser: |
var link = new URL(url);
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
switch (link.hostname) {
case "www.novelupdates.com":
var paths = link.pathname.split("/");
if (paths.length > 1 && paths[1] == "series") {
return {page_type:"toc", parser:"chaps_nu"};
}
}
// Default to all links
return {page_type:"toc", parser:"chaps_all_links"};
toc_parsers:
chaps_nu:
name: Novel Update
code: |
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
var chap_popup = dom.querySelector("#my_popupreading");
if (chap_popup == null) {
return []
}
var chap_lis = chap_popup.querySelectorAll("a");
var chaps = [];
chap_lis.forEach((element) => {
if (element.href.includes("extnu")) {
chaps.unshift({
url_title: element.innerText,
url: helpers["link_fixer"](element.href, url),
});
}
});
var tit = dom.querySelector(".seriestitlenu").innerText;
var desc = dom.querySelector("#editdescription").innerHTML;
var auth = dom.querySelector("#authtag").innerText;
var img = dom.querySelector(".serieseditimg > img");
if (img == null){
img = dom.querySelector(".seriesimg > img");
}
return {"chaps":chaps,
meta:{title:tit, description: desc, author: auth, cover: img.src, publisher: "Novel Update"}
};
chaps_name_search:
name: Chapter Links
code: |
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
var ancs = dom.querySelectorAll("a");
var chaps = []
ancs.forEach((element) => {
if(RegExp(/chap|part/i).test(element.innerText)){
chaps.push({
url_title: element.innerText,
url: helpers["link_fixer"](element.href, url),
});
}
});
return {"chaps":chaps,
meta:{}
};
chaps_all_links:
name: All Links
code: |
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
var ancs = dom.querySelectorAll("a");
var chaps = []
ancs.forEach((element) => {
chaps.push({
url_title: element.innerText,
url: helpers["link_fixer"](element.href, url),
});
});
return {"chaps":chaps,
meta:{}
};
chap_main_parser: |
var url = new URL(url);
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
// Generic parser
return {chap_type: "chap", parser:"chaps_readability"};
chap_parsers:
chaps_readability:
name: Readability
code: |
var parser = new DOMParser();
var dom = parser.parseFromString(source, "text/html");
var new_link = null;
var subchaps = [];
// Wordpress content
var main_cont = dom.querySelector(".entry-content")
if(main_cont != null){
var ancs = main_cont.querySelectorAll("a");
ancs.forEach((element) => {
if(RegExp(/click here to read|read here|continue reading/i).test(element.innerText)){
new_link = helpers["link_fixer"](element.href, url);
} else if (RegExp(/chapter|part/i).test(element.innerText)) {
subchaps.push(helpers["link_fixer"](element.href, url))
}
});
}
if (new_link != null) {
var res = await fetch(new_link);
var r_txt = await res.text();
dom = parser.parseFromString(r_txt, "text/html");
var out = helpers["readability"](dom);
return {title: out.title, html: out.content};
} else if (subchaps.length > 0) {
var html = "";
for(var subc in subchaps){
console.log(subchaps[subc]);
var cres = await fetch(subchaps[subc]);
var c_txt = await cres.text();
var cdom = parser.parseFromString(c_txt, "text/html");
var out = helpers["readability"](cdom);
html += "<h1>"+out.title+"</h1>"+ out.content
}
return {title: title, html: html};
}
var out = helpers["readability"](dom);
return {title: out.title, html: out.content};
chaps_raw:
name: No Parse
code: |
return {title: title, html: source}

VS Code Extension API - Replace a String in the document?

const textEditor = vscode.window.activeTextEditor;
if (!textEditor) {
return; // No open text editor
}
for(var i=0;i<textEditor.document.lineCount;i++)
{
var textLine = textEditor.document.lineAt(i);
for(var j=textLine.range.start.character; j<=textLine.range.end.character; j++)
{
var startposition = new vscode.Position(i,j);
var endposition = new vscode.Position(i,j+1);
var range = new vscode.Range(startposition,endposition);
var text = textEditor.document.getText(range);
if(text === "\'"){
textEditor.edit(editBuilder => editBuilder.replace(range,"\""));
}
}
}
I need to replace all the single quotes with double quotes. But what happens is the
textEditor.edit(editBuilder => editBuilder.replace(range,"\"")); only replaces the 1st occurence. I need to replace all of the occurence in the document.
let doc = vscode.window.activeTextEditor.document;
let editor = vscode.window.activeTextEditor;
var j=0;
editor.edit(editBuilder => {
for(var i=0;i<doc.lineCount;i++)
{
var line = doc.lineAt(i);
for(j=0;j<line.range.end.character;j++)
{
var startposition = new vscode.Position(i,j);
var endingposition = new vscode.Position(i,j+1);
var range = new vscode.Range(startposition,endingposition);
var charac = editor.document.getText(range);
if(charac == '\'')
{
editBuilder.replace(range,'\"');
console.log(startposition);
}
}
}
})
`

NodeJS using a class from another file

I have a java script file that is referencing another javascript file that contains a class using
const Champion = require("./championgg_webscraper_cheerio.js");
I then try to instantiate an object of the class Champion by
var temp = new Champion("hello");
console.log(temp);
And when I do it prints this to the console indicating and undefined variable:
Champion {}
Also when i try to print out the properties of the class I get undefined, I think it might not have access to the most_frequent_completed_build variable.
console.log(temp.most_frequent_completed_build);
Here is a look at the championgg_webscraper_cheerio.js file
function Champion(champName) {
//CHEERIO webscraping
var cheerio = require('cheerio');
//REQUEST http library
var request = require('request');
//url of the champion
var url = "http://champion.gg/champion/Camille/Top?";
var most_frequent_completed_build;
var highest_win_percentage_completed_build;
request(url,
function(error, response, html) {
if (!error && response.statusCode == 200) {
var $ = cheerio.load(html);
var final_build_items = $(".build-wrapper a");
var mfcb = [];
var hwpcb = [];
for (i = 0; i < 6; i++) {
var temp = final_build_items.get(i);
temp = temp.attribs.href;
//slices <'http://leagueoflegends.wikia.com/wiki/> off the href
temp = temp.slice(38);
mfcb.push(temp);
}
for (i = 6; i < 12; i++) {
var temp = final_build_items.get(i);
temp = temp.attribs.href;
//slices <'http://leagueoflegends.wikia.com/wiki/> off the href
temp = temp.slice(38);
hwpcb.push(temp);
}
most_frequent_completed_build = mfcb;
highest_win_percentage_completed_build = hwpcb;
} else {
console.log("Response Error: " + response.statusCode);
}
}
);
};
module.exports = Champion;
I think you want a Function constructor named Champion (a prototype or blue-print like classes in other programming languages like Java).
As an alternative I would suggest you to learn ES6 way of writing classes which is similar to that of Java.
You can achieve that by adding all the variables or methods to the this variable inside the Function Constructor so that you can access them using an object created using the 'new' keyword i.e make them Class members or methods.
In your case,
function Champion(champName) {
//Some code
this.most_frequent_completed_build = NULL;
//Rest of code
}
module.exports = Champion;
Just make sure whenever you try to access Class variables always use this.variable_name like this.most_frequent_completed_build.
So when you create a new object of this Class in main app you will be able to access all Class members and methods.
const Champion = require("./championgg_webscraper_cheerio.js");
var temp = new Champion("hello");
console.log(temp.most_frequent_completed_build);
You are exporting a function
All you have to do is call that function like
var temp = Champion();
You can read more about new keyword here and here
function Champion(champName) {
//CHEERIO webscraping
var cheerio = require('cheerio');
//REQUEST http library
var request = require('request');
//url of the champion
var url = "http://champion.gg/champion/Camille/Top?";
var most_frequent_completed_build;
var highest_win_percentage_completed_build;
request(url,
function(error, response, html) {
if (!error && response.statusCode == 200) {
var $ = cheerio.load(html);
var final_build_items = $(".build-wrapper a");
var mfcb = [];
var hwpcb = [];
for (i = 0; i < 6; i++) {
var temp = final_build_items.get(i);
temp = temp.attribs.href;
//slices <'http://leagueoflegends.wikia.com/wiki/> off the href
temp = temp.slice(38);
mfcb.push(temp);
}
for (i = 6; i < 12; i++) {
var temp = final_build_items.get(i);
temp = temp.attribs.href;
//slices <'http://leagueoflegends.wikia.com/wiki/> off the href
temp = temp.slice(38);
hwpcb.push(temp);
}
most_frequent_completed_build = mfcb;
highest_win_percentage_completed_build = hwpcb;
} else {
console.log("Response Error: " + response.statusCode);
}
}
);
return {most_frequent_completed_build:most_frequent_completed_build};
};
module.exports = Champion;
var temp = new Champion("hello");
console.log(temp.most_frequent_completed_build);

node.js replace() - Invalid string length error

I've just coded a little script to replace all variables from a .txt file to their values in a JS file
Example:
Txt file example (values):
Hi = "HELLO WORLD",
Hey = /someregex/g,
Hh = 'haha';
Script example:
window[Hi] = true;
"someregex hi".replace(Hey, "")
window[Hh] = 1;
Here's my script:
var fs = require("fs")
var script = fs.readFileSync("./script.js", "utf8");
var vars = fs.readFileSync("./vars.txt", "utf8");
var replace = {}
var spl = vars.replace(/\r\n/g, "").replace(/ /g, "").split(",");
console.log("caching variables")
for(var dt of spl) {
var splt = dt.split(" = ");
var name = splt[0];
var val = splt[1];
if(!name || !val) {
continue;
}
if(val.endsWith(";")) {
val = val.slice(0, -1);
}
replace[name] = val;
}
console.log("Variables are in cache!")
console.log("Replacing variables in script")
var i = 1;
var t = Object.keys(replace).length;
for(var var_name in replace) {
var var_val = replace[var_name];
var regex = new RegExp(var_name, "g");
console.log(i, "/", t, "Replacing", var_name, "with", var_val, "regex", regex)
script = script.replace(regex, var_val);
i++;
}
console.log("DONE!")
fs.writeFileSync("./dec.js", script, "utf8")
However, when i ~= 100, I have this error:
RangeError: Invalid string length
at RegExp.[Symbol.replace] (native)
at String.replace (native)
EDIT: also, I can see that node.js process is using ~400MB of RAM and I have the error when it reaches 900MB
What's wrong?

javascript parser for a string which contains .ini data

If a string contains a .ini file data , How can I parse it in JavaScript ?
Is there any JavaScript parser which will help in this regard?
here , typically string contains the content after reading a configuration file. (reading cannot be done through javascript , but somehow I gather .ini info in a string.)
I wrote a javascript function inspirated by node-iniparser.js
function parseINIString(data){
var regex = {
section: /^\s*\[\s*([^\]]*)\s*\]\s*$/,
param: /^\s*([^=]+?)\s*=\s*(.*?)\s*$/,
comment: /^\s*;.*$/
};
var value = {};
var lines = data.split(/[\r\n]+/);
var section = null;
lines.forEach(function(line){
if(regex.comment.test(line)){
return;
}else if(regex.param.test(line)){
var match = line.match(regex.param);
if(section){
value[section][match[1]] = match[2];
}else{
value[match[1]] = match[2];
}
}else if(regex.section.test(line)){
var match = line.match(regex.section);
value[match[1]] = {};
section = match[1];
}else if(line.length == 0 && section){
section = null;
};
});
return value;
}
2017-05-10 updated: fix bug of keys contains spaces.
EDIT:
Sample of ini file read and parse
You could try the config-ini-parser, it's similar to python ConfigParser without I/O operations
It could be installed by npm or bower. Here is an example:
var ConfigIniParser = require("config-ini-parser").ConfigIniParser;
var delimiter = "\r\n"; //or "\n" for *nux
parser = new ConfigIniParser(delimiter); //If don't assign the parameter delimiter then the default value \n will be used
parser.parse(iniContent);
var value = parser.get("section", "option");
parser.stringify('\n'); //get all the ini file content as a string
For more detail you could check the project main page or from the npm package page
Here's a function who's able to parse ini data from a string to an object! (on client side)
function parseINIString(data){
var regex = {
section: /^\s*\[\s*([^\]]*)\s*\]\s*$/,
param: /^\s*([\w\.\-\_]+)\s*=\s*(.*?)\s*$/,
comment: /^\s*;.*$/
};
var value = {};
var lines = data.split(/\r\n|\r|\n/);
var section = null;
for(x=0;x<lines.length;x++)
{
if(regex.comment.test(lines[x])){
return;
}else if(regex.param.test(lines[x])){
var match = lines[x].match(regex.param);
if(section){
value[section][match[1]] = match[2];
}else{
value[match[1]] = match[2];
}
}else if(regex.section.test(lines[x])){
var match = lines[x].match(regex.section);
value[match[1]] = {};
section = match[1];
}else if(lines.length == 0 && section){//changed line to lines to fix bug.
section = null;
};
}
return value;
}
Based on the other responses i've modified it so you can have nested sections :)
function parseINI(data: string) {
let rgx = {
section: /^\s*\[\s*([^\]]*)\s*\]\s*$/,
param: /^\s*([^=]+?)\s*=\s*(.*?)\s*$/,
comment: /^\s*;.*$/
};
let result = {};
let lines = data.split(/[\r\n]+/);
let section = result;
lines.forEach(function (line) {
//comments
if (rgx.comment.test(line)) return;
//params
if (rgx.param.test(line)) {
let match = line.match(rgx.param);
section[match[1]] = match[2];
return;
}
//sections
if (rgx.section.test(line)) {
section = result
let match = line.match(rgx.section);
for (let subSection of match[1].split(".")) {
!section[subSection] && (section[subSection] = {});
section = section[subSection];
}
return;
}
});
return result;
}

Categories

Resources