doPost(e) with no result - Google script as API

doPost(e) with no result - Google script as API - javascript

I can't seem to get the simplest doPost to reply in Scripts, tried a few things from around the internet, and cannot get a Post to reply.
I have reduced it to as simple as possible to get a reply, as below.
function doPost(e) {
var HTMLString = "<style> h1,p {font-family: 'Helvitica', 'Arial'}</style>"
+ "<h1>Hello World!</h1>"
+ "<p>Welcome to the Web App";
HTMLOutput = HtmlService.createHtmlOutput(HTMLString);
Logger.log(HTMLOutput);
// Return plain text Output
return ContentService.createTextOutput(HTMLOutput);
}
Deployment settings:
I am using postman to send the POST
No result.
Any ideas?

I understood your setting of Web Apps from I have edited the question to show the deployment settings are "me" and "anyone".
About Not sure what you mean by "Reflect your latest script" to obtain a result sorry, can you explain more?, when you use Web Apps, it is required to prepare a script with Google Apps Script. In this case, when you modify your script, it is required to reflect the latest script to the current or a new version of the deployed Web Apps. Here, when the script is reflected in Web Apps as a new version, the deployment ID is changed. By this, the endpoint of Web Apps is also changed. I thought that in this case, it might not be useful for your situation. So, I proposed to reflect the latest script in Web Apps without changing the endpoint of Web Apps. If you cannot understand it, I thought that this post might be useful. Ref (Author: me)
About The goal is to create an API, that will do things, but I haven't been able to get a result for a doPost at all. I have managed to doGet with a result, which is lovely, but I want the Post version., unfortunately, in your current script, an object of "HtmlOutput" is returned by return ContentService.createTextOutput(HTMLOutput);. By this, a text of "HtmlOutput" is returned. If you want to put a value to Web Apps as the POSt method and you want to retrieve the returned value, how about the following modification?
function doPost(e) {
return ContentService.createTextOutput(JSON.stringify(e));
}
When you modified your script of Web Apps, please reflect the latest script to the Web Apps. In this case, when you want to use the endpoint of Web Apps without changing the endpoint, please check this post.
For example, when the HTTP request is run by including the request body of '{"key":"value"}' as a text with the POST method, {"parameter":{},"postData":{"contents":"'{\"key\":\"value\"}'","length":17,"name":"postData","type":"text/plain"},"parameters":{},"contextPath":"","contentLength":17,"queryString":""} is returned.
Note:
From your showing script, when you want to return your value of HTMLString, the modified script is as follows. But, in this case, the HTML is returned as text. Please be careful about this. If you want to see the rendered HTML, please use doGet and return return HtmlService.createHtmlOutput(HTMLString);.
function doPost(e) {
var HTMLString = "<style> h1,p {font-family: 'Helvitica', 'Arial'}</style>"
+ "<h1>Hello World!</h1>"
+ "<p>Welcome to the Web App";
HTMLOutput = HtmlService.createHtmlOutput(HTMLString).getContent();
return ContentService.createTextOutput(HTMLOutput);
}
Reference:
Taking advantage of Web Apps with Google Apps Script (Author: me)
Redeploying Web Apps without Changing URL of Web Apps for new IDE (Author: me)

Related

Getting client-side HTML from Javascript generated page using Google Apps Script

I am trying to write some code which captures limited information from webpages by using the following code in Google Apps Script:
var url = "myurlhere.com";
var str = UrlFetchApp.fetch(url).getContentText();
const RegexExp = /(?<=<span class=\"text-match\">).*?(?=<\/span>)/gi;
var result = str1.match(RegexExp);
And this works BEAUTIFULLY. The problem comes when the url page source is a client-side Javascript creation. I am crawling all over the internet, and basically everywhere I have looked, I get the same answer... "you need a headless browser", "use Node.js or Phantom.js".
This is irrelevant to me. I am not the owner of the HTML file, so I am either missing something or Node.js does not work here, and Phantom.js will not work because this is not my local machine.
My current thoughts toward a solution... create a pop-up using Apps Script (load the entire webpage, not just the Javascript source), and find a way to grab the generated HTML (the stuff you see when you "inspect" a page), but this is where I get stuck. I'm looking for an expert who may know best how to interface with a chrome tab and grab this information with Google Apps Script.
Current attempt progress:
function openURLdata() {
var js = " \
<script> \
window.open('myurlhere.com', 'URLdata', 'width=800, height=600'); \
google.script.host.close(); \
</script> \
";
var html = HtmlService.createHtmlOutput(js)
.setHeight(10)
.setWidth(100);
Logger.log(html.getContent());
SpreadsheetApp.getUi().showModalDialog(html, 'Now loading.'); // If you use this on Spreadsheet
}
Does anyone know if this is a step in the right direction? Does Google's server-side communications mean that I won't ever get a plain text interpretation of the HTML page that I need my data from? I have to get this task done in Google Sheets, and the target site does not have an API I can use.

Best option for crawling a website that loads content via ajax [duplicate]

Please advise how to scrape AJAX pages.

Overview:
All screen scraping first requires manual review of the page you want to extract resources from. When dealing with AJAX you usually just need to analyze a bit more than just simply the HTML.
When dealing with AJAX this just means that the value you want is not in the initial HTML document that you requested, but that javascript will be exectued which asks the server for the extra information you want.
You can therefore usually simply analyze the javascript and see which request the javascript makes and just call this URL instead from the start.
Example:
Take this as an example, assume the page you want to scrape from has the following script:
<script type="text/javascript">
function ajaxFunction()
{
var xmlHttp;
try
{
// Firefox, Opera 8.0+, Safari
xmlHttp=new XMLHttpRequest();
}
catch (e)
{
// Internet Explorer
try
{
xmlHttp=new ActiveXObject("Msxml2.XMLHTTP");
}
catch (e)
{
try
{
xmlHttp=new ActiveXObject("Microsoft.XMLHTTP");
}
catch (e)
{
alert("Your browser does not support AJAX!");
return false;
}
}
}
xmlHttp.onreadystatechange=function()
{
if(xmlHttp.readyState==4)
{
document.myForm.time.value=xmlHttp.responseText;
}
}
xmlHttp.open("GET","time.asp",true);
xmlHttp.send(null);
}
</script>
Then all you need to do is instead do an HTTP request to time.asp of the same server instead. Example from w3schools.
Advanced scraping with C++:
For complex usage, and if you're using C++ you could also consider using the firefox javascript engine SpiderMonkey to execute the javascript on a page.
Advanced scraping with Java:
For complex usage, and if you're using Java you could also consider using the firefox javascript engine for Java Rhino
Advanced scraping with .NET:
For complex usage, and if you're using .Net you could also consider using the Microsoft.vsa assembly. Recently replaced with ICodeCompiler/CodeDOM.

In my opinion the simpliest solution is to use Casperjs, a framework based on the WebKit headless browser phantomjs.
The whole page is loaded, and it's very easy to scrape any ajax-related data.
You can check this basic tutorial to learn Automating & Scraping with PhantomJS and CasperJS
You can also give a look at this example code, on how to scrape google suggests keywords :
/*global casper:true*/
var casper = require('casper').create();
var suggestions = [];
var word = casper.cli.get(0);
if (!word) {
casper.echo('please provide a word').exit(1);
}
casper.start('http://www.google.com/', function() {
this.sendKeys('input[name=q]', word);
});
casper.waitFor(function() {
return this.fetchText('.gsq_a table span').indexOf(word) === 0
}, function() {
suggestions = this.evaluate(function() {
var nodes = document.querySelectorAll('.gsq_a table span');
return [].map.call(nodes, function(node){
return node.textContent;
});
});
});
casper.run(function() {
this.echo(suggestions.join('\n')).exit();
});

If you can get at it, try examining the DOM tree. Selenium does this as a part of testing a page. It also has functions to click buttons and follow links, which may be useful.

The best way to scrape web pages using Ajax or in general pages using Javascript is with a browser itself or a headless browser (a browser without GUI). Currently phantomjs is a well promoted headless browser using WebKit. An alternative that I used with success is HtmlUnit (in Java or .NET via IKVM, which is a simulated browser. Another known alternative is using a web automation tool like Selenium.
I wrote many articles about this subject like web scraping Ajax and Javascript sites and automated browserless OAuth authentication for Twitter. At the end of the first article there are a lot of extra resources that I have been compiling since 2011.

I like PhearJS, but that might be partially because I built it.
That said, it's a service you run in the background that speaks HTTP(S) and renders pages as JSON for you, including any metadata you might need.

Depends on the ajax page. The first part of screen scraping is determining how the page works. Is there some sort of variable you can iterate through to request all the data from the page? Personally I've used Web Scraper Plus for a lot of screen scraping related tasks because it is cheap, not difficult to get started, non-programmers can get it working relatively quickly.
Side Note: Terms of Use is probably somewhere you might want to check before doing this. Depending on the site iterating through everything may raise some flags.

I think Brian R. Bondy's answer is useful when the source code is easy to read. I prefer an easy way using tools like Wireshark or HttpAnalyzer to capture the packet and get the url from the "Host" field and the "GET" field.
For example,I capture a packet like the following:
GET /hqzx/quote.aspx?type=3&market=1&sorttype=3&updown=up&page=1&count=8&time=164330
HTTP/1.1
Accept: */*
Referer: http://quote.hexun.com/stock/default.aspx
Accept-Language: zh-cn
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Host: quote.tool.hexun.com
Connection: Keep-Alive
Then the URL is :
http://quote.tool.hexun.com/hqzx/quote.aspx?type=3&market=1&sorttype=3&updown=up&page=1&count=8&time=164330

As a low cost solution you can also try SWExplorerAutomation (SWEA). The program creates an automation API for any Web application developed with HTML, DHTML or AJAX.

Selenium WebDriver is a good solution: you program a browser and you automate what needs to be done in the browser. Browsers (Chrome, Firefox, etc) provide their own drivers that work with Selenium. Since it works as an automated REAL browser, the pages (including javascript and Ajax) get loaded as they do with a human using that browser.
The downside is that it is slow (since you would most probably like to wait for all images and scripts to load before you do your scraping on that single page).

I have previously linked to MIT's solvent and EnvJS as my answers to scrape off Ajax pages. These projects seem no longer accessible.
Out of sheer necessity, I have invented another way to actually scrape off Ajax pages, and it has worked for tough sites like findthecompany which have methods to find headless javascript engines and show no data.
The technique is to use chrome extensions to do scraping. Chrome extensions are the best place to scrape off Ajax pages because they actually allow us access to javascript modified DOM. The technique is as follows, I will certainly open source the code in sometime. Create a chrome extension ( assuming you know how to create one, and its architecture and capabilities. This is easy to learn and practice as there are lots of samples),
Use content scripts to access the DOM, by using xpath. Pretty much get the entire list or table or dynamically rendered content using xpath into a variable as string HTML Nodes. ( Only content scripts can access DOM but they can't contact a URL using XMLHTTP )
From content script, using message passing, message the entire stripped DOM as string, to a background script. ( Background scripts can talk to URLs but can't touch the DOM ). We use message passing to get these to talk.
You can use various events to loop through web pages and pass each stripped HTML Node content to the background script.
Now use the background script, to talk to an external server (on localhost), a simple one created using Nodejs/python. Just send the entire HTML Nodes as string, to the server, where the server would just persist the content posted to it, into files, with appropriate variables to identify page numbers or URLs.
Now you have scraped AJAX content ( HTML Nodes as string ), but these are partial html nodes. Now you can use your favorite XPATH library to load these into memory and use XPATH to scrape information into Tables or text.
Please comment if you cant understand and I can write it better. ( first attempt ). Also, I am trying to release sample code as soon as possible.

is it possible to embed a google web app in a third party website?

For my job, we are creating a google apps script based web app. However, our client would like to have the web app on their own personal appropriately-branded website. The problem is, from the "Web Apps and Google Sites Gadgets" page, I only see an option to embed web apps into Google Sites specifically, and can't find any info otherwise anywhere else. Does that mean that there is no way to embed into personal websites or is there a workaround that i'm not finding anywhere?
Thank you!

I use ContentService to use Apps Script as a backend.
https://developers.google.com/apps-script/reference/content/content-service
On my webpage I use a call like:
var url = "https://script.google.com/macros/s/AKfycb...vZ8SvFBRWo/exec?offset="+offset+"&baseDate="+baseDate+"&callback=?";
$.getJSON( url, function( events ) { ...
In my script:
function doGet(e){
var callback = e.parameter.callback; // required for JSONP
.
.
return ContentService.createTextOutput(callback+'('+ JSON.stringify(returnObject)+')').setMimeType(ContentService.MimeType.JAVASCRIPT);
}

Using Parse's JavaScript framework with Google Apps Script

I would like to use the Parse framework directly in Google Apps Script and copied the following source code from Parse.com directly into my project.
However, it seems that there are some adjustments required to get this to work correctly. e.g. when running the following sample code...
function upload()
{
Parse.initialize("wCxiu8kyyyyyyyyyyyyyyyyy", "bTxxxxx8bxxxxxxxxx");
var TestObject = Parse.Object.extend("TestObjectJSSSSS");
var testObject = new TestObject();
testObject.save({foo: "bar"}, {
success: function(object) {
alert("yay! it worked");
}
});
}
… I get the error message TypeError: Cannot call method "getItem" of undefined.
which seems to relate to localStorage. I believe I should replace localStorage with a similar storage type available in Google Apps Script. Would that be possible and how would I need to adjust the Parse code?
What other adjustments would I need to get to the Parse framework to work in my Google Apps Script project?

I would suggest that using Parse REST API would be a far simpler solution and it is meant for such cases.

How do you use the Facebook Graph API in a Google Chrome Extension?

I have been trying to access the information available when using the https://graph.facebok.com/id concept through JSON but have been unable to call or return any information based on different snippets of code I've found around. I'm not sure if I'm using the JSON function correctly or not.
For example,
var testlink = "https://graph.facebook.com/"+id+"/&callback=?";
$.getJSON(testlink,function(json){
var test;
$.each(json.data,function(i,fb){
test="<ul>"+json.name+"</ul>";
});
});
In this code, I am trying to return in the test variable the name. When I use this in a Google Chrome Extension, it just returns a blank page.
Alternatively, I've been also trying to use the Facebook Javascript SDK in my Google Chrome extension, but I am unsure what website I should be using when signing up for an API Key.

I believe that you need to establish either an OAuth session or provide your API key before you can talk to FB. It's been a while since I messed around with FB api but, I'm pretty sure you have to register before you can use the api.
Here's something that might be useful though, it's a javascript console for Facebook which would allow you to test out your code! http://developers.facebook.com/tools/console/

It's an issue with chrome, but I haven't figured out the exact problem. For example open the chrome inspector and type this into it:
$.getJSON("http://graph.facebook.com/prettyklicks/feed?limit=15&callback=?", function(json){console.log(json);});
Then do the same thing in Firefox. Chrome will return nothing, but FF will give you the JSON object. It's not the JSON call itself because if you call something else for instance:
$.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?", function(data) {console.log(data);});
It will come through normally. So there is some miscommunication between chrome and fb. The really confusing part is that if you browse directly to the graph entry by just pasting it into your address bar it will come through normally that way too.

Develop Reference

JavaScript is the programming language of the Web.