tideSDK, jquery, XMLHttpRequest, and absolute URL's - javascript

I'm using TideSDK to get content from a website. I will need to pre fill in form data from the database on this website eventually.
I'm able to get the page and store it to a variable.
I'm able to parse our the relative URL's with alert()'s.
But I'm not able to replace the body with the corrected body
$('html').replaceWith(html); Jquery should be in memory so I don't have to worry about replacing html right?
I cannot figure out why this doesn't work. If an image or URL is absolute it works fine, but if it's relative it doesn't work. I don't have access to fix the website with absolute url's.
My demo code: http://jsfiddle.net/Cs5MC/13/ Changed from html to body in demo
Any ideas?

To begin with, if this is an implementation of Titanium, you will need to use the Ti Network api discussed at the this doc.
Pulling JSON data and using it is much the same, with a callback, as it would be with jQuery or any regular xhr request.
I hope that it helps.
By the way, using jQuery - which is heavily dependent upon the dom to do it's work - you will always have to be careful whether an expected dom structure is actually present - which it may not be with the titanium platform without a web view, although I am not familiar with the TideSDK and may stand corrected on that side.

Related

Generate PDF from web app

I need to generate a PDF from the current screen in my webapp. Some kind of screenshot, but I'm facing serious difficulties.
The main problem is that the view contains a grid made with jQuery Gridster; and some "widgets" contain complex elements like tables, highcharts, etc.
So plugins like jsPDF or html2canvas can't render my page in a prorper PDF. They always generate it blank.
This is how the page looks like. You can/move resize each element:
(Sorry for the CIA style, but there's business data in there)
Some ideas I came across but don't work are:
Using browser print-to-pdf feature programatically. (can't)
Use phantomjs. (but page state matters, so...)
I believe a solution to this poroblem may be widely adopted by anyone trying to generate a PDF of img from current screen in a web app. Quite an unresolved problem.
It's ok if only works on Google Chrome.
Many thanks.
EDIT:
One posible solution might be to find a way to represent the current layout status with an object and save it with and id.
Then retrieve that object via url param with the id and apply the stored layout to the inital page.
This way I might able to take a screenshot with phatomjs, but it seems quite complex to me. Any alternative?
Based on the fact that you're struggling with capturing dynamic content, I think at this point you need to take a step back and see that you might need to alter your approach. The reason these plugins are failing is because they will only work with the HTML before interactions right?
Why not convert the HTML to .pdf format from the server side? But the key part here is, send the current HTML back. By sending it back, you're sending updated static HTML back to the server to be rendered into a PDF? I've used HTML to PDF from server side before and it works fine, so I can't see why it wouldn't be appropriate here.
See this answer for details about HTML to PDF server side.

Angularjs vs SEO vs pushState

After reading this thread I decided to use pushstate api in my angularjs application which is fully API-based (independent frontend and independent backend).
Here is my test site: http://huyaks.com/index.html
I created a sitemap and uploaded to google webmaster tools.
From what I can see:
google indexed the main page, indexed the dynamic navigation (cool!) but did not index
any of dynamic urls.
Please take a look.
I examined the example site given in the related thread:
http://html5.gingerhost.com/london
As far as I can see, when I directly access a particular page the content which is presumed to be dynamic is returned by the server therefore it's indexed. But it's impossible in my case since my application is fully dynamic.
Could you, please, advise, what's the problem in my particular case and how to fix it?
Thanks in advance.
Note: this question is about pushState way. Please do not advise me to use escaped fragment or 3-d party services like prerender.io. I'd like to figure out how to use this approach.
Evidently Quentin didn't read the post you're referring to. The whole point of http://html5.gingerhost.com/london is that it uses pushState and proves that it doesn't require static html for the benefit of spiders.
"This site uses HTML5 wizrdry [sic] to load the 'actual content' asynchronusly [sic] to the rest of the code: this makes it faster for users, but it's still totally indexable by search engines."
Dodgy orthography aside, this demo shows that asynchronously-loaded content is indexable.
As far as I can see, when I directly access a particular page the content which is presumed to be dynamic is returned by the server
It isn't. You are loading a blank page with some JavaScript in it, and that JavaScript immediately loads the content that should appear for that URL.
You need to have the server produce the HTML you get after running the JavaScript and not depend on the JS.
Google does interpret Angular pages, as you can see on this quick demo page, where the title and meta description show up correctly in the search result.
It is very likely that if they interpret JS at all, they interpret it enough for thorough link analysis.
The fact that some pages are not indexed is due to the fact that Google does not index every page they analyze, even if you add it to a sitemap or submit it for indexing in webmaster tools. On the demo page, both the regular and the scope-bound link are currently not being indexed.
Update: so to answer the question specifically, there is no issue with pushState on the test site. Those pages simply do not contain value-adding content for Google. (See their general guidelines).
Sray, I recently opened up the same question in another thread and was advised that Googlebot and Bingbot do index SPAs that use pushState. I haven't seen an example that ensures my confidence, but it's what I'm told. To then cover your bases as far as Facebook is concerned, use open graph meta tags.
I'm still not confident about pushing forward without sending HTML snippets to bots, but like you I've found no tutorial telling how to do this while using pushState or even suggesting it. But here's how I imagine it would work using Symfony2...
Use prerender or another service to generate static snippets of all your pages. Store them somewhere accessible by your router.
In your Symfony2 routing file, create a route that matches your SPA. I have a test SPA running at localhost.com/ng-test/, so my route would look like this:
# Adding a trailing / to this route breaks it. Not sure why.
# This is also not formatting correctly in StackOverflow. This is yaml.
NgTestReroute:
----path: /ng-test/{one}/{two}/{three}/{four}
----defaults:
--------_controller: DriverSideSiteBundle:NgTest:ngTestReroute
--------'one': null
--------'two': null
--------'three': null
--------'four': null
----methods: [GET]
In your Symfony2 controller, check user-agent to see if it's googlebot or bingbot. You should be able to do this with the code below, and then use this list to target the bots you're interested in (http://www.searchenginedictionary.com/spider-names.shtml)...
if(strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot"))
{
// what to do
}
If your controller finds a match to a bot, send it the HTML snippet. Otherwise, as in the case with my AngularJS app, just send the user to the index page and Angular will correctly do the rest.
Also, has your question been answered? If it has, please select one so I and others can tell what worked for you.
HTML snippets for AngularJS app that uses pushState?

Intercepting JavaScript before going to JavaScript Engine in Mozilla Firefox

I want to develop an extension which works on scripts coming from HTTP response. I know that whole HTML code first goes to rendering engine inside browser where it is parsed to create a DOM tree. Any script embedded inside is passed to the JavaScript Engine.(Correct me if I am wrong. :) )
So I wanted to intercept the JavaScript code before it is sent to the JavaScript Engine in order to modify them accordingly.
Are there any APIs for Mozilla Firefox which would allow me to do this? How can I do it?
while doing some stuff i stumbled across this:
https://developer.mozilla.org/en-US/docs/XPCOM_Interface_Reference/NsITraceableChannel?redirectlocale=en-US&redirectslug=NsITraceableChannel
this allows you to modify stuff before it is parsed. see this topic here:
http://forums.mozillazine.org/viewtopic.php?f=19&t=2800541
here is a working example of getting the content before it is shown to user. it doesnt change it though, thats what im asking in the mozillazine topic. the writeBytes should modify it, once you figure it out please share as im interested as well
https://github.com/Noitidart/demo-nsITraceableChannel
You can follow this answer on how to intercept each request and modify before sending it to the page itself. You can do transpilation or whatever you'd like there.
take a look at this guys addons code. he does exactly what you are looking for:
https://addons.mozilla.org/en-US/firefox/addon/javascript-deminifier/
You can try invade before HTML'll be parsed and take all tags, work with them and put it back.
...I wanted to intercept these javascript code before Javascript Engine and modify them accordingly. Is there any APIs for mozilla firefox? How can I do it?
You can use page-mod of the Addon-SDK by setting contentScriptWhen: "start"
Then after completely preventing the document from getting parsed you can fetch the same document on the side, do any modifications and inject the resulting document in the page. Here is an answer which does just that https://stackoverflow.com/a/36097573/6085033

HTML Source-Code rip-save?

i came across a js library (jsMovie) and wanted to see the example files, but it is really badly documented (usage), so i tried to download the authors page to look in the source-code. But when trying to do that, I've recognized that "view-source" wasn't giving the full code (almost 80% of the code did not appear). (Tried in Chrome, Firefox)
So my question is, how can this be? Firebug is displaying everything propperly. At this moment i thought, that this could be as well a good way to prevent kiddies from ripping sites.
here the page: http://konsultaner.de/entwickler#Konsultaner
Hints are welcome
Generate the current source code, as interpreted by the browser. This can be done using an XMLSerializer on document.
var generatedSource = new XMLSerializer().serializeToString(document);
From there, if you want to open a page just showing the source, you could do
window.open('data:text/plain,'+encodeURIComponent(generatedSource), '_blank');
They are using AngularJS, a front-end javascript framework. That means almost all parts of the page are generated dynamically using javascript. Therefore, you can't see the page without javascript running (using view-source), but you can see the generated HTML via inspector.
If it is a static website (the javascripts and templates are all there), you can still 'rip' it. But not if it is a dynamic website, since all data and logic are 'fed' by the server.

Checking file size -- file stored on server (not uploaded!); using JavaScript without AJAX request

Like in the title. With jQuery, or even simple JavaScript code, I can get table of all scripts (CSSes and images) that particular page uses, and I'm looking, if there is any solution to get each resource file size?
I think I did a quite good research here. Most answers are about files uploaded to server, being uploaded or just before being uploaded, so that is not, what I'm looking for. There is some support introduced in HTML5, but again, it seems to be for uploaded files only.
Of course, I'm looking for a cross-browser solution, so using some crappy file-object, introduced in old IE, is also not, what I'm looking for. Also, please let me underline, that I'm talking purely about checking file size of a file stored on server, accessible by given URL. So, please, don't write answers like, that I can't access local files, from JavaScript, for security reasons. I already know that.
I found quite great solution on SO, but it uses AJAX request to solve the problem. Although it is very interesting (sending request of HEAD type), it might not work on all servers (but was tested by answer's author that is supported to all major browsers). And I'm a bit thrilling about firing AJAX request for each resource I find on each analysed page.
So, I'm assuming that there isn't such solution. And I would be happy, if someone could prove that my assumption is wrong. But, then, on the other hand, how do they do this in for example Firebug? If I'm not mistaken, XPI extensions are written in JavaScript, right? And Firebug certinally can measure sizes of resuorces used in current website.
To verify content length from inline scripts you can use their .text attribute.
document.getElementsByTagName('script')[0].text.length //works for inline scripts
For external scripts, where .src attribute refer to another file/resource, there's a problem with Access-Control-Allow-Origin security constraint, unless you allow them in your browser settings. If the external scripts are from the same domain as the page where you are trying to catch them, is ok.
I created a fiddle to demonstrate how to get their content length.
UPDATED 20/07
Firebug has its own implementation to intercept page loads that extends the Mozilla.org observer-service.
Question 'An observer for page loads in a custom xul:browser' should give you an idea of how to implement this kind of interceptor using Mozilla Add-on API.

Categories

Resources