I'm developing a small app designed to embed HTML + JavaScript (JavaScript manages the behavior of HTML) into existing websites. My small app is an ASP.Net MVC 3 app. What's the best approach for delivering JavaScript to the web client? I do not have access to the web clients except for giving them the URL to retrieve the HTML/JavaScript. The web clients will be retrieving the HTML/JavaScript using jQuery. jQuery will then load the results of the ASP.Net MVC 3 app into the DOM. Should the JavaScript that's needed to manage the behavior of the embedded HTML simply be a at the end of the HTML fragment? Thanks for your time.
If the loading mechanism is in place, and simply inserts the payload of the HTTP request into the DOM somewhere, then including a <script> as the last tag in the payload is probably the best way to go.
Any DOM elements the script depends on should be ready for use when it is executed, and there isn't anything wrong with that technique that I know of.
You could get more sophisticated, but not without complicating your jQuery loading mechanism.
Related
I don't know if this is a PHP or JavaScript code, but what do you call this technique about changing web content? For an instance, the MDC Web demo site. It has an empty content if you view the source, but completely contains all elements if you inspect the page.
Regarding PHP, I think it is done with a PHP code in MDC Web's case, but how exactly? Is this a common technique? I wanna know this method coz it's useful in some cases where there's actually no need to reload the page, but able to change the content and URL.
This is called Single Page Applications (a.k.a SPA).
A single-page application (SPA) is a web application or web site that interacts with the user by dynamically rewriting the current page rather than loading entire new pages from a server. This approach avoids interruption of the user experience between successive pages, making the application behave more like a desktop application more.
I can scrape static website using scrapy however, this other website that I'm trying to scrape has 2 sections in its HTML namely; "head" and "body onload". And the information that I need is in the body onload part. I believe that content is loaded after html is requested and thus the website is dynamic. Is this doable using scrapy? What additional tools do I need?
Check out scrapy_splash, it's a rendering service for scrapy, that will allow you to crawl javascript based web sites.
You can also create your own downloader middleware and use Selenium with PhantomJS (example). The downside of this technic is that you lose the concurrency provided by scrapy.
Anyway, I think splash is the best way to do this.
Hope this helps.
Is it possible to extract the HTML of a page as it shows in the HTML panel of Firebug or the Chrome DevTools?
I have to crawl a lot of websites but sometimes the information is not in the static source code, a JavaScript runs after the page is loaded and creates some new HTML content dynamically. If I then extract the source code, these contents are not there.
I have a web crawler built in Java to do this, but it's using a lot of old libraries. Therefore, I want to move to a Rails/Ruby solution for learning purposes. I already played a bit with Nokogiri and Mechanize.
If the crawler is able to execute JavaScript, you can simply get the dynamically created HTML structure using document.firstElementChild.outerHTML.
Nokogiri and Mechanize are currently not able to parse JavaScript. See
"Ruby Nokogiri Javascript Parsing" and "How do I use Mechanize to process JavaScript?" for this.
You will need another tool like WATIR or Selenium. Those drive a real web browser, and can thus handle any JavaScript.
You can't fetch the records coming from the database side. You can only fetch the HTML code which is static.
JavaScript must be requesting the records from the database using a query request which can't be fetch by the crawler.
I am building a sizable, mobile application that is currently built on top of jQuery Mobile and KnockoutJS. My first approach made heavy use of a Single Page Application design along with loading all dynamic content and data via Knockout and ajax calls. This has worked OK but maintenance and development has become very complicated as jQuery Mobile loads more and more into the DOM.
I wonder about moving to more traditional, individual HTML pages that are completely static while still loading data via Knockout and ajax. This will allow browsers to cache the biggest parts of the app: the HTML pages.
Question:
How can I best pass parameters around from page to page without creating unique URLs that inhibit client-side browser caching? I want browsers to aggressively cache pages.
I realize that I can implement all kinds of server side caching but that is not my goal here. /Display/3 and /Display/5 are the same page. Will the browser cache these as one?
I wonder about passing parameters after the hash mark? /Display#3 and /Display#5? How about passing parameters via JavaScript in the global namespace?
Hoping for a standard approach here.
Ok sorry for misunderstanding, but I think your approach goes the wrong way. You cannot use GET paramters that way, also JQueryMobile is a little bit confusing in url handling for AJAX.
Normally, if using AJAX to refresh content, you do not need to reload the page. So you need no caching, because the page is already there and only some content is reloaded via AJAX. But JQM's single page approach is not usable for dynamic created content that way. You can only dynamically create a page with all content in it, and JQM shows content by switching visibility. Then the # could be used to switch between the pages (the # does not force an reload, as used for on side navigation).
You can write your own loading function calling in buttons and links (instead of using URL GET paramters). By using JQuery's $.ajax method with dataType "html" (instead of json, default) you can do a content refresh in its success handler.
You could try html5 sessionStorage/localStorage. If html5 is an issue, than plain old cookies.
Just to clarify, if there are several HTML pages, each page must have its own URL.
I'm developing a modal/popup system for my users to embed in their sites, along the lines of what KissInsights and Hello Bar (example here and here) do.
What is the best practice for architecting services like this? It looks like users embed a bit of JS but that code then inserts additional script tag.
I'm wondering how it communicates with the web service to get the user's content, etc.
TIA
You're right that usually it's simply a script that the customer embeds on their website. However, what comes after that is a bit more complicated matter.
1. Embed a script
The first step as said is to have a script on the target page.
Essentially this script is just a piece of JavaScript code. It's pretty similar to what you'd have on your own page.
This script should generate the content on the customer's page that you wish to display.
However, there are some things you need to take into account:
You can't use any libraries (or if you do, be very careful what you use): These may conflict with what is already on the page, and break the customer's site. You don't want to do that.
Never override anything, as overriding may break the customer's site: This includes event listeners, native object properties, whatever. For example, always use addEventListener or addEvent with events, because these allow you to have multiple listeners
You can't trust any styles: All styles of HTML elements you create must be inlined, because the customer's website may have its own CSS styling for them.
You can't add any CSS rules of your own: These may again break the customer's site.
These rules apply to any script or content you run directly on the customer site. If you create an iframe and display your content there, you can ignore these rules in any content that is inside the frame.
2. Process script on your server
Your embeddable script should usually be generated by a script on your server. This allows you to include logic such as choosing what to display based on parameters, or data from your application's database.
This can be written in any language you like.
Typically your script URL should include some kind of an identifier so that you know what to display. For example, you can use the ID to tell which customer's site it is or other things like that.
If your application requires users to log in, you can process this just like normal. The fact the server-side script is being called by the other website makes no difference.
Communication between the embedded script and your server or frames
There are a few tricks to this as well.
As you may know, XMLHttpRequest does not work across different domains, so you can't use that.
The simplest way to send data over from the other site would be to use an iframe and have the user submit a form inside the iframe (or run an XMLHttpRequest inside the frame, since the iframe's content resides on your own server so there is no cross domain communication)
If your embedded script displays content in an iframe dialog, you may need to be able to tell the script embedded on the customer site when to close the iframe. This can be achieved for example by using window.postMessage
For postMessage, see http://ejohn.org/blog/cross-window-messaging/
For cross-domain communication, see http://softwareas.com/cross-domain-communication-with-iframes
You could take a look here - it's an example of an API created using my JsApiToolkit, a framework for allowing service providers to easily create and distribute Facebook Connect-like tools to third-party sites.
The library is built on top of easyXDM for Cross Domain Messaging, and facilitates interaction via modal dialogs or via popups.
The code and the readme should be sufficient to explain how things fit together (it's really not too complicated once you abstract away things like the XDM).
About the embedding itself; you can do this directly, but most services use a 'bootstrapping' script that can easily be updated to point to the real files - this small file could be served with a cache pragma that would ensure that it was not cached for too long, while the injected files could be served as long living files.
This way you only incur the overhead of re-downloading the bootstrapper instead of the entire set of scripts.
Best practice is to put as little code as possible into your code snippet, so you don't ever have to ask the users to update their code. For instance:
<script type="text/javascript" src="http://your.site.com/somecode.js"></script>
Works fine if the author will embed it inside their page. Otherwise, if you need a bookmarklet, you can use this code to load your script on any page:
javascript:(function(){
var e=document.createElement('script');
e.setAttribute('language','javascript');
e.setAttribute('src','http://your.site.com/somecode.js');
document.head.appendChild(e);
})();
Now all your code will live at the above referenced URI, and whenever their page is loaded, a fresh copy of your code will be downloaded and executed. (not taking caching settings into account)
From that script, just make sure that you don't clobber namespaces, and check if a library exists before loading another. Use the safe jQuery object instead of $ if you are using that. And if you want to load more external content (like jQuery, UI stuff, etc.) use the onload handler to detect when they are fully loaded. For example:
function jsLoad(loc, callback){
var e=document.createElement('script');
e.setAttribute('language','javascript');
e.setAttribute('src',loc);
if (callback) e.onload = callback;
document.head.appendChild(e);
}
Then you can simply call this function to load any js file, and execute a callback function.
jsLoad('http://link.to/some.js', function(){
// do some stuff
});
Now, a tricky way to communicate with your domain to retrieve data is to use javascript as the transport. For instance:
jsLoad('http://link.to/someother.js?data=xy&callback=getSome', function(){
var yourData = getSome();
});
Your server will have to dynamically process that route, and return some javascript that has a "getSome" function that does what you want it to. For instance:
function getSome(){
return {'some':'data','more':'data'};
}
That will pretty effectively allow you to communicate with your server and process data from anywhere your server can get it.
You can serve a dynamically generated (use for example PHP or Ruby on Rails) to generate this file on each request) JS file from your server that is imported from the customers web site like this:
<script type="text/javascript" src="//www.yourserver.com/dynamic.js"></script>
Then you need to provide a way for your customer to decide what they want the modal/popup to contain (e.g. text, graphics, links etc.). Either you create a simple CMS or you do it manually for each customer.
Your server can see where each request for the JS file is coming from and provide different JS code based on that. The JS code can for example insert HTML code into your customers web site that creates a bar at the top with some text and a link.
If you want to access your customers visitors info you probably need to either read it from the HTML code, make your customers provide the information you want in a specific way or figure out a different way to access it from each customers web server.