Request HTML for an HTML body which loads in via AJAX - javascript

Hi so I am currently building a webcrawler based program. Currently I've hit a road block because the response to my html request is only giving me some of the content. The rest of the content loads in my browser but does not appear when calling request(url, cb).
My guess is that this part of the html code is loaded in after via something like angularjs, because my browser shows all of the missing content (and you can see that the content that is missing is loading in after the page).
How do I work around this? Is there a way to get the html after all the elements have been added?
Thanks

You are correct. Just using request to fetch the HTML you won't be able to see what the page looks like after being rendered with all the JavaScript. I would look at phantomjs or a framework that is base on phantomjs such as http://nrabinowitz.github.io/pjscrape/. That will allow you to access the HTML after the client side JavaScript has been executed.

Related

How to handle the content part in AJAX page switching in PWA?

I have zero experience in native apps, which might help with this question.
Since service worker caches everything so nicely, then I don't see any reason why I should render the entire webpage again when the page gets switched (link gets clicked.) So I will switch only the content, use history pushstate to change the URL and change the title. I have that part figured out.
Problem is, I cannot find any resources that would support either of the two content load ideas I have:
Load center content via AJAX with HTML.
Load center content as data only and render the HTML on-the-fly in JS.
First method would be fairly straight forward, but would mean that the payload would be bigger.
Second seems much more advanced, but would mean that HTML templates have to be in the JS somehow already? I also have a feeling, that there is a method somewhere in here.. that would allow to open the heavily cached page (lets say the article page) and replace the (text) contents. But as I said, I cannot find any resources to wager the cons and pros or give any reliable information on PWA AJAX page switching.
Any credible information on this matter would be much appreciated.
EDIT
I have kept reading and researching on this matter, but sadly there is no clear indication on how to handle dynamic content over AJAX. Whether I should parse the JSON data from AJAX to HTML in JS or send it already as HTML from the backend.
To add in favour to second option. I have figured out, that my theory had somewhat weight to it. If I use pure.js to pull a HTML template from hidden template tag and generate the HTML on the fly from JSON over AJAX.
you make it so complicated can we take a look at your code please?!
if you mean retrieving data from database by ajaxthen all what when you need is a jquery plugin
$(document).ready(function(){
var contentData1 = document.getElementById('contentData1');
$(function() {
$.post("pathToPHP.php",{contentData1: contentData1},function(data){
$("#container").html(data);
});
});
and the pathToPHP.php file should retrieve the data you want
echo "";

Calling JSP Custom Tags tld through Ajax

So I am currently trying to implement AJAX functionality to my webapp.
I currently have all the Tag Library, Tag Handlers, set up properly, so that if I call the Tags when the page is fully refreshing. All these custom tags work.
However, I have actually never implemented ajax in my life and is currently stuck on how to proceed to call these tags dynamically based on the changes in the webpage.
E.X:
Custom Tag library under -->/WEB-INF/tld
Tag Handlers --> classes/ClassHandlers/Tag1...TagXXX
With the above calling the following tag in the JSP file works perfectly:
<tagLib:tagName Attribute1="" Attribute2="">
However, how can I get this to be dynamically inserted by Ajax?
Please let me know if I can provide any more details.
Well, I would said that it is impossible. AJAX and JSP are two incompatible technologies. JSP tags can be used only on server side (during generating HTML) while AJAX is a client side technology (it runs in user browser). You can read more about client-server model here.

load external webpage and add custom header and use the data from webpage

I want to load a external webpage on my own server and add my own header. Also i need to use the data from the external website like url and content (i need to search and find specific data, check if i got that data in my system and show my data in the header). The external webpage needs to be working (like the buttons for opening other pages, no new windows).
I know i can play with .NET to create software but i want to create a website that will do the trick. Can this be done? Php + iframe is to simple i think, that won't give me the data from external website and my server won't see changes in the external url (what i need).
If it's supposed to be client-side, then you can acquire the data necessary by using an Ajax request, parsing it in JavaScript and then just inserting it into an element. However you have to take into account that if the host doesn't support cross-origin resource sharing, then you won't be able to do it like this.
Ajax page source request: get full html source code of page through ajax request through javascript
Parsing elements from the source: http://ajaxian.com/archives/html-parser-in-javascript (not sure if useful)
Changing the element body:
// data --> the content you want to display in your element
document.getElementById('yourElement').innerHtml = data;
Other approach (server-side though) is to "act" like a browser by faking your user-agent to some browser's and then using cUrl for example to get the source. But you don't want to fake it, because that's not nice and you would feel bad..
Hope it gets you started!

How to add html content generated by json call to page generated html code?

I have a page. Which contain json method to load data. I call this method on page load. It works properly. The problem is when I view source of that page I don't see the generated code.
My concern is the search engine will never see the content even if end user see it.
Is there anyway to add it? If so how it can be done?
Here is the example of code I use
$(function(){
//Call to the server to get data.
var content = "Some data"; //from the json call
$("#content").html(content);
});
});
Most if not all search engines will not recognize content inserted into a page from Ajax/Javascript, this is why you need to load this content with in the page if you want the search engine to recognize it.
Your question seems to me a copy of what's asked over here and in Is there anyway of making json data readable by a Google spider?
So, Progressive enhancement and Unobtrusive JavaScript are the way out..
In addition to these, optionally, but importantly, test the crawlability of your app: see what the crawler sees with "Fetch as Googlebot".

Is it possible to insert an include tag into a webpage using JavaScript?

I want to insert the following include tag into my webpage using JavaScript. <!--#include virtual='includes/myIncludeFile.htm' -->
I have tried the following but it doesn't work: jQuery("<!--#include virtual='includes/myIncludeFile.htm' -->").appendTo(jQuery("body"));
I have outputted jQuery("<!--#include virtual='includes/myIncludeFile.htm' -->") to the console and it thinks that it is a comment object (see screenshot).
Where am I going wrong and how can it be done?
The basic premise of what you're trying to do isn't going to work. The HTML "include tags" you're using are also known as "server-side includes." That is, they are processed on the server before the page is sent to the client.
By the time the JavaScript code is executing on the client, the server is already processing the response. The client-side code can't initiate a server-side include.
One thing you can do from the client-side is use something like jQuery's .load() function to make a request to the server and load the response into a specified element on the page. Something like this:
$('#includeDiv').load('includes/myIncludeFile.htm');
This would dynamically load all of the contents from myIncludeFile.htm into an element of id "includeDiv" on the page.

Categories

Resources