How to get hyperlink ids in another html file - javascript

am doing a project it requires a web site.on this site i have to darw state diagram for hyperlinks.that is how the hyperlinks are attached to one another on a using to get hyperlink id in another html file.i know about document.getElementById.
Thanks inadvance

That would require a way to access another HTML file through AJAX, which is not possible if it isn't on your domain or if CORS isn't enabled.
There's however quite a few things you could do:
Use your own server-side as proxy for fetching the HTML file.
Do the processing on the server-side and let JavaScript plot the data.
Do everything on the server-side.
If you'd like to get the ID's of a link you should use a HTML parser. Modern browsers include a such, it's called DOMParser. You'd do something like this:
var parser = new DOMParser();
var doc = parser.parseFromString(yourHTMLSource, 'text/html');
var links = doc.getElementsByTagName('a');
for(var i = 0, length = links.length; i < length; i++) {
links[i].getAttribute('id'); // -> Returns the ID of the link, if any
As I remember it, IE doesn't support this, but has it's own module for HTML parsing with some different methods, but still relatively easy to use.


SharePoint Rest Document library

I am creating a custom page writing the HTML and javascript for a SharePoint site. I would like to embed document libraries inside my custom html I am writing in SharePoint designer.
I have nto found a way to easily embed document libraries in custom html but did stumble on some documentation for a rest api. I figured I could use this and write my own ajax app in the html for users to navigate the document library.
I am currently trying with this javascrip just to see if I can pull html or JSON for a document library contents:
<script type="text/javascript">
var folderUrl = "x/x/x/testDocumentLibrary/Forms/AllItems.aspx";
var url = _spPageContextInfo.webServerRelativeUrl + "/_api/Web/GetFolderByServerRelativeUrl('" + folderUrl + "')?$expand=Folders,Files";
for(var i = 0; i < data.Files.length;i++){
for(var i = 0; i < data.Folders.length;i++){
I am not sure if I am using the right url for the folderUrl variable.
In order to conduct some tests what is _spPageContextInfo.webServerRelativeURL pulling? I am trying to see if I can work backwards and create the URL manually first with out the SP function calls.
The folderUrl variable in your example code should end with the path to the library; everything up until /Forms/AllItems.aspx, so /x/x/x/testDocumentLibrary where /x/x/x/ is the server-relative path to the site on which the library resides.
The _spPageContextInfo object provides two variations of server-relative URL, one for the current site (called a "web" in SharePoint jargon) and one for the current site collection (called a "site" in SharePoint jargon). Appropriately, these properties are labeled webServerRelativeURL and siteServerRelativeURL. Both of these are server-relative, meaning that they exclude the first part of the domain name. (Instead of they'll give you /sites/stackoverflow.)
For a REST call, you probably want the absolute URL, not the server-relative URL. You can access the web and site absolute URLs through _spPageContextInfo's properties webAbsoluteURL and siteAbsoluteURL.
If the list/library you're accessing is on the current site where your REST is running, use the webAbsoluteURL property.

Can i scrape this site using just node?

im very new to JavaScript so be patient.
I've been trying to scrape a site and get all the product URLs in a list that i will use later in other function like this:
var http = require('http-get');
var request = require("request");
var cheerio = require("cheerio");
function getURLS(url) {
request(url, function(err, resp, body){
var linklist = [];
$ = cheerio.load(body);
var links = $('#productResults a');
for(valor in links) {
if(links[valor].attribs && links[valor].attribs.href && linklist.indexOf(links[valor].attribs.href) == -1){
var extended_links = [];
extended_link = '' + link;
This does work unless you go to the second page of items like this:
var http = require('http-get');
var request = require("request");
var cheerio = require("cheerio"); //etc...
As far as i know this happens because the content on the page is loaded dynamically.
To get the contents of the page i believe i need to use PhantomJS because that would allow me to get the html code after the page has been fully loaded, so i installed the phantomjs-node module. I want to use NodeJS to get the URL list because the rest of my code is written on it.
I've been reading a lot about PhantomJS but using the phantomjs-node is tricky and i still don't understand how could i get the URL list using it because i'm very new to JavaScript or coding in general.
If someone could guide me a little bit i'd appreciate it a lot.
Yes, you can. That page looks like it implements Google's Ajax Crawling URL.
Basically it allows websites to generate crawler friendly content for Google. Whenever you see a URL like this:[pagenum=2*ava=1]
You need to convert it to this:*ava%3D1%5D
The conversion is simply take the base path:, add a query param _escaped_fragment_ who's value is URL fragment Filter=[pagenum=2*ava=1] encoded into Filter%3D%5Bpagenum%3D2*ava%3D1%5D using standard URI encoding.
You can read the full specification here:
Note: This does not apply to all websites, only websites that implement Google's Ajax Crawling URL. But you're in luck in this case
You can see any product you want without using dynmic content using this url:{product_id}
For example to see product 37023:
All you have to do is for(var productid=0;prodcutid<40000;productid++) {request...}.
Another approach is to use phantom module. ( It will let you run phantom command directly from your NodeJS app

How Edit data of an XML node with Javascript

I want to write some data in an existing local XML file with Javascript with some text from an Html page. Is it possible to change content of nodes?
Here is XML sample:
I will get some more text from input and want to add it after "text1", but can't find a solution.
function SaveNotes(content,player)
var xml = "serialize.xml";
var xmlTree = parseXml("<Notepad></Notepad>");
var str = xmlTree.createElement("Notes");
var xmlString = (new XMLSerializer()).serializeToString(xmlTree);
Here is the code to manipulate xml content or xml file :
Please check this Fiddle
var parseXml;
parseXml = function(xmlStr) {
return (new window.DOMParser()).parseFromString(xmlStr, "text/xml");
var xmlTree = parseXml("<root></root>");
function add_children(child_name, parent_name) {
str = xmlTree.createElement(child_name);
//strXML = parseXml(str);
var xmlString = (new XMLSerializer()).serializeToString(xmlTree);
add_children("apple", "root");
add_children("orange", "root");
add_children("lychee", "root");
you can use it for searching in xml as well as adding new nodes with content in it. (And sorry i dont know how to load xml from client side and display it.)
but this fiddle demo will be helpful in adding content in xml and searching in it.
Hope it helps :)
If you want to achieve this on the client side you can parse your xml into a document object:
And then manipulate it like you would the DOM of any html doc, e.g. createElement, appendChild etc.
Then to serialize it into a String again you could use
Persisting the data
Writing to a local file is not possible in a cross-browser way. In IE you could use ActiveX to read/write file.
You could use cookies to store data on the client side, if your data keeps small enough.
In HTML5 you could use local storage, see
Try to use these two package one to convert to json and when is finish the other to come back

Get webpage and read throug it using javascript

Hi i have a quick question, say that you would like to connect to a website and search it for what links it contains, how do you do this with javascript?
I would like to do something like this
Var everythingAdiffrentPageContains = //Go to some link ex and store it in this variable
var pageLinks = []; var anchors = everythingAdiffrentPageContains.getElementsByTagName('a');
var numAnchors = anchors.length;
for(var i = 0; i < numAnchors; i++) {
We can assume here that we have acces rights to the site so this is not of a concern.
In other words I would like to go to some site and store all that sites Hyperlinks in an array, how would you do this in javascript?
EDIT since pointed out Im not trying to connect to another domain. Im trying to connect to another apache webserver inside my lan that hosts a website that I would like to scan for links.
Unfornuatley I do not have PHP on my webserver :/ But a simple javascript would do it
for example go to X:/folder/example.html
Read it, and store the links
Unfortunately - You can't do this. "We can assume here that we have acces rights to the site"...that's a false assumption from a JavaScript point of view, if the page is on another domain. You simply can't access content on another domain (not HTML content anyway) via JavaScript. It's prevented by the same-origin policy, in place for several security reasons.
I suggest you to use a JS framework that helps you to retrieve elements and do stuff with DOM easily.
For example using mootools you could achieve this writing some code like this:
var req = new Request.HTML({
url:'./retrieve.php?url=YOURURL', //create a server script to "retrieve" the html of another domain page
onSuccess: function(tree,DOMelements) {
var links = [];
The retrieve.php page should be written for example in this way:
$url = $_GET['url'];
header('Content-type: application/xml');
echo file_get_contents($url);

XML in html div?

I have put some xml-fragments in a div and retrieve it with getElementsByTagName. It works fine in Firefox but Internet Explorer ain't so nice... What should I do to fix this?
var thumbnails = content.getElementsByTagName("thumbnails");
for (var i = 0; i < thumbnails.length; i++) {
You can't put arbitrary XML in an HTML document, in general. It's invalid HTML, and browser parsers may try to ‘fix’ the broken HTML, mangling your data.
You can embed XML inside HTML using <xml> data islands in IE, or using native-XHTML with custom namespaces in other browsers. But apart from the compatibility issue of the two different methods, it's just not really a very good idea.
Further, even if it worked, plain XML Element nodes don't have an innerHTML property in any case.
You could embed XML inside JavaScript:
<script type="text/javascript">
var xml= '<nails><thumb id="foo">bar</thumb><thumb id="bof">zot</thumb></nails>';
var doc= parseXML(xml);
var nails= doc.getElementsByTagName('thumb');
for (var i = 0; i<nails.length; i++) {
function parseXML(s) {
if ('DOMParser' in window) {
return new DOMParser().parseFromString(s, 'text/xml');
} else if ('ActiveXObject' in window) {
var doc= new ActiveXObject('MSXML2.DOMDocument');
doc.async= false;
return doc;
} else {
alert('Browser cannot parse XML');
But this means you have to encode the XML as a JavaScript string literal (eg. using a JSON encoder if you are doing it dynamically). Alternatively you could use an XMLHttpRequest to fetch a standalone XML document from the server: this is more widely supported than the DOMParser/ActiveX approach.
If you are just using XML to pass data to your script, you will find it a lot easier to write JavaScript literals to do it instead of mucking about with parsing XML.
<script type="text/javascript">
var nails= [
{"id": "foo", "text": "bar"},
{"id": "bof", "text": "zot"}
for (var i = 0; i<nails.length; i++) {
// do something
Again, you can produce this kind of data structure easily using a JSON encoder if you need to do it dynamically.
IE 7 has a security issue with the innerHTML property of a DOM element. This security check silently blocks some code. It appears this may be your problem. I do not know if this is an issue with IE 8.
The fix just add the dynamically created element in the DOM tree before accessing any of the properties, not after.
However, for best practices it is wise to change the way you are doing this. Perhaps you should edit your question to ask a better way to do this.
What I've found to be the best way of doing this is to put your xml into a textarea. This is also ext-js's suggestion. That way, the browser doesn't try to create html out of your xml. When you retrieve its value, you just retrieve the texarea's value.
However, as other people have mentioned, I would suggest you retrieve the xml from the server for better separation between html and data, unless you really need to keep your http requests to a minimum.

