Is it possible to access elements of a website?

Is it possible to access elements of a website? - javascript

I have a link here: https://fantasy.espn.com/football/players/add?leagueId=1589782588 and I've been wanted to pull data from it. In the developer console I typed out
let players = document.getElementsByClassName("AnchorLink link clr-link pointer")
players[0].text
and it works perfectly. How can I get this to work in my ide?

Disclaimer: the following is for teaching purpose only and should not be abused.
Use a public API if provided by the website owner.
Investigate what happens to the request by using the Network tab.
You'll notice that a request is made to an URI https://site.api.espn.com/apis/..... which ends in something like: ....ffl/news/players?days=30&playerId=2576623.
If you click that link you'll go directly to a page that serves an API response as JSON.
Inspect again the entire website Ctrl + Shift + F and look for that player ID 2576623 - and you'll notice that is stored inside each player image URI. So let's collect all those IDs.
Open Dev Tools Console and run:
var _i = document.querySelectorAll("tbody .player__column img[src*='full/']");
console.log(_i);
Now that you have your image elements it's time to collect all the IDs:
var _ids = [..._i].map(el => el.src.match(/(?<=full\/)[^\.]+(?=\.)/)[0]);
console.log(_ids)
From this point on you - can use any server-side script (or even JS if there's no CrossOrigin limitation), and fetch that JSON data.

Related

How to download files of different types of computer format and without display to the user in Javascript or C#?

I have a list of links, where each link corresponds to a different file that can be PDF, DOC, XLS, etc. obtained through a viewer on a jsp page. I have the link and the code for the file (documentId), but the way I'm doing it only downloads the last file in the list:
// File 1
var url1 = "https://site.site.net/servlet/DocumentServlet?documentId=123456789&action=viewUncontrolledCopy#toolbar=0&navpanes=0&scrollbar=0";
// File 2
var url2 = "https://site.site.net/servlet/DocumentServlet?documentId=987654321&action=viewUncontrolledCopy#toolbar=0&navpanes=0&scrollbar=0";
window.location.assign (url1);
setTimeout (20000);
window.location.assign (url2);
I have already searched and could not find something that completely do what I need.

Try with the window.open command
// File 1
var url1 = "https://site.site.net/servlet/DocumentServlet?documentId=123456789&action=viewUncontrolledCopy#toolbar=0&navpanes=0&scrollbar=0";
// File 2
var url2 = "https://site.site.net/servlet/DocumentServlet?documentId=987654321&action=viewUncontrolledCopy#toolbar=0&navpanes=0&scrollbar=0";
window.open(url1);
window.open(url2);
This will download the files by opening two new windows thus maintaining the first page open and the javascript running
Check here: https://www.w3schools.com/jsref/met_loc_assign.asp
Location assign navigates to a new page. This means that the rest of your javascript is not executed and only the first navigation occurs.

SharePoint Rest Document library

I am creating a custom page writing the HTML and javascript for a SharePoint site. I would like to embed document libraries inside my custom html I am writing in SharePoint designer.
I have nto found a way to easily embed document libraries in custom html but did stumble on some documentation for a rest api. I figured I could use this and write my own ajax app in the html for users to navigate the document library.
I am currently trying with this javascrip just to see if I can pull html or JSON for a document library contents:
<script type="text/javascript">
var folderUrl = "x/x/x/testDocumentLibrary/Forms/AllItems.aspx";
var url = _spPageContextInfo.webServerRelativeUrl + "/_api/Web/GetFolderByServerRelativeUrl('" + folderUrl + "')?$expand=Folders,Files";
$.getJSON(url,function(data,status,xhr){
for(var i = 0; i < data.Files.length;i++){
console.log(data.Files[i].Name);
}
for(var i = 0; i < data.Folders.length;i++){
console.log(data.Folders[i].Name);
}
});
</script>
I am not sure if I am using the right url for the folderUrl variable.
In order to conduct some tests what is _spPageContextInfo.webServerRelativeURL pulling? I am trying to see if I can work backwards and create the URL manually first with out the SP function calls.

The folderUrl variable in your example code should end with the path to the library; everything up until /Forms/AllItems.aspx, so /x/x/x/testDocumentLibrary where /x/x/x/ is the server-relative path to the site on which the library resides.
The _spPageContextInfo object provides two variations of server-relative URL, one for the current site (called a "web" in SharePoint jargon) and one for the current site collection (called a "site" in SharePoint jargon). Appropriately, these properties are labeled webServerRelativeURL and siteServerRelativeURL. Both of these are server-relative, meaning that they exclude the first part of the domain name. (Instead of https://constoso.com/sites/stackoverflow they'll give you /sites/stackoverflow.)
For a REST call, you probably want the absolute URL, not the server-relative URL. You can access the web and site absolute URLs through _spPageContextInfo's properties webAbsoluteURL and siteAbsoluteURL.
If the list/library you're accessing is on the current site where your REST is running, use the webAbsoluteURL property.

Web page doesn't reflect code changes [duplicate]

How do I clear a browsers cache with JavaScript?
We deployed the latest JavaScript code but we are unable to get the latest JavaScript code.
Editorial Note: This question is semi-duplicated in the following places, and the answer in the first of the following questions is probably the best. This accepted answer is no longer the ideal solution.
How to force browser to reload cached CSS/JS files?
How can I force clients to refresh JavaScript files?
Dynamically reload local Javascript source / json data

Update: See location.reload() has no parameter for background on this nonstandard parameter and how Firefox is likely the only modern browser with support.
You can call window.location.reload(true) to reload the current page. It will ignore any cached items and retrieve new copies of the page, css, images, JavaScript, etc from the server. This doesn't clear the whole cache, but has the effect of clearing the cache for the page you are on.
However, your best strategy is to version the path or filename as mentioned in various other answers. In addition, see Revving Filenames: don’t use querystring for reasons not to use ?v=n as your versioning scheme.

You can't clear the cache with javascript.
A common way is to append the revision number or last updated timestamp to the file, like this:
myscript.123.js
or
myscript.js?updated=1234567890

Try changing the JavaScript file's src? From this:
<script language="JavaScript" src="js/myscript.js"></script>
To this:
<script language="JavaScript" src="js/myscript.js?n=1"></script>
This method should force your browser to load a new copy of the JS file.

Other than caching every hour, or every week, you may cache according to file data.
Example (in PHP):
<script src="js/my_script.js?v=<?=md5_file('js/my_script.js')?>"></script>
or even use file modification time:
<script src="js/my_script.js?v=<?=filemtime('js/my_script.js')?>"></script>

You can also force the code to be reloaded every hour, like this, in PHP :
<?php
echo '<script language="JavaScript" src="js/myscript.js?token='.date('YmdH').'">';
?>
or
<script type="text/javascript" src="js/myscript.js?v=<?php echo date('YmdHis'); ?>"></script>

window.location.reload(true) seems to have been deprecated by the HTML5 standard. One way to do this without using query strings is to use the Clear-Site-Data header, which seems to being standardized.

put this at the end of your template :
var scripts = document.getElementsByTagName('script');
var torefreshs = ['myscript.js', 'myscript2.js'] ; // list of js to be refresh
var key = 1; // change this key every time you want force a refresh
for(var i=0;i<scripts.length;i++){
for(var j=0;j<torefreshs.length;j++){
if(scripts[i].src && (scripts[i].src.indexOf(torefreshs[j]) > -1)){
new_src = scripts[i].src.replace(torefreshs[j],torefreshs[j] + 'k=' + key );
scripts[i].src = new_src; // change src in order to refresh js
}
}
}

try using this
<script language="JavaScript" src="js/myscript.js"></script>
To this:
<script language="JavaScript" src="js/myscript.js?n=1"></script>

Here's a snippet of what I'm using for my latest project.
From the controller:
if ( IS_DEV ) {
$this->view->cacheBust = microtime(true);
} else {
$this->view->cacheBust = file_exists($versionFile)
// The version file exists, encode it
? urlencode( file_get_contents($versionFile) )
// Use today's year and week number to still have caching and busting
: date("YW");
}
From the view:
<script type="text/javascript" src="/javascript/somefile.js?v=<?= $this->cacheBust; ?>"></script>
<link rel="stylesheet" type="text/css" href="/css/layout.css?v=<?= $this->cacheBust; ?>">
Our publishing process generates a file with the revision number of the current build. This works by URL encoding that file and using that as a cache buster. As a fail-over, if that file doesn't exist, the year and week number are used so that caching still works, and it will be refreshed at least once a week.
Also, this provides cache busting for every page load while in the development environment so that developers don't have to worry with clearing the cache for any resources (javascript, css, ajax calls, etc).

or you can just read js file by server with file_get_contets and then put in echo in the header the js contents

Maybe "clearing cache" is not as easy as it should be. Instead of clearing cache on my browsers, I realized that "touching" the file will actually change the date of the source file cached on the server (Tested on Edge, Chrome and Firefox) and most browsers will automatically download the most current fresh copy of whats on your server (code, graphics any multimedia too). I suggest you just copy the most current scripts on the server and "do the touch thing" solution before your program runs, so it will change the date of all your problem files to a most current date and time, then it downloads a fresh copy to your browser:
<?php
touch('/www/control/file1.js');
touch('/www/control/file2.js');
touch('/www/control/file2.js');
?>
...the rest of your program...
It took me some time to resolve this issue (as many browsers act differently to different commands, but they all check time of files and compare to your downloaded copy in your browser, if different date and time, will do the refresh), If you can't go the supposed right way, there is always another usable and better solution to it. Best Regards and happy camping.

I had some troubles with the code suggested by yboussard. The inner j loop didn't work. Here is the modified code that I use with success.
function reloadScripts(toRefreshList/* list of js to be refresh */, key /* change this key every time you want force a refresh */) {
var scripts = document.getElementsByTagName('script');
for(var i = 0; i < scripts.length; i++) {
var aScript = scripts[i];
for(var j = 0; j < toRefreshList.length; j++) {
var toRefresh = toRefreshList[j];
if(aScript.src && (aScript.src.indexOf(toRefresh) > -1)) {
new_src = aScript.src.replace(toRefresh, toRefresh + '?k=' + key);
// console.log('Force refresh on cached script files. From: ' + aScript.src + ' to ' + new_src)
aScript.src = new_src;
}
}
}
}

If you are using php can do:
<script src="js/myscript.js?rev=<?php echo time();?>"
type="text/javascript"></script>

Please do not give incorrect information.
Cache api is a diferent type of cache from http cache
HTTP cache is fired when the server sends the correct headers, you can't access with javasvipt.
Cache api in the other hand is fired when you want, it is usefull when working with service worker so you can intersect request and answer it from this type of cache
see:ilustration 1 ilustration 2 course
You could use these techiques to have always a fresh content on your users:
Use location.reload(true) this does not work for me, so I wouldn't recomend it.
Use Cache api in order to save into the cache and intersect the
request with service worker, be carefull with this one because
if the server has sent the cache headers for the files you want
to refresh, the browser will answer from the HTTP cache first, and if it does not find it, then it will go to the network, so you could end up with and old file
Change the url from you stactics files, my recomendation is you should name it with the change of your files content, I use md5 and then convert it to string and url friendly, and the md5 will change with the content of the file, there you can freely send HTTP cache headers long enough
I would recomend the third one see

You can also disable browser caching with meta HTML tags just put html tags in the head section to avoid the web page to be cached while you are coding/testing and when you are done you can remove the meta tags.
(in the head section)
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<meta http-equiv="Pragma" content="no-cache" />
<meta http-equiv="Expires" content="0"/>
Refresh your page after pasting this in the head and should refresh the new javascript code too.
This link will give you other options if you need them
http://cristian.sulea.net/blog/disable-browser-caching-with-meta-html-tags/
or you can just create a button like so
<button type="button" onclick="location.reload(true)">Refresh</button>
it refreshes and avoid caching but it will be there on your page till you finish testing, then you can take it off. Fist option is best I thing.

I tend to version my framework then apply the version number to script and style paths
<cfset fw.version = '001' />
<script src="/scripts/#fw.version#/foo.js"/>

Cache.delete() can be used for new chrome, firefox and opera.

I found a solution to this problem recently. In my case, I was trying to update an html element using javascript; I had been using XHR to update text based on data retrieved from a GET request. Although the XHR request happened frequently, the cached HTML data remained frustratingly the same.
Recently, I discovered a cache busting method in the fetch api. The fetch api replaces XHR, and it is super simple to use. Here's an example:
async function updateHTMLElement(t) {
let res = await fetch(url, {cache: "no-store"});
if(res.ok){
let myTxt = await res.text();
document.getElementById('myElement').innerHTML = myTxt;
}
}
Notice that {cache: "no-store"} argument? This causes the browser to bust the cache for that element, so that new data gets loaded properly. My goodness, this was a godsend for me. I hope this is helpful for you, too.
Tangentially, to bust the cache for an image that gets updated on the server side, but keeps the same src attribute, the simplest and oldest method is to simply use Date.now(), and append that number as a url variable to the src attribute for that image. This works reliably for images, but not for HTML elements. But between these two techniques, you can update any info you need to now :-)

Most of the right answers are already mentioned in this topic. However I want to add link to the one article which is the best one I was able to read.
https://www.fastly.com/blog/clearing-cache-browser
As far as I can see the most suitable solution is:
POST in an iframe. Next is a small subtract from the suggested post:
=============
const ifr = document.createElement('iframe');
ifr.name = ifr.id = 'ifr_'+Date.now();
document.body.appendChild(ifr);
const form = document.createElement('form');
form.method = "POST";
form.target = ifr.name;
form.action = ‘/thing/stuck/in/cache’;
document.body.appendChild(form);
form.submit();
There’s a few obvious side effects: this will create a browser history entry, and is subject to the same issues of non-caching of the response. But it escapes the preflight requirements that exist for fetch, and since it’s a navigation, browsers that split caches will be clearing the right one.
This one almost nails it. Firefox will hold on to the stuck object for cross-origin resources but only for subsequent fetches. Every browser will invalidate the navigation cache for the object, both for same and cross origin resources.
==============================
We tried many things but that one works pretty well. The only issue is there you need to be able to bring this script somehow to end user page so you are able to reset cache. We were lucky in our particular case.

window.parent.caches.delete("call")
close and open the browser after executing the code in console.

Cause browser cache same link, you should add a random number end of the url.
new Date().getTime() generate a different number.
Just add new Date().getTime() end of link as like
call
'https://stackoverflow.com/questions.php?' + new Date().getTime()
Output: https://stackoverflow.com/questions.php?1571737901173

I've solved this issue by using
ETag
Etags are similar to fingerprints, and if the resource at a given URL changes, a new Etag value must be generated. A comparison of them can determine whether two representations of a resource are the same.

Ref: https://developer.mozilla.org/en-US/docs/Web/API/Cache/delete
Cache.delete()
Method
Syntax:
cache.delete(request, {options}).then(function(found) {
// your cache entry has been deleted if found
});

What's the best method to EXTRACT product names given a list of SKU numbers from a website?

I have a problem.
I have a list of SKU numbers (hundreds) that I'm trying to match with the title of the product that it belongs to. I have thought of a few ways to accomplish this, but I feel like I'm missing something... I'm hoping someone here has a quick and efficient idea to help me get this done.
The products come from Aidan Gray.
Attempt #1 (Batch Program Method) - FAIL:
After searching for a SKU in Aidan Gray, the website returns a URL that looks like below:
http://www.aidangrayhome.com/catalogsearch/result/?q=SKUNUMBER
... with "SKUNUMBER" obviously being a SKU.
The first result of the webpage is almost always the product.
To click the first result (through the address bar) the following can be entered (if Javascript is enabled through the address bar):
javascript:{document.getElementsByClassName("product-image")[0].click;}
I wanted to create a .bat file through Command Prompt and execute the following command:
firefox http://www.aidangrayhome.com/catalogsearch/result/?q=SKUNUMBER javascript:{document.getElementsByClassName("product-image")[0].click;}
... but Firefox doesn't seem to allow these two commands to execute in the same tab.
If that worked, I was going to go to http://tools.buzzstream.com/meta-tag-extractor, paste the resulting links to get the titles of the pages, and export the data to CSV format, and copy over the data I wanted.
Unfortunately, I am unable to open both the webpage and the Javascript in the same tab through a batch program.
Attempt #2 (I'm Feeling Lucky Method):
I was going to use Google's &btnI URL suffix to automatically redirect to the first result.
http://www.google.com/search?btnI&q=site:aidangrayhome.com+SKUNUMBER
After opening all the links in tabs, I was going to use a Firefox add-on called "Send Tab URLs" to copy the names of the tabs (which contain the product names) to the clipboard.
The problem is that most of the results were simply not lucky enough...
If anybody has an idea or tip to get this accomplished, I'd be very grateful.

I recommend using JScript for this. It's easy to include as hybrid code in a batch script, its structure and syntax is familiar to anyone comfortable with JavaScript, and you can use it to fetch web pages via XMLHTTPRequest (a.k.a. Ajax by the less-informed) and build a DOM object from the .responseText using an htmlfile COM object.
Anyway, challenge: accepted. Save this with a .bat extension. It'll look for a text file containing SKUs, one per line, and fetch and scrape the search page for each, writing info from the first anchor element with a .className of "product-image" to a CSV file.
#if (#CodeSection == #Batch) #then
#echo off
setlocal
set "skufile=sku.txt"
set "outfile=output.csv"
set "URL=http://www.aidangrayhome.com/catalogsearch/result/?q="
rem // invoke JScript portion
cscript /nologo /e:jscript "%~f0" "%skufile%" "%outfile%" "%URL%"
echo Done.
rem // end main runtime
goto :EOF
#end // end batch / begin JScript chimera
var fso = WSH.CreateObject('scripting.filesystemobject'),
skufile = fso.OpenTextFile(WSH.Arguments(0), 1),
skus = skufile.ReadAll().split(/\r?\n/),
outfile = fso.CreateTextFile(WSH.Arguments(1), true),
URL = WSH.Arguments(2);
skufile.Close();
String.prototype.trim = function() { return this.replace(/^\s+|\s+$/g, ''); }
// returns a DOM root object
function fetch(url) {
var XHR = WSH.CreateObject("Microsoft.XMLHTTP"),
DOM = WSH.CreateObject('htmlfile');
WSH.StdErr.Write('fetching ' + url);
XHR.open("GET",url,true);
XHR.setRequestHeader('User-Agent','XMLHTTP/1.0');
XHR.send('');
while (XHR.readyState!=4) {WSH.Sleep(25)};
DOM.write(XHR.responseText);
return DOM;
}
function out(what) {
WSH.StdErr.Write(new Array(79).join(String.fromCharCode(8)));
WSH.Echo(what);
outfile.WriteLine(what);
}
WSH.Echo('Writing to ' + WSH.Arguments(1) + '...')
out('sku,product,URL');
for (var i=0; i<skus.length; i++) {
if (!skus[i]) continue;
var DOM = fetch(URL + skus[i]),
anchors = DOM.getElementsByTagName('a');
for (var j=0; j<anchors.length; j++) {
if (/\bproduct-image\b/i.test(anchors[j].className)) {
out(skus[i]+',"' + anchors[j].title.trim() + '","' + anchors[j].href + '"');
break;
}
}
}
outfile.Close();
Too bad the htmlfile COM object doesn't support getElementsByClassName. :/ But this seems to work well enough in my testing.

Mozilla (Firefox, Thunderbird) Extension: How to get extension id (from install.rdf)?

If you are developing an extension for one of the mozilla applications (e.g. Firefox, Thunderbird, etc.) you define a extension id in the install.rdf.
If for some reason you need to know the extension id e.g. to retrieve the extension dir in local file system (1) or if you want to send it to a webservice (useage statistic) etc. it would be nice to get it from the install.rdf in favour to have it hardcoded in your javascript code.
But how to access the extension id from within my extension?
1) example code:
var extId = "myspecialthunderbirdextid#mydomain.com";
var filename = "install.rdf";
var file = extManager.getInstallLocation(extId).getItemFile(extId, filename);
var fullPathToFile = file.path;

I'm fairly sure the 'hard-coded ID' should never change throughout the lifetime of an extension. That's the entire purpose of the ID: it's unique to that extension, permanently. Just store it as a constant and use that constant in your libraries. There's nothing wrong with that.
What IS bad practice is using the install.rdf, which exists for the sole purpose of... well, installing. Once the extension is developed, the install.rdf file's state is irrelevant and could well be inconsistent.
"An Install Manifest is the file an Add-on Manager-enabled XUL application uses to determine information about an add-on as it is being installed" [1]
To give it an analogy, it's like accessing the memory of a deleted object from an overflow. That object still exists in memory but it's not logically longer relevant and using its data is a really, really bad idea.
[1] https://developer.mozilla.org/en/install_manifests

Like lwburk, I don't think its available through Mozilla's API's, but I have an idea which works, but it seems like a complex hack. The basic steps are:
Set up a custom resource url to point to your extension's base directory
Read the file and parse it into XML
Pull the id out using XPath
Add the following line to your chrome.manifest file
resource packagename-base-dir chrome/../
Then we can grab and parse the file with the following code:
function myId(){
var req = new XMLHttpRequest();
// synchronous request
req.open('GET', "resource://packagename-base-dir/install.rdf", false);
req.send(null);
if( req.status !== 0){
throw("file not found");
}
var data = req.responseText;
// this is so that we can query xpath with namespaces
var nsResolver = function(prefix){
var ns = {
"rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"em" : "http://www.mozilla.org/2004/em-rdf#"
};
return ns[prefix] || null;
};
var parser = CCIN("#mozilla.org/xmlextras/domparser;1", Ci.nsIDOMParser);
var doc = parser.parseFromString(data, "text/xml");
// you might have to change this xpath expression a bit to fit your setup
var myExtId = doc.evaluate("//em:targetApplication//em:id", doc, nsResolver,
Ci.nsIDOMXPathResult.FIRST_ORDERED_NODE_TYPE, null);
return myExtId.singleNodeValue.textContent;
}
I chose to use a XMLHttpRequest(as opposed to simply reading from a file) to retrieve the contents since in Firefox 4, extensions aren't necessarily unzipped. However, XMLHttpRequest will still work if the extension remains packed (haven't tested this, but have read about it).
Please note that resource URL's are shared by all installed extensions, so if packagename-base-dir isn't unique, you'll run into problems. You might be able to leverage Programmatically adding aliases to solve this problem.
This question prompted me to join StackOverflow tonight, and I'm looking forward participating more... I'll be seeing you guys around!

As Firefox now just uses Chrome's WebExtension API, you can use #serg's answer at How to get my extension's id from JavaScript?:
You can get it like this (no extra permissions required) in two
different ways:
Using runtime api: var myid = chrome.runtime.id;
Using i18n api: var myid = chrome.i18n.getMessage("##extension_id");

I can't prove a negative, but I've done some research and I don't think this is possible. Evidence:
This question, which shows that
the nsIExtensionManager interface
expects you to retrieve extension
information by ID
The full nsIExtensionManager interface
description, which shows no
method that helps
The interface does allow you to retrieve a full list of installed extensions, so it's possible to retrieve information about your extension using something other than the ID. See this code, for example:
var em = Cc['#mozilla.org/extensions/manager;1']
.getService(Ci.nsIExtensionManager);
const nsIUpdateItem = Ci.nsIUpdateItem;
var extension_type = nsIUpdateItem.TYPE_EXTENSION;
items = em.getItemList(extension_type, {});
items.forEach(function(item, index, array) {
alert(item.name + " / " + item.id + " version: " + item.version);
});
But you'd still be relying on hardcoded properties, of which the ID is the only one guaranteed to be unique.

Take a look on this add-on, maybe its author could help you, or yourself can figure out:
[Extension Manager] Extended is very
simple to use. After installing, just
open the extension manager by going to
Tools and the clicking Extensions. You
will now see next to each extension
the id of that extension.
(Not compatible yet with Firefox 4.0)
https://addons.mozilla.org/firefox/addon/2195

Develop Reference

JavaScript is the programming language of the Web.

Is it possible to access elements of a website? - javascript

Related

How to download files of different types of computer format and without display to the user in Javascript or C#?

SharePoint Rest Document library

Web page doesn't reflect code changes [duplicate]

What's the best method to EXTRACT product names given a list of SKU numbers from a website?

Mozilla (Firefox, Thunderbird) Extension: How to get extension id (from install.rdf)?

Categories

Resources