Scrape HTML off AppBuilder page [closed] - javascript

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I found this website with some interesting data I wish to analyze. But the page is really slow and build around .docx files. But it has a preview of each document in HTML
http://www.produktresume.dk/AppBuilder/search?page=0
My current idea for a strategy is:
Wait for the page to load (haven't tried this before)
Dig into the div class="widget_inside"
Grab all the href in <a class="preview_link"
Iterate over all the collected links and parse the HTML into some .json/.csv for later analysis
I'm pretty new when it comes to scraping, and had previously some luck with BeautifulSoup in Python - with a page that don't have a loading. But I have been using nodejs lately, so would prefer to be able to do it in JS with some npm package.
Anybody who can help me out finding the right tools for the job and some pointers/comments on the best strategy?
Bonus info
By decoding one of the filter links to the left this comes up:
http://www.produktresume.dk/AppBuilder/search?expand_all=true&page=0&refinements_token={}&selected_tokens[]={"s":[{"id":"folder-refinement","xPath":"$folders","separator":"\u003e","logic":"OR","s":[{"n":"Human","k":"Human"}]}]}
Don't know if that would be of any use?

so would prefer to be able to do it in JS with some npm package
Although I used Python, this answer shows a very simple and synchronous (slow!) way of grabbing all of those preview links. It took about 1 hour and 15 minutes to go through all 757 pages.
I wasn't sure how exactly you wanted to save the information in each of those preview links so I left that part to you. It would also be trivial to modify this script to simply download all of those .docx files instead of just grabbing the preview links.
import json
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import (
NoSuchElementException,
WebDriverException
)
base_url = 'http://www.produktresume.dk'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get('{}/AppBuilder/search'.format(base_url))
result = []
while True:
soup = BeautifulSoup(driver.page_source, 'lxml')
preview_links = [base_url + link['href']
for link in soup.find_all('a', class_='preview_link')]
result.extend(preview_links)
try:
element = driver.find_element_by_link_text('Næste')
element.click()
except (NoSuchElementException, WebDriverException):
break
driver.quit()
with open('preview_links.json', 'w') as f:
json.dump(result, f, indent=2)
preview_links.json
https://bpaste.net/show/b597d6910f18

Related

What is a good alternative for javascript escape() that lets you encode code? [duplicate]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm creating a website where live editing of code (for Java, c,python,javascript ect) is required. I'm aware of codemirror and I want to know how to run code on a website (like W3Schools try it yourself feature) and locally run instead of requiring server infrastructure
For front-end, it's pretty easy to emulate the basics of how Stack Snippets work here on Stack Exchange. The general idea is to create textareas for the different sections (HTML/JS/CSS), then when you want to render it, create an iframe whose content is created by inserting the textarea values:
const [button, html, css, javascript, iframe] = document.querySelectorAll('button, textarea, iframe');
button.addEventListener('click', () => {
const fullHTML = `
<!doctype html><html>
<head><style>${css.value}</style></head>
<body>${html.value}<script>${javascript.value}<\/script></body>
</html>`;
iframe.src = 'data:text/html,' + encodeURIComponent(fullHTML);
});
textarea {
display: block;
}
<button>Run</button>
<textarea><span id="span">a span</span></textarea>
<textarea>span { color: green; }</textarea>
<textarea>span.onclick = () => span.style.color = 'yellow';</textarea>
<iframe sandbox="allow-scripts"></iframe>
The above is tweaked from an example from Kaiido on Meta.
This isn't exactly how Stack Snippets work - they do require a backend to take the form values and construct the HTML response the browser sees - but it's pretty similar.
For anything other than the front-end, it's a lot more complicated - you'll have to retrieve the source text the user has written, send a request to a server to run it in the correct language, then send the request back to the client and render it. For anything that a browser can't run natively already, there's no way around having "server infrastructure" to process and run code.

Best way to integrate frontend and backend without ajax or api [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I wanna use php variable in frontend framework like Vue js.
What is the best way of integration frontend and backend framework?
This is my idea, but i think there are better way to do this.
<script id = "data" >
let $user = <?= json_encode($user) ?>
</script >
Some content...
<script >
new Vue({
data: {
user: $user
},
mounted() {
$("#data"). remove ()
}
})
While 'simplicity' is wonderful, 'functionality' is also pretty critical...
Sometimes you can get by with your type of coding (use it for some things that come into the PHP file that are needed to load the page, for example), and what you have may work for this particular situation (and, no, there isn't any way I can see to make it "better"...), though most pages will need more data that is 'fluid', and you will quickly run out of projects where you can write only 'simple' code.
Learn to use ajax (it is pretty simple once you get the hang of it) and copy/paste from your own 'library' (save snippets in a place you remember - you will find MANY things you want to keep... - I keep a 'functions.php' file and over the years it has grown pretty large with great bits-n-pieces.)
Since you are using jQuery already, here's one way to do ajax... (there are others, again, study and find the way you like...)
var url = "https://theURLtoMyAjax.phpPage";
var elements = "theStuff=thatIwantToSend&someMore=somethingElse"; // this is like writing everything in the address bar - again, there are other ways...)
$.post(url, elements, function (data) {
// do all kinds of wonderful things in here!
// the 'data' is what is returned from your call, so you can use it to update data on the page, etc.
});
So, as you can see, only a couple lines of code to add Ajax and tons of things you can do once you do it, so learn it - and use it!
Happy coding!

Python backend with JS frontend [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm creating a Python-powered web framework aiming to make use of javascript as minimal as it possible. Need to make something lightweight like uilang to pass events to my Python code. I suppose that should be jQuery solution somehow pinging kind of observer object(?).
I have discovered some efforts to create pure-Python interface like Remi but still have no clue how should I reimplement it in my code.
Let's say I make a class like that:
class WebView():
def __init__(self, template, **callbacks):
"""
Callbacks are dict, {'object': callback}
"""
self.template = template
self.customJS = None
for obj, call in callbacks:
self.setCallback(obj, call)
def __call__():
"""
Here I return template
Assume {{customJS}} record in template, to be simple
"""
return bottle.template(self.template, customJS = self.customJS)
def setCallback(self, obj, call):
"""
Build custom js to insert
"""
self.customJS += ('<script ....%s ... %s' % (obj, call))
So, how could I make JS to pass an event from, say, pressing button back to my callback?
I understand that question might be on the edge of being too broad, but I'm solely trying to be as descriptive as possible because I really don't know JS at all.
Thing is you don't need javascript for a python web framework. You would be fine serving pages with flask or django without the single line of JS.
These pages would be pretty static with a few forms but would work perfectly.
Now if you want to have more dynamic content and interaction you'll probably need JS, and use XMLHttpRequests to asynchronously call your python backend on events. But in order to do so properly, you should start by learning JS.
You could probably do it with websockets too, however i don't think it's the best way. You can use websocket-python library on the python side, and on the website, you just send a websocket message on every button click callback.

How to make python invoke a JavaScript function? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am rather new at this, so please cut me some slack.
I am working in a web application using google app engine. My server runs using Python, and when the user visits my "/" page, the server gets some HTML source code from a website and places it into a variable called html_code.
Now I would like to pass the html_code variable to the client so he can analyze it using JavaScript. Is this possible? How can I achieve this? Please provide examples if possible.
Thanks in advance, Pedro.
I assume you want to send a piece of html to a browser that won't disrupt the existing layout and functioning of the page, but still take place in DOM (hence be accessible from JS). In that case you may consider a hidden/invisible iframe.
Note, the following is very very quick and dirty way of getting something to the client side.
Based upon https://developers.google.com/appengine/docs/python/gettingstartedpython27/introduction the following makes use of webapp2 and jinja2. "html_code" would be available for usage within a file named index.html. How you render / surface the fetched document is up to you, but as has been previously mentioned an iframe would probably work well in this situation.
import os
from google.appengine.api import urlfetch
import webapp2
import jinja2
JINJA_ENVIRONMENT = jinja2.Environment(
loader=jinja2.FileSystemLoader(os.path.dirname(__file__)),
extensions=['jinja2.ext.autoescape'])
class MainHandler(webapp2.RequestHandler):
def get(self):
html_code = urlfetch.fetch('http://stackoverflow.com/questions/18936253/how-to-make-python-invoke-a-javascript-function')
template_values = {
'html_code': html_code.content
}
template = JINJA_ENVIRONMENT.get_template('index.html')
self.response.write(template.render(template_values))
app = webapp2.WSGIApplication([
('/', MainHandler)
], debug=True)

console-like interface on a web page using javascript [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I very like MySQLs mysql cli tool and I don't like phpMyAdmin.
[IMHO]It's a nice thing for a Windows user, but its not so good when you've used to console.[/IMHO].
What I want is to build a web page containing element with console-like input (for example something like this) which should get input from user, send it to PHP script on back-end and show back-end response.
Back-end script is done (it was the easiest part), but I can't find any library for JavaScript implementing console-like input.
I've tried to examine and modify for my needs example I've provided, but it's too bloated (because doesn't use any libraries) and implements specific thing. Also I would like this element to provide some auto-completion for input.
Any ideas on such JS library?
I think you are looking for this: jQueryTerminal
there is shellinabox - javascript terminal.
EDIT:
There is also library xterm.js that's real terminal emulator.
EDIT 2:
My jQuery Terminal library is useful when you need custom behavior and you can write your code in JS or as backend code, but backend need to be simple input -> output, if you want to run for instance interactive backend commands, like vi or emacs, you need proper tty, for this use xterm.js (or implement that in JavaScript) for any other usage jQuery Terminal is better. It have lot of features and you don't need to run process on the server (listen on a port) which usually is forbidden on shared hostings or GitHub pages.
instead of using console.log() use document.write()
It will write text on the webpage just like console.log would in the console
I've made a console library called Simple Console (I'll probably rename it because simple-console is taken on npm)
It handles command history and such for you, and you can use it to implement any kind of console.
var handleCommand = (command)=> {
var req = new XMLHttpRequest();
req.addEventListener("load", ()=> {
con.log(req.responseText);
// TODO: use con.error for errors and con.warn for warnings
// TODO: maybe log a table element to display rows of data
});
// TODO: actually pass the command to the server
req.open("GET", "mysql.php");
req.send();
};
var con = new SimpleConsole({
handleCommand,
placeholder: "Enter MySQL queries",
storageID: "mysql-console"
});
document.body.appendChild(con.element);
Check out the documentation on GitHub for more information.
hmm firebug console ?
http://getfirebug.com/commandline

Categories

Resources