I am creating a react app in next js and ran into a problem. I wanted to click on a word in displayed pdf and send it to an API call. I've seen a few pdf to text solutions before, but they destructurize the document. Is this the only option? All the highlight options in pdfs are just based on a position in canvas.
And another question - how to properly handle this click event. Let's say even that we work with a text document. How to wrap these words so I could use this onClick on them?
Related
I'm trying to use JavaScript to scrape data from the following page, specifically the "free shipping free returns" text that appears when you hover your mouse over the cart icon:
Whenever I hover over the cart icon, new HTML is added to the DOM.
And when I move my mouse away, the the previously added HTML goes away. I want to be able to parse data from the HTML that gets added without having the popup visible. How would I be able to scrape this text data even if someone does not hover over the cart icon? Is there a way to access all the HTML data at once?
You can try to catch the JavaScript function being executed when you hover your mouse over the cart icon. You can do this via the developer tools. Add break points to code execution if the DOM changes (on the parent element in which the new element is added).
Once you get the function, just execute it directly on that page and you'll probably be able to see the popup and extract it's contents.
You could also try to simulate a hover as explained in these answers: How do I simulate a mouseover in pure JavaScript that activates the CSS ":hover"?
Scraping a page for data is not usually recommended since they can change over time (especially ones not written directly in HTML, but are rather generated (usually they have CSS classes like 8h2H1)).
If this is not supposed to be a long-term solution, the above answer by #nvkrjn is a good answer. Or, you can just check for an element with the id name free-shipping-label.
But, if this is supposed to be a long-term solution, then I would suggest using an API (this site doesn't seem to have one) or querying the database like how to Javascript does. Also, if you're using a non-browser environment (eg BeautifulSoup), it may not run the JS required to get the data.
I am trying to copy paste some stuff from some website which I want to automate. Here is my manual workflow:
There is a master webpage which contains set of links.
When I click on one of those links it opens another (say topic page) page with set of tabs.
I click on one specific of those tabs which loads a page containing several buttons with same html-css applied to them.
On click events of those botton calls a javascript function passing four integer parameters.
The function results in generating a separate popup window with some small content which I then print as pdf.
The issue is that the website blocks right click and text selection. And the popup window contains a image which I print as pdf by right clicking on titlebar and selecting print as pdf. When I checked the source of popup, I found that it uses
"data:image/png;base64,<source for image>"
as value for src of <image>.
Now the big question can I write some script which can run when either master page or topic page to automatically click on buttons on them and get those images saved either directly as png or pdfs? I am good at programming languages java, groovy, python, C#... Also explored javascript a lot. But that's many years ago and really lost in touch with JS. Can I do this with say greasemonkey or any other way. Any pointers (possibly detailed) will be helpful...Or even some small pseudocode which I can paste in console of topic page which will do all clicking of buttons and saving image from the popup, so that I don't have to do button-clicking-&-saving-image manually. This will also serve a lot since there are more buttons per topic page instead of number of topic pages themselves.
Update
Well I know this question is not at all specific, so here is my initial hurdles, since I have started to try it out:
given that I am programmatically call all those function on onclick events, how can I get hold of popups in source? That is, how can I reference the popup that is opened by function call in js?
I need a quick way to get the image URL, just like I would get if I right click on an image and select "Copy Image URL". I'm thinking Applescript, though others have mentioned Javascript.
This needs to be compatible with an Automator workflow and needs to work with Google Chrome, Chromium, and Safari, at a minimum.
More specifics:
I already have an Automator workflow that this will be added to.
The workflow begins with text and images that I have selected on a webpage using the mouse.
The processing of the text is working fine.
I just need a Applescript or Javascript or Shell Script (which I assume are the only outside code that can be added to an Automator workflow) that will grab any and all image URL's within the part of the page selected in step 2.
Images are NOT downloaded. Only the image URL is needed.
The basic logic is this:
Does selected input contain images?
If yes,
get URL of image(s)
pass to the next step
else continue
Any help or ideas appreciated!
OS X Services would be your best bet. Those work with text selections and are supported in most apps (e.g. see the Safari>Services submenu). You can also assign them keyboard shortcuts, which is very handy for repetitive tasks.
Basically, you want to get the selection as web content (i.e. HTML data, not plain text) then extract the URLs from that. You can create services in Automator, which includes various actions for working with web content, so I recommend starting there.
I'm attempting to implement an image upload tool on my website, similar to how Google has done their image upload search. This would need to be able to have images dropped into it; have urls pasted into it; or have a user upload it from their own computer.
Another related question, how does Google have the selection method change when you click 'Paste URL' or 'Upload file'?
images.google.com for idea source.
Thanks in advance.
I believe Google spent several years developing the algorithms behind searching for images with other images, so if this a website you are developing for yourself, I'd explore other options. Otherwise, tell your boss it's impractical.
As for the method change, I imagine (without looking too closely) that they overlay the camera icon on the input, and on click, displays a <div> or other container element over the input field, which can then contain alternative input methods.
When the user clicks on Paste or Upload, it replaces the content of a sub container with the relevant HTML using jQuery or whichever JavaScript library it is that they use.
I have 2 questions:
How would I make a small overlay open if a mouse hovers over any image on a webpage?
How would I find selected pieces of text on a webpage and make them into a link?
(similar to what Kontera or Vibrant does)
EDIT - Let me explain.
If a mouse hovers over any image on the website with a particular tag, I want a magnified version of the image opened next to it
If I have a word - "skills" inside my database and the webpage on which my Javascript is added has the word "skills" on it, I want it to be highlighted and linked to another page
1:
I know that in CSS you can use some hover keyword to effect the page (but I know little about CSS).
In javascript/HTML there is onMouseOver and onMouseOut, so.
<img onMouseOver="javascript:" onMouseOut="javascript:">
2:
You could use regex to find the text snd replace it.
document.body.innerHTML = document.body.innerHTML.replace(new Regex("skills|other|words", g), "<a>$&</a>"); //Note that this searches inside of the html tags so it is better if you know of specific locations to search for the text instead of anywhere in the html.
JQuery is a good javascript framework. You can do (1) fairly easily with a host of plugins. Jquery Plugins
As for (2), you would probably have to do that server side. Not sure what server technology you use, but when the view is rendered, you would have to have some sort of filter go through all the words on the page and create the links.