java scraping data hidden in html and script [closed] - javascript

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I would like java to display a particular line of a webpage. This line is a src link to a jpg on a server. But Jsoup methods or OpenStreamReader methods cannot get the line that is generated only when a pin on a map is pushed.
Here is a site:
https://webgispu.wigeogis.com/kunden/omvpetrom/client/map.php?BRAND=OMV&LNG=SI&CTRISO=SVN&MODE=NEXTDOOR&VEHICLE=CAR
which displays this data for one gass station at a time in a frame that opens only when you click on a pin in a map. What's more, src link to .jpg with a gas price changes every two hours. I would like to get my program to get to those jpg-s but I donno how. When I use OpenStremReader to get to the html of this site I cannot figure out where to next.
Here is a line of code (it is an img tag)I am looking for( it is an eksample,' tmp2C31' changes every 2 hours):
'img src="https://webgispu.wigeogis.com/temp/tmp2C31.tmp.png" alt="" title="" style="margin-bottom:5px;display:block;" class="preisImageClass" '
Please have a look at the upper link and sugest which classes and methods should I adopt in my program. I have already read about OCRs so no need to explain geting data from jpgs.
thanx

I think what you're looking for is a HTML parser. In my opinion, the best parser is jsoup.
From the site:
jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.
With this, you can specify what you want to display on your program straight from the html document.

this code vil return a html.txt file:
public void htmlToTxt(String startSite) throws Exception {
URL u = new URL(startSite);
InputStream is = u.openStream();
InputStreamReader isr = new InputStreamReader(is);
BufferedReader br = new BufferedReader(isr);
BufferedWriter bw = new BufferedWriter(new FileWriter("htmlž.txt"));
String code = new String();
while ((code = br.readLine()) != null) {
bw.write(code);
bw.newLine();
}
bw.close();
br.close();
isr.close();
is.close();
}
public static void main(String[] args) throws Exception {
TestOMV a = new TestOMV();
a.htmlToTxt(
"https://webgispu.wigeogis.com/kunden/omvpetrom/client/map.php?BRAND=OMV&LNG=SI&CTRISO=SVN&MODE=NEXTDOOR&VEHICLE=CAR");
}
}

Related

Access a page's HTML [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Is it possible to take a link and access its HTML code through that link? For example I would like to take a link from Amazon and put it within my own HTML code, use JavaScript to getElementsByClassName to get the price from that link and display it back into my HTML code.
It is possible. You could do a GET request to the Amazon page that will give you the html in the response from there you'll have a string now you'll need to format it, last time I used the node module jsdom to do that.
In more detail:
HTTP is a protocol that we use to request data from the server, I've wrote an explanatory node js script:
const https = require('https');
const JSD = require('jsdom');
const { JSDOM } = JSD;
const zlib = require('zlib');
// The http get request
https.get('https://www.amazon.com', (response) => {
html = '';
// we need this because amazon is tricky and encodes the response so it is smaller hence it is faster to send
let gunzip = zlib.createGunzip();
response.pipe(gunzip);
// we need this to get the full html page since it is too big to send in one amazon divides it to chunks
gunzip.on('data', (chunk) => {
html += chunk.toString();
});
// when the transmittion finished we can do wathever we want with it
gunzip.on('end', () => {
let amazon = new JSDOM(html);
console.log(amazon.window.document.querySelector('html').innerHTML);
});
});

Creating .eml file from dynamic web components (React/Vue/Angular) [string of compiled html]

The title may be confusing, so let me expand a little more:
My goal is to have a front end framework/(library), like React, Vue, or Angular, that has a normal user interface stuff, such as the user inputting data or an uploading an image to a server.
I then want the web app to basically make an HTML email. So, I'm thinking the best way is to create a text file of HTML, but it will be of the format .eml instead of .txt so it's easy to open in mail clients and send the email.
My question:
- How can I create a string of dynamic HTML that is then saved as a file for the user to download. dynamic meaning sometimes it may be just 1 or 2 items, sometimes it may be 15, but the point is that the variable will change and a loop will be run for as many objects as there are so that the appropriate amount of HTML will be created.
I'm asking because we all know how to display a view in React/others, but how can we get a programmatic pseudo-view in the logic code. That is, how do we get a string representation of the views output of the resulting html, if that makes sense. And then create an .eml file holding that html so the user can download.
Is this even possible in the operations of today's popular frameworks?
====
EDIT
Just an idea I had from research, for generating the file it seems a Blob might be best.
var file = new Blob([html_string], {type: 'text/plain'})
Some, for React, some code would be like the following (thanks to Chris's answer from this SO question.)
class MyApp extends React.Component {
_downloadTxtFile = () => {
var element = document.createElement("a");
var file = new Blob([document.getElementById('myInput').value], {type: 'text/plain'});
element.href = URL.createObjectURL(file);
element.download = "myFile.txt";
element.click();
}
render() {
return (
<div>
<input id="myInput" />
<button onClick={this._downloadTxtFile}>Download txt</button>
</div>
);
}
}
ReactDOM.render(<MyApp />, document.getElementById("myApp"));
Which leaves the question of how to create the string of HTML. Maybe ES6 template literals with embedded expressions? But, that wouldn't be JSX exactly, so I'm not sure how to throw a for loop in there. I'll continue researching or if someone knows how to throw all this together.

Download a file from a webpage based on the file saved date [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am still very new at web development, I need some suggestions please. I am busy creating a page with a date selector, then I have a folder with one file saved every day. What I am trying to do is: The user needs to select the date of the file he wants and click download and the file saved on that date needs to be downloaded. Can someone please give me an idea how I can get this to work. I have tried some things with JavaScript and php and could not get a working solution.
This code should do the job.
First we scan path provided and list all files, then check creation date and find our target file.
<?php
$Path = './'; // Set path of files here
$TargetDate = '2016-08-11'; // We find the first file with thi date
$TargetFile = null; // Store result here
// Lets Do It
foreach (glob("$Path/*") as $File) {
$Stat = stat($File);
if (date("Y-m-d", $Stat['ctime']) == $TargetDate) {
$TargetFile = $File;
break;
}
}
// Your File!
if (is_null($TargetFile)) {
echo 'No file found';
} else {
echo $TargetFile;
}

Best practice for localization and globalization of strings and labels [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm a member of a team with more than 20 developers. Each developer works on a separate module (something near 10 modules). In each module we might have at least 50 CRUD forms, which means that we currently have near 500 add buttons, save buttons, edit buttons, etc.
However, because we want to globalized our application, we need to be able to translate texts in our application. For example, everywhere, the word add should become ajouter for French users.
What we've done till now, is that for each view in UI or Presentation Layer, we have a dictionary of key/value pairs of translations. Then while rendering the view, we translate required texts and strings using this dictionary. However, this approach, we've come to have something near 500 add in 500 dictionaries. This means that we've breached DRY principal.
On the other hand, if we centralize common strings, like putting add in one place, and ask developers to use it everywhere, we encounter the problem of not being sure if a string is already defined in the centralized dictionary or not.
One other options might be to have no translation dictionary and use online translation services like Google Translate, Bing Translator, etc.
Another problem that we've encountered is that some developers under the stress of delivering the project on-time can't remember the translation keys. For example, for the text of the add button, a developer has used add while another developer has used new, etc.
What is the best practice, or most well-known method for globalization and localization of string resources of an application?
As far as I know, there's a good library called localeplanet for Localization and Internationalization in JavaScript. Furthermore, I think it's native and has no dependencies to other libraries (e.g. jQuery)
Here's the website of library: http://www.localeplanet.com/
Also look at this article by Mozilla, you can find very good method and algorithms for client-side translation: http://blog.mozilla.org/webdev/2011/10/06/i18njs-internationalize-your-javascript-with-a-little-help-from-json-and-the-server/
The common part of all those articles/libraries is that they use a i18n class and a get method (in some ways also defining an smaller function name like _) for retrieving/converting the key to the value. In my explaining the key means that string you want to translate and the value means translated string.
Then, you just need a JSON document to store key's and value's.
For example:
var _ = document.webL10n.get;
alert(_('test'));
And here the JSON:
{ test: "blah blah" }
I believe using current popular libraries solutions is a good approach.
When you’re faced with a problem to solve (and frankly, who isn’t
these days?), the basic strategy usually taken by we computer people
is called “divide and conquer.” It goes like this:
Conceptualize the specific problem as a set of smaller sub-problems.
Solve each smaller problem.
Combine the results into a solution of the specific problem.
But “divide and conquer” is not the only possible strategy. We can also take a more generalist approach:
Conceptualize the specific problem as a special case of a more general problem.
Somehow solve the general problem.
Adapt the solution of the general problem to the specific problem.
- Eric Lippert
I believe many solutions already exist for this problem in server-side languages such as ASP.Net/C#.
I've outlined some of the major aspects of the problem
Issue: We need to load data only for the desired language
Solution: For this purpose we save data to a separate files for each language
ex. res.de.js, res.fr.js, res.en.js, res.js(for default language)
Issue: Resource files for each page should be separated so we only get the data we need
Solution: We can use some tools that already exist like
https://github.com/rgrove/lazyload
Issue: We need a key/value pair structure to save our data
Solution: I suggest a javascript object instead of string/string air.
We can benefit from the intellisense from an IDE
Issue: General members should be stored in a public file and all pages should access them
Solution: For this purpose I make a folder in the root of web application called Global_Resources and a folder to store global file for each sub folders we named it 'Local_Resources'
Issue: Each subsystems/subfolders/modules member should override the Global_Resources members on their scope
Solution: I considered a file for each
Application Structure
root/
Global_Resources/
default.js
default.fr.js
UserManagementSystem/
Local_Resources/
default.js
default.fr.js
createUser.js
Login.htm
CreateUser.htm
The corresponding code for the files:
Global_Resources/default.js
var res = {
Create : "Create",
Update : "Save Changes",
Delete : "Delete"
};
Global_Resources/default.fr.js
var res = {
Create : "créer",
Update : "Enregistrer les modifications",
Delete : "effacer"
};
The resource file for the desired language should be loaded on the page selected from Global_Resource - This should be the first file that is loaded on all the pages.
UserManagementSystem/Local_Resources/default.js
res.Name = "Name";
res.UserName = "UserName";
res.Password = "Password";
UserManagementSystem/Local_Resources/default.fr.js
res.Name = "nom";
res.UserName = "Nom d'utilisateur";
res.Password = "Mot de passe";
UserManagementSystem/Local_Resources/createUser.js
// Override res.Create on Global_Resources/default.js
res.Create = "Create User";
UserManagementSystem/Local_Resources/createUser.fr.js
// Override Global_Resources/default.fr.js
res.Create = "Créer un utilisateur";
manager.js file (this file should be load last)
res.lang = "fr";
var globalResourcePath = "Global_Resources";
var resourceFiles = [];
var currentFile = globalResourcePath + "\\default" + res.lang + ".js" ;
if(!IsFileExist(currentFile))
currentFile = globalResourcePath + "\\default.js" ;
if(!IsFileExist(currentFile)) throw new Exception("File Not Found");
resourceFiles.push(currentFile);
// Push parent folder on folder into folder
foreach(var folder in parent folder of current page)
{
currentFile = folder + "\\Local_Resource\\default." + res.lang + ".js";
if(!IsExist(currentFile))
currentFile = folder + "\\Local_Resource\\default.js";
if(!IsExist(currentFile)) throw new Exception("File Not Found");
resourceFiles.push(currentFile);
}
for(int i = 0; i < resourceFiles.length; i++) { Load.js(resourceFiles[i]); }
// Get current page name
var pageNameWithoutExtension = "SomePage";
currentFile = currentPageFolderPath + pageNameWithoutExtension + res.lang + ".js" ;
if(!IsExist(currentFile))
currentFile = currentPageFolderPath + pageNameWithoutExtension + ".js" ;
if(!IsExist(currentFile)) throw new Exception("File Not Found");
Hope it helps :)
jQuery.i18n is a lightweight jQuery plugin for enabling internationalization in your web pages. It allows you to package custom resource strings in ‘.properties’ files, just like in Java Resource Bundles. It loads and parses resource bundles (.properties) based on provided language or language reported by browser.
to know more about this take a look at the How to internationalize your pages using JQuery?

Tag images in the image itself? HOW-TO

How to tag images in the image itself in a web page?
I know Taggify, but... is there other options?
Orkut also does it to tag people faces... How is it done?
Anyone knows any public framework that is able to do it?
See a sample bellow from Taggify:
I know this isn't javascript but C# 3.0 has an API for doing this. The System.Windows.Media.Imaging namespace has a class called BitmapMetadata which can be used to read and write image metadata (which is stored in the image itself). Here is a method for retrieving the metadata for an image given a file path:
public static BitmapMetadata GetMetaData(string path)
{
using (Stream s = new System.IO.FileStream(path, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
{
var decoder = BitmapDecoder.Create(s, BitmapCreateOptions.None, BitmapCacheOption.OnDemand);
var frame = decoder.Frames.FirstOrDefault();
if (frame != null)
{
return frame.Metadata as BitmapMetadata;
}
return null;
}
}
The BitmapMetadata class has a property for tags as well as other common image metadata. To save metadata back to the image, you can use the InPlaceBitmapMetadataWriter Class.
There's a map tag in HTML that could be used in conjunction with Javascript to 'tag' different parts of an image.
You can see the details here.
I will re-activate this question and help a bit. Currently the only thing i have found about is http://www.sanisoft.com/downloads/imgnotes-0.2/example.html . A jQuery tagging implementation. If anyone knows about another way please tell us.
;)
You can check out Image.InfoCards (IIC) at http://www.imageinfocards.com . With the IIC meta-data utilities you can add meta-data in very user-friendly groups called "cards".
The supplied utilities (including a Java applet) allow you to tag GIF's, JPEG's and PNG's without changing them visually.
IIC is presently proprietary but there are plans to make it an open protocol in Q1 2009.
The difference between IIC and others like IPTC/DIG35/DublinCore/etc is that it is much more consumer-centric and doesn't require a CS degree to understand and use it...

Categories

Resources