How to grab info from DC Metro site to create XML file? - javascript

I feel like this may be a trivial problem for most people but I'm new at doing all this, so any help would be much appreciated!
So I need to get the coordinates of all the DC metro stops from the website. I did some searching and what I figured out is that the site with all the stations provides you with the option to click on the name of the station, which then shows a map of where the station is located. When you click on the map, you are directed to a google maps page where the coordinates are shown in the search box. I also noticed that the URL contains the coordinates as well.
From the research I did, it looks like it's possible to parse through the source code of the original DC metro website that holds all of the stations, go through each link to the stations, and then parse through the source code of each station's individual website to grab the coordinates and the name of the station. Once that is retrieved, it can be stored into an XML file. I wanted to make the XML look something like:
<stations>
<station>
<name>Ballston-MU</name>
<lat>38.882071</lat>
<long>-77.111845</long>
</station>
<station>
<name>Addison Road</name>
<lat>38.886713</lat>
<long>-76.893592</long>
...
</stations>
I don't really have an preference to what language to use. I'm not even sure which one would be easier. I've used javascript and jquery to do the rest of the project. But since I only need the XML file, I don't think it'll matter what langauge I use to create it.
Sorry I know this is super long!!!

Just in case anyone was wondering, I did what user thg435 said and used the DC metro's own API. Just registered, got an API key, and used the URL they gave to get the XML file with all the info needed! :)
This was the URL (gotta insert your own custom API key to make it work):
http://api.wmata.com/StationPrediction.svc/GetPrediction/A10?api_key=YOUR_API_KEY

Related

python javascript scrape automatically

Python novice here.
I am trying to scrape company information from the Dutch Transparency Benchmark website for a number of different companies, but I'm at a loss as to how to make it work. I've tried
pd.read_html(https://www.transparantiebenchmark.nl/en/scores-0#/survey/4/company/793)
and
requests.get("https://www.transparantiebenchmark.nl/en/scores-0#/survey/4/company/793")
and then working from there. However, it seems like the data is dynamically generated/queried, and thus not actually contained in the html source code these methods retrieve.
If I go to my browser's developer tools and copy the "final" html as shown there in the "Elements" tab, the whole information is in there. But as I'd like to repeat the process for several of the companies, is there any way to automate it?
Alternatively, if there's no direct way to obtain the info from the html, there might be a second possibility. The site allows to download the information as an Excel-file for each individual company. Is it possible to somehow automatically "click" the download button and save the file somewhere? Then I might be able to loop over all the companies I need.
Please excuse if this question is poorly worded, and thank you very much in advance
Tusen takk!
Edit: I have also tried it using BeautifulSoup, as #pmkroeker suggested. But I'm not really sore how to make it work so that it first runs all the javascript so the site actually contains the data.
I think you will either want use a library to render the page. This answer seems to apply to python. I will also copy the code from that answer for completeness.
You can pip install selenium from a command line, and then run something like:
from selenium import webdriver
from urllib2 import urlopen
url = 'http://www.google.com'
file_name = 'C:/Users/Desktop/test.txt'
conn = urlopen(url)
data = conn.read()
conn.close()
file = open(file_name,'wt')
file.write(data)
file.close()
browser = webdriver.Firefox()
browser.get('file:///'+file_name)
html = browser.page_source
browser.quit()
I think you could probably skip the file write and just pass it to that browser.get call, but I'll leave that to you to find out.
The other thing you can do is look for the ajax calls in a browser developer tool. i.e. when using chrome the 3 dots -> more tools -> developer tools or press something like F12. Then look at the network tab. There will be various requests. You will want to click one, click the Preview tab, and then go through each until you find a response that looks like json data. You are effectively look for their API calls that they used to get the data to generate things. Once you find one, click the Headers tab and you will see a Request URL.
i.e. this https://sa-tb.nl/api/widget/chart/survey/4/sector/38 has lots of data
The problem here is it may or may not be repeatable (API may change, id's may change). You may have a similar problem with just HTML scraping as the HTML could change just as easily.

How do I create environment variables to protect my Google Maps API Key(or any other secret value) for my website?

I am learning to code my own website using Bootstrap and have easily placed a map on my page using a Google map API-key and script from Google Developers:
<script async defer
src="https://maps.googleapis.com/maps/api/js?key=YOUR_API_KEY&callback=initMap">
</script>
Ideally I would have something like:(i.e. I have tried this):
Html: <script src="map-value-pair.php"></script></script><script async defer src="$MapURL"></script>
PHP: <?php $MapURL="maps.googleapis.com/..." ?>
So obviously this doesn't hide the URL it prints to the html.
I am not convinced anyone actually tries to hide this key for a basic purpose like mine because the http referrer restriction on Google Developers is sufficient. But, after trying many different approaches to obfuscate the key I have decided I would like to learn to create environment variable to hide values. I think it could be useful and I would like to extend my server-side knowledge. Google best practices also suggests to "store them in environment variables or in files outside of your application's source tree".
I have found a related stack: What steps should I take to protect my Google Maps API Key?
In that stack a related link was given to hackernoon.com (limited links in first post) My hosting service uses cPanel and provides this apache environment variables tutorial: documentation.cpanel.net...
I am having problems finding material to start myself off. Right now I have two potential action plans:
enable ssh on cpanel>follow cPanel tutorial noted above>figure out how to access variable in html
or
enable ssh on cpanel>install node.js somehow>follow something like this on twilio>figure out how to access variable in html
Any help or suggestions appreciated. Best case scenario some edits to my action plan with a couple links to check out. I just need some sort of confidence to move forward with one of these plans.
P.S. Woo first post!
EDIT: Minor cleanup of question. Added ideally script.
Anything you output as HTML can be scraped by anyone visiting your website. You can't access variables etc in HTML as it is the final output, generated server side once it has processed all your variables.
As you are using the Javascript library then there is no way to hide your key. Javascript is executed on the client's browser, so you have to pass them your API key for it to work.
If all you need are static maps then you can generate them server side and return them in your HTML without ever revealing your key to the end user.
Heres a PHP example:
$mapimg = "https://maps.googleapis.com/maps/api/staticmap?center=<$USER-LOCATION>&size=640x320&key=<$YOUR-KEY>"; // Include the user location in your request URL
$imagedata = file_get_contents($mapimg); // Retrieve the image server side
$src = base64_encode($imagedata); // Encode the image server side
$imagemap = '<img src="data:image/jpg;base64,'.$src.'">'; // Create a variable to insert the image into your page
Then within your PHP script to output the HTML you just include $imagemap wherever you want the map to appear.
This stack has a similar question, see geocodezip's answer: How do I securely use Google API Keys
(I looked but could not find this answer before posting the original question).
Basically what I gather is you must have the API_Key showing for Googles javascript to work, despite Googles somewhat contradictory advice here: https://support.google.com/googleapi/answer/6310037
If you know of some reading material for me or others about creating environment variables or their usefulness please do so, I am still interested in that!

Google script to automatically add predetermined file paths to Google form response

I have little to, really no coding experience. Spare for a bit of HTML, CSS, and AppleScript However, I'm not sure where else to go in terms of insight on this question. Currently I am creating a Google Form for colleagues to submit work orders that will be recorded on a Google Spreadsheet.
But, I need to figure out how to write a script that will automatically add a specific file path next to files uploaded to the form.
For example, if a user uploaded a file named "061217_VideoAuthor_Slide1.mp4" instead of the Google Drive link (which is what is automatically populated with this file upload form option) I need that file name to be recorded and then it's cell needs to be changed from a Google Drive link to a specific file path
(Ex. /user/editor1/template/061217_VideoAuthor_Slide1.mp4)
Is there anyone that can point me to a sort-of specific tutorial or some documentation that could help me figure this out?
You can run logic every time the form is submitted that looks at the values in the sheet and augments them as needed.
Here's a walkthrough someone else put together:onFormSubmit
Your goal could be achieved by running a script every time a form is submitted, on certain schedule or by demand.
You will need to use JavaScript to get the file Id from the file URL, then get the file by using first getFileById(id) and getName() method from the Google Drive Service to get the file itself and its name respectively.
By the other side, instead or replacing the file URL I suggest you to add the file path to another column, just in case you need it later.

Google Docs Spreadsheet to JSON

I've seen numerous articles on this but they seem outdated, for instance none of the Google Docs Spreadsheet urls has key parameter. I read this as well:
JSON data from google spreadsheet
Then I read this to access data
https://developers.google.com/gdata/samples/spreadsheet_sample
My spreadsheet exists at:
https://docs.google.com/spreadsheets/d/1SKI5773_68HiSve1fsz7fr4gotjFWHB7KBuVsOlLz6I/edit#gid=0
I've tried using this code, I think I have a problem with the key or syntax, please guide to fix.
<script src="http://spreadsheets.google.com/feeds/feed/1SKI5773_68HiSve1fsz7fr4gotjFWHB7KBuVsOlLz6I/worksheet/public/basic?alt=json-in-script&callback=importGSS"></script>
<script type="text/javascript">
function importGSS(json) {
console.log('finished');
}
</script>
The src attribute in your script tag is an invalid link (and you can see this for yourself by viewing your link directly in a browser).
The feed/key/worksheet section of the URL has the right key but the wrong feed and worksheet.
In the URL, replace "feed" with either "cells" (separate value for each cell) or "list" (separate value for each row).
At the same time, replace "worksheet" with "od6" (indicating the leftmost, or default, sheet - see this blog post for accessing other sheets).
If you view this new URL directly in a browser, you can see that it returns a meaningful value.
Your final script tag might look like this:
<script src="https://spreadsheets.google.com/feeds/list/1SKI5773_68HiSve1fsz7fr4gotjFWHB7KBuVsOlLz6I/od6/public/values?alt=json-in-script&callback=importGSS"></script>
For more info, you can see an example on the Google Developers site
APISpark PaaS has a feature to create and deploy a custom JSON API based on a GSpreadsheet. That might help and give you more control on the web API (CORS support, authentication, custom domain and so on).
See the tutorial here: https://apispark.com/docs/tutorials/google-spreadsheet
You may consider use an alternative to this request of your sheet data, because this method is deprecated.
Anyway, you can still using another feed format, you can see this alternatives in:
https://spreadsheets.google.com/feeds/worksheets/your-spreadsheet-id/private/full
In that result you can see any export formats are availables. Can help you an CSV or alt JSON visualization format?
Another potential solution here is to use this https://gist.github.com/ronaldsmartin/47f5239ab1834c47088e to wrap around your existing spreadsheet.
Add the id and sheet html param to the URL below.
https://script.google.com/macros/s/AKfycbzGvKKUIaqsMuCj7-A2YRhR-f7GZjl4kSxSN1YyLkS01_CfiyE/exec
Eg: your id is your sheet id which is
1SKI5773_68HiSve1fsz7fr4gotjFWHB7KBuVsOlLz6I
and your sheet which is
Sheet1
In your case you can actually see your data (it's actually working) here as json at
https://script.google.com/macros/s/AKfycbzGvKKUIaqsMuCj7-A2YRhR-f7GZjl4kSxSN1YyLkS01_CfiyE/exec?id=1SKI5773_68HiSve1fsz7fr4gotjFWHB7KBuVsOlLz6I&sheet=Sheet1
To be safe, you should deploy the code sheetAsJson.gs in the github gist above as your own in your Google Drive.
You have plenty of possible answers above. For those that come anew, if you're looking for a more controlled JSON generator, check out this gist:
JSONPuller
It takes in a Spreadsheet and returns an array of objects, with the lined that you decide as the headers (defaults to whichever line is frozen)
Cheers,
The easiest way for me was to export spreadsheet to CSV
File -> Download -> Comma-separated values
Then just convert CSV to JSON with free online servic

Sending google visualization chart to email

Can we send Google Visualization chart to an email client?
I tried to copy paste the javascript code while sending the email, but its been removed on the fly by gmail.
Thanks and Regards.
Disclaimer: I'm Image-Charts founder.
6 years later! Google Image-Charts is deprecated since 2012, and as an indiehacker, I don't want to rewrite from scratch an image generation backend each time I started a new SaaS to just be able to send charts in email...
That's why I've built Image-charts 👍 and added gif animation on top of it 🚀(chart animations in emails are awesome!!), no more server-side chart rendering pain, no scaling issues, it's blazing fast, 1 URL = 1 image chart.
https://image-charts.com/chart
?cht=bvg
&chd=t:10,15,25,30,40,80
&chs=700x300
&chxt=x,y
&chxl=0:|March '18|April '18|May '18|June '18|July '18|August '18|
&chdl=Visitors (in thousands)
&chf=b0,lg,90,05B142,1,0CE858,0.2
&chxs=1N**K
&chtt=Visitors report
&chma=0,0,10,10
&chl=||||+33% !|x2 !
I ran into this problem as well. In order to send a chart in email, you need to render it as an image because email clients strip Javascript.
If you're using Google Charts, you'll have to run the Javascript and then export it using getImageURI. To automate this, you need a headless renderer like puppeteer.
The solution to the problem is open source. I wrapped chart rendering in a library and web server: https://github.com/typpo/quickchart. This web service handles the rendering details, all you do is call the API with your data.
For example, define your chart in the query parameters:
https://quickchart.io/chart?width=500&height=300&c={type:'bar',data:{labels:['January','February','March','April','May'],datasets:[{label:'Dogs',data:[50,60,70,180,190]},{label:'Cats',data:[100,200,300,400,500]}]}}
The above URL renders this image:
Hope this helps!
Google charts could be published in 2 ways:
as an Image. Edit Chart-> Publish Chart-> Format : image. An image link is generated. This image link could be either used in any html page or could be embedded in any email.
as an Interactive Chart. Edit Chart-> Publish Chart-> Format : Interactive Chart. In this case javascript code has to be inserted. This could only be published in html pages. This could not be attached in email body as most email servers/clients do not process javascript code (AFAIK).
3.5 years later... :)
My team at Ramen recently spun out some internal functionality into a standalone product that does just this: https://ChartURL.com
You can generate charts on the fly using an "Encrypted URL" scheme, or you can send us huge amounts of data and return a Short URL that'll resolve to an image.
It was built on top of C3js.org so there's a ton of flexibility in what you can generate.
These URLs can be used in web apps & mobile apps, but the original intent was email charts so I hope this helps!
There is very little JS support in email clients. so you will have to use an image chart. But you could wrap the chart in a link to the svg version.
Doesn't Google Charts have an API where you can just build a URL and it returns an image - no Javascript needed? It certainly used to. If you can use that, then:
a) Just put the URL in the email and let the users email client get it
b) Fetch the image with CURL and attach to the email.

Categories

Resources