The oEmbed spec requires a site to link to its oEmbed endpoint and encode the current URL in that link. This is quite annoying for static/CDN-served websites that have to now encoding/return the request URL into the HTML response.
I'm wondering if it is known whether major oEmbed consumers (e.g. Slack, Facebook, or oEmbed client libraries) will add this URL themselves when requesting, so much so that it may be reasonable, in practice, to break the spec and do this statically. Any examples of a static implementation could be insightful.
Dynamic:
Link: <http://flickr.com/services/oembed?url=http%3A%2F%2Fflickr.com%2Fphotos%2Fbees%2F2362225867%2F&format=json>; rel="alternate"; type="application/json+oembed"; title="Bacon Lollys oEmbed Profile"
Static:
Link: <http://flickr.com/services/oembed?format=json>; rel="alternate"; type="application/json+oembed"; title="Bacon Lollys oEmbed Profile"
I implemented discovery using the html tag alternative (as opposed to the Link header).
Our frontend services are deployed by having an NGINX container serve up static build files. I wanted to add oEmbed discovery to all responses, so I used ndk_http_module and ngx_http_set_misc_module to create an escaped URI variable then inject it as a tag at the end of the html element. My experience after playing around with it for a few days on platforms like Slack and Teams is that including the URL query parameter and the format haven't conflicted so far.
Our frontend services are deployed by having an NGINX container serve up static build files. I wanted to add oEmbed discovery to all responses, so I used ndk_http_module and ngx_http_set_misc_module to create an escaped URI variable then injected it as a tag at the end of the html element like so:
server {
listen 80;
set_escape_uri $escaped_uri $http_host$request_uri;
sub_filter '</head>' '<link rel=\"alternate\" type=\"application/json+oembed\" href=\"${OEMBED_URL}?format=json&url=https%3A%2F%2F$escaped_uri\" title=\"Bacon Lollys oEmbed Profile\"></head>';
...
}
Related
I have a file available through an URL (need authorization). I created a mailto: link and would like to attach this file in the mail. How can I do that ?
Something like "mailto:toto#gmail.fr&attachment=site.com/file.pdf"
mailto: doesn't support attachments, but there are various ways you could achieve a similar effect:
Link to the file in a message body
You mentioned that the link needs authorisation, you could generate temporary urls that last 30 minutes (or more/less) which allow for downloads (users can then attach the file themselves)
Send the email yourself
Your service could send an email to your user (or on behalf of your user) with the attachment using something like Amazon SES, or Mailchimp, etc...
Render your PDF into HTML
It seems you are planning on attaching PDF files. Depending on the complexity of the PDF files, you could attempt to convert the PDF into email-friendly HTML using one of many tools, such as pdf2htmlEX or Pandoc.
If you're hoping for an universal solution, you can't. The mailto protocol described in the RFC 2368 tells us :
The creator of a mailto URL cannot expect the resolver of a URL to
understand more than the "subject" and "body" headers.
Even though other headers might be used and understood by some mail clients, this isn't an universally compatible solution. Unless you tell your clients to open these links specifically with a specific mail client that you know supports more headers (like a hypothetical attachment header), you should consider this to not be doable.
My application uses AngularJS for frontend and .NET for the backend.
In my application I have a list view. On clicking each list item, It will fetch a pre rendered HTML page from S3.
I am using angular state.
app.js
...
state('staticpage', {
url: "/staticpage",
templateUrl: function (){
return 'http://xxxxxxx.cloudfront.net/staticpage/staticpage1.html';
},
controller: 'StaticPageCtrl',
title: 'Static Page'
})
StaticPage1.html
<div>
Hello static world 1!
<div>
How do I do SEO here?
Do I really need to do HTML snapshot using PanthomJS or so.
Yes PhantomJS would do the trick or you can use prerender.io with that service you can just use their open source renderer and have your own server.
Another way is to use _escaped_fragment_ meta tag
I hope this helps, if you have any questions add comments and I will update my answer.
Do you know that google renders html pages and executes javascript code in the page and does not need any pre-rendering anymore?
https://webmasters.googleblog.com/2014/05/understanding-web-pages-better.html
And take a look at these :
http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157
http://wijmo.com/blog/how-to-improve-seo-in-angularjs-applications/
My project front-end also has biult on top of Angular and I decieded to solve SEO issue like this:
I've created an endpiont for all search engines (SE) where all the requests go with _escaped_fragment_ parameter;
I parse a HTTP Request for _escaped_fragment_ GET parameter;
I make cURL request with parsed category and article parameters and get the article content;
Then I render a simpliest (and seo friendly) template for SE with the article content or throw a 404 Not Found Exception if article does not exists;
In total: I do not need to prerender some html pages or use prrender.io, have a nice user interface for my users and Search Engines index my pages very well.
P.S. Do not forget to generate sitemap.xml and include there all urls (with _escaped_fragment_) wich you want to be indexed.
P.P.S. Unfortunately my project's back-end has built on top of php and can not show you suitable example for you. But if you want more explanations do not hesitate to ask.
Firstly you can not assume anything.
Google does say that there bots can very well understand javascript application but that is not true for all scenarios.
Start from using crawl as google feature from the webmaster for your link and see if page is rendered properly. If yes, then you need not read further.
In case, you see just your skeleton HTML, this is because google bot assumes page load complete before it actually completes. To fix this you need an environment where you can recognize that a request is from a bot and you need to return it a prerendered page.
To create such environment, you need to make some changes in code.
Follow the instructions Setting up SEO with Angularjs and Phantomjs
or alternatively just write code in any server side language like PHP to generate prerendered HTML pages of your application.
(Phantomjs is not mandatory)
Create a redirect rule in your server config which detects the bot and redirects the bot to prerendered plain html files (Only thing you need to make sure is that the content of the page you return should match with the actual page content else bots might not consider the content authentic).
It is to be noted that you also need to consider how will you make entries to sitemap.xml dynamically when you have to add pages to your application in future.
In case you are not looking for such overhead and you are lacking time, you can surely follow a managed service like prerender.
Eventually bots will get matured and they would understand your application and you will say goodbye to your SEO proxy infrastructure. This is just for time being.
At this point in time, the question really becomes somewhat subjective, at least with Google -- it really depends on your specific site, like how quickly your pages render, how much content renders after the DOM loads, etc. Certainly (as #birju-shaw mentions) if Google can't read your page at all, you know you need to do something else.
Google has officially deprecated the _escaped_fragment_ approach as of October 14, 2015, but that doesn't mean you might not want to still pre-render.
YMMV on trusting Google (and other crawlers) for reasons stated here, so the only definitive way to find out which is best in your scenario would be to test it out. There could be other reasons you may want to pre-render, but since you mentioned SEO specifically, I'll leave it at that.
If you have a server-side templating system (php, python, etc.) you can implement a solution like prerender.io
If you only have AngularJS-only files hosted on a static server (e.g. amazon s3) => Have a look at the answer in the following post : AngularJS SEO for static webpages (S3 CDN)
yes you need to prerender the page for the bots, prrender.io
can be used and your page must have the
meta tag
<meta name="fragment" content="!">
I'm trying to use ms-seo package for meteor but I'm not understanding how it works.
It's supposed to add meta tags to your page for crawlers and social media (google, facebook, twitter, etc...)
To see it working according to the docs all I should have to do is
meteor add manuelschoebel:ms-seo
and then add some defaults
Meteor.startup(function () {
if(Meteor.isClient){
return SEO.config({
title: 'Manuel Schoebel - MVP Development',
meta: {
'description': 'Manuel Schoebel develops Minimal Viable Producs (MVP) for Startups',
},
og: {
'image': 'http://manuel-schoebel.com/images/authors/manuel-schoebel.jpg',
}
});
}
});
which I did but that code only executes on the client (browser). How is that helpful to search engines?
So I test it
curl http://localhost:3000
Results have no tags
If In the browser I go to http://localhost:3000 and inspect the elements in the debugger I see the tag but if I check the source I don't.
I don't understand how client side added tags have anything to do with SEO. I thought Google, Facebook, Twitter when scanning your page for meta tags basically just do a single request. Effectively the same as curl http://localhost:3000
So how does this package actually do anything useful? I feel stupid. 27k users it must work but I don't understand how. Does it require the spiderable package to get static pages generated?
You are correct. You need to use something like the spiderable package or prerender.io to get this to work. This package will add tags, but like any Meteor page, it's rendered on the client.
Try this with curl to see the result when using spiderable:
curl http://localhost:3000/?_escaped_fragment_=
Google will now render the JS itself so for Google to index your page correctly you don't need to use spiderable/prerender.io, but for other search engines I believe you still do have to.
An alternate answer:
Don't use spiderable, as it uses PhantomJS which is rather resource intensive when bots crawl your site.
Many Meteor devs are using Prerender these days, check it out.
If you still have some problems with social share buttons or the package, try to read this: https://webdevelopment7636.wordpress.com/2017/02/15/social-share-with-meteor/ . It was the only way I got mine to work. You don't have to worry about phantomJS or spiderable to make it work fine.
It is a complete tutorial using meteorhacks:ssr and meteorhacks:picker. You have to create a crawler filter on the server side and a route that will be called by it when it is activated. The route will send dynamically the template and the data to a html on the "private" folder, and will render the html to the crawler. The template on the private folder will be the that gets the metatags and the tag.
This is the file that will be on the private folder
I can't put the other links with the code here, but if you need anymore help, go to the first link and see if the tutorial helps.
I am hosting a widget on another domain than the site in which I am embedding the widget.
The dashboard.js loads fine, but the HTML template gets,
XMLHttpRequest cannot load http://192.168.2.72:8081/widgets/templates/dashboard.html. Origin http://192.168.2.72:8080 is not allowed by Access-Control-Allow-Origin.
The url to the template is correct, so I can only assume this is a cross domain error. In the widget, the template is referred to like:
templatePath: dojo.moduleUrl("monitor/dashboard", "../templates/dashboard.html"),
This all works when it's a local widget. Is there anyway to get dojo to load the HTML template in better way?
The way I have defined loader,
<script data-dojo-config="async: 0, dojoBlankHtmlUrl: '/blank.html', parseOnLoad:true,
packages: [
{name: 'monitor', location: 'http://192.168.2.72:8081' + '/widgets'},
]"
src="/media/scripts/dojo/dojo/dojo.js"></script>
Well, there are several ways to solve it.
The first solution is a serverside solution by using CORS (Cross-origin resource sharing). If you can set the CORS headers like:
Access-Control-Allow-Origin: *
Your browser will detect this and will allow the XMLHttpRequest.
While this solution is probably the best, you could also use some alternatives, for example by using JSONP (for example with dojo/request/script). However, using JSONP also means that you cannot use a plain HTML template, but you have to convert your HTML template to a JavaScript string.
If you then use the templateString property, you can then pass the template as a string, in stead of specifying the path.
The templateString property also allows you to build your template, if you can build your template as a JavaScript string, then you could build your template, for example by using Grunt and the grunt-html-convert task.
You might be able to do a similar thing with the Dojo build system by using depsScan. This build transform should scan modules and convert legacy code to AMD and it should also look for things like dojo.cache(), dojo.moduleUrl() and templatePath and convert it to templateString.
Look at the documentation for more info.
the last (and also pretty common) solution is to use a reverse proxy. If you have to host your HTML templates on a different domain, you can still define a reverse proxy in your HTTP server and redirect certain calls to a different domain, for example (Apache 2):
ProxyPass /templates http://other-domain.com
ProxyPassReverse /templates http://other-domain.com
This allows you to go to /templates/my-template.html, which will be redirected to http://other-domain.com/my-template.html.
I have some text that includes URLs to GitHub Gists. I'd like to look for those URLs and put the Gist inline in the content client-side. Some things I've tried:
A direct lookup to GitHub's OEmbed API.
For https://gist.github.com/733951, this means I do a JSON-P lookup to
https://github.com/api/oembed?format=json&url=https%3A%2F%2Fgist.github.com%2F733951,
extract the html property of the object, and add adding that to my page. The problem
here is that GitHub's OEmbed API only returns the first three lines of the Gist.
Using the jQuery-embedly plugin.
Calling
jQuery('a.something').embedly({allowscripts: true})
works, but Embedly strips formatting from the Gist. Wrapping it in a <pre> tag doesn't help because there are no line-breaks.
Using GitHub's .js version of the gist.
https://gist.github.com/733951.js uses document.write, so I don't have any control over where in the page when I require it dynamically. (If I could write it into the HTML source it would show up in the right place, but this is all being done client-side.)
I've been inspired by client side gist embedding and built a script.js hack library just for that (I also use it to remove the embedded link style and use my own style that fits better on my page) ...
It's more generic than just embedding gists and pasties - actually I'm using it to dynamically load some third-party widgets / google maps / twitter posts) ...
https://github.com/kares/script.js
Here's the embedding example :
https://github.com/kares/script.js/blob/master/examples/gistsAndPasties.html
UPDATE: gist's API since then supports JSONP, jQuery sample:
var printGist = function(gist) {
console.log(gist.repo, ' (' + gist.description + ') :');
console.log(gist.div);
};
$.ajax({
url: 'https://gist.github.com/1641153.json',
dataType: 'jsonp', success: printGist
});
I just started a project called UrlSpoiler (on github). It will help you embed gists dynamically. It's hosted on Heroku on the free/shared platform so you can play with it, but I'd recommend copying the code you need into your own app for production use.