Client-side Website Localization Using URL Path - javascript

I'm working on localizing a website that I recently built - https://xmllint.com
The project is rather small, and I mostly use it to teach myself javascript along with Webpack and other web-related technologies/frameworks.
The website is 100% browser-based and does not have a lot of content. For that reason, I decided to go with this approach to translate the content itself.
The replacement of the placeholders with the 'real' content happens via javascript that is at the bottom of the HTML. Ultimately I want to have the content ready before the page renders. Just so that that search engines can index the new pages nicely.
What I want to achieve is that the page itself detects the language code (e.g., https://xmllint.com/es/ for Spanish) from the URL and then performs the translation based on that value.
What I'm struggling with is how to handle the part of the URL in the web page itself as the directory itself does not exist on the server directly.
So far, I tried redirecting all HTTP 404 codes to the index.html file itself (on the hosting side) - As suggested for SPAs.
This leads me to problems loading the resources as the relative paths now include the language code part of the URL.
Two ideas came to mind.
Improve the current Webpack build so that I only deliver a single file including all assets. That way I would not have problems with relative paths and I should be good. (Is Single page application just one page using for entire web application?)
Should I introduce a routing framework like Vue?
What I'm not asking for is
How to parse the URL itself.
For SEO reasons I also don't want to use URL parameters.
Hacky ideas or workarounds. I have no time pressure and want to know how this is done best.
Any help/ideas are greatly appreciated.

Under the circumstances that you have no time pressure, I'd personally recommend to use a JavaScript framework - or more specifically - Vue.js. Since you already mentioned it, I assume you have basic knowledge of it.
I see various ways to benefit from choosing this path:
The actual problem you're facing will no longer be an issue. The application will handle all the routing, so all you have to do is return the index.html and you're good to go
The developer experience (build process, hot reload, deployment, ...) will dramatically improve your daily work
Your bundle size will very likely reduce
You're prepared for future growth of your application
Best of all: you're challenging yourself by using a technology you probably have not much experience with. Speaking for myself, that should be reason enough. :-)
Happy coding!

Related

How to download and query html pages where JS processing is necessary?

I often compile informal datasets by running some kind of XPath/XQuery on publicly available web pages. Usually the structure of the HTML is regular enough that useful information can be extracted easily.
But today I've come across tunefind.com. This website makes extensive use of the REACTJS framework, and so most of the structure of the page is configured client-side by Javascript. The pages, when initially downloaded, are very basic and missing a lot of information. The pages are populated by a script that uses a hopelessly messy blob of JSON data at the bottom of the page.
The only way I can think of to deal with this would be to use some kind of GUI-based web engine and just not display the GUI part. But that is a preposterous amount of work for these casual little CLI tools that I use to gather information.
Is there any way to perform the javascript preprocessing without dealing with unnecessary graphics?
Even if you were to process without the graphics the react javascript will be geared towards running in a browser context, at the very least it will expect a functioning DOM to exist, the application itself may also require clicks / transitions to happen before you can see some data.
Your best bet then is to load the page in a browser, to keep this simple, there are plenty of good browser automation frameworks designed for this.
I've used a fair few libraries over the years including phantomJS and recently I've gotten the most mileage out of nightmarejs.
It runs an electron browser for you and gives you a useful promisified javascript API to control it with, that has common browser functions such as clicking, following links etc.
You can configure it to hide the browser which is useful for making a CLI tool, however its a bit of a pseudo-headless mode and will still require a windowing/graphical context (e.g. x window).
Hope this helps.
PS - If you're at all used to docker it's not hard to make this just a running container!

Thymeleaf or Angular4 with Springboot

I've mostly worked with Springboot framework only with JSP to cover the things I need for the website part. Now, I've got a project to do that mostly revolves, if not all, around the website and I have a couple of questions.
Just to be clear first, I don't have experience either with Thymeleaf or Angular so whichever I pick will be the first time (I think thymeleaf syntax would be easier for me to handle).
Ok so the main goal in my mind is not to render the whole page every time I load data from the backend, so I figure I would have like a header/content/footer parts where every time I click something only the content part would change. Also, I would like the possibility for a loader to show and go away when the content part is changing. This web application will need to be secure for users that register.
I've searched the web to investigate both frameworks, but can not seem to find the right answer so I can continue with my development
I do not mind creating separate REST services in later development if some other platform needs to hook up to the service if I decide to go with Thymeleaf.What do you guys think in what direction should I go, Thymeleaf or Angular? Any help and/or discussion would be much appreciated.
I am sorry if this seems like a general question, but I just need some basic guidelines to start with. Cheers!
I think transitioning to ThymeLeaf is probably going to be the easiest for you, but Angular is a great choice as well, and from there it is up to you. Would you rather use mostly expressions similar to JSTL expressions, but in Spring's SpEL language, then use ThymeLeaf, or would you rather use JavaScript, then use Angular. It just a user's preference for what you are doing.
The fragmenting portion (header, footer, body, etc..) is native to both frameworks. It just depends on which one you want to use. Whatever you go with, to load specific sections while not rendering the others, is going to require AJAX and for you to feel comfortable with how the template frameworks work.
I would suggest reading up on both of them to figure out which one you feel more comfortable with.
Angular
Thymeleaf
Both of them have great documentation for beginners and the Baeldung and Mkyong have good walkthrus for ThymeLeaf. Angular's documentation I found good enough on its own.
For a loader, you can do with simple CSS and JS. There is a ton of demos out there on how to do full screen loaders and how to turn them on or off with JS or CSS. IHateTomatoes has a good article about how to build a full screen loader, that has a No JS fallback option and should give you a good starting point.
Your point about it needing to be secure is a whole other monster. I would look into Spring Security. It's relatively straightforward, especially when using Spring Boot. If you want it can control the users session and assist in preventing session jacking, add CSRF to prevent cross site forgery, control permissions to urls and on down the line, or not. It all just depends.
Either way, don't randomly stab at security, it will end up in something that you feel is secure but it is not, which leaves you and your users in a very bad spot. Again, Baeldung has a great walkthru on the user registration and login process that can help get you started with Spring security and how to tie everything together.
Pretty high level answer, but hopefully gave you some good starting points and some resources.
Build apps decoupling frontend from the backend.
Always build apps following the "The API-first approach"
The API-first approach involves setting up the foundation of your app, which is the application programming interface
For me the differences between Thymeleaf and Angular:
Using Thymeleaf: You don't need to create Restful/web service endpoints on your frontend side because you just need to make calls to the backend from the frontend itself.
Using Angular: Besides of your Restful/ web service endpoints on your backend side, you have to build Restful/ web services endpoints on your frontend side because you don't want to expose direct access from Javascript code to the backend.
Hope this helps and happy coding time!

Techniques to implement javascript enabled websites that also run without it

I'm looking for design patterns, frameworks or techniques to implement a web page that fulfills these requeriments:
The web pages are rendered statically at first load, without needing any JavaScript support.
If enabled, JavaScript should be used to load new portions of the website when the user tries to follow a link, and change the URL accordingly using the HTML5 history api or equivalent.
If not available, new pages should be loaded statically by following the links.
I shouldn't write the code twice, obviously. This would lead to inconsistencies.
I've been thinking on this problem for a while but I haven't come with an answer.
Edit: MVC sounds like a good start to solve this problem, but I definitely want to avoid writing the views code twice.
This requires backend support, so the techonology of your backend matters.
That said, this sound an awful lot like the rails library Turbolinks:
https://github.com/rails/turbolinks/
On the front end alone, supporting JS and non-JS is known as Graceful Degredation and there are lots of articles on it all over the web.
If I understand your problem, I think these links will help:
http://www.implico.pl/lang-en/ajaxgetcontent_dynamic_ajax_website,7.html
http://www.asual.com/jquery/address/
There is also a github project (browserstate/history.js) to handle this, but I dont post more links, I need reputatión :p

Using node.js to serve content from a Backbone.js app to search crawlers for SEO

Either my google-fu has failed me or there really aren't too many people doing this yet. As you know, Backbone.js has an achilles heel--it cannot serve the html it renders to page crawlers such as googlebot because they do not run JavaScript (although given that its Google with their resources, V8 engine, and the sobering fact that JavaScript applications are on the rise, I expect this to someday happen). I'm aware that Google has a hashbang workaround policy but it's simply a bad idea. Plus, I'm using PushState. This is an extremely important issue for me and I would expect it to be for others as well. SEO is something that cannot be ignored and thus cannot be considered for many applications out there that require or depend on it.
Enter node.js. I'm only just starting to get into this craze but it seems possible to have the same Backbone.js app that exists on the client be on the server holding hands with node.js. node.js would then be able to serve html rendered from the Backbone.js app to page crawlers. It seems feasible but I'm looking for someone who is more experienced with node.js or even better, someone who has actually done this, to advise me on this.
What steps do I need to take to allow me to use node.js to serve my Backbone.js app to web crawlers? Also, my Backbone app consumes an API that is written in Rails which I think would make this less of a headache.
EDIT: I failed to mention that I already have a production app written in Backbone.js. I'm looking to apply this technique to that app.
First of all, let me add a disclaimer that I think this use of node.js is a bad idea. Second disclaimer: I've done similar hacks, but just for the purpose of automated testing, not crawlers.
With that out of the way, let's go. If you intend to run your client-side app on server, you'll need to recreate the browser environment on your server:
Most obviously, you're missing the DOM (Document Object Model) - basically the AST on top of your parsed HTML document. The node.js solution for this is jsdom.
That however will not suffice. Your browser also exposes BOM (Browser Object Model) - access to browser features like, for example, history.pushState. This is where it gets tricky. There are two options: you can try to bend phantomjs or casperjs to run your app and then scrape the HTML off it. It's fragile since you're running a huge full WebKit browser with the UI parts sawed off.
The other option is Zombie - which is lightweight re-implementation of browser features in Javascript. According to the page it supports pushState, but my experience is that the browser emulation is far from complete - however give it a try and see how far you get.
I'm going to leave it to you to decide whether pushing your rendering engine to the server side is a sound decision.
Because Nodejs is built on V8 (Chrome's engine) it will run javascript, like Backbone.js. Creating your models and so forth would be done in exactly the same way.
The Nodejs environment of course lacks a DOM. So this is the part you need to recreate. I believe the most popular module is:
https://github.com/tmpvar/jsdom
Once you have an accessible DOM api in Nodejs, you simply build its nodes as you would for a typical browser client (maybe using jQuery) and respond to server requests with rendered HTML (via $("myDOM").html() or similar).
I believe you can take a fallback strategy type approach. Consider what would happen with javascript turned off and a link clicked vs js on. Anything you do on your page that can be crawled should have some reasonable fallback procedure when javascript is turned off. Your links should always have the link to the server as the href, and the default action happening should be prevented with javascript.
I wouldn't say this is backbone's responsibility necessarily. I mean the only thing backbone can help you with here is modifying your URL when the page changes and for your models/collections to be both client and server side. The views and routers I believe would be strictly client side.
What you can do though is make your jade pages and partial renderable from the client side or server side with or without content injected. In this way the same page can be rendered in either way. That is if you replace a big chunk of your page and change the url then the html that you are grabbing can be from the same template as if someone directly went to that page.
When your server receives a request it should directly take you to that page rather than go through the main entry point and the load backbone and have it manipulate the page and set it up in a way that the user intends with the url.
I think you should be able to achieve this just by rearranging things in your app a bit. No real rewriting just a good amount of moving things around. You may need to write a controller that will serve you html files with content injected or not injected. This will serve to give your backbone app the html it needs to couple with the data from the models. Like I said those same templates can be used when you directly hit those links through the routers defined in express/node.js
This is on my todo list of things to do with our app: have Node.js parse the Backbone routes (stored in memory when the app starts) and at the very least serve the main pages template at straight HTML—anything more would probably be too much overhead /processing for the BE when you consider thousands of users hitting your site.
I believe Backbone apps like AirBnB do it this way as well but only for Robots like Google Crawler. You also need this situation for things like Facebook likes as Facebook sends out a crawler to read your og:tags.
Working solution is to use Backbone everywhere
https://github.com/Morriz/backbone-everywhere but it forces you to use Node as your backend.
Another alternative is to use the same templates on the server and front-end.
Front-end loads Mustache templates using require.js text plugin and the server also renders the page using the same Mustache templates.
Another addition is to also render bootstrapped module data in javascript tag as JSON data to be used immediately by Backbone to populate models and collections.
Basically you need to decide what it is that you're serving: is it a true app (i.e. something that could stand in as a replacement for a dedicated desktop application), or is it a presentation of content (i.e. classical "web page")? If you're concerned about SEO, it's likely that it's actually the latter ("content site") and in that case the "single-page app" model isn't appropriate; you really want the "progressively enhanced website" model instead (look up such phrases as "unobtrusive JavaScript", "progressive enhancement" and "adaptive Web design").
To amplify a little, "server sends only serialized data and client does all rendering" is only appropriate in the "true app" scenario. For the "content site" scenario, the appropriate model is "server does main rendering, client makes it look better and does some small-scale rendering to avoid disruptive page transitions when possible".
And, by the way, the objection that progressive enhancement means "making sure that a user can see doesn't get anything better than a blind user who uses text-to-speech" is an expression of political resentment, not reality. Progressively enhanced sites can be as fancy as you want them to from the perspective of a user with a high-end rendering system.

Pure Javascript and HTML app & deployment via CDN ... good idea?

A big and general question, though NOT a discussion
Me and a friend are discussing a web application being developed. Currently it uses PHP, but the PHP doesn't store anything and it is all OAuth based. The whole thing talks to an independent API. The PHP is really just mirroring alot of the Javascript logic for browsers without Javascript support.
If it were decided to enforce Javascript as a requirement (let's not go into that ... whole other issue)
Are there any technical, fundamental problems serving the app as HTML+Javascript hosted on a CDN? That is, 100% Static javascript and HTML with no server-side logic. As the Javascript is just as capable of doing all the API calls as the PHP. Are any existing sites doing this?
We can't think of any show-stoppers, but it seems like a scary thought to make a "web" app 100% client side ... so looking for more input.
(To clarify, the question is about deploying using ONLY javascript and HTML and abandoning server-side processing outside the JSON API or whatever)
Thanks in advance!
One issue is with search engines.
Search engine crawlers index the raw HTML source code of a web-page. If you use JavaScript to load new data and generate new content, crawlers won't come into play, so your content won't get indexed.
However, Google is offering a solution for this - read here: http://code.google.com/web/ajaxcrawling/
Other than this, I can't think of any other issue...
Amazon has been offering the service on its S3 for a little while now. http://aws.typepad.com/aws/2011/02/host-your-static-website-on-amazon-s3.html . Essentially this allows you to specify a default index page and error pages. Otherwise you just load up your html on the S3 and point your www CNAME on your domain to the Amazon S3 bucket or cloudfront CDN.
The only thing that is not possible this way is if a user ends up typing example.com instead of www.example.com, you need to ensure that you have your DNS correctly forward these to www. Also the S3 will not be able to handle a naked domain (http://example.com/).
Regarding how good an idea it is, it sounds good to us as well. And we are currently exploring the option. So far it appears to work fine. What we have done is to setup beta.example.com to point to a CDN hosted site (S3) and are testing to see if it gives us everything we need. Performance is great though !

Categories

Resources