I am aware there are several other similar types of questions about pushState and SEO, but I cannot find one asking about this issue.
If I have a page with url site.com/Product/Detail2, that loads all the "pages" associated with site.com/Product into it and then scrolls Detail2 into view, will it cause problems with SEO if there are links like site.com/Product/Detail1 and site.com/Product/Detail3? Each of these urls will actually load the same exact content, but scroll the user to the portion of the page that detail is on similar to how fragment identifiers work. I understand Google wont run the JavaScript and will spider all those product urls, but I have read that google doesn't like different urls returning the exact same content. For example, site.com/Product/Detail1 and site.com/Product/Detail2 will both return the same content when user initial navigates to them, and code will scroll the user to the specific detail.
I don't want to have to do ajax calls to dynamically load content to avoid the different product sub urls from pulling up the exact same content. I could see a solution where navigating to each url initial loads only that one sub url's content but then gets the rest of the Product content with ajax calls. That would allow google to think each of those product urls returns unique content but the users always sees one big page that scrolls the sub urls into view when they use the nav bar.
Has anyone else thought about this specific issue and dealt with it before?
Use the canonical tag on the detail pages (ones that describe only one item and ideally have descriptive urls).
More on rel="canonical"
Related
First I should explain my site structure. I have WordPress pages (with each their own page-{id}.php template to change what they look like), these are also the main pages on my website. Consider these an overview, so like a catalogue in a store (which my website is not, but it's a good comparison)
On these pages I show a list of links (to continue the metaphor, the products of that store) to other pages, when I click on one of these links I go to a new page with actual content (a detailed view of the product in our metaphor). These pages are WordPress posts and are influenced by the single_post.php template.
Now the problem, for four out of five pages there is no problem and the single_post.php template does a great job. But for the last page I wanted to go a different route, but I cannot change the contents of that page because if I do, I'd have to change the single_post.php template file and break the other pages.
So here's my question, I'm aware of the is_home function in WordPress and I was wondering if there is a way to check in the single_post.php file which page I'm on and depending on if I'm on one of the four good pages or the one bad page, I show different content? (Basically is there something like the is_home function where you change the "home" part to a specific name of a page?)
This doesn't have to be something specific like the is_home, a regular javascript or something would work just fine too.
You can use is_single function like below to check you are in which post:
if(is_single(123)) // Here check if post id equal 123
// some code
if(is_single("My post") // Here check if post title equal My post
// some code
if(is_single("my-post-slug") // Here check if post slug equal my-post-slug
// some code
Having looked at some old (2009) questions, I anticipate the answer is no, but I can't find a recent enough definitive answer, so I'm asking again...
I have some JavaScript-enabled tabs on a page which can be automatically pre-selected by passing a parameter in the query string like this:
www.example.com/landing-page?tab=tab1
Is there a way to prevent Google from indexing content in the tabs that are not currently visible? Also, I'd need it to treat the URL as a different page if a different tab is specified in the query string.
Update: I understand according to this page that you can set specific parameters to be considered unique pages based on the parameter having different values. So now the only question is whether or not it's possible to hide content on a page from Googlebot. According to some old answers, it sounds like probably not, but again, I'm looking for an up-to-date officially documented answer.
As far as I know there is no official documentation to this specific issue but it depends on the type of content you have for each tab.
If the content of each tabs are links you can add rel="nofollow" tag.
If you're displaying resources on the tabs like images, pdf... you can setup a robots.txt to block Google from accessing it.
Example:
disallow: /images
disallow: /pdf
But if you want to block Googlebot to access certain texts inside a page then it would be impossible to do because Google will fetch all the resources on a given page. Unless if you block Google on the page level
My site uses pagination to get to different event pages. The problem is, these conferences on these pages are not getting picked by search engines. Below is my code...
1
2
What can I do for SEO so Google will crawl and find all of the conferences on the other pages?
Make real pages with real URLs.
Link to the real pages instead of to javascript:;.
Cancel the default behaviour of the link (by returning false if you are going to keep using onclick attributes) so that the JS still has the effect you want.
Use pushState and friends to update the URL and make the back button work.
I have links on page that scroll page to other parts of the page. Is there a way to make the page search engine friendly.
So lets say if my FAQ page has 2 questions and other content. 1) how to do A? and 2) how to do B?
Someone searches for how to do B? My site shows up with that question as search description's title and when user clicks it jumps to that part of the page.
Jumping into different parts of a page are accomplished via anchor tags (#). The problem you'll face is that search engines ignore anchor tags (#). If you don't want to create separate pages, you can:
Create a single page with a table of contents at the top to jump users down into their relevant sections (the page title won't be affected but this does most of what you requested).
Create a single page with 2 different URLs. This can be as easy as adding a query parameter on the end of your URL, eg: www.yourwebsite.com/yourcontent and www.yourwebsite.com/yourcontent?param=1. If your page is created dynamically then you can (on the server side) update your page title and description based on the query parameter as well.
If you want parts of a page to have "different titles" then you have to make multiple pages with different URLs and titles in the meta information. Also remember canonical references to the main page.
So you would have one page for the FAQ with the questions and answers.
One page for question 1 with the question and answer.
Another page for question 2 with question and answer.
Stackoverflow is very good at this. Note the URL of this question for example. It has the question in the URL of this page. This makes that question "search engine friendly".
I suggest you do the same.
You can read on google's webmaster pages about how to set up cononical URLs. Google (and other search engines) like when you do this.
I designed a website in which the whole site is contained within one page (index.php).
Within the page, <section> tags define different parts of the site (home, contact, blog etc.)
Navigation is achieved by buttons that are always visible, and when clicked use javascript to change the visibility of the sections, so that only one is shown at any time.
More specifically, this is done by using the hash in the url, and handling the hashchange event. This results in urls such as www.site.com/#home (the default if no other hash is present) and www.site.com/#contact.
I want to know if this is a good design. It works, but I get the feeling there must be a better way to achieve the same thing? To clarify, I was aiming for site that loaded all the main content once, so that there were no more page loads after the initial load, and moving between sections would be smoother.
On top of this, another problem is introduced concerning SEO. The site shows up in google, but if for example, a search query contains a term in a specific section, it still loads the default #home page when clicked, not the specific section the term was found in. How can I rectify this?
Finally, one of the sections is a blog section, which is the only section that does not load all at once, since by default it loads the latest post from a database. When a user selects a different post from a list (which in itself is loaded using AJAX), AJAX is used to fetch and display the new post, and pushState changes the history. Again, to give each post a unique url that can be referenced externally, the menu changes the url which is handled by javascript, resulting in urls such as www.site.com/?blogPost=2#blog and www.site.com/?blogPost=1#blog.
These posts aren't seen by google at all. Using the Googlebot tool shows that the crawler sees the blog section as always empty, so none of the blog posts are indexed.
What can I change?
(I don't know if this should be on the webmasters stackexchange, so sorry if its in the wrong place)
Build a normal site. Give each page a normal URL. Let Google index those URLs. If you don't have pages for Google to index, that it can't index your content.
Progressively enhance the site with JS/Ajax.
When a link is followed (or other action that, without JS, would load a new page is performed) use JavaScript to transform the current page into the target page.
Use pushState to change the URL to the URL that would have been loaded if you were not using JavaScript. (Do this instead of using the fragment identifer (#) hack).
Make sure you listen for history events so you can transform the page back when the back button is clicked.
This results in situations such as:
User arrives at /foo from Google
/foo contains all the content for the /foo page
User clicks link to /bar
JavaScript changes the content of the page to match what the user would have got from going to /bar directly and sets URL to /bar with pushState
Note that there is also the (not recommended) hashbang technique which hacks a one-page site into a form that Google can index, but which is not robust, doesn't work for any other non-JS client and is almost as much work as doing things properly.