PDA

View Full Version : Javascript, Object Notation and Web Indexing



Bobafart
18 Feb 2007, 7:44 PM
You know the more I learn about Object Notation and allowing Javascript to handle the UI rather than markup and preventing Javascript and HTML from mixing (so-called "unobtrusive Javascript") the more I am realizing that my markup is quickly diminishing to only a few lines of code and my JS files are rapidly expanding.

An extreme example of unobtrusive javascript to create a complex web app:



<html>
<head>
<script type="text/javascript" src="jsfile1.js">
<script type="text/javascript" src="jsfile2.js">
<script type="text/javascript" src="jsfile3.js">
<script type="text/javascript" src="jsfile4.js">
</head>
<body>
<div id="myDiv"></div>
</body>
</html>


This make me wonder: are we destroying web indexing as we know it?

I mean, how is the web supposed to be indexed? I realize Google has tonnes of money and will adapt but for the black hat SEOs and the current web indexing systems as we see more and more unobtrusive Javascript how will indexing this be done?

Exclusively through <meta> tags? Hopefully not!

Your thoughts?

JeffHowden
18 Feb 2007, 9:49 PM
Unobtrusive JavaScript doesn't mean that you eliminate markup and code tons of JavaScript. No, what it means is that scripting merely hooks into existing DOM elements to enhance what's already there.

Additionally, I don't think that web apps are a good mix for SEO. I don't really think there's much in the way of true web app that is suited to access without some sort of login. If there's a case where you have lots of data available to the public, I wouldn't recommend the use of a web app paradigm.

ilazarte
18 Feb 2007, 11:40 PM
unobtrusive javascript can be considered a relative of "progressive enhancement".
check out this video:

http://yuiblog.com/blog/2006/10/03/video-sweeney-hackday06/

tryanDLS
19 Feb 2007, 7:06 AM
I don't know that I'd get hung up about indexing. You're building applications, not simple web pages, so you have to consider whether you really should be indexing it. That's not to say that you wouldn't have some simple html page that gets fed to the spiders. There was actually some discussion about this in a recent Ajaxian.com post - the one about saving view source, I think.

jack.slocum
19 Feb 2007, 7:08 AM
My blog generates a a sitemap for google every post a make so my site continues to get indexed.

Another option would be putting your links into existing markup, or just putting display:none links for search engines if you need to be indexed.

Animal
19 Feb 2007, 7:22 AM
Yes, I saw that discussion. One poster neatly highlighted the difference between an application, and a "web page" about that application.

The web page needs indexing - you can't index an application.

dfenwick
19 Feb 2007, 1:19 PM
The web page needs indexing - you can't index an application.

Amen. And this is precisely why there are so many broken links in the web searching systems. Unfortunately there's not a well used semantic for telling a web crawler to go away, and there are quite a few web crawlers that ignore the semantics even if you use them.

dj
19 Feb 2007, 6:25 PM
My blog generates a a sitemap for google every post a make so my site continues to get indexed...Hi Jack, you should check that. E.g. http://www.google.com/search?q=site%3Ajackslocum.com shows that your blog is completely not indexed by Google. Maybe the problem is your robots.txt. Have a look at your Webmaster Tool (http://www.google.com/webmasters/sitemaps/)

Jul
19 Feb 2007, 8:00 PM
dj, I think you're trying to search Google with the wrong domain. Try yui-ext.com:

http://www.google.com/search?q=site:yui-ext.com

Belgabor
19 Feb 2007, 8:15 PM
No, that's only the forum and docs, his blog is missing from google. If you search for some keywords on the blogs front page, you'll only get unrelated sites and other sites that syndicate his blog, but not the blog itself.

jack.slocum
19 Feb 2007, 8:57 PM
Actually it appears the "ping" was no longer working. They stopped receiving my sitemap a while ago :shock: I redid it, lets hope it works now. That's for the heads up.

sjivan
27 Mar 2007, 5:53 AM
Yes, I saw that discussion. One poster neatly highlighted the difference between an application, and a "web page" about that application.

The web page needs indexing - you can't index an application.

Do you have a URL for this? Many "social" sites have a large public area for browsing as well and an application side where a user can login and carry out a bunch of operations like tags, favourites, friends, comments and other rich app functionality. The public areas definitely need to be crawled so this swings the page design to be more traditional with entire page refreshes and restful urls, while I suppose the "my account" area could leverage a more rich (RIA) interface. You guys agree?

Backbase has an interesting white paper where on can build a RIA that supports web indexing but that requires a lot more work where one need to have restful urls crawled by engines return the same content as when navigated by a user. But if its not done right, google might consider this as cloaking and ban the website.

dfenwick
27 Mar 2007, 11:45 AM
Yes, I saw that discussion. One poster neatly highlighted the difference between an application, and a "web page" about that application.

The web page needs indexing - you can't index an application.

Do you have a URL for this? Many "social" sites have a large public area for browsing as well and an application side where a user can login and carry out a bunch of operations like tags, favourites, friends, comments and other rich app functionality. The public areas definitely need to be crawled so this swings the page design to be more traditional with entire page refreshes and restful urls, while I suppose the "my account" area could leverage a more rich (RIA) interface. You guys agree?

Backbase has an interesting white paper where on can build a RIA that supports web indexing but that requires a lot more work where one need to have restful urls crawled by engines return the same content as when navigated by a user. But if its not done right, google might consider this as cloaking and ban the website.

Documents should be indexed. Applications should not. If you store data on a backend and you want that data to be indexed, you should provide something that builds documents from the data. Then it can be indexed. What the indexers should not do (but do anyway) is index things generated from applications that have identification tags to extract data from backends. These should (in my opinion) never be indexed because they are volatile. If I rebuild the database on the backend and it regenerates all of the identifiers, the web index is now invalid in the search engine while the application works properly. If it shouldn't be indexed (which many of my applications should not) I add a <META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW"> tag to the HTML. Unfortunately there are quite a few web crawlers out there that ignore the meta tags and try to index things anyway.