I have just been reading an article about Google Noise (which seems to be something that keeps being brought up a lot lately) over at The Register. Anyway, it seems that now even the Trackback is gaining ground in the whole Blog vs Google saga. From what example Andrew has used in the article, yes it appears that Google will index junk, but don't all of the engines?. The only way that this would be avoided is to use humans as part of the entire directory auditing process, which is not feasible.
The reason why Google appears to be the one in the limelight over this entire indexing subject, is because of the way it currently likes newer content over old. Unless there is a major human element involved, this will always be a problem.
There is one simple solution to Google indexing Trackback pages and other junk.
I noticed that comment entries and such from my blog were being indexed by Google and in some cases higher than the actual archive pages. So, what did I do? Easy. I just added the MovableType cgi directory to robots.txt. Problem solved, Google will no longer index any Trackback pages, comment pages etc. The only things it will index now are the actual archive pages. No more junk indexed on Google from my blog, unless of course you class my thoughts as junk!
The problem is getting everyone using MT and other engines to do something like that.
Thanks for signing in, . Now you can comment. (sign out)(If you haven't left a comment here before, you may need to be approved before your comment will appear.)