Archive for the ‘SEO experiments’ Category

Ranking prediction: result

Sunday, August 17th, 2008

As George Costanza used to say, “I was wrong”. This week I’m wrong about the effect of the so-called powerful external link that I mentioned before

It turned up in Webmaster Tools as a credited link to the site. Did it make any difference as to the order of the blog home and site home in the Google rankings for the “search experiments” query? As I predicted? No, it did not. The blog home still sits there as the first result, with the home page indented.

I guess that with all the cross-linking, those two pages may well have similar rank, and the blog home page is more relevant in terms of its content.

Bad assumptions cause incorrect conclusions…

Monday, August 11th, 2008

Hmm. In my last post I suggested that I had reached a conclusion about the CSS and div-related image indexing test. I might have done, but I think that I jumped there. 

The original motivation for the test was to work out why certain images were not being indexed. Two hypotheses presented themselves – some slightly sloppy nesting of divisions, and a clear CSS hack.

Sure enough, when the various relevant pages were indexed and cached in Google, I found what I was looking for – that some of the pages didn’t appear to show the images in the cache. This would also tend, I thought, to support the hypothesis that the pictures not appearing here would be excluded from the image search results – because Google, having “refused” to cache the images on the page, would surely “refuse” again to include them in the index.

A nice enough hypothesis – and having been pleased with myself for devising it, of course I wanted it to be true, so started looking for results that would confirm it.

Those pages showed up with no images, and I published my immediate conclusions. However, I was looking at the cached pages in Firefox and Safari. Today I took a look using another browser that I rarely use, Internet Explorer. Using this browser, the images were visible. Google hasn’t “refused” to cache them. The hypothesis seems much weakened. 

I’ll continue to track what happens to these pages and the images on them, and report back. However, an important lesson has been learned from the experiment in any case: do not allow your desire to be correct skew your interpretation of the results that are returned.

Bad CSS to blame for non-caching of images

Sunday, August 10th, 2008

The first SEO experiment on the main site was intended to determine which of two possible code faux pas was more likely to be the cause of images not showing up in Google’s Image search results, a problem that had occurred on another site – which is why the test was a little specific in nature, and not very generic.

On examining Google’s cache of the pages in question, it was clear that the main images on those pages were not appearing. Looking at the code, two possible culprits suggested themselves. 

Firstly, in a rather messy way, classes and ids were being used interchangeably as style selectors for divisions (”divs”), and although there were not any repeated ids, there was a div with a particular id, which was then referenced as a class in another, nested div.

<div id=”blah”>

<div class=”blah”>

[picture and other content]

</div>

</div>

Not invalid HTML, but messy.

The other candidate was some strange-looking CSS code, apparently designed to get over some problem with rendering in IE6 (which may itself have been caused by the messy HTML…)

.hack {
	color: blue;
	font-size: 18px;
	height: 1%;
	overflow: hidden;
	}

It’s the last two lines, obviously, that are the candidates for causing issues. This CSS validates, and the pages render as expected in all browsers that I have tried. Browsers are very forgiving, however…

So, I recreated pages with these problems, including controls and permutations with the different errors.

The conclusion is that it is the CSS hack that is causing the images not to render in Google’s cache. It’s too early to tell whether this is also having an effect on the indexing of these images, because none of the images is yet indexed.

The cache for the badly nested divs page shows the picture, whereas the cache for the CSS-hacked test page does not render the picture.

Does this mean that Google is excluding certain types of “hidden” content, or does it mean that its internal “browser” for rendering its cached pages is a bit more strict about rendering pages accurately? Only when the pages have settled in the index and the images on the test pages have made it (or not) into the image search results will we be able to speculate more intelligently on this.

Experiment 2 – anchor text and noindex

Tuesday, July 29th, 2008

I’ve begun a new search experiment. At that link you’ll find the “home page” for that experiment, which links to a couple of other pages. These pages, while not identical, are pretty similar in content. Each of them has a link using unusual anchor text to another pair of final destination pages, each of which has a little bit of text and a picture. One of the linking pages is set to “noindex”. 

The intention is to see whether both of the destination pages will rank for a search on the unusual anchor text, and if so, which one ranks the highest. 

The expectation would be that both pages would appear for that search, along with the pages that contain the terms. 

If that is the case, then I won’t put too much weight on the outcome of which one ranks the highest, but it should give us a platform for further iterations.

The importance of a controlled environment

Thursday, July 24th, 2008

The meta-experiment relating to indexing is over, having fallen victim to a failure to maintain a hermetically sealed environment for the experiment.

The idea was to see whether pages from the site would be indexed by Google when they had no external links and no submission to Google had been made.

Despite the fact that no submission has been made and no links sought or set up, one has crept through.

It seems that Technorati have some detail about this blog, presumably through some hook-up with Wordpress. The relevant Technorati pages don’t currently appear in the Goo index, but this aggregator is picking up and publishing blog posts from Technorati with certain tags, in this case “w3c”, and publishing them.

All very interesting in itself, but it does rather blow the intended experiment. Which just goes to show how hard it is to maintain a hermetic environment for experiments on the web.

Anyhow, now that it’s blown, I can pump in a bit of link juice from elsewhere – I need the pages to be indexed for current and future experiments.

Using Google tools

Wednesday, July 23rd, 2008

As part of the “how little can I do and still get indexed” experiment, I’ve added both Google Analytics and Google Webmaster tools to the site, to see if these alone will inspire the big G to index them. 

Expected outcome: not in the index

Next steps: Use the Google “Add URL” tool. I’m going to give this a couple of days though. 

Incidentally, if anyone thinks that a couple of days is not enough time to wait, I’m pretty confident that I could get indexed in 24 hours if I was in a rush – I will need to have this site indexed for other experiments in future, so I’m not prepared to wait indefinitely…

Welcome to my site

Tuesday, July 22nd, 2008

Welcome to the blog for search-experiments.com. I don’t think you’ll find anything of too much interest here at present; it’s a newly created scratchpad for me to mess around without doing too much harm.

The first experiment is already live, but I’m also running a “meta-experiment” to see what it takes to get the domain noticed by Google.

Note that I’m not trying to get it indexed at present – I know just how I could do that quite easily if I wanted to. No, what I’m doing is seeing how little I have to do to get it indexed.

At present, this consists of creating a few pages for the site, and adding this blog. No URL submission, no external links, no Google Analytics on the page, no Google Webmaster tools account, no XML sitemap.

Oh, the only other thing that I’ve done which might give Google a clue as to the existence of this site is to search it for the domain name. Also, and in the interest of full disclosure, I have accessed the site with a browser equipped with the Google Toolbar.

I don’t expect to be successfully indexed now. Probably the next step will be to set up Google Analytics and maybe Webmaster Tools, as I will want to access those in the near future.

 

Edit: Added link (29/07/08)