Recommended Practises, Search

Why does my Sitecore website perform so badly?

After many years of doing a lot of reviews on Sitecore I would conclude that it more or less always comes down to one single thing when pages perform badly: bad code. And when I mean bad code, I mean code iterating an excessive amount of items.

Iterating A LOT of items

Previously we used to see a lot of calls to descendants when solutions were using axis, xpath or Sitecore Query, but luckily this is a disappearing trend with the introduction of Sitecore Search in Sitecore 7. This does not mean that there are less bad code – maybe even on the contrary. The use of ORM and abstractions in many solutions means that layers of code can easily hide item iterations. Neither of the two patterns or technologies are inherently bad, but they do provide means to hide “bad code”.

The ARGH! Example

The example in the diagram below is from a customer here in Australia which had a problem with a very slow running product search. The diagram breaks down the different parts of the search page and describes the functionality and number of items hit by each part. It was part of the presentation to the client to highlight where the performance bottleneck was (and that Sitecore itself was not cause of the problem).

After looking through layers and layers of code (3000+ lines to be more specific), we discovered that the code in the different layers were searching Lucene and iterating search results, loading the items individually and recursively running business logic.

A simple search resulting in a page of 50 links would easily take up to 5 minutes, simply because the code was iterating in excess of 1.7 mio items from the Sitecore database –  which would take time in any solution 🙂

Excessive Item Iterations

To cut the very long story short: always keep an eye on number of items being iterated – and never rely on caching to do the trick.

Standard
Architecture, Sitecore

Dedicated image server and Sitecore

A lot of sites which have a large amount of visitors use a dedicated image server.  There are several reasons for this, but mostly it is because that most browsers only allow a few simultaneous connections to the same domain, which stalls the download of a page that has more than two resources. For instance imagine you have a page consisting of multiple images, stylesheets and javascript files. As the clients only allows a few simultaneous downloads, the browser won’t download all the resources at the same time, but wait for each download to complete and first then start the next download.

The reason of this behavior is that the HTTP standard says that only two simultaneous connections between a client and a server should be allowed. This was specified to save servers from heavy IO load, when files where requested. Browsers like IE7 and Firefox 2 lives up to this standard and only allow 2 connections, while new browsers like IE8 and Firefox 3 allows 6 connections.
To ensure that the HTML can be fully loaded independently and without waiting for other downloads and to remove the load from the primary web server; a dedicated image server can be used. This server holds all files and uses a different domain (for instance images.mydomain.com). All files used on the page can then be fully referenced and thereby be downloaded from a different domain. Eg <img src=”http://images.mydomain.com/image1.jpg alt=”image1”/>. As the images are now on a different domain, these can be requested independently of the server providing the HTML and more simultaneous connections can be opened.

But how is this possible in Sitecore where the media library controls all the images?

One of the easiest ways is to have another publishing target for the media. In that way you will have two frontend servers: One serving the normal requests and one serving all media items. You can easily set up another publishing target and for instance use the staging module to clear the cache. Read more on SDN if you want to know how to set it up.

The remaining problem is: How do we ensure that all media items gets prefixed with another domain? Unfortunately there isn’t a simple setting for this (it will probably come in a future a release of Sitecore – I hope :)). However there is a simple solution to the problem as links are expanded by the LinkProvider, which can be overridden. More precisely the links are expanded in the LinkProvider.ExpandDynamicLink method, so we need to override this replacing the media path:

namespace TestApplication
{
  public class CustomLinkProvider : LinkProvider
  {
    public const string MEDIA_PATH = “~/media/”;

    public override string ExpandDynamicLinks(string text, bool resolveSites)
    {
      string baseExpands = base.ExpandDynamicLinks(text, resolveSites);
      return baseExpands.Replace(MEDIA_PATH, “http://images.pentia.dk/” + MEDIA_PATH);
    }
  }
}

Here I override the ExpandDynamicLinks in the inherited class CustomLinkProvider. I replace all media paths (identified by the prefix ~/media/) and replace it with the full path to the image server. The remaining thing to do is to replace the LinkProvider type in the web.config:

<linkManager defaultProvider=”sitecore”>
  <providers>
    <clear />
    <add name=”sitecore” type=”TestApplication.CustomLinkProvider, TestApplication” addAspxExtension=”true” alwaysIncludeServerUrl=”false” encodeNames=”true” languageEmbedding=”asNeeded” languageLocation=”filePath” shortenUrls=”true” useDisplayName=”false” />
  </providers>
</linkManager>

Here I just point the LinkProvider type to my own implementation.

Now that this is set up, all requests for media items will be handled by the image server. This does not just allow faster load times for the client, but will put less pressure on the main server. Further you can tune the media server to handling media items by increasing the media cache, setting the prefetch cache etc.

Enjoy your new Sitecore media server 🙂

Standard
Uncategorized

Type before function?

I’ve recently been made aware of how type-centric Sitecore is in its architecture and best practices.

Consider the main sections in Sitecore: Content, Templates, Layouts, and even the folder structure /xsl, /layouts. Sitecore seems inherently to point the architect towards categorizing his solution after which types of elements it consists of, i.e. which templates do my website consist of, how many layouts do I have, which XSL’s do I have to write etc. instead of looking at the conceptual structure of the website, i.e. which functionality do I have, e.g. newsletter, document, navigation etc. Trying to piece together all the parts which make one function on the website involves browsing through a lot of folders in Sitecore and looking into a lot of code, whereof many of the references are very loose e.g. template names in XSLT files, or assembly references in the web.config (or even worse: assembly references in Sitecore). All in all it is not an architecture which is very helpful for reuse and overview.

I vision Sitecore in a future version working in a more functionality-centric way, for example imagine the Sitecore content tree:

  • /sitecore
    • Content (This is where the editors edit the site – not much different here)
    • Components (This is where we – the developers – roam)
      • Component (each functionality has its own section)
        • Templates (this defines the templates for this component)
        • Presentation (This defines all the presentation elements for this component, layouts, renderings etc.)
        • Settings (General settings, dictionary texts etc. needed by the component)

…and that’s it. No more browsing the entire tree looking for the settings for the Mailing list module. No more: “I wonder if this layout is used by any of my 367 templates”. No more: “Was that document.xsl or document.ascx”. Just imagine the ease in making of package for porting functionality to a new website.

Just my two cents…

Standard
Uncategorized

Assigning layouts

My first idea was to write a post about the top 5 mistakes when developing in Sitecore, but then I decided to go for the one potential danger I consider grave enough for its own post: Assigning layouts to items.

A layout is basically a simple field (__layout) on an item which describes how the item is rendered by the rendering engine, and as such can be assigned on the item and on the Standard values (http://sdn5.sitecore.net/End%20User/Template%20Manager/Standard%20Values.aspx).

On pre-5.3 version of Sitecore standard values was not available, so in order to have layout reuse, layouts was assigned to templates. This functionality has been preserved in Sitecore 5.3 for backward compatibility. I’ve actually just recently been made aware of Sitecore’s recommendation on putting layout settings on standard values instead of on the template. Architecturally this is absolutely the right way to go – the template way was a hardcoded layout engine functionality to enable reuse.

Well to get back to the point: Do not – ever – assign layouts directly to items.

Assigning layouts to template standard values gives you the reuse you want and need, in order to maintain an architecturally sound Sitecore installation. Consider a situation where you have a layout (header, footer, menu, spots, content) defined on your document template, and you want to introduce a breadcrumb on your site. If the architecture is correct, this can be accomplished merely by changing the document template and all documents on your website will be updated.

Then again consider a situation where 10 specific documents need a slightly different layout. Well, the simplest way would be to just set the layout directly on these documents. WRONG! Settings layouts on directly on items primarily set aside all reuse, but more importantly makes you completely loose the overview of your solution. If you in this case wish to introduce a breadcrumb you will have to manually find (on your 10.000 page website) the items which has a specific layout and correct these – which often leads you to the painstaking realization that you have to write a script.

Therefore what you should consider is introducing data templates and layout templates and use template inheritance. In projects in Pentia, we define fields and functionality in one template then derive to another template and set the layout on this. Consider a template DocumentData which holds the fields Title and Text, and another, Document, which derives from DocumentData and define the layout (and other standard values). If I need specific layouts on certain items, I derive another layout template from DocumentData, e.g. WideDocument, which then defines the specific layout. This means that all layout settings are always in the Templates section of Sitecore – easy to find and reusable.

Standard
Sitecore

Configuration in Sitecore

Ever considered putting environment specific configuration in the Sitecore database? …well don’t.

In the official Mailing list module, Sitecore has, among others, placed the database connection string and we have nothing but grief from it. Consider moving from development to test to production, or in the latest example a customer wanted to have an internal test site nightly updated with the production database. Sigh!

Real Sitecore style, we’ve even looked into the possibility of hooking into the database connection of the mailing list module – but no:

  • All database access in the MailingList module goes through the internal class Sitecore.Modules.MailingList.Core.SqlHelper
  • This class is only accessible through the internal property SqlHelper SqlHelper in Sitecore.Modules.MailingList.Core.MailingList
  • SqlHelper opens a database connection in the protected (but not virtual) IDbConnection CreateConnection()
  • This function returns an SqlConnection if the Sitecore field “Database” contains the text “MSSQLSERVER” (*Sigh*), otherwise an OleDbConnection is opened
  • The sql server ConnectionString is read from public static string Sitecore.Modules.MailingList.Core.MailingListSettings.ConnectionString
  • This property reads the value from the field ConnectionString in the Sitecore database

As you can see there is no way of plugging in anywhere…

Standard