Recommended Practises, Search

Why does my Sitecore website perform so badly?

After many years of doing a lot of reviews on Sitecore I would conclude that it more or less always comes down to one single thing when pages perform badly: bad code. And when I mean bad code, I mean code iterating an excessive amount of items.

Iterating A LOT of items

Previously we used to see a lot of calls to descendants when solutions were using axis, xpath or Sitecore Query, but luckily this is a disappearing trend with the introduction of Sitecore Search in Sitecore 7. This does not mean that there are less bad code – maybe even on the contrary. The use of ORM and abstractions in many solutions means that layers of code can easily hide item iterations. Neither of the two patterns or technologies are inherently bad, but they do provide means to hide “bad code”.

The ARGH! Example

The example in the diagram below is from a customer here in Australia which had a problem with a very slow running product search. The diagram breaks down the different parts of the search page and describes the functionality and number of items hit by each part. It was part of the presentation to the client to highlight where the performance bottleneck was (and that Sitecore itself was not cause of the problem).

After looking through layers and layers of code (3000+ lines to be more specific), we discovered that the code in the different layers were searching Lucene and iterating search results, loading the items individually and recursively running business logic.

A simple search resulting in a page of 50 links would easily take up to 5 minutes, simply because the code was iterating in excess of 1.7 mio items from the Sitecore database –  which would take time in any solution 🙂

Excessive Item Iterations

To cut the very long story short: always keep an eye on number of items being iterated – and never rely on caching to do the trick.

Standard
New releases, Sitecore, xDB

Merging Visitor cookies in Sitecore CMS 7.5 and the Experience Database

Sitecore 7.5 hasn’t hit the streets yet, but as rumour has it, it will soon.

One of the highly anticipates features of Sitecore 7.5 is the ability to merge visitors across devices or cookies. This was something which had to be implemented specifically in the past, but will now be part of the platform.

How do I merge two visitors?

First of all, visitors in Sitecore 7.5 are no longer called visitors, but contacts. This is a part of a larger renaming scheme where e.g. visits (sessions) now are interactions and visitors are contacts. The renaming has consequences in the API, in that the DMS (which is now the Experience Database, xDB) API has had a major rework. Therefore expect a slightly harder upgrade path for those heavy DMS customized websites.

The key to merging contacts lies in identification. Two interactions on two different devices cannot be associated unless something common identifies them. This could be a login, an entered email address or something similar. It’s basically up to you to determine what and how to uniquely identify two contacts on two different interactions as being the same.

Sitecore.Analytics.Tracker.Current.Session.Identify(string key)

After you have determined the appropriate key to uniquely
identify a user, it’s as simple as calling the Identify method on the current interaction, ehh session? (Oops, apparently the rename hasn’t been done all consistently)

How does it work?

Using the new session handling mechanisms in Sitecore 7.5, Sitecore loads the contact associated with the current interactions cookie into session. The contact is then locked in shared session, making sure that two separate interactions will change the contact simultaneously.

  1. When calling Identify on the contact in the current interaction, any other contact identified with the same key is loaded using the LoadContactReadOnly in the ContactRepository class.
  2. The newly loaded contact is locked, making sure other interactions are not changing it.
  3. The current interaction is transferred onto the new contact, running the TransferSession pipeline
  4. MergeContacts is called, passing the current contact as “dying” and and the loaded as the “surviving”
  5. The MergeContacts pipeline is triggered, which in turn merges tags, counters, automation states, attributes and facets on the contacts.
  6. The new merged contact is now the active contact in the interaction and tracking will continue on this across multiple devices.

Things to keep in mind:

  • Sitecore 7.5 introduces new terms for the Analytics API
  • Expect a development effort when upgrading – dependant on the level of analytics use
  • In Sitecore 7.5, the platform will be able to merge visitors automatically
  • Nothing comes for free – you will have to determine how to uniquely identify contacts, and call the appropriate method
  • All steps of the actual merge is configurable using pipelines

See Nicks posts on 7.5 for a more in-depth walk through:
http://www.techphoria414.com/Blog/2014/June/Copy_of_One_Month_with_Sitecore_7-5_Part_3

Standard
Recommended Practises, Sitecore

How do I get the Publishing date for a page in Sitecore CMS?

During the last week I’ve stumbled upon this question multiple times (strangely enough when dealing with many Sitecore solutions I find myself stumble upon the same issues – and oddly enough, the same issue often arises on different solutions at the same time) and the answer is: “You need to determine what you mean by Publishing Date”.

Getting the date the item was published.

So you mean the date and time the page was transferred from the master database to the web database? Would this be the very first time the page item is transferred to the web database? Or maybe the first time the latest version of the page is transferred? Or maybe the last time the version was transferred?

The answer to all the above would be: There is no such date available in the Sitecore CMS API!

If you still want to pursue this path, John West has a brilliant post apparently solving this: http://www.sitecore.net/Learn/Blogs/Technical-Blogs/John-West-Sitecore-Blog/Posts/2011/08/Intercept-Item-Publishing-with-the-Sitecore-ASPNET-CMS.aspx

But you have to keep a couple of things in mind:

  1. Perhaps most important – does this date really make sense? Remember that this is not the date the page item was last changed, but merely the time of the transfer to the web database.
  2. This date would be the publish of the latest version to the web database.
  3. A full publish of an item would update this date – even if the item already exists in the web database. A full publish removes the items from the web database before a new publish commences.
  4. I am always most reluctant to make changes directly in the web database (as suggested by John West). There are generally many components of Sitecore (indexing, publishing etc.) which are based on items being transferred as-is from master to web. You might get it to work – but will every feature of Sitecore keep working? Correctly?

Getting the date the page was last changed.

Well this one is easy: Item.Statistics.Updated will give you just this, the date/time for the last change – any change – on the version of the item. By setting up workflows and a scheduled publishing strategy (which I would recommend on any solution) will transfer the page changes automatically to the web database, thereby getting an updated item published as soon as possible.

But! If you determine to use this field, keep in mind:

  1. The Updated date/time field will be updated by any change, e.g. both major revisions and just smaller changes. Even fields which are not shown on the page or not edited manually by users will trigger a change. For example: If an editor makes a revision to the text on a page on July 1., but an approver waits two months before approving the change (without content changes), the Updated date will be September 1. (because the item workflow state was changed on this date).
  2. There is no way for the editor to control this field (unless you start meddling with the inner working of Sitecore, which is never recommended). The field is controlled by Sitecore.

Using your own date/time field

Most likely, when thinking this issue through, you will arrive at the conclusion that the editors – or at least the administrators – needs to have total control over the Published Date shown on the page. This of course means that the field will have to be a custom field added to all page items – and the most straight forward solution is to just make the editors set the value when they make a change.

But if you need to automate the date/time value, there are a number of options which springs to mind:

  1. First option is to set the field value when the item is created. This can be done by setting the standard values on the template: http://briancaos.wordpress.com/2011/03/24/initial-field-values-for-sitecore-setting-a-default-future-date/
  2. You could change the date/time if some predetermined fields changes (e.g. the content fields shown on the page). This would be done in an event handler for the item saved event: http://www.sitecore.net/Learn/Blogs/Technical-Blogs/John-West-Sitecore-Blog/Posts/2010/11/Intercepting-Item-Updates-with-Sitecore.aspx
  3. You could update the field in a workflow action, e.g. approval. This would be require a custom workflow action. See more here: http://sdn.sitecore.net/upload/sitecore6/workflowreference-usletter.pdf

Things to keep in mind

  • Think carefully what you mean by “Published Date”
  • There are no “Publishing Date” concept in Sitecore.
  • Use the Updated date with care – it will get changed by Sitecore at every change to the item
  • Consider using a custom field for this. There are ways of automating the value in the field without meddling with the inner workings of Sitecore.
Standard
Search, Sitecore

Avoiding downtime while rebuilding your Lucene search indexes in Sitecore ASP.NET CMS

This is the first blog post I’ve written since my arrival in Sydney and my new job within Sitecore. It will be about a simple yet strong feature often overlooked in the Lucene indexing capabilities of Sitecore 7: how to avoid downtime on your Lucene indexes.
One of the problems I run into occasionally is that a rebuild of an index, e.g. the web database index, will cause features of the site to not work properly. This is due to the fact that the index is taken offline while rebuilding.

But what causes a rebuild?

A rebuild may be triggered manually through the control panel in Sitecore, or by an indexing strategy such as RemoteRebuild (see more here). But full rebuilds might also be triggered by simply publishing too many items from master to web. This is due to the Threshold setting in e.g. the PublishEndStrategy, which will transform many single item indexes to simply one full rebuild if the given threshold is reached (See more here).

How do I avoid downtime then?

Well the solution to the above problem is relatively simple, yet often overlooked: Changing the standard Lucene index handler (the class which actually handles reading and writing to the Lucene index files) from LuceneIndex to SwitchOnRebuildLuceneIndex will remove this problem.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <contentSearch>
        <configuration type="Sitecore.ContentSearch.LuceneProvider.LuceneSearchConfiguration, Sitecore.ContentSearch.LuceneProvider">
          <indexes hint="list:AddIndex">
            <index id="sitecore_web_index" type="Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex, Sitecore.ContentSearch.LuceneProvider">
              <param desc="name">$(id)</param>
              <param desc="folder">$(id)</param>
<!-- ... -->

The SwitchOnRebuildLuceneIndex maintains two separate indexes on disk, one active and one passive. As a rebuild occurs, it is executed on the passive index and when the rebuild completes, the passive becomes active and the active becomes passive. Thereby, there will be no downtime of the index.

Things to keep in mind

Therefore, a couple of things to keep in mind:

  • Switch to SwitchOnRebuildLuceneIndex if you want to make sure that you have no downtime on your Sitecore search indexes.
  • Make room on disk for maintaining duplicate disks.
  • Always choose your Index update strategies carefully.
  • Different server roles might require different index update strategies.
Standard
Sitecore, .NET, LINQ

Inheritance techniques in Sitecore ASP.NET CMS

There are two techniques for inheritance which are commonly used in all Sitecore solutions; hierarchical inheritance and template inheritance. Sadly, Sitecore offers no straight forward out-of-the-box API methods for the two techniques. This post describes some simple and neat ways of implementing these techniques – and also shows of how to use extension methods in .net to extend Sitecore and ASP.NET classes. This post was partly inspired by the extension method contest at the Learn Sitecore website.

Template inheritance:

Template inheritance is an integral part of Sitecore which I hope that all of you Sitecore fans out there use extensively in your solutions. At Pentia, template inheritance is so vital that in practice, you will see no item.TemplateID == SomeID or item.TemplateKey = "sometemplate" in any of our code, just like you (hopefully) would never use MyObject.GetType().Name == "MyClass" in C#.
No, we always use the following extension method:

public static bool IsDerived(this Item item, ID templateId)
{
  if (item == null || templateId.IsNull || item.Template == null)
    return false;
    
  TemplateItem templateItem = item.Database.Templates[templateId];
  if (templateItem == null)
    return false;

  Template template = TemplateManager.GetTemplate(item);       
  return template != null &amp;&amp; (template.ID == templateItem.ID || template.DescendsFrom(templateItem.ID));
}

The method is used as item.IsDerived(“MyTemplate”) just as you would always do MyObject is MyClass in c#. Sadly, Sitecore offers no obvious way to do the same out-of-the-box….wait… oh yeah, you could naturally do Sitecore.Xml.Xsl.XslHelper.IsItemOfType()

Hierarchical inheritance:

While template inheritance is used to give all content of different types the same properties, hierarchical inheritance in Sitecore is slightly different. This is primarily used to allow a piece of content, a setting or a configuration to propagate down a branch in the tree – i.e. propogate down a number of content types which can be totally unrelated, but grouped together solely because of their content position. For example, this could be used to apply a given design theme to an entire subsite within you site. Using hierarchical inheritance, extension methods and Linq, this could be accomplished as such (pardon the train-wreck):

protected void Page_Load(object sender, EventArgs e) {
  var theme = this.GetDataSourceItem().GetAncestorOrSelf().First(i => i.IsDerived("HasTheme")).Fields["Theme"].Value;
  //...
}

In this example there are – besides the IsDerived extension method described above – two very useful extension methods:

GetDataSourceItem extends the ASP.NET UserControl to provide easy access to the DataSource set on a sublayout and is implemented as follows:

public static Item GetDataSourceItem(this UserControl control)
{
  Sublayout sublayout = GetSublayout(control);
  if (sublayout == null)
    return null;
  string dataSourcePath = sublayout.DataSource;
  return !String.IsNullOrEmpty(dataSourcePath) ? Context.Database.GetItem(dataSourcePath) : null;
}

GetAncestorOfSelf extends item to return a list with the item itself and all its parents – and this is naturally where the key to hierarchical inheritance lies:

public static IEnumerable<Item> GetAncestorOrSelf(this Item item)
{
  do {
    yield return item;
    item = item.Parent;
    
  } while (item != null);
}

I hope this post shows you how some simple extension methods can give you a lot of power in your solutions.

Standard
.NET, Architecture, publishing, Sitecore

Drawing the customization line

Sitecore is an extremely flexible system; in fact most of Sitecore is built with the extensibility and customization functionalities which is provided. This includes pipelines, events, commands, SheerUI, providers and more. All in all: there is not much which cannot be achieved in Sitecore and only the imagination of you and your client sets the limits of customizations – and believe me, in my work of helping out clients and Sitecore partners all over the world, I’ve seen Sitecore completely twisted in many ways. I’ve seen Sitecore’s entire layout engine replaced with an item based layout definition, with subitems representing placeholders and subitems representing presentation elements. I’ve seen publishing and item saved pipelines so crammed full of custom functionality that performance was non-existent and I’ve seen XSLT’s with one line of code: a call to an extension function which would return an entire HTML structure as a string.

So where do we draw the line in the holy name of bending functionality and giving the customer “what they want”? Remember:

  • Convincing the client to choose standard functionality is also an option.
  • Each change of standard Sitecore functionality is potentially expensive – for you and for the client.
  • If possible, do not replace Sitecore functionality, extend it.
  • Think about what you put into Sitecore: Content and functionality can be accessed through .NET code and does not necessarily need to be added to the Sitecore DB’s or in a pipeline.
  • Keep track of your customizations – source control it and document it.
  • Remember that your customizations will have to upgrade with Sitecore.
  • Stay true to yourself: If it feels wrong, don’t do it.

In short, just because you can does not mean that you should:-)

Standard
Introduction, Sitecore

Getting your Sitecore project right

This post is for you clients who have already selected Sitecore as your new website platform and is starting up a new Sitecore project.
Here are a couple of my thoughts on how you can get your Sitecore project on the right track from the beginning, by setting the stage for a good collaboration with your implementation partner.

The website you a building today should also be the website for tomorrow and hence the website you are building should be extendable, flexible and scalable. It is absolutely possible to build a Sitecore web platform which will last many iterations – considering that you take this into account early in the process.

Select your implementation partner carefully

Your implementation partner is the single most important collaboration partner in your project. They bind everything together; requirement specifications, user experience design, hardware, software, domain knowledge and more, and is therefore crucial for the project’s success. Therefore your primary focus initially in your project should without doubt be to find the right implementation partner.

In my opinion, what you should focus on is:

Human chemistry: A website is not primarily a technological project, but a communication project. Therefore select a partner with whom you can communicate openly and freely. If you sense that they are listening, factors as for example domain knowledge is less important.

Experience: Sitecore is not a difficult tool to learn – but is takes time to master. Therefore, try to find a partner which has multiple large scale projects under its belt, and preferably with projects which has undergone a number of iterations.

Technology based: In my experience, companies which are technology based, i.e. which focuses primarily on the integration and platform parts of the solutions compared to the user experience driven companies, e.g. design agencies, makes more future proof Sitecore solutions. Therefore if you are looking for someone to build your future webplatform as opposed to just your next website, opt for a company with vast knowledge of Microsoft .NET and surrounding technologies.

References: Most implementation partners can most likely show an impressing list of references – but please do not stop there. Call their references and enquire about support, quality etc. Hearing whether their existing customers have gotten value for money is very useful.

Don’t be too specific in your requirement specifications

My suggestion is that you use your implementation partner as a sparring partner on requirements. Remember that these guys have built other solutions before yours and might bring experience, skills and functionality which will benefit you. Also, allowing multiple implementation partners to suggest different solutions to your website’s objective – as opposed to a RFP checklist – will allow you to better evaluate their creativity.

Therefore, in the specifications, try to explain the objectives you have for your company, users or editors, instead of the precise functionality. In Pentia we have had multiple requests for debate forum functionality in solutions – which in our experience is a prime example of a specific functionality which is often never used by users. By explaining which objective the clients wanted to achieve on the website, instead of the specific functionality, we could have advised better, earlier in the process and given more value to the client. By the way; in most cases we managed to dissuade the clients to actually implement the debate forum, and used the precious development time for something much more valuable.

In short: Requirements change. Therefore, being too specific and detailed about functionality already in the RFP process will most likely get you a whole lot of expensive, unused functionality.

Be open about your development budget

This is in my book a no-brainer. The only reasons for not being open about the budget are if you adopt the “they-are-all-thieves-and-robbers” attitude or if you hope to haggle your way to a cheaper website. In both cases, you are doing yourself and your website a whole lot of damage.

First of all, you have to trust your implementation partner, as they hold an immense power to make your project a success or failure. If you don’t, find another partner. Secondly, this is not a standard product you are buying. If you push your implementation partner on time or money, they have but one place to push back: quality. This basically means that your solution will be in a worse state, bringing lower reliability or higher support cost.

Therefore, a selection process is not about getting the lowest price or best solution description. It’s all about finding the implementation partner which you trust the most. And if you found that implementation partner, why not be open about mostly everything, including budget?

Bring your partners on board early

There is a lot of benefit in bringing in all your partners as early in the process as possible – this means both strategy and design partners and well as implementation and hosting partners. Each domain has something to bring to the process and can potentially save you a lot of money and hassle. Getting the implementation partner and hosting partner to talk together as early as possible can save a lot of time in the deployment process, and in my experience, by getting the implementation partner involved in the strategy and graphical design process a lot of hours can be saved in communication afterwards. Furthermore implementation partners often have prebuilt functionality which – if it fits the project – can save you a lot of time and money. The earlier this is brought forward, the easier it is to fit into any graphical design or information architecture.

 

Standard