Sitecore xDB Personalisation at the client

Nick Hills’ headed over to SUGCON last week to share his infinite wisdom with the rest of the Sitecore community, if you missed it or want a recap you can listen to his talk below:

This talk shows off techniques that allow you to combine two great tools: the personalisation capabilities and power of xDB along with the benefits of CDN edge-caching. Editors can configure and design personalisation rules as normal yet still harness all the power and gains that CDN’s offer.

He’s also handily put together a whitepaper weighing up the pros and cons of the various approaches – http://blog.boro2g.co.uk/personalization-at-scale-whitepaper/

Get in touch if you want to find out more on +44 (0)117 932 7700 or info@trueclarity.co.uk

Using Cloudflare as a CDN – a review

Recently one of our clients was experiencing an increase in site downtime. During our investigation of the outage incidents we discovered that the site was increasingly becoming a victim of DOS (denial of service) attacks.

From the data we looked at it appeared the ‘hacker’ would trawl the site, honing in on pages which had the longest response times and then repeatedly hit those pages with requests using up resources on the site and eventually causing the CPU on the database to max out and the site to go down.

Our client hosts with Rackspace who offer a security solution so we asked them for pricing.  They suggested that their managed service would be rather expensive  for our needs and recommended we take a look at Cloudflare.

Cloudflare offers a low cost (entry level plans are free) Content Delivery Network which enables you to save bandwidth and reduce requests to your server by caching some content. In addition (and this was the feature we were most interested in), Cloudflare offers built in security protection to guard against DOS attacks.

Both the caching and security settings are highly configurable through an easy to use interface, help documentation is clear and well written and support is good (support tickets are prioritised according to the plan you’re on – support for clients on paid plans get priority over those on free plans which seems fair).

Cloudflare is amazingly simple and low risk to implement. The most simple way is to simply delegate the top level domain DNS e.g. example.com to Cloudflare who take over the management of your Zone file. You can then choose which of your zone file entries you want to send through Cloudflare and which you don’t. You can set Cloudflare up ready to go with all services in ‘pause’ mode which means when your DNS does initially point to them they don’t do anything other than relay requests.

If you (or your IT department) aren’t happy to delegate the entire DNS for your domain (maybe you have internal systems running on that domain) then it is possible to get a CNAME record setup by Cloudflare for a sub domain e.g. http://www.example.com. This is the route we needed to go down for our client and this option does require you to be on a paid for plan (we went for Business at $200 per website per month).

The steps we followed for implementing Cloudflare were as follows:

1) Setup Cloudflare account and add card details for paid for plan
2) Requested CNAME record from Cloudflare support (we got this in 24 hours)
3) Given a TXT record from Cloudflare to add to the DNS for our example.com domain to allow them to take control
4) When that was done, Cloudflare gave us a CNAME record for the DNS record
5) Client reduced the TTL on the domain
6) We setup all the configuration of the http://www.example.com domain in Cloudflare but set it to ‘pause’
7) Client added the CNAME record to the DNS and once we’d waiting for the TTL to expire we did a tracert to see that we were actually pointing at Cloudflare
8) We then did the cool bit which was pressing the ‘unpause’ button and sending users through the CDN

We gave the site a smoke test and everything seemed to be working as expected. During the day we then proceed to ‘tune’ Cloudflare by gradually turning on the various options that allow you to cache static content (Cloudflare provide a handy list of file extensions it sees as ‘static’ files and you can use page rules to bypass these or to cache more file types).

Each time we made a change we checked the site and made sure everything looked OK before making the next change. We also checked that real traffic wasn’t being blocked by looking at Google Analytics to ensure there wasn’t a sudden drop in activity and asking Rackspace to ensure that all Cloudflare IP addresses (again there’s a useful list) were whitelisted.

At the end of the first couple of days of using Cloudflare we had enough data to see that it was making a difference. It had saved lots of requests (almost 50% of all requests were coming from Cloudflare cache) and had blocked over 100 threats (with the security setting on ‘low’).

dashboard2

dashboard1

The website ‘felt’ much, much faster from a user perspective although our external monitoring wasn’t reflecting this which somewhat confused us. This must be something Cloudflare get asked about on a regular basis and they give a very clear response to this at http://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/.

So the site was faster, we were blocking some hacking attempts, we were saving bandwidth all looked good. However we looked at IIS logs and could see that we were still getting some bad http requests (PROFIND, COOK, OPTIONS requests for non-existent URLs) and attempts to do some XSS and SQL injection. Our site/code was rejecting these requests as our IIS filters and security settings meant the hackers weren’t getting anywhere but we ideally didn’t want these requests hitting our server at all and wanted Cloudflare to catch and block them.

We then took advantage of the Cloudflare WAF (Web Application Firewall) and this is now blocking most of the ‘dodgy’ looking requests we’ve seen in our IIS logs. We’ve raised a support ticket with Cloudflare support about the few remaining dodgy requests and they’ve responded very promptly to say they will add a WAF rule to block those. If they come through on that promise we’ll be very happy.

wafrules

All in all, Cloudflare appears to deliver on it’s promises, is incredibly easy to setup and configure and support seems good.  There are lots of options we’ve not explored yet such as using their API to automatically clear the cache on a publish from Sitecore which would enable us to cache more than static content.  For a relatively low cost it certainly seems to offer a good alternative to Akamai.

A Tale of Two Architects

Ever got caught up in a debate about up front architecture versus evolutionary architecture? I know I have, but which is right? I believe it all depends on the business context and the information you have available.

I my mind both approaches are valid, just generally not on the same set of problems. Deciding which architectural approach to take should be considered seriously, and if you’re not having a long hard think about which approach to take, you probably should be.

This post hopes to equip you with some basic ideas to have a meaningful conversation about how you should approach your project from an architectural point of view: whether it’s the design it before you code it approach, or quickly start on the implementation and use learning to feedback on the direction you should be heading.

I’m going to split the approaches to architecture into the following two broad types:

Architecture Approach A

The first approach to architecture involves a lot of planning and discussion and a general effort to deeply understand the problem and map out the system up front. Typically there will be a person or delegation of people responsible for the architecture as a primary role. A separate team responsible for software implementation will build against the architectural vision. It will generally involve lot of meetings and documentation.

Architecture Approach B

The second approach is to transfer the responsibility of architecture to the implementation team and allow them to grow the architecture organically based on the continued learning and insight gained during the build and feedback from stakeholders as they see the software evolve.

The Winning Architectural Approach

I believe both approaches will get there in the end with a determined team (so by simply delivering a project using an approach, you don’t necessarily have proof about whether the choice was right, only that you have a good team), however, I think there are clear cases when one approach is better than the other.

The Architect Who Faces Certainty

An architect that faces certainty usually works in a business whose landscape is unlikely to change and with users and customers that have clear expectations and have requirements that are highly predictable. This is a clear choice for Architectural Approach A – the Certain Approach.

The Architect Who Faces Uncertainty

An architect that faces uncertainty will work in a business environment with a volatile market and whimsical users/customers whose needs are unpredictable. This is a clear choice for Architectural Approach B – the Uncertain Approach.

The Future of Architecture

Choosing the right architectural approach really depends on how much information you have about the future. If you have certainty, making all your decisions up front will get your project where it needs to be efficiently as everything can be mapped out. If you do not have certainty, you need an evolutionary approach to architecture based on gathering the information you need for the next set of decisions. Typically information is gathered by implementing part of your architecture and extracting insight from real results.

If the time taken to plan is spiralling out of control and conversations involve a lot of “what if this happened in the future” and you are hedging your design against uncertainty you are probably trying to up front plan against uncertainty – a sure sign you need to rethink your approach.

Another way of looking at the two approaches is from a scientific point of view. When trying to model something we understand it’s easy to come up with sound model. When modelling something we don’t understand, we must experiment in order to piece the model together.

As computing power continues to increase, and software tools become more powerful, it’s becoming harder and harder to model the systems that our tools are capable of building and predict how users/customers will use them. Companies should be comfortable with both architectural approaches to ensure the best chance of success developing software.

Designing software at the start of a project in a certain world is well understood, and I don’t think there is much more to teach anyone about how to do it well. Evolutionary architecture is a seldom discussed topic in software literature and their are similar sounding labels for other ideas. To feel your way through a project implementing the right parts at the right time to get the right feedback is tough and ultimately an art form with much scope for new thought but already used with great success. I believe it is vital to invest time in exploring the benefits that evolutionary architecture can bring to delivering large scale customer facing web platforms that inherently have many unknowns without being paralysed by the fear of implementation without certainty.

Whatever you decide, uncertainty will be your guide 🙂