Using Cloudflare as a CDN – a review

Recently one of our clients was experiencing an increase in site downtime. During our investigation of the outage incidents we discovered that the site was increasingly becoming a victim of DOS (denial of service) attacks.

From the data we looked at it appeared the ‘hacker’ would trawl the site, honing in on pages which had the longest response times and then repeatedly hit those pages with requests using up resources on the site and eventually causing the CPU on the database to max out and the site to go down.

Our client hosts with Rackspace who offer a security solution so we asked them for pricing.  They suggested that their managed service would be rather expensive  for our needs and recommended we take a look at Cloudflare.

Cloudflare offers a low cost (entry level plans are free) Content Delivery Network which enables you to save bandwidth and reduce requests to your server by caching some content. In addition (and this was the feature we were most interested in), Cloudflare offers built in security protection to guard against DOS attacks.

Both the caching and security settings are highly configurable through an easy to use interface, help documentation is clear and well written and support is good (support tickets are prioritised according to the plan you’re on – support for clients on paid plans get priority over those on free plans which seems fair).

Cloudflare is amazingly simple and low risk to implement. The most simple way is to simply delegate the top level domain DNS e.g. example.com to Cloudflare who take over the management of your Zone file. You can then choose which of your zone file entries you want to send through Cloudflare and which you don’t. You can set Cloudflare up ready to go with all services in ‘pause’ mode which means when your DNS does initially point to them they don’t do anything other than relay requests.

If you (or your IT department) aren’t happy to delegate the entire DNS for your domain (maybe you have internal systems running on that domain) then it is possible to get a CNAME record setup by Cloudflare for a sub domain e.g. http://www.example.com. This is the route we needed to go down for our client and this option does require you to be on a paid for plan (we went for Business at $200 per website per month).

The steps we followed for implementing Cloudflare were as follows:

1) Setup Cloudflare account and add card details for paid for plan
2) Requested CNAME record from Cloudflare support (we got this in 24 hours)
3) Given a TXT record from Cloudflare to add to the DNS for our example.com domain to allow them to take control
4) When that was done, Cloudflare gave us a CNAME record for the DNS record
5) Client reduced the TTL on the domain
6) We setup all the configuration of the http://www.example.com domain in Cloudflare but set it to ‘pause’
7) Client added the CNAME record to the DNS and once we’d waiting for the TTL to expire we did a tracert to see that we were actually pointing at Cloudflare
8) We then did the cool bit which was pressing the ‘unpause’ button and sending users through the CDN

We gave the site a smoke test and everything seemed to be working as expected. During the day we then proceed to ‘tune’ Cloudflare by gradually turning on the various options that allow you to cache static content (Cloudflare provide a handy list of file extensions it sees as ‘static’ files and you can use page rules to bypass these or to cache more file types).

Each time we made a change we checked the site and made sure everything looked OK before making the next change. We also checked that real traffic wasn’t being blocked by looking at Google Analytics to ensure there wasn’t a sudden drop in activity and asking Rackspace to ensure that all Cloudflare IP addresses (again there’s a useful list) were whitelisted.

At the end of the first couple of days of using Cloudflare we had enough data to see that it was making a difference. It had saved lots of requests (almost 50% of all requests were coming from Cloudflare cache) and had blocked over 100 threats (with the security setting on ‘low’).

dashboard2

dashboard1

The website ‘felt’ much, much faster from a user perspective although our external monitoring wasn’t reflecting this which somewhat confused us. This must be something Cloudflare get asked about on a regular basis and they give a very clear response to this at http://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/.

So the site was faster, we were blocking some hacking attempts, we were saving bandwidth all looked good. However we looked at IIS logs and could see that we were still getting some bad http requests (PROFIND, COOK, OPTIONS requests for non-existent URLs) and attempts to do some XSS and SQL injection. Our site/code was rejecting these requests as our IIS filters and security settings meant the hackers weren’t getting anywhere but we ideally didn’t want these requests hitting our server at all and wanted Cloudflare to catch and block them.

We then took advantage of the Cloudflare WAF (Web Application Firewall) and this is now blocking most of the ‘dodgy’ looking requests we’ve seen in our IIS logs. We’ve raised a support ticket with Cloudflare support about the few remaining dodgy requests and they’ve responded very promptly to say they will add a WAF rule to block those. If they come through on that promise we’ll be very happy.

wafrules

All in all, Cloudflare appears to deliver on it’s promises, is incredibly easy to setup and configure and support seems good.  There are lots of options we’ve not explored yet such as using their API to automatically clear the cache on a publish from Sitecore which would enable us to cache more than static content.  For a relatively low cost it certainly seems to offer a good alternative to Akamai.

Going backwards is sometimes the only way to go forwards

I was recently drafted in as Project Manager on a redesign project for one of our clients. The project had been underway for several weeks, the team were all very busy, the ‘work in progress’ boards showed that lots was being worked on but the ‘accepted’ and ‘live’ parts of the board were worryingly empty. There was a project plan but it was clear the team weren’t delivering against it.

I observed what was happening in the team over the course of the next week or so whilst trying to get my head round the requirements. The team were doing daily stand-ups reporting in on what they were working on, everyone seemed to be working on unconnected component level user stories.

The team were complaining about the designs, they were inconsistent, kept changing, looked OK for print but wouldn’t work for the web, didn’t work if the content (which was content managed) got longer etc. etc. etc.

Many stories had been started but couldn’t be finished because they were blocked for one reason or another. If the team got blocked, they just started something else.

The team were demoralised, they weren’t getting any sense of progress, couldn’t see the big picture and nobody was doing anything about their frustrations.

One day I asked a question about how we were going to put the redesigned site live. The customer was going to need to do some content population in the live CMS so we would need to let them do that without the risk of the new pages going live before the client was ready. Some of the content on the existing site was going to be re-used but would need to be available in the new designs. Stakeholders would want to be able to see and approve the new site before it went live but this categorically couldn’t be on a URL that just anyone could access – even within the organisation concerned access to the new site would need to be restricted.

When I asked the question everyone in the room looked blank, the team hadn’t thought about this, nobody had asked the question before and there were no ideas.

So we stopped. Mad as it sounds, on a project which was running behind time, running over budget and with nothing actually delivered we decided to stop work. Anything we did was pointless, we weren’t getting anything accepted or live and we had no plan for getting things live.

We don’t like to deliver bad news, but if we have bad news we don’t try and hide it and we deliver it early. We called the customer, we told them we had a problem, we said we were behind plan and that we were going to stop and re-think the approach. Understandably they were concerned, they were adamant that the dates we had given them weren’t moveable, they said they didn’t want to have another call like that again and that they wanted us to update them when we had a new plan.

The senior developer and I then got together and talked about the problem. We took a fresh look at what we needed to deliver and simply printed out in colour all the pages we needed to build. Whilst our backlog of component level stories looked scarily unachievable, the number of pages we needed to build looked far less daunting.

We had a team of 3 developers on the project. All we needed to do was a page a day between the three developers and we’d still have 2 weeks before the go live date for amends, launch plan and as contingency. Was a page a day feasible or should we ask each developer to produce one page each every 3 days? We talked about the strengths of the team – one of the team was a whizz at HTML/CSS, another had a lot of Sitecore backend experience, another was a good all-rounder.

The Senior Developer bought the other developers into the room and put the ‘page a day’ challenge to them. They agreed they’d all prefer to work together, doing the things they were each best at and they were happy to commit to doing a page a day with a ‘we don’t go home unless we’ve met our goal’ attitude. They were fired up, they had something tangible to aim for, it would be really easy for them to know if they’d achieved what they wanted to.

My job as the Project Manager having got that commitment from the team was to make their jobs as easy as possible – clearing the path ahead of them.

I talked to the client about the inconsistencies in the designs and we consolidated a number of the components which were similar thereby reducing the work that was needed (and producing a better user experience). I asked the designer for an updated style guide confirming which fonts, colours, buttons should be used where and when.  We agreed the style guide was the definitive version of what we should follow even if the Photoshop files we had didn’t.  Before we started work on any page I suggested that we’d do a final sanity check that we were working to a confirmed/signed off design to minimise changes later.

We agreed that any ‘changes’ after delivery would be added to a backlog which we wouldn’t look at until after all the main work was done. We’d then prioritise the changes and decide which were ‘must have’ for launch and get as far down them as we could. If there were more ‘must have’ changes than there was time available the client accepted the fact the date would need to slip but that the slippage would be down to the organisational wants/needs and not our inability to deliver what we’d committed to.

We told the client that we wouldn’t start any story that we weren’t confident we could finish and get approval on. We set the client expectations about the time/effort this would require on their part and got their commitment to make the relevant people available on the right days to sign off the work.

We agreed that getting a page ‘done’ for the customer to sign off was getting it all the way through to a live environment and onto a URL to which we could restrict access. This required a change to the architectural approach and we had to spend several days unpicking and re-doing some work we’d already done. But it was absolutely necessary.

Our new plan and approach massively de-risked the launch of the new site. Code was going to be deployed up to live every day. Those that had access would be able to see the site coming together on the live servers, they could update content on the old site and see it in the new designs, they could add new content. In short on go live day, all we needed to do was a configuration change to point the site URL at the new site we’d created – no downtime for the end users or for the content editors.

With renewed vigour the team jumped on the new plan. We stuck those printed out designs on the wall in ‘to do’, ‘doing’ ‘done’ columns and every day those designs would move across the columns and we’d see the progress. The increased production with the team working together rather than independently was amazing to see. If one person finished their piece of the jigsaw early they’d help someone else out. Every day they were doing a new page, some days they did 1.5 or 2 pages.

Daily stand-ups became unnecessary, the team were working so closely together with the same daily goal that they were communicating non-stop all day across the desks. I didn’t need to ask whether they were on track or how they were doing – I could see and hear it.  The client didn’t need updated project plans or status reports – the teams progress was visible every day by the release of something new.

The team delivered the site on time and under budget. The client was able to use the spare budget to include some changes and new features they wanted. The launch day was as smooth a transition as anticipated, literally a push button exercise to switch the new site over. The client was over the moon. The team were buzzing and had a real sense of ownership and pride in the end result.

It was a real team effort and a project I thoroughly enjoyed working on (even at the most stressful times) with a great team of committed professionals (you know who you are).

Here’s what we/I learnt:
• Spend enough time planning up front – don’t be tempted to dive straight in to the work
• Don’t assume every project or piece of work is the same and that processes which have worked before will work again – look at each challenge with a fresh pair of eyes
• The power of a team is far greater than the sum of the individuals – play to your strengths
• Think about getting things live early and how you’ll launch – de-risk big bang launches
• Don’t start what you can’t finish
• Tackle impediments and issues, head-on (frustrations and moans often hide an impediment)
• If it’s not working don’t be afraid to stop and rip up the rulebook
• Find a way to see your progress – nothing gives a greater sense of achievement
• A problem shared is a problem halved – getting the team ideas and commitment is better than telling them what to do and making the commitment for them

The Agile Manifesto – Individuals and interactions over processes and tools

Here at True Clarity we like to think of ourselves as ‘agile’ and we like to keep things simple. When asked ‘What’s your process?’ We have been known to say ‘we don’t have a one, we tailor it to you’.

Whilst our existing customers understand the way we work, for potential customers this approach can cause them to perceive we’re a bit ad-hoc, fly-by-the-seat-of-our pants, make-it-up as-we-go along, cowboy sort of an outfit.

Of course we have a process but it’s so familiar and natural to us that it’s like breathing – we don’t think about it and therefore we’re not very good at articulating it.

So this blog post is the first of what we hope will become a series of posts which talk about our processes and the tools and techniques we use. We won’t be inventing any new processes here – these are things we do or tools we use already, every day – naturally.

“The aspects of things that are most important to us are hidden because of their simplicity and familiarity. ”

Prof. Ludwig Wittgenstein