We can schedule this for you wholesale: Boiler replacement and predictive analytics

Tom Harrison
5 min readMar 13, 2018

When I bought my flat in 2015 it needed a lot of modernisation and came equipped with a venerable Potterton Netaheat Electronic boiler. The user manual I found in a kitchen drawer referenced the Gas Safety (Installation and Use) Regulations, 1984 and Potterton replaced the Netaheat with the Profile in 1988 so at the point I inherited it, it was about 30 years old and still going strong.

Every day on schedule it would chug into life as I fumbled for the remote control to turn up the volume on my stereo and drown out the noise (how far we’ve come, now I just bark orders at my Sonos One). Every time I had family over they’d nag that it needed replacing.

“Nah” I’d reply “if there was anything that badly wrong with it, it would have broken already”…

Imagine, however, that we’re not managing boiler replacement in one property but about 18,000, and that we’re qualified and experienced heating engineers and asset managers rather than a digital professional with a cavalier attitude towards the efficacy of central heating. At Hackney, we currently aim to replace boilers in around 3,000 of our properties per year and target the programme by looking at boilers over ten years old, or with energy efficiency ratings of G or worse. It all seems sensible but as ever, what if there’s a better way?

Define “Better”

“Better” of course depends on your point of view and as a landlord we need to balance a few factors:

  • Reducing our costs by not replacing boilers unnecessarily, or trying to fix things beyond the point of repair.
  • Minimising boiler breakdowns experienced by tenants, especially vulnerable tenants, to ensure they live in safe and warm homes.
  • Minimising tenants heating bills and the environmental impact of our properties by ensuring they are energy efficient.

Boilers are a good component to work with as they require servicing annually, have repairs logged against them relatively frequently and are replaced more often than e.g. windows or roofs. This means we hold a lot of data, but the general principles of this project would hold true for other components too. Pivigo gave us a great deal for their next data science boot camp so we signed up again, with our team consisting of:

To pilot is to pivot

The general problem we asked our team to solve was how to identify the boilers likely to suffer a terminal failure ahead of time so that we could prioritise them for replacement.

During our initial data dive however, we found that our repairs data showed that boilers are often replaced without our assets database being updated so that it wasn’t always possible to tell when a boiler was replaced as part of a planned programme as opposed to it suffering a terminal failure. This was great for evidencing communication breakdown between teams, but less helpful when trying to do predictive analytics. Additionally, although our data was well structured there were large numbers of jobs related to heating that were nothing to do with the boilers themselves and so needed stripping from the dataset. This meant that we couldn’t answer some of our initial questions such as “what is the optimum point to replace a boiler?” and “can we identify boilers likely to suffer a terminal failure ahead of time?” but our data did allow us to answer some others, including:

  • What features impact boiler reliability?
  • Are some operatives better than others?
  • Which boilers are most likely to need repair?

The takeaway from this is that the double diamond approach is pretty useful for data analysis projects. When dealing with legacy systems you will likely have a pretty good idea of what sort of data is there and what you’d like to get out of it, but it’s only after a period of discovery that you can define the exact problems / questions that you will be able to answer with the dataset. Personally I don’t consider this to be an issue, but something to bear in mind when setting expectations for other stakeholders…

Seeing the forest through the trees

With the data cleansed and the problem better defined, the team started with a Kaplan-Meier survival analysis written in Python and using the Pandas and Numpy packages. The analysis looked at each of a number of features of our properties, households and boilers in turn to identify their impact on reliability.

The make and model of the boller was fairly important but happily our repairs team hadn’t needed data science to figure that out and had already standardised new installs on what turned out to be the statistically most reliable (in Hackney properties anyway, YMMV). Unsurprisingly, the biggest influence on reliability was age and repair history of the boiler but other factors also had some fairly interesting influences. Larger property and household sizes were both associated with higher failure rates which makes sense — boilers have to work harder to heat bigger properties and provide hot water to more people so are under more strain than they would be in smaller properties. We also found that in blocks of flats boilers in properties on higher floors were more reliable than on lower floors. Living on the top floor of my block this is easy to believe, my central heating is off for about two-thirds of the year as I get all the warmth from the flats below me so my boiler is under less stress.

The difficulty in assessing features in this way however is that we can’t look at any of them (e.g. boiler make and model) in isolation as any given boiler will be inherently influenced by the other features (e.g. property size) that are not being directly measured. Across a large enough dataset this noise is low enough to see the broad trends outlined above, but to better counteract this the survival analyses were combined with a Random Forest algorithm in R to generate survival probabilities for specific boilers in specific properties (as the name suggests, Random Forests are essentially multiple Decision Trees).

Give it to me straight, Robodoc

From the survival probabilities calculated for each of our boilers we should now know which boilers are best to replace, except… Well, define “best”. The probabilities can certainly rank the boilers in order of how likely they are to break down and they show that the “half life” of a boiler is 16 years, longer than the 10 years we currently use as our cut-off for replacement. This suggests that we could sweat our assets a bit longer than we do currently and save some money, but as said previously we need to balance that against the impact on tenants of having no heating and hot water.

How we do this requires consideration and the analysis lends itself to a risk-based approach with what is deemed an acceptable level of risk depending on the people living in the property. Is that something we can quantify or does it require a more qualitative assessment? It’s an interesting area and I think more indicative of a future where AI is helping people to make complicated decisions, rather than replacing them by making decisions for them.

--

--