One year ago this month, I wrote a post about how the COVID-19 pandemic was going to greatly accelerate the pace of global, digital transformation. How literally overnight we were being forced to find new ways of working, meeting, shopping, managing healthcare, and even staying entertained. And how there would be a tremendous surge in demand for video conferencing, home delivery services, online learning, eCommerce, media streaming, eSports, and telemedicine.

I wrote that “enterprises need to act quickly to ensure they can meet this new level of demand and expectations of dependability and quality of service. The current crisis will carve a new landscape in business – winners and losers will emerge in every industry as a result.”

Perhaps nowhere has this been more evident than in the world of retail sales.

Record Growth

2020 saw staggering growth in ecommerce sales. According to Digital Commerce 360, ecommerce sales hit $791.70 billion in 2020, up 32.4% year-over-year and double the growth rate of 2019. Online penetration of overall retail sales, which historically has grown at about one percentage point per year, hit 19.6% last year, up from 15.8% in 2019. It was the largest annual gain in ecommerce penetration ever recorded– compressing years of growth into just months. Total retail sales grew 6.9% to just over $4 trillion with ecommerce sales accounting for nearly 75% of that gain. One in every five dollars spent on retail in the last quarter of 2020 came from online orders.

But there is a darker side to that growth – it is creating an even larger divide between the mega-retailers like Wal-Mart and Amazon and their smaller competitors. Amazon alone accounted for nearly half of all ecommerce sales growth in 2020. To remain competitive, smaller retailers will need to step up their game. It has never been more critical for online retailers to provide flawless shopping experiences to customers where leaving one store and going to a competitor is only a mouse-click away.

And yet, anecdotally it seems we’re seeing more issues with online websites and services than at any other time in recent history. Many online sites have experienced degradation of service, poor customer experiences, and even complete outages. Operations teams are scrambling to keep up while demands on the business continue to increase. The surge in demand for online services is exposing weaknesses in, and in some cases a total lack of, adequate performance monitoring of ecommerce websites and platforms.

The New Normal

It would be tempting to hope for things to slow down and get back to normal but all indications are that ecommerce growth trends are only going to continue. This is the new normal. It is survival of the fittest. So if you’re an online retailer, what can you do today to ensure the best possible performance of your digital properties? Here are three suggestions.

Implement robust monitoring and analytics of all your ecommerce sites and digital services.

If you haven’t already, you need to implement a robust monitoring program for all of your digital properties immediately to provide complete visibility and absolute clarity into system performance. Having deep, insightful analytics to quickly identify and resolve issues while continually optimizing consumer experiences is absolutely paramount to providing a high quality of service.

If you’re just getting started, developing monitoring expertise is a journey. It begins with implementing the fundamental building blocks and then layering on increasing sophistication over time. As you advance your monitoring practices, operations become far more strategic and proactive. Organizations move from fire-fighting to driving measurable business performance and results. Here is a handy guide to help you benchmark where you are today and steps you can take to strengthen your monitoring practices.

It’s also critical that the business adopt a data-driven culture and a philosophy of continuous improvement. It’s more important to get started and evolve over time than to shoot for immediate perfection. Build your initial monitoring plan collaboratively with technical and business leaders, set what you believe to be acceptable performance levels, and measure results. Then meet regularly to share data and results and further refine your monitoring plan. It’s an iterative process – one in which there is always room for improvement.

Define and measure service level objectives that are tied to business success.

Here it is important that the technical and business leaders work closely together to define what is important to the business, what success looks like, and how that success will be measured. This sounds deceptively simple but in practice it can be incredibly difficult. It’s a great mental exercise to ask (not that you would, but), “If we could only monitor one KPI, one metric, one telemetry point, what would it be?” Do this exercise with a cross-functional team across the business and you’ll get a range of answers. You’re on your way to really understanding what’s important to the business and therefore what you should be monitoring to ensure you meet those goals.

Armed with that input, operations teams can then define the associated service level objectives (SLOs) that will be used to measure and know, empirically, whether or not the business is delivering an acceptable customer experience. There is a bit of an art to setting SLOs because you need to strike a balance between stability and the ability to introduce change. You can certainly reduce the risk of error by greatly limiting changes in production but then you also limit the business’s ability to introduce new features to enhance the customer’s shopping experience.

Setting an SLO is not about driving the highest possible level of performance at all times but about setting the minimum viable service level that will still deliver acceptable quality to the customer. The margin between these two points is what’s known as an error budget which you can use to determine whether or not you are introducing too much change into production or perhaps could be deploying changes even faster. You can read more about the art and science of setting SLOs here.

Make sure your current monitoring system is up to the task.

As mentioned earlier, many businesses are finding their current monitoring solutions and practices are not keeping pace with the dramatic increase in online demand. Here are some signs your monitoring platform may need a tune-up:

  1. Being Blindsided. It’s never fun when the first indication you have of problems in production is complaints from customers. It could be that your monitoring system just needs to be better configured, but it could also mean it has limitations on the amount of data it can collect, there’s been data loss, or delays in the ingestion of data for analytics.
  2. Preventable Outages and False Positives. In this scenario, you are experiencing too many preventable outages on the one hand or too many false positive alerts on the other. This could either be a case of not monitoring what you actually care about or you know exactly what you care about, but your monitoring system can’t express what you want to monitor.
  3. Bad Data. You spend hours studying a graph to isolate a problem only to find out the data you’ve been analyzing is either wrong or outdated. This can happen when you’ve had to compromise and store summary data to save space and/or cost, and you lose the drill-down granularity you need for root cause analysis and other analytics.
  4. Missing Data. You have urgent operational and business impact questions but turns out you haven’t been collecting the data you need to answer them. This typically happens when your monitoring system is limited in its ability to collect and retain the massive amount of telemetry data generated by today’s modern infrastructure.
  5. Monitoring Crashes. Your monitoring solution is actually less dependable than the systems it monitors, and you lose data when it crashes. Be careful of solutions with single points of failure and/or that need to be deployed within the actual “blast zone” (of a potential outage) in order to collect data.

Successful monitoring boils down to the ability to harness and analyze infrastructure performance metrics in real-time to ensure you are meeting your quality of service objectives. The power and value of monitoring increases dramatically the more you can harness all of your telemetry from all of your infrastructure to confidently make the best decisions for your business. The return on investment is well worth the time and effort.

Seize the Day

A robust monitoring program combined with thoughtful SLOs and a comprehensive monitoring and analytics platform will give you the tools you need to compete in today’s online-first economy. You’ll have more confidence and speed in your decision making, the ability to move fast without breaking things, and if nothing else, you’ll sleep better at night knowing your customers are getting great service.

Beyond providing a great or even exceptional online experience, savvy brands are recognizing that the surge in ecommerce is reshaping the competitive landscape and creating an opportunity to gain substantial market share. For those companies that want to separate themselves from the pack, there may never be a better time.

As Shakespeare said, there is a tide in the affairs of men which, taken at the flood, leads on to fortune. The tide of ecommerce growth is certainly ripe for the taking.

Get blog updates.

Keep up with the latest in telemtry data intelligence and observability.