WFB Fractional CFO
  • Home
  • Services
  • Contact
  • Strategy
  • Finance
  • Operations
  • About

Customer Segments, Analysis, and FP&A

12/27/2025

0 Comments

 
TL;DR : Clustering as a method for forecasting could be effective if the automation was present to cluster as customers were entered. It could prove to be very accurate for cash flow forecasting, but also planning for future resources, at least given the data sample used

I am a big fan of using tools that are generally associated with ML (machine learning) to conduct analysis on data sets, especially for FP&A work. Many people don't realize that the programming and development that goes into ML data fitting, filtering and analysis is just mathematical theories and equations that can also be used on a scripting basis. I prefer to do some pre-analysis on data to get an understanding of what the data is saying about a given subject matter so I make use of the data capabilities of platforms such as R and Python in addition to Excel. Even if you are not a developer, I am not, you can still find many useful tools that enable you to glean quick insights into data that would be difficult to do within Excel or a BI tool directly. I refer to this as the pre-prep phase before getting into model construction, costing, or forecasting and is the actual data analytics that should go into financial analytics else you could be missing out.
​
The great thing about data analytics and data science is that you actually don't need to know anything about the data that you are given and you can still begin to draw conclusions. I have been given data-frames with no labels other than row1, ... row10,000 and column1, ..., column30 and relationships can still be found; clusters, dominating dimensions, and even regressions predictors if one wants to run through several arrangements of least squares calculations. This was just a test, but I like to point out that data speaks if you understand data regardless of industry, a huge misconception in hiring practices these days as this skillset is desperately lacking in the corporate finance and accounting functional department. But that aside, let's see how data talks.
In business everything is segmented, processes to functional departments, and it should be no surprise that how we segment can impact effective reporting and analysis in addition to functions well beyond reporting and analysis. Let's think of revenues since we are discussing customers. We see segmenting by geography, business unit, product/service category, product/service lines (or subcategories), delivery channels, margin groupings, and so many attributes. The method for presentation is generally chosen by executive management or, in the cases where segmentation reporting is required, by external reporting compliance. Unfortunately, this is also the format for the internal reporting, P&Ls, budgeting, and forecasting. It is not unfortunate in its presentation, but in the fact that these numbers will be used as the baseline without any consideration to the customers and their attributes as drivers to the revenues. Are you certain that the appropriate drivers are being given the proper attention according to their influence?

This is where customer analysis and segmentation can come in handy. While this is typically designated to marketing in attempts to target campaigns at customer clusters defined by traits, there is no reason that FP&A should not be able to benefit from this information. Obviously, this is a good reason for collaboration so that the same analysis is not completed more than once, but this topic can also be a great discussion for IT, Marketing, and Finance, on data development, automation, analysis and reporting if there is something noteworthy.
Picture
Above is a screenshot of the xlsx file. The dataset is a 'fake-data' telecomm sample set of churned customers. There are 7,043 rows of data collected. The data is all geographically located in California, and while the Latitude, Longitude and Zip Code is given so we could have perhaps imported other data from the US Census Bureau such as income or occupation by area, I did not. This would have given a rough estimation of income in a geographic area, but not necessarily be useful. In an an actual scenario, it might be worth a check, but this is just a look at how data communicates.
​
Below you can get a better look at the column headers, attributes, that were recorded and the data type. The data type is relevant when using the tools in Python or R. When clustering and choosing a method such as k-means or k-nearest neighbors the data must be numeric. If the data is categorical, such as the contract type ('Month-to-Month', 'One-Year', 'Two-Year'), then k-modes is used. If you use both, as is the case in marketing, then there is a method called k-prototypes that manages both types of data. 
Picture
The method for analysis does become important. Clustering is about grouping the data together but this clustering is done by selecting the number of clusters desired. Oversimplifying, if you want five clusters then five center points, called centroids, are randomly chosen, and then all the row entries are measured in distance from centroids. A new placement of centroids is chosen and then again all row entries measured. This continues until five centroids are chosen and the total distances measured between all row elements and their five respective centroids is minimized and stable (unchanging). The distances and the centroids create the cluster.

There is a good explanation of clustering here: https://neptune.ai/blog/k-means-clustering

​Therefore, it shouldn't be too difficult to imagine that a selection of six clusters would change the outcome. Because of this, there are methods to select an "optimum" number of clusters since there is a balance in the number of clusters and the information available from the clusters, after all, every unique row could be its own cluster if we went to that level of granularity. How useful would that be? So one method that is used is called the Elbow Method and would look something like the image below that was used for this dataset.
Picture
​In order to get a good cluster number, we look at the steepness of the slopes and how much it changes. In this case, there is little change in the slope from 4 to 14, but it is quite steep from 2 to 4. I chose 4 clusters.
Since that bit of background is out of the way, what are these clusters and what information can we glean?
I decided to narrow my columns to a select group of nine attributes, five that are numeric (float64) and four that are categories (category). Given the available attributes, I thought these may have the most influence on creating clusters and perhaps getting more insight into customers that had churned.
Picture
​The categories have the following entries for Dependents, Phone Service, Internet Service, and the Contract. Rather that the number of dependents, this is simply a yes or no, as is whether there is phone service in the contract with internet service.
Picture
​If we get a quick view of the numerical categories below, then we see that Tenure Months is the length of time until churn, Monthly Charges is the monthly bill, Churn Value is 1 or 0 for customers churned or not, respectively, Churn Score is the rating given to the customers to determine if they will churn and CLTV is the computation for the Customer Lifetime Value.
Picture
With some visuals for the categorical, we can already begin to draw some conclusions about the clusters and the customers that are fit into these four clusters. We can see that there is quite a difference in the 'Internet Service' and 'Contract' type. Coincidently, I would use these to organize revenue presentation and forecasting in addition to assessing cost. I would choose Contract types with and without Internet service for six categories given the data available. You will see why as we run through the clusters.
What do we see in the categorical first?

Starting with #2 and #3:

Cluster 2:
At a high level, these are the primarily single group that spend on the fiber optic internet with month-to-month contracts. This would probably be the cluster that is going to churn to the most affordable contract offering out there. And as luck would have it, this is the cluster that has 99% of the total churned customers. Due to the month-to-month, the odds were good that this group has a high churn.

Cluster 3:
This cluster is your Internet Cluster and most likely to package with Phone Service. The utility cluster as there is nearly 75% of Contract types that are one or two year. This is the cluster that would churn for a customer service issue or specific reason having to do with performance.

Cluster 0:
The older demographic and single parent families, this is almost guaranteed even though this was not included in the analysis. This might also be the cluster with the longest tenure of those that have churned. The reason is that there are few contracts with any Internet Service and those that have Internet Service have the DSL, older and cheaper. This is also the smallest cluster, two-thirds have one-year and two-year contracts and aren't looking to have to switch. In telecomm there are federal statutes for companies that provide services that in their given area phone service must be delivered to all homes that desire it.
​
Cluster 1:
This is what I dub the mix, but this is, in effect, Cluster #2 that hasn't churned. This cluster would probably be the most difficult to determine when or why they would churn. They align their 'Dependents' with #3, their 'Phone Service' with #0, their contract dispersion is similar to #2 and due to the Internet Service and Phone Service breakout you could conclude that this cluster packages for utility as #3, but prefers the freedom of no contracts like #2 ... as I said ... those that haven't churned. 
Picture
Now for the numerical data that I have a couple of different ways.

Bar Chart. I really like the 'Tenure Months' view for a quick analysis visual, and the 'Churn Score' isn't terrible. These aren't for a dashboard necessarily, we just need some information. The 'Tenure Months' shows the nice skew, displaying both the count and the months of tenure. We see that mirror image of #2 and #3 that we would expect for a high churn on short tenure, month-to-month in #2 and the opposite for the utility cluster, #3.

We see that the Churn Score was generally higher for cluster #2 as it is tightly grouped to the right (+80s) and it is also a higher customer count, making sense given that the entire cluster #2 is churn. Given that the other clusters do not have churn, and are quite evenly dispersed we can conclude that cluster #2 was determined by churn (as it was) and that the scoring seems to be a good predictor of churn. There is also an interesting relationship of Tenure Months and Monthly Charges to CLTV as there should but it is the dynamic that is interesting. Cluster #0, very low monthly charges but skewing toward a long tenure, therefore grouping at higher CLTV values. Cluster #2, skewing higher on monthly charges, but for very short tenures with high counts of customers and a fair dispersion of CLTV. Cluster #3, long tenures, high charges, high CLTV.
​
Cluster #1 is, of course, similar to Cluster #2 and we would draw from this group to determine our churn if our data is any indication. Clusters #0 and #3 would be our base, consistent, revenues for 45 - 60 months and during forecasting, a minimal number of within these clusters would churn if their contracts hadn't expired. 
Picture
Boxplots. Same data, different view.
​
Cluster 2:
(above) At a high level, these are the primarily single group that spend on the fiber optic internet with month-to-month contracts. This would probably be the cluster that is going to churn to the most affordable contract offering out there. And as luck would have it, this is the cluster that has 99% of the total churned customers. Due to the month-to-month, the odds were good that this group has a high churn.
Lowest tenure, highest churn. The higher monthly charges that can easily be seen from this visual. Higher charges could be due to the month-to-month contract pricing or to value-adds, but we should keep in mind that this group may also require higher servicing costs and acquisition costs. This is just a brainstorm as there was not data to support costs, but it is the next place to begin a dive.

Cluster 3
This cluster is your Internet Cluster and most likely to package with Phone Service. The utility cluster as there is nearly 75% of Contract types that are one or two year. This is the cluster that probably churned for a customer service issue or specific reason having to do with performance.
Highest revenue generators, Value-Adds. Longest Tenure, Highest CLTVs. This all aligns. This is a cluster that you would want to gather customer survey data to determine churn, if any, develop early warnings and quickly determine solutions if necessary. It is unlikely that this cluster wants to churn given the monthly investment.

Cluster 0:
The older demographic and single parent families, this is almost guaranteed even though this was not included in the analysis. This might also be the cluster with the longest tenure of those that have churned. The reason is that there are few contracts with any Internet Service and those that have Internet Service have the DSL, older and cheaper. This is also the smallest cluster, two-thirds have one-year and two-year contracts and aren't looking to have to switch. In telecomm there are federal statutes for companies that provide services that in their given area phone service must be delivered to all homes that desire it.
Consistency and lengthy tenure contribute to a high CLTV. The goal with this group is cost control, then this cluster could be a foundational set of revenues each month.

Cluster 1:
This is what I dub the mix, but this is, in effect, Cluster #2 that hasn't churned. This cluster would probably be the most difficult to determine when or why they would churn. They align their 'Dependents' with #3, their 'Phone Service' with #0, their contract dispersion is similar to #2 and due to the Internet Service and Phone Service breakout you could conclude that this cluster packages for utility as #3, but prefers the freedom of no contracts like #2 ... as I said ... those that haven't churned.
This cluster has a slightly higher mean tenure, slightly lower monthly charges and a slightly lower CLTV. Clusters #1 and #2 present an opportunity.
Picture
We have a good look at the clusters.

Quick comments on Marketing and Sales:

Behaviorally, people are reluctant to change unless there is a conscious effort to do so on their part. That said, those that signed into month-to-month did so for a purpose that was probably motivated by personality and belief. It is difficult to get someone to sign into a contract if they are a month-to-month personality. The same goes for those that get all the bells and whistles of add-ons. While disposable income does come into these decisions, it also more often than not comes down to personality and behavior. Those that are minimalists will be a minimalist regardless of disposable income. Those that desire the flare of the add-ons will strive for as many as they can afford. This is what economists try to model through utility and behavioral economics.

From a Marketing and Sales' perspective, It is fairly easy to determine what drives clusters #0 (need) and #3 (utility). Marketing will probably push efforts to keep #3 and offer more utility to their services, such as faster internet and since #0 is unlikely to be in a high disposable income category and more likely in the government compliance category there is unlikely to be many campaigns targeting their cluster. If these clusters are similar across telco providers, then it should be noted that their #0 and #3 will also be unlikely to leave. It would likely take considerable investment to acquire a #3 from another company. That leaves us with #2, the churned, and #1, the next to churn.

There are two strategies to push the monthly recurring revenue because of the high turnover of #1 and #2. These clusters are where you would seek to gain market share relative to periods of time. What I mean by this is that you seek to increase mean monthly tenure of the month-to-month contracts. In the first strategy, the goal should be to determine clusters #1 and #2's cost-to-benefit maintenance to drive tenure for longer before they churn. Increasing tenure in cluster #1 would probably be more likely than increasing the monthly charge as I would venture a guess that #1 is not willing to get add-ons relative to their #2 counterparts. The second strategy, is the more difficult pursuit of attempting to either increase the per month spend or get the month-to-months to sign onto a contract, thus increasing their tenure but possibly in conflict with their natural behavior.

Finance Comments:

As mentioned the method for presenting the revenues is often determined by the external reporting compliance, but this might be a less effective approach for forecasting and getting a real view at the drivers of your revenues, the customer base.

I would, in a real-world instance, still dig deeper and attempt use the location from the latitude and longitude to gain insight into customer occupations, income, age, household size and ratio of single family homes to multi-family homes in the neighborhood using available data from external sources such as the US Census Bureau and the Bureau of Labor Statistics. This, while not on a customer specific basis, would give me a peace of mind to whether my understanding of the clusters is on point. We will assume that what I stated above is accurate.

  • Reporting, Presentation, and Forecasting:

We have a couple of interesting methods for internal reporting. The degree to which this is capable relies on the data, IT capabilities of the company and the analyst. Their shortcomings are your decision makers' analytical and strategic shortcomings.

First (very straight forward, but still two different views):

Given SEC reporting requirements for reporting operating segments, there may be separate reporting for internet services, phone services, and retail, however, I have seen some companies still report as a single segment. This is generally due to the roll-up of ledgers as it is easier, but is it useful internally?

Note below and how these segments, still attempt to group by customer utility and behavior, though the two formats, Internet/Contract and then Contract/Internet, represent the customer base in slightly different ways. Internet/Contract, which I prefer if we had to choose a high level view, can capture those clusters well and could even go deeper into DSL vs Fiber. Of course, while the example is rough, we would miss influence of Cluster 1 and this is important if they truly are the cluster yet to churn.
​
ie: Cluster 3 : Internet @ Two-Year vs. Cluster 2 : Internet @ Month-to-Month
Picture
​Contract/Internet, gives a slightly different grouping as we would have an easier time using these segments to identify Cluster #1 and Cluster #2, primarily Month-to-Month, and Clusters #0 and #3, primarily contract. 
Picture
Second (more interesting, possibly valuable, definitely differentiating):

Finally, a more interesting way might be to utilize the clusters. I think this method could be very useful for cash flow forecasting. Though this is an unlikely method for reporting since people don't usually think of segmentation in this way, but it would make for some interesting forecasting.

Think if you automated the clustering assignment as new customers were entered into the system. This would first, create a baseline for marketing since the cluster would already suggest something about personality and behavior. Second, we have seen that churn occurs in Cluster #2, those with a high churn score. Now I am not privy to how the churn score was calculated on a customer by customer basis. It is likely there is a Probit regression to predict the probability of churn that is used, much like banking uses a probability of default model for risk management.

The output would look something like the table below where I used the Contract category (Month-to-Month, One Year, Two Year) and Monthly Charge to determine likelihood of churn. While the variables are statistically significant, as they should be, the R-squared has low explanatory power for any variance of the predicted outcome and the observations (0.2039).

What the model really says is that having a contract makes a churn almost nonexistent, the coefficient is -0.9534, the values of the Contract are
​
0 : Month-to-Month,
1 :One Year,
2 : Two Year.

Having a monthly charge increases the likelihood of churn, a probability of about 1% of the monthly charge. We of course have to take these variables into the broader context of the formula,

Y = -0.9291 - 0.9534*Contract + 0.0109*Monthly Charge,
​
which states that churn probability is unlikely except in the instances of Month-to-Month Contracts and Monthly Charges >= $85.23. As this monthly charge increases, so to does the churn probability.

Picture
The model really needs some additional variables to see if we can get more explanatory power in the variance for prediction, but given that what the model says makes sense in terms of contracts and costs, we are probably on the right track and further, i think it suggests the the individual customer churn score could indeed be the probability of churn. Now, where this churn score to continue to adjust over time (which it could) then this churn score could be the probability of churning from one month to the next and be useful for calculating expected MRR and CLTV.
 
So .. lets quickly look at the rundown of MRR and what happens using the clusters and some assumptions about the churn score.
​
  • To get a timeline, I will just run backwards from today (2025 - 12 - 26) using the Tenure Months. This will create an interesting picture as all customers in the list were present until today, but all will have different start dates. This will still be our standard SaaS broad groupings of New, Continued and Churned as of today's date, but we wont know if any Upgrades or Downgrades occurred because those changes are not indicated. Any changes and further New would simply need to be forecasted and modeled (as stated using the Clusters) 
Picture
​The pricing for the services is implied in the Monthly Charge, each add-on can be approximated by backing out the selections and payment by the customers
Picture
​The great thing about python or R is that you can do alot of data analysis quickly on large data frames in addition to the visuals. The numbers for the clusters give some people more information than the visuals. Below is the quick segmenting of Clusters by Churned (1) or Not (0), for attributable monthly revenue, the medina churn score and the median tenure in months. Note, again, cluster #1( dubbed 'yet to churn') and cluster #2 (churned).
Picture
​We could look more granularly with a quick change. Granularity, however, has its limits in its usefulness. This view, below, could be interesting with a long run tally over the past two to three years of customers. A grouping of our customer set that has a thousand or more customers in each group could prove to be a statistically significant representation of a long run average of the customer base. We could use the clusters, subsegment of plans, and map out monthly revenues for the median tenure as a solid forecast. This level would also allow for a better mapping of costs to each service since there is certainly different costs that area allocated to the levels of internet. The interesting note is the DSL'ers seem a bit less likely to churn than the Fiber.
Picture
Now how can we use clustering to our advantage for forecasting revenues?
​
First, we determine how many of each cluster will take up services as new each month. If we dug into store level demographics, we could definitely come up with pretty good historic numbers for each cluster.
Second, we can choose our revenue calculation mechanism in two ways. We could simply use the monthly revenue by cluster and the customer count to get an average to carry forward relative to the size of new customers in each cluster or we can use the services and add-ons to come up with a formula to predict a price based on service selected.

Such a formula for cluster #0 comes from a multivariate regression below. In this case, the 'Constant' represents Phone Service. This regression gives a prediction of monthly charges based on service level selection. If we believe that 1000 new customers would be cluster #0, we could use our long run segmentation from above to determine that 14% get internet only (140), 11% get phone and DSL internet (110), and 74% get only phone service (740). If no other add-ons were purchased then using the equation below, monthly charge would be $18,290 - $20,100 depending on how you wanted to use the constant. This is inline with the service pricing table above. Further, for this cluster you could anticipate a churn rate of 4.36 per 1000 customers and a weighted ave tenure of 40 months. If this was roughly a long run and stable figure then this cohort of cluster #0 would give $18,000 - $20,000 MRR for the extent of their 40 months with minimal churn loss.
Picture
Conclusion

This could be done for all four clusters and the activities from clustering to Probit and segmentation could all be done through an automated process or manually using small sample sets and applying the figures to larger models. If the data and modeling is constructed correctly, much of this could be minimally maintained although some skill in SQL and perhaps Power Query would be needed.
​
While these clusters worked out quite nicely with a focused cluster of churning customers, there is no reason to think that a clustering wouldn't also identify this factor in real-world data. The point is that if you were to dive into data, you would fine behaviors exactly like this and this adds predictability through the clusters that would otherwise be unavailable using the external reporting categories or even high-level segmentation that doesn’t account for some of the more unique behaviors that can be discovered in a clustering of attributes. Use your data to its strengths and create competencies.
 
 


I would like to mention. If you are just starting out using a package, it helps to have a guide.
There is a very nice article on Kaggle that conducts walk through using K-prototypes by Taron Zakaryan.
https://www.kaggle.com/code/taronzakaryan/clustering-with-k-prototypes
0 Comments



Leave a Reply.

    All case studies and blog writings are written by:
    William F Bryant
    MSc MBA CMA
About
Contact
Services
Case Studies
Blog
Copyright © 2025
  • Home
  • Services
  • Contact
  • Strategy
  • Finance
  • Operations
  • About