The financial forecast in the pharmaceutical sector
Proper and reliable sales forecasting has a great positive impact on different areas of the company, but it must be done properly and used in the right way.
It gives sales teams a yardstick against which to measure their sales targets, which naturally impacts performance. On the other hand, having forecasts allows for informed retrospective reflections where one can review, based on data, whether the forecasts were overly optimistic or whether, on the contrary, corrections need to be made on how to work in the coming year.
An annual, or even quarterly, forecast allows for a well-balanced short- and long-term debt structure
In the case of logistics, it is not very useful to have annual sales forecasts. In general, sales will not change drastically and unexpectedly in the short term. In addition, a long-term financial forecast (quarters, or whole years) will not be accurate enough to be used as the main source of information for production or logistics. However, it is possible to make shorter-term forecasts that may take into account short-term variations due to transient effects, such as own or third-party commercial offers or a competitor's lack of stock. This type of forecast does not prioritise the accuracy of this type of sales, which are difficult to predict by their very nature, but rather the agility of having this information available in an automated way, without the need to have a team on standby 24/7.
Finally, the most important of all the impacts of a good forecast is perhaps the least visible from the outside: the financial vertical of the company. An annual or even quarterly forecast allows for a well-balanced short- and long-term debt structure, based on predictable cash flows.
Why is it not easy to make such predictions in the pharmaceutical sector?
However, in the pharmaceutical sector there are a number of additional challenges to good sales forecasting that are not unfamiliar to Izertis. Each of these challenges is sufficiently complex to be a case of one in itself. As an example:
- What impact will a new launch have?
Prevalence and incidence figures for a disease are known, but they are estimates with varying precision depending on the disease.
The differential improvement of this new launch over existing medicines is a crucial variable that is very difficult to measure.
Commercial investment has a large impact which varies according to therapeutic area, competitors etc. It is therefore not easy to know what effect the expenditure will have on the different commercial items for a medicine that is not yet sold.
- What impact will the approval of a new indication for an existing medicine have?
- What will the revenue degradation curve look like for a drug on which a patent expires?
- What impact does spending on each trade item have on sales?
Certainly some actions are being over-invested in and under-invested in others. If we can afford to conduct controlled trials of certain drugs in certain regions, the isolated impact of that change can be estimated using causal impact analysis techniques.
- How can we predict sales in a market regulated by tenders (such as the Netherlands, the United Arab Emirates, most African countries, etc.)?
For each of these use cases, there are different solutions: finding similar launches and regressing them on the level of similarity and then weighting it by market variables, experimenting with small changes in commercial investment in small regions, using causal impact analysis to determine which variables have caused the change, and so on. Each use case should be treated differently depending on the market for each medicine, the objectives, and the information available inside and outside the company.
Sales of certain products are correlated, how could you incorporate this information into your models?
Forecasting models in the pharmaceutical sector face the recurring challenge of using sales information from the rest of the portfolio to feed forecasting models for a specific product.
Using these embeddings we can generate a single model that predicts the sales of multiple brands.
A trivial, but erroneous, solution would be to code each product by one-hot. The correct way is to use embeddings, where each brand is represented by an automatically learned numeric vector, so that similar brands will correspond to nearby vectors and different brands to more distant vectors. This embedding representation allows compressing the mark information into the model in a much more efficient way, as well as eliminating arbitrary information that only adds noise (such as the order of the marks). This mapping of each tag to a numerical vector can be generated during the training of the model itself, or ideally can be pre-trained in a semi-supervised manner using additional information. This additional information does not even have to be only for own products, information on competing products can also be used.
A very appropriate way to do this in the case of pharmaceuticals would be to use the indications for each medicine. In this way it is possible to create embeddings that, given a product, return a vector that is close in vector space to other products that have similar indications, without the need to use any sales information.
Using these embeddings we can generate a single model that predicts the sales of multiple brands based on the one selected. This model will have been trained with much richer information than if it is trained exclusively on its own sales.
why can't a pre-packaged solution be used?
In each case, different data is available and a different solution is to be found. The main condition for success in this type of project is to make the most of every available piece of data. This includes incorporating into models abstract unstructured data that are not trivially translatable into mathematical language.
A common mistake is to assume that an LSTM will always work better
To address these major prediction challenges based on the artificial intelligence model definition, we propose two ways:
- Using time series analysis models, such as ARIMA, which applies to those cases where there is little data associated with highly stationary series. It works well when a short time horizon forecast is required (e.g. weeks).
- Classical regression models, creating additional features that are prior temporal data of the feature to be predicted (known as lags).
- There is a hybrid solution: recurrent neural networks (RNNs), such as an LSTM. Conceptually they are the same as the previous model, only that lags are automatically generated and weighted during training.
A common mistake is to assume that an LSTM will always work better because it is the most advanced and modern of the three suggested models. In practice, however, there are often data availability constraints that mean that this is not the case.
It is possible to create a single model that predicts the next time point
Finally, as the AAAI 2021 best paperaward shows, it is also possible to use transformers to predict time series.
Once it has been decided how to model, it is possible to create a single model that predicts the next time point, then use this new prediction as if it were real data and so predict the next one. This technique (rolling forecasting) produces the easiest models to create and maintain (which is no mean feat), but not the best forecasts.
Another way is to create a unique model for each desired time horizon. For example, create 6 models, each trained to predict the specific time horizon: one for the next month, one for the value in two months' time, etc. This technique offers better results at the cost of having a more complex and difficult to maintain logical architecture.
how can I estimate the impact of an event?
What is the impact of new indication approvals on a medicine? how can the launch of a competitor affect sales?
The difference between forecast and reality will help us to know the causal impact data
In order to answer these questions, one can turn to a branch of applied mathematics that attempts to analyse how much of the impact on the change of a variable is due to a given event: causal impact analysis.
If we could go back in time and see how much of the variation in sales is due to the new event, it would be possible to train a model relatively easily. But the challenge is that we do not have this information, as we do not know how much is due to this event and how much is due to other variables. To solve this we can deploy a model to make a forecast from the occurrence of the event. The difference between the forecast and the reality will help us to know the causal impact data. Obviously, this data is biased, as there are additional uncontrolled variables involved in the change, but if we look at a sufficiently large number of these events, we can compensate for the impact of the different variables and keep only the one variable that always affects the impact of the event we wish to estimate.