How investment banks will benefit from the first wave of machine learning

How are you going to adopt machine learning at your organization?

Machine learning may not have crossed your mind yet, but it’s definitely on the radar of someone at your firm. Over the past quarter, the number of discussions I have had with clients about the impact of machine learning has skyrocketed. Indeed, according to a recent LinkedIn survey, 55% of respondents who work in investment banking believe machine learning to be one of the most important developments in finance. It’s not just contained to the financial services industry. Google Trends shows an almost eightfold increase in searches for the term over the past few years.


So why is machine learning so appealing?

For investment banking departments, machine learning represents an opportunity to either:

  • Reduce costs, for example, improving process efficiency or;
  • Increase revenues, such as identifying timely opportunities

These may seem like different sides of the same coin, and indeed either strategy will have a similar impact on earnings, but, before deciding what projects to pursue, it’s important to consider the potential speed to monetization, any impacts on additional machine learning projects, and regulatory or compliance hurdles.

Having spent a lot of time thinking about the potential of machine learning and how it could have the most impact at a bank, I believe that for most firms, it makes sense first to deploy machine learning to reduce costs and then to apply it to revenue generation.

Investment banking can be split into two areas: pitching and execution. While cost reductions on the execution side of the firm are welcome, they won't make a big difference. You’re already making money in executing deals so while you may be able to make it more efficient, in the grand scheme of things, it likely won't translate to a big dollar sign.

The pitching side of the business, however, is 100% cost. Any efficiencies or cost reductions achieved can escalate into massive benefits, especially if you consider the relative spread of your time, which is undoubtedly spent on pitching.

Another critical difference between pitching and execution are the legal and regulatory concerns, namely on the execution side of the business. What do I mean by this? Ask an expert in machine learning how it works and the answer you receive will likely be not only tough to follow, but also tough to explain to your legal and regulatory folks, which may result in some red flags. Plus are you willing to give advice to your clients based on something you can’t explain in plain English? So my short warning here is to think long and hard about how you want to use machine learning.

Given the constraints of using machine learning on the execution side, it makes sense to deploy it as a cost saving tool on the pitching side of the business.

Machine learning requires data and training

For machine learning to work effectively, lots of data and training is needed. Conceptually, machine learning operates in a similar way to the human brain. We are exposed to a variety of situations (data) and over time, our brain optimizes how to react to this stimulus (training). Machine learning is not that different. The more data and training an algorithm is exposed to, the better the performance.

What many investment banking teams haven’t tapped into yet, is that the "data" part of this relationship doesn't have to be limited to what bankers traditionally consider to be data. As an ex-banker, if someone says "data" to me I immediately think of financial statements or trading data, but in regards to machine learning, data can be much more broadly defined, which in turn can lead to a much broader dollar impact.

You have more data than you think

A while ago, I wrote that investment banks should embrace crowdsourcing. I argued that tapping into knowledge across the firm to do things such as creating a competitive set would save tons of time and create a robust starting point for analysis.

The average bulge bracket bank will create around 20,000 pitchbooks a year. On average, each pitchbook will have 41 pages, 2.1 sets of analysis per page, and eight different companies covered, including the primary company. This creates thousands of data points or nodes on a neural net. Currently, we invest a lot of time in developing the right set of comps, but we don’t view the outcome as an institutional data set which can be used again. Shifting expectations of what can be considered data and how it can be re-leveraged can point you toward some obvious targets for machine learning-based optimizations which in turn can lead to notable cost reductions.

But what about the training part?

If you’ve read this far and thought, “Wait, don’t market data vendors already use algorithms to generate ‘quick comps’ sets?” You are right. But what you may not know is that these lists are generated using “dumb algorithms,” meaning no feedback is passed back to the algorithm to improve the results. So the comps set created today will likely be the same as the comps set generated last year, and probably the same as the one generated in 12 months. There is no evolution or adaptation to new information, and, if given the choice between a dumb algorithm or a smart one, I know which I’d invest in.

With this use case in mind, how do you train an algorithm to optimize a comps set? I break this down into two important components: perspective and success.

First, why do you use a comps set? Usually, it’s because you’re marketing one of a myriad of products or transactions, like a follow-on offering or an investment-grade issuance. Each scenario would likely require a different comps set, which is where your perspective as a banker brings additional value. For the second element, success, this can be informed by the result of the pitch, we know that most investment banking pitches don’t lead to anything, but for the ones that do, this information should form a feedback loop with the algorithm.

Incorporating these two pieces of incremental information transforms a dumb algorithm into a smart one. In this way, machine learning provides a better pitch comps set for much less work than typical processes. Once in place, a viral feedback loop is created where success breeds success, which ultimately is what you’re trying to capture.

You may consider creating a comps set a mundane way of applying the hype of machine learning, but investment bankers spend a lot of time thinking about the comps they use, and it’s an area where significant cost savings can be reaped. It’s also easy to extend this example to a range of different pitching areas, most of which probably wouldn't require new code to implement, just a new set of data to train the system and then the addition of feedback.

In investment banking, the first wave of machine learning deployment is to apply algorithms to already existing data and train the algorithm to optimize results, leading to resource reductions that meaningfully impact the bottom line.

What conversations about machine learning are taking place at your bank? What else could it be readily applied to? Email me at with your ideas.