In this episode of Better Edge, Baljash S. Cheema, MD, assistant professor of Cardiology at Northwestern Medicine, and Zach Miller, PhD, manager of Machine Learning and Artificial Intelligence at Northwestern Medicine, discuss the innovative use of machine learning to identify patients with hypertrophic cardiomyopathy and heart failure. Emphasizing the critical collaboration between clinical experts and data scientists, Dr. Cheema and Dr. Miller explain the process of designing and optimizing AI models to better meet clinician needs and improve patient outcomes.
Selected Podcast
Designing AI Models to Identify Hypertrophic Cardiomyopathy

Zach Miller, PhD | Baljash S. Cheema, MD, MSCI, MSAI
Zach Miller, PhD is a Manager of Machine Learning and Artificial Intelligence at Northwestern Medicine.
Baljash S. Cheema, MD, MSCI, MSAI is an Assistant Professor of Cardiology.
Designing AI Models to Identify Hypertrophic Cardiomyopathy
Melanie Cole, MS (Host): Welcome to Better Edge, a Northwestern Medicine podcast for physicians. I'm Melanie Cole. And we have a panel for you today to highlight internal development of machine learning to identify patients with cardiac disease, specifically heart failure and hypertrophic cardiomyopathy.
Joining me is Dr. Baljash Cheema, he's an Assistant Professor of Cardiology at Northwestern Medicine; and Dr. Zach Miller, he's the Manager of Machine Learning and Artificial intelligence at Northwestern Medicine. Gentlemen, thank you so much for joining us today. This is really a fascinating topic.
Dr. Miller, I'd like to start with you. Can you walk us through the process you followed to develop an AI algorithm for identifying patients with hypertrophic cardiomyopathy, and why was this a good use case for leveraging AI? We're hearing so much about AI in the cardiology space. Tell us about that.
Dr. Zach Miller: Generally, whenever we are starting on a project like this one, we meet with our clinical stakeholders. So Dr. Cheema was our clinical stakeholder who brought this forward. So first thing we try to do is, like-- we're not medical specialists, we are data scientists. So, we sit down and we talk about, you know, what do we want to do with this algorithm? What are we trying to accomplish? Not mathematically, not nerd stuff, but like real clinical, operational outcomes.
And so, once we've talked through that, gotten an understanding of what we're trying to solve, what the problem is, what some of the complexities are, then we go and kind of dig into the data and say, "Okay, what data can we bring to this problem? What concepts from medicine can we turn into mathematical things that we can act on?" And then, how do we bring that into a kind of a machine learning algorithm that will then make some type of prediction that's useful? So in this case, we're trying to just say what is the probability that a patient has hypertrophic cardiomyopathy given the set of data that we have access to?
And this was a really good use case because we had historical data where we've had physicians diagnosing HCM for years and years now. We have some of the best doctors in the world at diagnosing HCM. And so, we were able to use their clinical expertise, turn that into an algorithm by looking at what data did they use to make a decision, what decision did they make, and then try to replicate that with an AI algorithm on the backend.
So, you know, we had really well-defined inputs. We had really defined outputs. And the data was pretty consistent, so that made it a really nice target for us to work on. And so far, it's turned out pretty well.
Melanie Cole, MS: That's very cool, and so forward thinking. Dr. Cheema, from a clinical perspective, what are some of the key indicators of hypertrophic cardiomyopathy that your team considers? How do they align with the features used in the AI model? And while you're telling us that, how did the collaboration between a data scientist and a cardiologist enhance that overall quality and applicability of the AI model, as you think of what the key indicators that you considered were?
Dr. Josh Cheema: So, this is really a fascinating use case and one that's been a pleasure to be a part of and working on. So, hypertrophic cardiomyopathy, for any of our listeners out there that are not taking care of these patients, this is a genetic condition. It's inherited. People are born with it. They often will develop symptoms in their teenage years or other periods of rapid growth, but it leads to one specific part of the heart being larger than the others. We call it asymmetric hypertrophy, which means bigger than it should be and in one particular area, but not all of the heart. And there is no other reason for that to occur in that individual, and so we assume it's this condition called hypertrophic cardiomyopathy.
Now, there's probably five out of every six people in the U.S. that have hypertrophic cardiomyopathy-- I'll abbreviate it as HCM-- don't actually know they have the condition. And most of them feel totally fine and don't have any symptoms, but some of them are at risk for more serious health-related outcomes, even the risk of dying suddenly due to a heart rhythm problem. So, finding those patients is really important. We have great therapies that we can offer them now. There are a number of medications that have been involved and used in clinical trials now, some on sale in the market that have made a marked difference in the symptoms that people experience. And we have surgical-based approaches and catheter-based approaches as well that we can offer.
Patients with HCM will often feel discomfort in their chest, trouble breathing, palpitations, or racing heart, just generally not feeling as well as they should when they're doing their activities, going through their day or going on hikes and exercising. When we are looking at them clinically, we can see changes to their electrocardiogram or ECG. We can see changes to their echocardiogram, the ultrasound of their heart.
We also document a certain way when we're writing our clinical notes about murmurs that we may hear when we're listening to their heart or certain ways that they're describing their condition. So, all of this gets registered in some way, shape, or form in the electronic health record, either as a read on the ECG or a read on the echocardiogram, or just the way that we write our clinical notes. And so, working with our machine learning and AI colleagues, we pulled features, those specific data points that we thought commonly occur in patients that have HCM. So again, some of the descriptions that we use in the clinical notes, some of the features from the ECG, some of the specific features from the echo into creating this algorithm.
I would say the collaboration's been fantastic. It's really great working with Dr. Miller's team. This is really a joy to be able to talk about the clinical problems and to think a little bit like a data scientist. I have some background in AI, I got a master's in it, did a fellowship in the program. But really, what I think of the role of myself on the team and other clinicians that have a similar interest to me is to be able to articulate these problems clearly and have some sense of what are the tools we could use and then we transition that over to the experts like Zach and his team to really create the algorithms that we're using and deploying in practice.
Melanie Cole, MS: Well, thank you for that. So, Dr. Miller, how did you approach the validation of the model? Speak about iterations, revisions that you made based on the validation results based on what Dr. Cheema was just describing as the features that were more important for the model.
Dr. Zach Miller: So like Dr. Cheema was talking about, one of the things that we try to do is pull in the clinical expertise. So, I think there's a little bit of a general misconception around AI, that it's bringing all of this knowledge to the table that it doesn't matter what you feed it, it's going to do a good job.
But what we've really found is that we are not able to bring medical expertise into the AI space. We have to borrow that from the physician. So, you can think of Dr. Cheema writing a note about a patient. And we need to synthesize that information into an outcome. And so a lot of the iterations and refinement were around what data can we use that Dr. Cheema is highlighting. Like, "Hey, look at these things. Look at these types of measurements. Look at these types of notes. And try to figure out which parts of this actually bring value to the problem? And which parts of this are we not able to-- maybe this measurement is useful for a human, but the model that we're trying to build doesn't have enough context to understand how to use that information."
And so, a lot of the iteration was around, "Let's try these types of clinical notes. Let's look at echocardiogram notes. Let's look at this type of measurement." And ultimately, we've iterated through it three or four times. So, the data scientist on my team is Asma Baccouche. She did I think probably 20, 30 versions of the model that we tested over time. And as we revise that, we're narrowing in on what brings predictive power. We are now working through validation of the model. We validated originally on a holdout set, so basically a set of patients that we didn't learn anything about. We kept them completely separate while we were training the system. And then, we looked at them and say, "Okay, how well does this model predict on these patients?" Does it get answers that a clinician would be able to use? And is it accurate enough that we're not going to be wasting clinician time by giving a bunch of false positives?
And once we were happy with that in what we call the test set there, the group that we held out, we then are moving it to what we call a shadow deployment. So basically, we've turned it on in the background. We're not making clinical decisions yet. But we're letting it run and collect data on brand new patients that we've never seen before. Patients that are coming through the health system right now today. And every day, it is pulling down new patients, making predictions, and now we're starting to validate those predictions in kind of the background.
Once we're happy that the model is doing something useful, then we'll be ready to turn it on for real. Once we've kind of completely validated the approach, completely validated the system, and we've got it integrated into the workflow in a nice way, we'll be ready to like use it in clinical practice. But we always try to be really careful that we're not turning on a model that's not going to be operationally useful or wasting a bunch of time by turning on a model that's making bad predictions.
Melanie Cole, MS: This is fascinating. Dr. Cheema, can you point us towards instances where clinical expertise influenced the model's adjustment and refinement, and the considerations that you made to assess the feasibility of using this AI model that you all were creating in a real world clinical setting? Tell us if it had to be redesigned, clinical workflow, how did this directly apply to what you're doing with patients every single day.
Dr. Josh Cheema: I'll start with the second part of that question, because I think that's the most important piece of this whole entire process, is what are we actually going to do with these tools that we create? How will they actually be used in real life? And one of the benefits that we have is that we have previously worked with Zach and his team on a separate model that is used to identify patients that have advanced heart failure. So, advanced heart failure means the typical medications that we prescribe that help people live longer and have more time outside of the hospital and live a higher quality of life are no longer working, and really the only option for that individual patient to extend their life is a heart transplant or an implantable heart pump that takes over the function of the heart.
So since we've worked on creating that algorithm in the past, we've thought through a lot of the details of, once we have the algorithm, how do we deploy it? Where does it live? Who has access to it? Who's reviewing it? When we have an output from the model, which is a recommendation or a piece of evidence, I would say a prediction really, that an individual patient may have advanced heart failure, how do we then act?
And the way that we have done this is we have a team that reviews these cases, discusses it with the physicians as needed, and then we bring these patients into our clinic to see a heart failure doctor to determine is the model right to. Are there other opportunities for things that are lesser than a heart transplant or a heart pump? Or should we be really thinking about those interventions and moving forward with them?
So using the same context that we have from that project, we have thought about how to deploy this model. And so, our main task here is to try to find patients that have HCM but don't know that they have it, especially those that may be having symptoms from HCM. There's some parameters that we can see from that echocardiogram that tells us specifically that there's high pressures that are developing inside the heart, and that's the way the body is trying to overcome the challenges that you have when you have HCM and that can lead to people having symptoms. So, that's a target for us to treat. We call that obstructive HCM, obstruction to the blood flow going through the heart.
There were many incidents or incidents, I would say, or specific times where we thought about what is the model giving us as an output? What are the features it's using to make its prediction and do they make sense? I remember looking through and at times some of the blood work that it was using to make its best prediction didn't really correlate as much with what we see clinically. The types of blood work that we're seeing at one iteration of the model just are not ones that typically vary with HCM. And it's probably has something to do with those patients that we're training on weren't coming into the hospital as much as maybe they weren't going to get that blood work checked as frequently.
And once we talk to our machine learning engineers about how to restructure things, we saw different features come up as the ones that were the most important when making the decision that actually really made sense clinically. So, it's the how thick the heart is, what that pressure is inside the heart, the types of symptoms people were describing.
So, the model that we got to, the model that we're validating right now before we deploy it live in the electronic health record, really correlates with the types of things that I see when I see these HCM patients in my clinic.
Dr. Zach Miller: So to add on one thing, Dr. Cheema is being very charitable here. And I think it's worth highlighting, in this case, the case he's talking about where we found some blood work that wasn't correlating to his clinical insight, that's actually a huge win for our process. So, we love being wrong. It sounds counterintuitive, but like this is all researchy, right? Like we're building things that don't currently exist. We're building things that are new, kind of cutting edge. And so, we should be wrong some of the time. And in this specific case, what happened was we were accidentally pulling data that was incomplete in the EHR. When you look back in history, because we're using all that historical data to make predictions, the data that in the long ago in, you know, the 2018 priors was incomplete, which meant that we could accidentally infer whether a patient had HCM or not based on whether they had this measurement or not. It's a little bit technical, it doesn't kind of matter.
But the big thing here is this is a great version where the clinicians were able to look at this and go, "Wait a second. Something about this doesn't make sense. Your process here, whatever's coming out of this doesn't make any sense to me. We can't move forward until we fix this." And it helped us tremendously to have a clinical stakeholder kind of sit down and just bring their intuition to our problem. And we really try to bring this as part of our process. Like, let's talk with the physicians, let's talk with the stakeholders. Let's understand the clinical significance of things. Because if we're wrong and we put it into production, we're going to actively hurt patients. We're going to actively do something that is not clinical standard of care. And so, we always want to be refining our process. And I think it's really important to highlight that, in this specific case, Dr. Cheema was able to catch it right away, and was able to put us on the right track. And that's, I think, a really great part of the collaboration that we have.
Melanie Cole, MS: Thank you both. And I'd really love to give you each a chance for a final thought here. And, Dr. Miller, looking ahead, how do you envision the ongoing collaboration between data scientists such as yourself and clinicians evolving in the field of Cardiology or medicine in general? I mean, this is a big question. What do you see as the implications for care at Northwestern Medicine, Bluhm Cardiovascular Institute? And if you were to give advice to other providers out there and say, you know, this is really something special that can help us predict, what would you tell them?
Dr. Zach Miller: I think what we're seeing is that. my team is already working broadly across the healthcare system, so it's not just Cardiology. We're also working with some Oncology folks. We're working with some Palliative Care folks. What we're really seeing is like we can't bring necessarily the right problem to solve from a nonclinical perspective. But once we know what problem to solve, once we can discuss that and figure out what we can do, we can do a lot of good.
And I think there's a lot of places that we can take this. We can go into clinical care gaps, we can go into identifying patients, we can go into diagnostic information. we can go into searching for the correct patients for, say, a clinical trial. We're looking at all of those different spaces. So, there's a really broad set of problems that can be solved with machine learning and artificial intelligence. But we really are reliant on the physicians to bring us their pain points, bring us their ideas, bring us their concepts.
We're not going to be able to solve every problem with the data we have, but we can always give it a great try. And we're really looking to continue that and make it very collaborative. We think that getting something into operations requires both clinical support. And data science support, and we really need to meld those two worlds, and we're seeing a lot of really good outcomes from that so far.
Melanie Cole, MS: What an interesting, really fascinating discussion this is. And Dr. Cheema, last word to you. When you think of the primary benefits and challenges of incorporating AI into clinical practice, what would you like other physicians and clinicians to know about embarking on this kind of initiative? Because it really is about patient outcome, satisfaction, just really overall quality of care. So, what would you like to say to other physicians?
Dr. Josh Cheema: I really believe that physicians deserve a seat at the table in having these conversations, but that they absolutely should not be the only ones at the table. This has to be a communication, a collaboration between clinicians and data scientists. And that is really the best way to create these products, to make sure that we are bringing the right questions there, which the clinicians can help identify, to make sure all of us collectively are thinking about how we're going to use these AI models in practice and what is feasible.
And then, you know, in a lot of ways, to get a reality check from the machine learning and AI engineers. I think too often clinicians who don't have some extra training in this space think that data can solve everything. And there are times that it can, but there are many occasions where we either have incomplete data or we're asking the wrong questions, and that's where an AI and a machine learning engineer can sit down and really help clarify what exactly is possible.
So, I really enjoy the partnership that we have with our machine learning and AI team in Northwestern. I think it's certainly the model for the future, and I would advocate for clinicians that are interested in this space to maintain that interest, do some more education on your own accord, and consider specific focused training in machine learning and AI to be able to work on these kind of projects for the rest of our careers.
I mean, I sincerely think that for the entirety of my career, I will be having these kinds of conversations. Of course, this space changes very fast, so I know this will evolve and grow, but I hope to keep doing machine learning and AI work throughout my time in academic medicine.
Melanie Cole, MS: Wow. Thank you both so much for joining us and really telling us about this collaboration and how important this is to not only the cardiology world, but to medicine in general. Thank you again. And to refer your patient or for more information, please visit our website at breakthroughsforphysicians.nm.org/cardiovascular to get connected with one of our providers. That concludes this episode of Better Edge, a Northwestern Medicine Podcast for physicians. I'm Melanie Cole.