Artificial Intelligence/Machine Learning in the Analysis of Biotherapeutics
Therapeutic protein-based medicinal products (e.g. monoclonal antibodies) have enabled new treatment options in the field of various oncological diseases, autoimmune diseases, infectious diseases and genetic disorders. However, due to their complexity, the characterisation of these products is a major challenge. One of these challenges is the invisible particles consisting of protein aggregates that can be formed, for example, under stress conditions, by the presence of leaking ingredients or silicone oil in pre-filled syringes. Although they represent only a small part of the protein concentration, these particles can increase the risk of adverse immune reactions in patients.
Among the imaging techniques explored to date to characterise these particles, flow imaging microscopy (FIM) has shown particular promise. With FIM, large sets of complex images of individual, invisible particles can be acquired from a single sample. Although these image sets are rich in structural information, manual analysis of the results is very time-consuming. Current common methods for analysing FIM images use only certain defined features such as aspect ratio, compactness or pixel intensity, so most of the complex morphological information encoded in a FIM image is not utilised.
New possibilities through AI
To overcome the obstacles of current optical image analysis techniques for therapeutic proteins, one possible solution is to apply artificial intelligence/machine learning (AI/ML), specifically Convolutional Neural Network (abbreviated as CNNs or ConvNets). CNNs are a class of artificial neural networks that are already applied in many areas of image analysis. CNNs enable the automatic extraction of data-driven features (i.e. measurable features or properties) encoded in images. These complex features (e.g. fingerprints specific to certain proteins) extracted by CNNs can potentially be used to monitor the morphological characteristics of particles in biotherapeutics and allow tracking of the concentration of particles in a medicinal product.
These networks can be trained on large amounts of data using supervised learning or a fingerprinting approach, both of which have important applications in the analysis of complex visual data such as FIM images. The supervised learning techniques mentioned here allow CNNs to extract feature information from raw images and correlate these features with experimental conditions that produce different particle images with different morphologies. Supervised learning relies on predefined associations with individual images for training the network. After training, the CNN can predict which of the predefined labels best matches a new image that has not yet been used in training. This approach is useful in root cause analysis when the conditions leading to protein aggregation are known in advance.
The application of artificial intelligence/machine learning (AI/ML) in the form of CNNs has enabled the processing of large collections of images with high efficiency and accuracy by distinguishing complex 'texture features' that are not readily identifiable with existing image processing software. The methodology described in this article is applicable to a range of products in the pharmaceutical and biopharmaceutical industries to monitor changes in product properties (e.g. particles/aggregates) during manufacturing. AI thus offers potential new strategies for monitoring and analysing product quality properties.
If you want to learn more about the software described in the article and also more about this topic, read the FDA main article "Artificial Intelligence/Machine Learning Assisted Image Analysis for Characterizing Biotherapeutics".