An artificial neural network (ANN) is a series of computer programs that receive data and pass it on. They are modeled on the brain with each program acting as a neuron that, when triggered by the data meeting certain criteria, transfer information to other neurons in the network.

The key advantage is that a network can be trained to recognize patterns independently. For example, while a classical analysis system can look at an image, a neural network can determine whether it is a picture of a cat.

Training involves feeding data to the first layer of neurons in a network that are, for example, programmed to detect areas of light and dark. These then feed data to the next neurons that recognize edges, which in turn connect to neurons that identify shapes, etc.


A network trained to recognize patterns can also make predictions based on data, which has obvious applications in process development: researchers can modify culture parameters to assess the likely impact on titer.

The challenge is that dynamic cell cultures are more complicated than static pet pictures, according to Jens Smiatek, PhD, lecturer and head of the theoretical chemical physics group, Institute for Computational Physics, University of Stuttgart, who says industry should consider using recurrent neural networks (RNNs).

“Standard feed-forward artificial neural networks are well-suited to predict, classify, or calculate static properties. A typical example: train the network to identify cats and dogs in photos,” says Smiatek.

“The RNN is able to predict and to classify dynamic properties. For example: predict the trajectory of cats and dogs in movies. Hence, one usually maps input to output (temporal) sequences while standard ANNs usually map static input to output variables. Both RNNs have to be trained by iterative schemes. Thus, the methods mainly differ in their applications.”


Smiatek and colleagues shared details of such an RNN in May, writing that unlike traditional neural networks, which assume that inputs and outputs are independent, the output of a recurrent neural network depends on the prior inputs.

The researchers claim their system has the potential to predict the impact changes to key process parameters would have on titer with a high degree of accuracy.

“Trained RNNs can be used to predict the outcomes of future cultivation processes. In the publication, we have shown that RNNs can be used to predict the properties of future platform processes. Therefore, we can try to find optimal process conditions temperatures, pH values, etc., even before the initial wet-lab work starts. Moreover, one could identify deviating process conditions in terms of model predictive control,” notes Smiatek.

Having process data to train the network is key, Smiatek says, which means biopharmaceutical firms with the appropriate infrastructure are well placed.

“The RNN can be used for early process predictions, control, and optimization without a significant amount of actual wet-lab work. A prerequisite are historic data sets of closely related processes with a broad variability in parameter space,” continues Smiatek.

“If such data are available, one can optimize future processes in terms of tailor-made process conditions for high product titer and quality. Thus, the RNNs reveal their largest benefit for the development of early-stage processes. In consequence, wet-lab work can be reduced, which saves time and costs. The corresponding experiments can be performed in silico.”

Data choice

However, while a data infrastructure is a prerequisite for firms interested in using an RNN, technology investment needs can be minimized by carefully selecting the information used for training, Smiatek says.

“One can choose which data to use, for instance ignoring viabilities, pH values etc., depending on the availability. However, the accuracy is definitely higher if more information and data is included,” according to Smiatek.

“Thus, routinely monitored process data from cultivation processes such as VCD, TCD, viability, glucose and lactate concentration, and titer is fully sufficient for a first guess. Even the amount of historic data must not be that high as long as the parameter space is sufficiently covered in terms of extreme and mean values.

“The average biomanufacturing line would provide enough data for a reliable RNN model. The consideration of in-process monitoring data would be helpful but is not a must have for accurate predictions.”