The integration of Deep Learning methods to improve Nuclei Detection and Segmentation
31 May 2022
By OracleBio

In this article, we will briefly discuss different methods to detect and segment nuclei and in particular, give examples of how artificial intelligence (AI) Deep Learning (DL) methods are increasingly being applied within our organisation to streamline our image analysis workflow.

Nuclei identification is a fundamental step in the quantification of digital whole slide images (WSI) in both healthy and diseased tissue. Variability in tissue quality, nuclear morphology as well as stain heterogeneity across a particular sample set present challenges that may compromise accurate nuclei detection and segmentation and thereby impact on the generation of robust high-quality data.

Nuclear structure and composition are relevant to normal development and physiology and alterations or variances can contribute to many human diseases, such as cancer. For example, aberrations in nuclear structure and morphology associated with cancer may include irregular nuclear shape and nucleolar alterations.

The heteromorphic appearance of nuclei, in both healthy and diseased cells, can impact on the detection and segmentation of nuclei, which in turn can compromise the accuracy of the data generated. Moreover, variability introduced during sample fixation, slide preparation, and staining as well as scanning artefacts introduces an additional level of complexity in the automation of nuclei detection and segmentation.

Although various traditional Machine Learning (ML) methods for nuclei detection and segmentation are available, such as Random Forests and K-means clustering, they are typically designed to address specific cell types or indications and are therefore limited in their ability to accurately detect and segment nuclei within a heterogenous sample set.

Traditional methods relying on intensity thresholding are, by design, highly sensitive to local and global variations in absolute pixel intensity values. This can hamper the accuracy of detecting cell nuclei across all samples within a particular image set, where variability in staining is introduced by technical processes. Furthermore, these methods do not accommodate for other biological factors such as tissue environments containing high cell densities, where overlapping structures result in nuclear boundaries being blurred or obscured. Collectively, these elements pose a real challenge in terms of accurately delineating the boundary of individual nuclear objects, which ultimately impacts negatively on the quantification of WSI. In practice, traditional nuclei detection and segmentation methods require multiple post-processing steps to achieve optimal performance across a particular sample set.

In order to accommodate these challenges, we are actively integrating DL methods into our image analysis workflow to better manage sample heterogeneity and to enable us to detect and segment nuclei with a greater degree of accuracy and efficiency across the HALO® and Visiopharm software platforms.

DL methods have the potential to significantly improve on ML applications thereby providing a valuable tool for both pathologists and scientists in the digital pathology field. DL is a sophisticated form of ML and is part of the broader family of AI methods. Specifically, it is composed of complex, multi-layered neural networks that rely on training data to automatically and adaptively learn hierarchies of features, which ultimately enables object detection and image classification. DL methods can be applied to various stains and tissues exhibiting a wide range of clinical presentations and histopathological features.

Examples of how DL methods, which were developed in-house at OracleBio, perform in segmenting nuclei across chromogenic and fluorescent stained skin and tonsil tissue sections are captured in the figures below.

[Above – Figure 1]: Nuclei Detection and Segmentation using a DL method applied across an immunohistochemically (IHC) stained skin tissue sample:

  • A: IHC stained skin tissue section exhibiting heteromorphic and sometimes incomplete nuclei
  • B: Colour deconvolution was employed to better demonstrate nuclei appearance
  • C: A DL Nuclei Segmentation algorithm was trained across a heterogenous sample set and subsequently applied across this tissue section to detect and segment cell nuclei (green overlay mask)

[Above – Figure 2]: Nuclei Detection and Segmentation using a DL method applied across a multiplex immunofluorescence (mIF) stained NSCLC tissue sample:

  • A: Magnified view of a mIF stained non-small cell lung cancer (NSCLC) tissue section
  • B: Magnified view of the same mIF stained NSCLC tissue section shown in A demonstrating the ability of a DL Nuclei Segmentation algorithm to accurately detect individual DAPI stained nuclei (outlined in blue) within a nuclei-rich environment

[Above – Figure 3]: Nuclei Detection within a cell dense region across an IHC stained tonsil tissue sample:

  • A: IHC stained tonsil tissue section exhibiting cell dense regions
  • B: Colour deconvolution was employed to highlight the nuclei
  • C: A DL Nuclei Segmentation algorithm was applied across this tissue section to detect and segment cell nuclei (blue overlay mask)

There are a number of considerations to take into account when deploying DL methods.


In the first instance, the selection of an appropriate deep learning network is crucial and requires an understanding of which network is best suited to the task at hand. Some networks, such as U-Net or DenseNet, work better for the detection of small objects like nuclei, compared to DeepLabv3, which is optimised to detect larger structures (e.g., tissue regions of interest).


The input magnification level employed is very much task-dependent as different magnifications provide varying levels of information. For instance, fine cellular structures and abnormalities are best detected at higher magnifications, whereas discrimination of general tissue architecture is better observed at lower magnifications.


Appropriate and adequate feature selection is an important processing strategy. Failure to include representative training data, e.g., a broad range of nuclear staining and morphologies, may lead to classifier over – or underfitting. Overfitting means that a classifier was trained on a very specific type of pathology only, and therefore the network will struggle to generalise and detect slight variations of the same feature. On the other hand, underfitting occurs when not enough training and/or data is provided to the network, which translates into a too simplistic model unable to differentiate between the different classes. Our expertise involves customising training data on a per study basis following a review of the full image set in order to minimise this bias.

Where required, the training data is reviewed by our in-house clinical pathologists prior to initiating the training of DL algorithms to ensure all required nuclear pathologies are included.

Algorithm validation:

Once a DL nuclei segmentation algorithm has been trained, we employ validation processes to assess the accuracy of the algorithm. A set of randomly selected study images are used to compare the number of nuclei detected by the algorithm within the selected field of views (FoV) per sample versus the number of cells generated by a manual count on the same FoV. A mean correlation (R2) of >0.8 across all images evaluated is deemed as an acceptable level of accuracy. Regression parameters (slope/intercept) per marker can also be provided as part of the read-outs.

Benefits of DL for nuclear detection:

Overall the benefits of DL are being able to robustly identify nuclei, even when sparse nuclear staining and poor preservation is encountered, especially in tumour samples. Nuclear detection, using DL, is found to be far superior over previous ML or feature-driven approaches.

In Summary

The integration of Deep Learning methods into our image analysis workflow has enabled us to better manage heterogenous sample sets and thereby enhance the efficiency and accuracy of the data generated.

Ultimately, deploying these methods will have a positive impact on improving R&D data, diagnostic algorithm performance, and the clinical outcome of patients.

Karen McClymont

About the author: 

Marianne Cowan — Senior Image Analysis Scientist at OracleBio

With a PhD in Molecular, Cell and Systems Biology, Marianne joined OracleBio in 2016 as a Senior Image Analysis Scientist. She is a key member of the Image Analysis team where she is actively involved in managing and supporting studies for our Clients.

Find out more about OracleBio's Pathology Expertise

OracleBio's Quantitative Digital Pathology Services

Stay up to date with OracleBio

Sign up to our mailing list to stay informed about news, webinars, events and more