Having in mind such variation in the images does not make it possible to create one unified analysis module. That is why at Ardigen, we have developed a technology platform that can master them all. Through our expertise, we can choose the best possible computer vision approach to analyze a certain image type and address challenges posed by it. For histological images, we need to deal with differences in staining and tissue preparation between the medical centers. The microarray image analysis presents a challenge because data comes from scanners and imagers, along with their artifacts, and then HCS images can be presented in a 3D form or 2D slice.
Challenges in biomedical image analysis
Before we dive into the details about the image analysis, let us introduce few challenges that we encountered during our projects:
• image of super-resolution: to properly show a single cell or a spot, many of the images, like HCS, histology, or microarray data, need to be acquired with high magnification. Depending on the task, the images can be resized (e.g. microarrays to register the grid) or partitioned into smaller patches (e.g. HCS, histology) to preserve a piece of valuable information,
• weak labels, which do not directly correspond to the single data point (e.g. patch), but the bag of the instances (e.g. set of patches made from whole-slide biopsy images),
• batch effects, which can be related to the device acquiring the image (imager vs scanner) or to different laboratories the make the image and prepare the materials,
• data imbalance, which underrepresents the edge cases in the dataset, because they are not common in the population,
• data quality: even curated databases have issues that can influence the AI algorithms, especially datasets which were not intended to be used with AI, like blurriness,
• labels’ quality, because there can be a discrepancy between doctors’ diagnoses or human annotators,
• relatively small number of images available for algorithm training, compared to natural image datasets like ImageNet,
• artifacts related to the devices, like luminance glitches, distortions, or any other image defects,
• presenting the rationale behind the prediction, which is essential when the algorithm is presenting the decision that may have the human life at stake,
• assessment of the reliability of the prediction: whether it is trustworthy or not,
• and many, many more.