Computer vision techniques may transform the medical field, in particular the way diagnoses are made as well as their accuracy. Computer vision algorithms excel at detecting complex patterns within images, that may go unnoticed even by trained clinicians. Improving the accuracy of a computer vision system helps to reduce false positive rates and increases true positive rates. Thus, these systems can help doctors in allocating their time and resources in an optimal way. Medical images may stem from a variety of different imaging modalities, such as X-ray, ultrasound, computer tomography, and (functional) magnetic resonance imaging.
Before we dive deeper into computer vision applications in the medical field, let’s take a moment to discuss computer vision in general and how it enables machines to “see”.
Humans perceive visual information through light that reaches the human eye and hits the photoreceptors on the retina. These in turn trigger signals that are sent to the brain where an image is formed and processed. Machines receive visual information as digital images which are based on an array of numbers. Each cell of the array constitutes a pixel with the numbers stored in each cell defining the color of the respective pixel. While human vision is not fully explored yet, several concepts from human vision have inspired computer vision concepts. However, human vision operates on a more abstract level than computer vision. Humans use abstract concepts to make sense of visual information and consider the context in which an image appears. This allows us to ofen understand visual information even if we have never seen it before. In contrast, machines mainly operate by comparing visual input to information they have seen before.
Artificial neural networks are used in computer vision for image classification and object recognition purposes. These networks are trained on large image datasets that contain thousands of images. In machine learning, training refers to the adjustment of weights within a neural network in order to optimally represent the training dataset (in this case images together with their known results, e.g. class labels). Convolutional neural networks (CNNs) are one type of neural networks that is particularly well suited for the analysis and interpretation of images.
Once a CNN has been trained, it can analyze and classify new images by comparing them to images it has seen before and to the patterns it observed for certain object categories. A CNN could for example analyze an X-ray image of a hip bone. If it detects any patterns that match the pattern it has learned for “hip fracture”, the image will be classified as showing a fractured hip. It’s important to remember that this ability to identify and distinguish objects only extends to images that show a high similarity to the initial training data. Even slight movements that result in motion blur or other minor changes may already interfere with the classifier’s abilities.
X-Ray images are amongst the most widely used types of medical images, as they enable clinicians to detect anomalies initially within bone structures, but more recently in soft tissue such as fat, muscles and organs as well. Computer vision systems can be trained on X-ray images for the purpose of classifying X-ray images into different categories such as “bone fracture” or “lesion”. Employing high-quality training data based on annotated images allows computer vision systems to compete with or even outperform human experts regarding the accuracy of their analysis of X-ray images.
Such a computer vision application was for example proposed by researchers of the Shenzhen University. China. They introduced an AI system for the examination of X-ray images to detect nodules. Their experimental results showed that the AI system was able to detect nodules at a high precision and was even able to outperform radiologists in this task.
Ultrasound images are for example used in the diagnosis of abdominal problems such as issues affecting the liver or the kidneys. Another well-known field of application for ultrasound images is the regular screening regarding the fetal development during a pregnancy.
A team of researchers has introduced an AI system for the detection of precursors to thyroid cancer based on ultrasound images and the AutoML Vision tool by Google. This AI system predicted possible thyroid conditions with an accuracy of 77.4% when examining lesions in images of more than 100 patients. The researchers plan to refine their system to provide an automatic thyroid cancer screening tool.
Computer tomography (CT) scans consist of a collection of x-ray images (slices) of certain body parts such as the chest or the brain that allow for the examination of individual layers within this body part. CT scans are often used when screening for the presence of tumors or internal bleeding in organs. Thus, CT scans play a crucial role in the detection of potentially life-threatening conditions.
AI systems have for example been applied to CT images for the detection of lung cancer with an accuracy of approximately 95% which drastically exceeds the 65% accuracy trained clinicians achieve.
Another computer vision system for the analysis of CT scans of the head outperformed half of the expert radiologists it was compared with in locating small hemorrhages. A prediction regarding the presence of a hemorrhage was delivered within a second by this system. In case a hemorrhage was found, the system could also determine the location of the hemorrhage within the brain. Such a system may support doctors in the diagnosis of traumatic brain injuries and strokes.
Magnetic Resonance Imaging (MRI), is used to detect soft tissue damage, issues regarding joints, damage to the circulatory system, and problems affecting the bone structure. Image annotation tools such as semantic segmentation can be used to create machine learning approaches for the detection of health problems that are currently hard to identify such as cerebral aneurysms or clogged blood vessels. Naturally, with more and more high quality training data becoming available, the accuracy of MRI based computer vision tools will increase and further areas of application will be explored.
A 2018 study employed computer vision techniques for the analysis of functional MRI (fMRI) data and the diagnosis of cognitive impairment. Cognitive disorders can be determined by examining the brain activity. Which areas of the brain are active during a certain task or at a certain point in time can be visualized using fMRI scans.