Using Machine Learning to Improve Breast Cancer Detection | Writing at the Research University, Spring 2020

After decades of replications, the human genome begins to deteriorate, and mutations are introduced into cells that transform them into malignant cancer tumors. As the lifespan of humans steadily increases due to improvements in medicine, so to do the chances of novel, cancerous mutations appearing (National Cancer Institute). As a result, cancer has become a more omnipresent specter in the lives of many. For women, breast cancer is the second most common cause of cancer related deaths. The multitude of breast cancer screenings performed each year reflects the commonality of this complication among women, with some 39 million screening exams being performed in the year of 2014 alone (Wu 2019). However, with the improvement of computational technologies in the recent decade, intrepid computer scientists and doctors have teamed up to research tools that can more effectively combat this disease.

The first step towards treating breast cancer is screening for a tumor. Traditionally, a radiologist screening for breast cancer will review mammograms and make predictions based upon their own expertise, often times, with the assistance of Computer Aided Detection (CAD) technologies. These current technologies are primitive, providing basic annotating capabilities to the radiologists. A recent study by Lehman et al (2015) has shown that these CAD technologies do not actually improve the accuracy of diagnoses made by radiologists. Thus, with the goal of creating CAD software which could more accurately identify the presence of breast cancer in women using new machine learning techniques, a team of 31 researchers, mostly from New York University and its affiliated School of Medicine, conducted and published this study in 2019. The researchers set out to achieve this goal by creating and optimizing a machine learning algorithm that would be able to accurately predict the presence of malignant tumors in the breast while avoiding false positives and false negatives as best as possible. The work done in this paper builds upon the work of previous researchers (the study specifically refers to work done by Geras et al), who have worked on machine learning algorithms capable of identifying malignancy in tumors. This study also goes the extra step of putting the model they created into practice, demonstrating the beneficial influence of the model on predictions made by the researchers.

For the purposes of this article, some clarification about the machine learning process will be given to help accommodate an understanding of the process that the researchers took to develop their model. First, some definitions: A machine learning algorithm is a formulated series of steps used by the researchers to train a model. A model is constructed by the researchers, and once trained by the algorithm, it consists of a learned set of weights which it uses for decision making (Erikson et al 2017). This paper uses the term model and neural network interchangeably. One can think of the algorithm as a kind of robot teacher, which tests and evaluates the model’s decision-making capability, then adjusts it accordingly. This testing and adjusting process is done until no substantial improvements are achieved (Erikson et al 2017).

To train their model for cancer classification, the researchers used existing mammograms (high resolution X-ray images) from previous cases of breast cancer for the data that they trained their neural net with. The mammograms for each case were given in sets of four, corresponding to four different views of one patient’s breast region. Some preparatory steps, such as the normalization of the image dimensions, were taken on the data as well. Those data were then passed off to radiologists who labeled the data so that it could be used for training. The model itself is bimodal; the first layer of processing is made of up four ResNet architectures that analyze each view provided to the model, and the second layer aggregated the representations computed by the first layer to make a final computation as to whether or not there exists a tumor. Additionally, a patch-based model that evaluated small sections of each input view was coupled with the main model to try and improve accuracy. The model being researched was pre-trained by transferring learned weights and representations. From there the training data was used to optimize the model further. The researchers trained five copies of four variants of their general-purpose model and the predictions for each copy were averaged.

Of the four model variants, a view-wise model, which concatenated the same kinds of views for the left and right side, made the most accurate predictions. This model was found to achieve an AUC of .895, on par with that of experienced radiologists. The paper also found that by averaging the probability of malignancy predicted by both radiologists and the machine learning model, they could achieve results that were more accurate than either group could achieve individually.

One limitation of this study is the equipment available to the researchers. The graphical processing units used for computations had a limited amount of memory to work with, limiting the researchers to relatively shallow models. The amount of data that was available to use for training was also likely a limiting factor. Theoretically, the more data the model has to learn from, the better it will be. The authors concluded by saying that their model was relatively simplistic in nature and that more complex models, based not simply off of imaging, could be created to further improve the efficacy of machine learning produced CAD tools—for example, biometric data such as the amount of carbon dioxide or glucose in the blood. Often times, machine learning can excel in situations when there is a large amount of data available, but it is not obvious what the correlation between different parts of the data are (Erikson 2017). Therefore, when fed data that may not be obviously related to the presence of malignant tumors, machine learning models that take extra data into account may be able to find connections between seemingly unrelated data and improve their accuracy (Erikson 2017). Consequently, rather than just examining the same things humans normally would, a model that analyzes data that humans would have a difficult time analyzing might improve prediction accuracy more significantly.

The code produced by the researchers is open-source, meaning that it is made publicly available to the audience through a repository on GitHub, an online code sharing service. The code is freely available to anyone with the technical knowledge to alter and experiment with on their own. The GitHub page also includes a ReadMe detailing any what software libraries the researchers used. Using this public resource, researchers can easily verify the findings of the study if they have data that they would like to test the model on. There is also a premade dataset to run the model on in the code repository. All of the software that is used in the project is open-source/freeware.

The contents of this paper prove to be an important step towards modernized treatment technologies, making a technical contribution towards for breast cancer screening procedures. With less false positives and false negatives, it will be easier to provide treatment for those who need it without wasting resources and causing further harm. Additionally, the study illustrates an important microcosm of future human life, the overlap of information technology and biotechnology. Unbelievably large amounts of information are being collected today because of its potential implications when paired with improved computational technologies that are capable of processing and deriving meaning from it. This study alone used over one million images to train its models. I think that one of the inevitable implications on humanity’s future is on healthcare, and this article provides a way to explore and expose this topic to a wider audience.

References:

Erikson BJ, Korfiatus P, Akkus Z, Kline TL. Machine learning for medical imaging. RadioGraphics 37, 505-515 (2017).

Geras KJ, Wolfson S, Shen Y, Wu N, Kim SG, Kim E, Heacock L, Parikh U, Moy L, and Cho K. High-resolution breast cancer screening with multi-view deep convolutional neural networks. arXiv:1703.07047 (2017).

Lehman CD, Wellman RD, Buist DS, Kerlikowske K, Tosteson AN, and Miglioretti DL, Diagnostic accuracy of digital screening mammography with and without computer-aided detection. JAMA Internal Medicine, vol. 175, no. 11 (2015).

National Cancer Institute. c2015. Bethesda (MA): National Institute of Health; [accessed 2020 Feb 11]. https://www.cancer.gov/about-cancer/understanding/what-is-cancer.

Wu N, Phang J, Park J, et al. Deep neural networks improve radiologists’ performance in

breast cancer screening. IEEE Transactions on Medical Imaging (Early Access), 1

(2019).

Image taken from:

mikemacmarketing / original posted on flickr Liam Huang / clipped and posted on flickr – https://www.flickr.com/photos/186021024@N08/49203125457, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=84805375