Refining a deep learning model for object detection


This is the fourth post in my series about a computer vision project I worked on to identify liver tumors in CT scans. In my previous post, I had taken a break from my deep learning model to work on data management and data labeling.

Now, I’ll return to the model and try to refine it for better results. At this point in the project, I was granted access to a GPU cluster at SAS, which allowed me to train my models at much higher rates, giving me the ability to test more parameters in less time.

I’ll spare you the gory details (feel free to connect with me if you’d like to discuss), but in the end, I decided on a VGG16 model with modifications to the fully connected layers, where I added a dropout rate of 0.5 in hopes of preventing overfitting, which is basically when the model memorizes the training data too well and can’t perform well on new data that it sees later.

For those deep learners out there, here’s the model architecture:

deep learning model architecture

Deep learning model architecture

3D visualization of segmentation results

Once the model classifies all of the patches in a given 3D CT scan, the next step is to reassemble the results into something visual. So, the last step in the pipeline is taking those raw classification outputs from the VGG16 model and using SAS to visualize, in 3D space, the segmented lesions that the network has identified.

After transposing the rows of pixel-wise classifications determined by the model into a single row representing a linearized flat image, this row of pixels is condensed into a 2D image that visualizes the segmentation result of a particular slice in the CT scan. Once this is done for each of the 2D slices in a given scan, we now have segmentation results for the entire scan. The resulting output would ideally look something like the ground-truth classification given by the radiologist:

Here’s an example of the output from the actual pipeline, when viewed in our good Python friend Mayavi:

I found this visualization to be incredibly cool (and validating). It’s so different from just looking at linear metrics of model performance, and just fun to drag around and play with!

Luckily enough for me, I managed to complete this 3D visualization before the final demo of my project. After spending a good deal of time combing through all of my Jupyter notebooks that contained various pieces of the pipeline and compiling the code that illustrated the complete end-to-end system, I was more excited than ever to present my work!

I’ll tell you all about that final presentation and the model’s accuracy in my next post.

Learn how to do deep learning with SAS

About Author

Julia Gong

Julia Gong is a sophomore at Stanford University majoring in Mathematical and Computational Science and minoring in Linguistics. She started at SAS in the summer of 2016, when she created a skin cancer detection software at JMP using image analysis and statistical modeling techniques. In the summer of 2017, she used JMP Scripting Language to build an interactive custom R add-in builder for JMP. In the summer of 2018, she built an end-to-end, automated data pipeline for liver tumor segmentation in 3D CT scans using deep learning and computer vision for biomedical image analytics in SAS Viya and CAS. She has been recognized in international technology competitions, loves public speaking and puns, and enjoys seeking novel solutions in overlooked intersections of AI, machine learning, language, environmental conservation, medicine, service, and art. Julia hopes to pursue a career that unites her many interests in computer vision, artificial intelligence, medicine, natural language processing, social good, education, and sustainability.

Related Posts

Back to Top