End-to-End Semi-Supervised Opportunistic Osteoporosis Screening Using Computed Tomography

  • 1Healthcare AI Team, National Cancer Center, Goyang, Korea
  • 2Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
  • 3Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea


Osteoporosis is the most common metabolic bone disease and can cause fragility fractures. Despite this, screening utilization rates for osteoporosis remain low among populations at risk. Automated bone mineral density (BMD) estimation using computed tomography (CT) can help bridge this gap and serve as an alternative screening method to dual-energy X-ray absorptiometry (DXA).
The feasibility of an opportunistic and population agnostic screening method for osteoporosis using abdominal CT scans without bone densitometry phantom-based calibration was investigated in this retrospective study. A total of 268 abdominal CT-DXA pairs and 99 abdominal CT studies without DXA scores were obtained from an oncology specialty clinic in the Republic of Korea. The center axial CT slices from the L1, L2, L3, and L4 lumbar vertebrae were annotated with the CT slice level and spine segmentation labels for each subject. Deep learning models were trained to localize the center axial slice from the CT scan of the torso, segment the vertebral bone, and estimate BMD for the top four lumbar vertebrae.
Automated vertebra-level DXA measurements showed a mean absolute error (MAE) of 0.079, Pearson’s r of 0.852 (P<0.001), and R2 of 0.714. Subject-level predictions on the held-out test set had a MAE of 0.066, Pearson’s r of 0.907 (P<0.001), and R2 of 0.781.
CT scans collected during routine examinations without bone densitometry calibration can be used to generate DXA BMD predictions.


Osteoporosis; Opportunistic screening; Bone mineral density; Dual-energy X-ray absorptiometry; Deep learning


  • Fig. 1. Overview of the end-to-end opportunistic osteoporosis screening method used. The grey text boxes denote deep learning-based subtasks. (A) The abdominal computed tomography (CT) scan of a subject is input into the system. (B) Both frontal and sagittal maximum intensity projections are generated from the CT scan. (C) Two deep learning-based regressors predict the center axial slice locations for the top four lumbar vertebrae. (D) For each of the four vertebrae, one slice above and two slices below the respective center slice prediction are selected. (E) Vertebral bone, including the spinous process, is segmented with a trained model. (F) Using the segmentation masks from the previous step, a 196×196-pixel area containing the segmented vertebral bone is cropped from each slice. For each of the four lumbar vertebrae, the four selected slices are concatenated counterclockwise to form a 392×392-pixel image to be used as inputs for the estimation algorithm. (G) An image regressor generates bone mineral density (BMD) using convolutional neural network BMD estimates for the four vertebrae independently.

  • Fig. 2. Overall flow of the proposed lumbar spine localization method. (A) In the first stage, deep neural network 1 predicts the index of the top axial slice of the l1 vertebra. (B) During the second stage, deep neural network 2 outputs predictions for the center slice for the l1, l2, l3, and l4 vertebrae. The numbers below each image denote image dimensions. 3D, three-dimensional; CT, computed tomography; MIP, maximum intensity projection.

  • Fig. 3. Training data for semi-supervised vertebra segmentation. For all subjects, the central axial computed tomography slice was annotated with segmentation slice labels. All axial slices for a given vertebra were used regardless of segmentation ground-truth availability during training.

  • Fig. 4. Data generation process for the bone mineral density estimation subtask. (A) Four computed tomography (CT) slices were selected from the detection subtask and their corresponding masks were generated from the segmentation subtask. (B) A 196×196 patch centered on the center of mass of the binary segmentation mask is cropped from each CT slice. (C) Cropped CT slices are concatenated counterclockwise to form a 392×392 input image.

  • Fig. 5. Regression (left) and Bland-Altman (right) plots for bone mineral density using dual-energy X-ray absorptiometry (BMDDXA) ground truths and end-to-end bone mineral density using convolutional neural network (BMDCNN) predictions. For both regression plots, the 95% confidence interval and 95% prediction interval are represented by the orange dotted line and grey dashed line respectively. (A) Vertebral-level results for L1, L2, L3, and L4 predictions evaluated independently (n=392). (B) Subject-level results for averaged L1 to L4 BMDCNN predictions against total lumbar BMDDXA (n=98). SD, standard deviation.


