Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider cropping around the spinal cord #37

Open
jcohenadad opened this issue Jul 17, 2023 · 24 comments
Open

Consider cropping around the spinal cord #37

jcohenadad opened this issue Jul 17, 2023 · 24 comments
Assignees

Comments

@jcohenadad
Copy link
Member

Cropping around the spinal cord for training and inference could help with:

  • class imbalance
  • training/inference time
  • memory issues

There are at least two approaches for cropping:

Related to #29

@valosekj
Copy link
Member

I'm just adding that the SC segmentation does not work perfectly for PSIR/STIR images (i.e., contrasts with lesion masks). This should hopefully not be a problem as the SC segmentation will only be used for cropping. Here is a comment in which I investigated different settings for SC seg. Alternatively, we can bring T2w seg to PSIR/STIR, as done here.

@jcohenadad
Copy link
Member Author

Thank you for your insights, @valosekj

Alternatively, we can bring T2w seg to PSIR/STIR, as done here.

I would like to avoid that, as it adds constraints from the user side (ie: it requires this extra contrast thereby making the overall method less generalizable)

@plbenveniste plbenveniste self-assigned this Jul 18, 2023
@plbenveniste
Copy link
Collaborator

Just pushed the modifications on branch plb/nnunet:

  • created seg_sc.py file to segment the spinal cord using sct
  • created crop_sc.py file to crop an image using a mask
  • modified the convert_bids_to_nnunet.py to enable cropping when converting the dataset to the nnU-Net format.

Now when calling convert_bids_to_nnunet.py we can ask to crop the training and/or the testing images. However, this process is very long.

@valosekj
Copy link
Member

Cool progress, @plbenveniste!

A few notes:

  1. I would add QC to the SC segmentation command (using -qc and -qc-subject flags) to quickly access the SC seg quality. Potentially, I would also consider QC for the cropping (sct_crop_image does not support QC natively, but maybe we could use some tweak with sct_qc)
  2. I am not sure if sct_deepseg_sc you are using here works well for both PSIR and STIR. IIRC, sct_propseg -c t1 worked better for PSIR; see this comment.
  3. Since both seg_sc.py as well as crop_sc.py are essentially single line functions, you could include them directly within the convert_bids_to_nnunet.py script.
  4. I would keep the original convert_bids_to_nnunet.py script without SC seg and cropping as it is. And created a second script, e.g., convert_bids_to_nnunet_seg_and_crop.py. This allows us to reproduce both workflows, the one without cropping and the one with cropping.
  5. This last note is more general - if you run your scripts, they will process subjects individually, which can take a long time in the case of the canproco dataset (n~450 subjects). From this perspective, it would be better to use the sct_run_batch wrapper for SC seg and cropping. The sct_run_batch allows parallelization across subjects.

Note: All my notes are just for brainstorming; I am not discarding any ideas.

@jcohenadad
Copy link
Member Author

Currently the cropping approach is exploratory. So we don't need a 'clean' pipeline with proper QC, manual correction, etc. Obviously, you want to make sure that the segmentation is "good enough", ie: works in >90% of your images (otherwise you could not conclude about the efficiency of cropping for model training). In order to assess this, you need QC. So that could be added in your function call:

https:/ivadomed/canproco/blob/cfe61fdf18073e01f97d45f379f7188095392c8a/nnunet/seg_sc.py#L24C41-L24C74

If segmentation does not work well (because of the anisotropic and sagittal orientation), then you might need to:

  • manually correct some segmentation
    • pros: precise seg
    • cons: takes a LOT of time
  • run the segmentation on the T2w image (which is WAY more robust bc good cord/CSF contrast and isotropic resolution), and then bring that segmentation in the space of the PSIR/STIR (using sct_register_multimodal -identity 1)
    • pros: time efficient, robust
    • cons: if subject moves, segmentation is shifted. HOWEVER, in this case you only need the seg for cropping, so it's OK to have slight shift.

@jcohenadad
Copy link
Member Author

About using sct_run_batch: the way the functions are called (ie: system call within Python) makes it more difficult to use sct_run_batch. Now that the functions are coded, I suggest to just use them, and then conclude if cropping is a good idea or not. And if not, revert to the previous version (or do not merge to master).

That being said, it might be OK to keep the seg/crop and upgraded version of nnunet conversion script as is (ie: with the crop option)-- however it needs to be documented.

@plbenveniste
Copy link
Collaborator

New script for segmentation of the spinal cord on an entire dataset: seg_dataset.py.
It creates new files with the suffix '_seg'.

Currently performing QC on the segmentations.

@plbenveniste
Copy link
Collaborator

It's hard to perform QC with the axial view :
image
This corresponds to this:
image

Currently modifying the code to have sagittal view to perform quality control

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jul 20, 2023

To have a better QC for the segmentation, we used:
sct_qc -i {image} -s {output_path} -d {output_path} -p sct_deepseg_lesion -plane sagittal -qc {qc_folder}

image

@plbenveniste
Copy link
Collaborator

However, there is still one problem: PSIR images tend to have bad segmentations with the lower slices of the volume having no segmentations.
Since, in this case, we are only interested in having these segmentations for the cropping of the image, we use the obtained segmentations and dilate them on the z-axis. This way we keep the entirety of the spine while removing volumes "far" from the centerline to reduce the class imbalance in the dataset.

@valosekj
Copy link
Member

To have a better QC for the segmentation, we used: sct_qc -i {image} -s {output_path} -d {output_path} -p sct_deepseg_lesion -plane sagittal -qc {qc_folder}

Cool trick! In the past, I used -p sct_label_vertebrae to create sagittal SC QC. But it created only a single slice. So the workaround with -p sct_deepseg_lesion -plane sagittal is definitely better!

@plbenveniste
Copy link
Collaborator

After all, in this case, QC is more interesting after cropping than after the segmentation of the spinal cord.
To perform QC on the cropping I used:
sct_qc -i {image} -d {output_path} -p sct_image_stitch -plane sagittal -qc {qc_folder}

Here is the result:
QC_cropping

@plbenveniste
Copy link
Collaborator

I ran segmentation and cropping on Joplin with the following code

os.system(f'python ./canproco/nnunet/seg_dataset.py -i ./data/canproco/ -c STIR,PSIR -q ./data/qc_seg')
os.system(f'python ./canproco/nnunet/crop_dataset.py -i ./data/canproco/ -c STIR,PSIR -q ./data/qc_seg -r True')

Segmentation on STIR was done without any problem.
However, I had the following problem when cropping on PSIR (ie. using prop_seg):
Screen Shot 2023-07-21 at 2 12 30 PM

It seems that there is a problem with the location at which the output file is supposed to be. However, the folder is stored at the same location for sct_deepseg_sc and it works fine.

image

NB: It works fine on my computer but not on Joplin. (maybe an SCT issue)

@valosekj
Copy link
Member

Interesting, I just tried sct_propseg on joplin on the same subject with the latest SCT master, and the command worked for me:

terminal output
$ cd ~/extrassd1/janvalosek/data/canproco/sub-edm143/ses-M0/anat
$ sct_propseg -i sub-edm143_ses-M0_PSIR.nii.gz -c t1 -o sub-edm143_ses-M0_PSIR_seg.nii.gz

--
Spinal Cord Toolbox (git-master-0204cfebce8a005dbb8d14768270135ea32839bc)

sct_propseg -i sub-edm143_ses-M0_PSIR.nii.gz -c t1
--

Image header specifies datatype 'int16', but array is of type 'float64'. Header metadata will be overwritten to use 'float64'.
Creating temporary folder (/tmp/sct_2023-07-21_14-32-50_propseg_vw16rmuy)
Image header specifies datatype 'int16', but array is of type 'float64'. Header metadata will be overwritten to use 'float64'.
Creating temporary folder (/tmp/sct_2023-07-21_14-32-50_optic-detect-centerline_sy406lrg)
Remove temporary files...
rm -rf /tmp/sct_2023-07-21_14-32-50_optic-detect-centerline_sy406lrg
Creating temporary folder (/tmp/sct_2023-07-21_14-32-52_propseg-centerline-optic_wuvdypvo)
/home/GRAMES.POLYMTL.CA/p118175/code/spinalcordtoolbox/bin/isct_propseg -t t1 -o /tmp/sct_2023-07-21_14-32-50_propseg_vw16rmuy -verbose -i /mnt/extrassd1/janvalosek/data/canproco/sub-edm143/ses-M0/anat/sub-edm143_ses-M0_PSIR.nii.gz -init-centerline /tmp/sct_2023-07-21_14-32-52_propseg-centerline-optic_wuvdypvo/centerline_optic.nii.gz -centerline-binary # in /mnt/extrassd1/janvalosek/data/canproco/sub-edm143/ses-M0/anat
mv /tmp/sct_2023-07-21_14-32-50_propseg_vw16rmuy/sub-edm143_ses-M0_PSIR_seg.nii.gz ./sub-edm143_ses-M0_PSIR_seg.nii.gz
mv /tmp/sct_2023-07-21_14-32-50_propseg_vw16rmuy/sub-edm143_ses-M0_PSIR_centerline.nii.gz ./sub-edm143_ses-M0_PSIR_centerline.nii.gz

Check consistency of segmentation...
Creating temporary folder (/tmp/sct_2023-07-21_14-33-21_propseg-check-segmentation_dqmgdz8e)
/tmp/sct_2023-07-21_14-33-21_propseg-check-segmentation_dqmgdz8e/tmp.segmentation.nii.gz
/tmp/sct_2023-07-21_14-33-21_propseg-check-segmentation_dqmgdz8e/tmp.centerline.nii.gz

Get data dimensions...
/tmp/sct_2023-07-21_14-33-21_propseg-check-segmentation_dqmgdz8e/tmp.segmentation_RPI_c.nii.gz
rm -rf /tmp/sct_2023-07-21_14-33-21_propseg-check-segmentation_dqmgdz8e
Copy header input --> output(s) to make sure qform is the same.
Image header specifies datatype 'float64', but array is of type 'uint8'. Header metadata will be overwritten to use 'uint8'.
Image header specifies datatype 'float64', but array is of type 'uint8'. Header metadata will be overwritten to use 'uint8'.

Done! To view results, type:
fsleyes sub-edm143_ses-M0_PSIR.nii.gz -cm greyscale sub-edm143_ses-M0_PSIR_seg.nii.gz -cm red -a 100.0 &

@plbenveniste
Copy link
Collaborator

I found the problem:
sct_propseg doesn't like it when the output is in a subfolder. If you try this, it shouldn't work normally:
sct_propseg -i sub-edm143_ses-M0_PSIR.nii.gz -c t1 -o ./test/sub-edm143_ses-M0_PSIR_seg.nii.gz

@valosekj
Copy link
Member

I found the problem:
sct_propseg doesn't like it when the output is in a subfolder. If you try this, it shouldn't work normally:
sct_propseg -i sub-edm143_ses-M0_PSIR.nii.gz -c t1 -o ./test/sub-edm143_ses-M0_PSIR_seg.nii.gz

Okay, sct_propseg has a problem with . in the output filename. If I use $PWD instead of ., it works:

$ sct_propseg -i sub-edm143_ses-M0_PSIR.nii.gz -c t1 -o $PWD/test/sub-edm143_ses-M0_PSIR_seg.nii.gz

Could you please open an issue about this under SCT repo?

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jul 24, 2023

Segmentation and cropping done.
QC on cropping: out of the 782 images, 39 (5%) had a bad segmentation (either missing some lower part of the spinal cord or segmenting something completely out of the zone of interest)

List of images to change
sub-mon009_ses-M12_PSIR.nii.gz
sub-mon032_ses-M0_PSIR.nii.gz
sub-mon050_ses-M0_PSIR.nii.gz
sub-mon055_ses-M0_PSIR.nii.gz
sub-mon096_ses-M12_PSIR.nii.gz
sub-mon107_ses-M0_PSIR.nii.gz
sub-mon113_ses-M0_PSIR.nii.gz
sub-mon118_ses-M0_PSIR.nii.gz
sub-mon129_ses-M0_PSIR.nii.gz
sub-mon138_ses-M12_PSIR.nii.gz
sub-mon142_ses-M12_PSIR.nii.gz
sub-mon148_ses-M0_PSIR.nii.gz
sub-mon150_ses-M12_PSIR.nii.gz
sub-mon152_ses-M0_PSIR.nii.gz
sub-mon168_ses-M0_PSIR.nii.gz
sub-mon175_ses-M0_PSIR.nii.gz
sub-mon185_ses-M12_PSIR.nii.gz
sub-mon191_ses-M0_PSIR.nii.gz
sub-mon193_ses-M0_PSIR.nii.gz
sub-mon209_ses-M12_PSIR.nii.gz
sub-mon036_ses-M0_PSIR.nii.gz
sub-mon202_ses-M12_PSIR.nii.gz
sub-van137_ses-M12_PSIR.nii.gz
sub-van161_ses-M0_PSIR.nii.gz
sub-van171_ses-M0_PSIR.nii.gz
sub-van176_ses-M0_PSIR.nii.gz
sub-tor057_ses-M0_PSIR.nii.gz
sub-tor079_ses-M0_PSIR.nii.gz
sub-tor103_ses-M12_PSIR.nii.gz
sub-tor133_ses-M0_PSIR.nii.gz
sub-tor139_ses-M0_PSIR.nii.gz
sub-edm075_ses-M0_PSIR.nii.gz
sub-edm105_ses-M0_PSIR.nii.gz
sub-edm142_ses-M12_PSIR.nii.gz
sub-edm156_ses-M12_PSIR.nii.gz

I noticed that some images were exactly the same between M0 and M12. Maybe that's an issue to look at ?

None exhaustive list
sub-mon003_ses-M0_PSIR.nii.gz same as M12
sub-tor029_ses-M0_PSIR.nii.gz same as M12
sub-mon011_ses-M0_PSIR.nii.gz same as M12
sub-tor016_ses-M0_PSIR.nii.gz same as M12
sub-tor021_ses-M0_PSIR.nii.gz same as M12

The images which had a poor segmentation/cropping were replaced by their original nifti file.
Converted to nnUNet file structure :

  • Number of images for training: 443
  • Number of images for testing: 339

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jul 24, 2023

Improvement of the Dice score : Dice score around 0.55
image

By looking at the prediction results, it seems that the model is more exhaustive in predicting lesion segmentation. That is most likely due to class imbalance reduction thanks to the cropping of the volumes around the spinal cord.

Currently working on a script to detail the prediction results on the images from the M12 time point.

@valosekj
Copy link
Member

I noticed that some images were exactly the same between M0 and M12. Maybe that's an issue to look at ?

None exhaustive list
sub-mon003_ses-M0_PSIR.nii.gz same as M12
sub-tor029_ses-M0_PSIR.nii.gz same as M12
sub-mon011_ses-M0_PSIR.nii.gz same as M12
sub-tor016_ses-M0_PSIR.nii.gz same as M12
sub-tor021_ses-M0_PSIR.nii.gz same as M12

Good catch! Can you please open a separate issue about that under this repo? We will have to contact the sites and figure out what happed.

@jcohenadad
Copy link
Member Author

jcohenadad commented Jul 25, 2023

Improvement of the Dice score : around 0.55

Do you mean that the Dice improved by 0.55 or that it has reached 0.55 with the cropped data? If the latter, what what the previous Dice score?

@plbenveniste
Copy link
Collaborator

Current best dice score is 0.55. Dice score was around 0.49 before cropping the images.

@plbenveniste
Copy link
Collaborator

image

It seems that the model didn't have time to converge to an optimal model. Let's try on 2000 epochs.

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jul 25, 2023

Nota Bene:

It seems that the segmentations include more segmentations when performed with --disable_tta than when run without.
Example on image sub-cal072_ses-M12_STIR: it identifies one more lesion, another lesion is seen on a slice where it wasn't seen before, and overall lesions segmented are 'bigger'.

ezgif com-crop

On the left with tta, on the right with --disable_tta

Because we favor Recall over precision, we will run inference with -disable_tta

@jcohenadad
Copy link
Member Author

It seems that the segmentations include more segmentations when performed with --disable_tta than when run without.

very interesting observation. Tagging @naga-karthik @rohanbanerjee @tzebre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants