-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset aggregation #1
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I just remember that we also have a lot of data from UMass (git-annex data : umass-ms-* (3 datasets)) |
I updated the new code to aggregate the following datasets, which are labelled:
The command ran on python ms-lesion-agnostic/monai/1_create_msd_data.py -pd ~/net/ms-lesion-agnostic/data/ -po ~/net/ms-lesion-agnostic/msd_data/ --lesion-only --canproco-exclude canproco/exclude.yml The output is the following: Total number of derivatives in the root directory: 4407
Number of images in train set: 1636
Number of images in validation set: 569
Number of images in test set: 544
Total number of images in the dataset: 2749 The total number of images in the dataset (2749) is different from the total number of derivatives (4407) because we decided to keep only those which have lesions. The output is the following file: |
for now, but maybe in the future it would be desirable to develop a model that also has good specificity (ie: high true negative rate) |
There was an issue in the code when gathering segmentations from python ms-lesion-agnostic/monai/1_create_msd_data.py -pd ~/net/ms-lesion-agnostic/data/ -po ~/net/ms-lesion-agnostic/msd_data/ --lesion-only --canproco-exclude canproco/exclude.yml This is the output of the code: Total number of derivatives in the root directory: 4407
Number of images in train set: 1712
Number of images in validation set: 590
Number of images in test set: 569
Total number of images in the dataset: 2871 |
For the purpose of writing of an abstract for Actrims, I am referencing some information about the data that we use.
Sites used for external validation:
|
I ran the script to analyze the dataset : EDIT: I fixed the re-orientation problem so that the resolution be all taken in RPI orientation. Here is the output: Number of images: 2871
Number of images for training: 1712
Number of images for validation: 590
Number of images for testing: 569
Number of images per contrast: {'UNIT1': 265, 'T2w': 1773, 'STIR': 72, 'PSIR': 286, 'T2star': 474, 'T1w': 1}
Number of images per orientation: {'iso': 272, 'ax': 1693, 'sag': 906}
Average resolution: [1.25201238 0.54277958 2.93960629]
Std resolution: [1.17157561 0.23521095 1.95652873]
Median resolution: [0.57291669 0.5625 3.29999995]
-------------------------------------
Number of images in ms-basel-2018: 46
Contrast in ms-basel-2018: {'T2w', 'T1w'}
Number of images per contrast in ms-basel-2018: {'T2w': 24, 'T1w': 22}
Number of images in ms-basel-2020: 31
Contrast in ms-basel-2020: {'PD'}
Number of images per contrast in ms-basel-2020: {'PD': 31}
-------------------------------------
Number of images in umass: 3516
Contrast in umass: {'T2w', 'PD', 'T1w'}
Number of images per contrast in umass: {'T2w': 1806, 'PD': 537, 'T1w': 1173} |
I added the ms-nmo-beijing dataset, where we only get some T1w images. I also added the computation of the resolution and the orientation for every dataset
Here is the output after: Number of images: 2871
Number of images for training: 1712
Number of images for validation: 590
Number of images for testing: 569
Number of images per contrast: {'UNIT1': 265, 'T2w': 1773, 'STIR': 72, 'PSIR': 286, 'T2star': 474, 'T1w': 1}
Number of images per orientation: {'iso': 272, 'ax': 1693, 'sag': 906}
Average resolution: [1.25201238 0.54277958 2.93960629]
Std resolution: [1.17157561 0.23521095 1.95652873]
Median resolution: [0.57291669 0.5625 3.29999995]
Minimum pixel dimension: 0.1874999850988388
Maximum pixel dimension: 9.541563034057617
-------------------------------------
Number of images in ms-basel-2018: 46
Contrast in ms-basel-2018: {'T1w', 'T2w'}
Number of images per contrast in ms-basel-2018: {'T1w': 22, 'T2w': 24}
Number of images in ms-basel-2020: 31
Contrast in ms-basel-2020: {'PD'}
Number of images per contrast in ms-basel-2020: {'PD': 31}
Average resolution: [2.43636375 0.61377165 0.61377165]
Std resolution: [0.90967523 0.25884103 0.25884103]
Median resolution: [2.99999976 0.57291669 0.57291669]
Minimum pixel dimension: 0.3385416567325592
Maximum pixel dimension: 3.300001859664917
Number of images per orientation in basel: {'sag': 55, 'iso': 22}
-------------------------------------
Number of images in umass: 3512
Contrast in umass: {'T1w', 'T2w', 'PD'}
Number of images per contrast in umass: {'T1w': 1169, 'T2w': 1806, 'PD': 537}
Average resolution: [2.14845656 0.43490765 1.70391261]
Std resolution: [1.4642299 0.10913737 1.53917671]
Median resolution: [3.29995835 0.42969999 0.42970002]
Minimum pixel dimension: 0.3124999701976776
Maximum pixel dimension: 11.24999713897705
Number of images per orientation in umass: {'sag': 2088, 'ax': 1424}
-------------------------------------
Number of images in beijing: 346
Contrast in beijing: {'T1w'}
Number of images per contrast in beijing: {'T1w': 346}
Average resolution: [1.33011619 0.924434 2.19872953]
Std resolution: [1.05835971 0.13317857 2.83172577]
Median resolution: [1.00000072 1. 1. ]
Minimum pixel dimension: 0.390625
Maximum pixel dimension: 13.799997329711914
Number of images per orientation in beijing: {'sag': 113, 'iso': 174, 'ax': 59} |
Here is an issue to describe the aggregation of available datasets.
The dataset which are of interest for this project are:
Labeled datasets
Unlabeled datasets:
The text was updated successfully, but these errors were encountered: