The code is written in Python 3.10
, the required packages are given in requirements.txt
.
To run the code clone the repository to the current directory of your machine:
git clone https:/jana370/DeepSAD
It is recommended to set up a virtual environment to run the code:
# pip install virtualenv
cd <path of DeepSAD directory>
python -m virtualenv env
.\env\Scripts\activate
Then the required packages can be installed:
pip install -r requirements.txt
After that the code can be run.
DeepSAD.py
is the Deep SAD implementation,
make_graphics.py
is the code used for creating the figure for the tests including labeled normal data,
make_graphic_pollution.py
is the code used for creating the figure for the tests with mislabeled data.
python DeepSAD.py
python make_graphis.py
python make_graphic_pollution.py
For the Deep SAD implementation different options in the command line can be used:
-d
or --dataset
: choose the dataset which will be used; either "mnist"
, "fmnist"
, or "cifar10"
can be used;
default is "mnist"
-m
or --mode
: choose the type of loss function, which will be used for Deep SAD;
"standard"
will treat labeled normal data the same as unlabeled data and use the weight only for
labeled anomalies;
"standard_normal"
will use the weight for both labeled normal data and labeled anomalies;
"extended"
will use the weight for the labeled normal data and the second weight for the labeled
anomalies;
default is "standard"
-w
or --weight
: choose the weight that will be used in the loss function; Note, that this only defines the weight
for the labeled normal data if the "extended"
mode is used;
default is 3
-sw
or --second_weight
: choose the second weight that will be used for the labeled anomalies if the "extended"
mode is used;
default is 4
-cn
or --category_normal
: choose category which will be used as the normal class, the following categories
are defined for each dataset:
MNIST: 0
: 0, 6, 8, and 9; 1
: 1, 4, and 7; 2
: 2, 3, and 5;
F-MNIST: 0
: T_shirt, Pullover, Coat, and Shirt; 1
: Trouser, and Dress;
2
: Sandal, Sneaker, Bag, and Ankleboot;
CIFAR-10: 0
: plane, car, ship, and truck; 1
: bird, and frog;
2
: cat, deer, dog, and horse;
default is 0
-ca
or --category_anomaly
: choose category which will be used as the anomaly class, the following categories are
defined for each dataset:
MNIST: 0
: 0, 6, 8, and 9; 1
: 1, 4, and 7; 2
: 2, 3, and 5;
F-MNIST: 0
: T_shirt, Pullover, Coat, and Shirt; 1
: Trouser, and Dress;
2
: Sandal, Sneaker, Bag, and Ankleboot;
CIFAR-10: 0
: plane, car, ship, and truck; 1
: bird, and frog;
2
: cat, deer, dog, and horse;
default is 1
-ra
or --ratio_anomaly
: choose the ratio of labeled anomalies that will be used; Note, that the value should
be between 0
and 1
;
default is 0.05
-rn
or --ratio_normal
: choose the ratio of labeled normal data that will be used; Note, that the value should be
between 0
and 1
;
default is 0.0
-rpu
or --ratio_pollution_unlabeled
: choose the ratio of pollution in the unlabeled data; Note, that the value
should be between 0
and 1
;
default is 0.1
-rpl
or --ratio_pollution_labeled
: choose the ratio of pollution in the labeled anomalies; Note, that the value
should be between 0
and 1
;
default is 0.0
This means, that Deep SAD using MNIST, the "standard"
mode, with the weight 3
, the 0
category as normal class, the 1
category as anomaly class, a labeled
anomaly ratio of 0.05
, no labeled normal data, a pollution of 0.1
in the unlabeled data, and no pollution in the labeled anomalies, can be run by using:
python DeepSAD.py
Deep SAD using CIFAR-10, the "extended"
mode, with the weight 2
for labeled normal data, and a second weight 4
for labeled anomalies,
the 1
category as normal class, the 2
category as anomaly class, a labeled anomaly ratio of 0.01
, a labeled normal data ratio of 0.1
,
a pollution of 0.1
in the unlabeled data, and a pollution of 0.01
in the labeled anomalies, can for example be run by using:
python DeepSAD.py -d "cifar10" -m "extended" -w 2 -sw 4 -cn 1 -ca 2 -ra 0.01 -rn 0.1 -rpu 0.1 -rpl 0.01