Adversarial debiasing is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions
Usage
adversarial_debiasing(
unprivileged_groups,
privileged_groups,
scope_name = "current",
sess = tf$compat$v1$Session(),
seed = NULL,
adversary_loss_weight = 0.1,
num_epochs = 50L,
batch_size = 128L,
classifier_num_hidden_units = 200L,
debias = TRUE
)
Arguments
- unprivileged_groups
A list with two values: the column of the protected class and the value indicating representation for unprivileged group.
- privileged_groups
A list with two values: the column of the protected class and the value indicating representation for privileged group.
- scope_name
Scope name for the tensorflow variables.
- sess
tensorflow session
- seed
Seed to make
predict
repeatable. If not,NULL
, must be an integer.- adversary_loss_weight
Hyperparameter that chooses the strength of the adversarial loss.
- num_epochs
Number of training epochs. Must be an integer.
- batch_size
Batch size. Must be an integer.
- classifier_num_hidden_units
Number of hidden units in the classifier model. Must be an integer.
- debias
Learn a classifier with or without debiasing.
Examples
if (FALSE) {
load_aif360_lib()
ad <- adult_dataset()
p <- list("race", 1)
u <- list("race", 0)
sess <- tf$compat$v1$Session()
plain_model <- adversarial_debiasing(privileged_groups = p,
unprivileged_groups = u,
scope_name = "debiased_classifier",
debias = TRUE,
sess = sess)
plain_model$fit(ad)
ad_nodebiasing <- plain_model$predict(ad)
}