Moderator API Documentation

This section provides a detailed reference of the classes, methods, and attributes available in the Moderator module.

class nlpguard.moderator.moderator.HuggingFaceDatasetModerator

Bases: Moderator

_abc_impl = <_abc._abc_data object>

sentences_removal_mitigation_strategy(**kwargs): Abstract method to perform the sentence removal mitigation strategy.

word_replacement_with_hypernym_mitigation_strategy(**kwargs): Abstract method to perform the word replacement with hypernym mitigation strategy.

word_replacement_with_synonyms_mitigation_strategy(**kwargs): Abstract method to perform the word replacement with synonyms mitigation strategy.

words_removal_mitigation_strategy(**kwargs): Abstract method to perform the word removal mitigation strategy.

class nlpguard.moderator.moderator.Moderator

Bases: ABC

Abstract Moderator Class.

_abc_impl = <_abc._abc_data object>

abstract sentences_removal_mitigation_strategy(**kwargs): Abstract method to perform the sentence removal mitigation strategy.

abstract word_replacement_with_hypernym_mitigation_strategy(**kwargs): Abstract method to perform the word replacement with hypernym mitigation strategy.

abstract word_replacement_with_synonyms_mitigation_strategy(**kwargs): Abstract method to perform the word replacement with synonyms mitigation strategy.

abstract words_removal_mitigation_strategy(**kwargs): Abstract method to perform the word removal mitigation strategy.

class nlpguard.moderator.moderator.PandasDataFrameModerator

Bases: Moderator

Moderator Implementation for Pandas DataFrames.

_abc_impl = <_abc._abc_data object>

static _batch_sentences_removal(texts, tokenizer, protected_attributes)

Removes sentences containing protected attributes from texts in batch.

Parameters:

texts (list of str) – List of texts.
tokenizer (transformers.AutoTokenizer) – Tokenizer.
protected_attributes (list of str) – List of protected attributes to remove.

Returns:

List of mitigated texts obtained by removing sentences containing protected attributes.

Return type:

list(str)

static _batch_words_removal(texts, tokenizer, protected_attributes)

Removes words from texts in batch.

Parameters:

texts (list(str)) – List of texts.
tokenizer (transformers.AutoTokenizer) – Tokenizer.
protected_attributes (list of str) – List of protected attributes to remove.

Returns:

List of mitigated texts.

Return type:

list(str)

_generate_synonym_sentences(texts, protected_words, glove_word_embedding, n_synonyms, keep_original_sentence) → list[str]

Generate sentences by replacing words with synonyms.

Parameters:

texts (list(str)) – List of original sentences.
protected_words (list(str)) – List of protected words.
glove_word_embedding (gensim.models.KeyedVectors) – GloVe embeddings model.
n_synonyms (int) – Number of synonym sentences to generate.
keep_original_sentence (bool) – Flag indicating weather to keep the original sentence in the output.

Returns:

List of texts with synonym-replaced sentences for each original sentence.

Return type:

list(str)

static _get_hypernyms(word_list): “ Returns the hypernyms of the words in the given list.

static _get_synonyms(word_list, glove_word_embedding, k=5) → dict[str, list[str]]

Returns synonyms for the given word list using the GloVe embeddings.

:param list: List of words to find synonyms for. :type :obj:`list: str :param :obj:`gensim.models.KeyedVectors: Pre-trained word embeddings. :param : obj:(int, optional): Number of synonyms to return for each word. Defaults to 5.

Returns:: Dictionary of words and their corresponding synonyms.
Return type:: dict

static load_GloVe_embedding_model(model_name='glove-wiki-gigaword-300')

Load the GloVe embeddings model.

Parameters:: model_name (str) – Name of the GloVe model to load.
Returns:: Loaded GloVe embeddings.
Return type:: gensim.models.KeyedVectors

sentences_removal_mitigation_strategy(df_train, tokenizer, protected_attributes_per_label_dict, text_column_name, label_column_name, id2label, mitigate_each_label_separately=False, batch_size=128) → DataFrame

Performs the sentence removal mitigation strategy.

Parameters:

df_train (pandas.DataFrame) – Training dataset.
tokenizer (transformers.AutoTokenizer) – Tokenizer.
protected_attributes_per_label_dict (dict) – Dictionary of protected attributes per class label.
text_column_name (str) – Name of the column containing the text.
label_column_name (str) – Name of the column containing the class label.
id2label (dict) – Dictionary mapping class indices to class labels.
mitigate_each_label_separately (bool, optional) – Whether to mitigate each class label separately. If True Protected attributes identified as important for a class label are mitigated for that class label only. If False protected attributes identified for a particular class label are mitigated for all class labels. Defaults to False.
batch_size (int, optional) – Batch size. Defaults to 128.

Returns:

Dataframe with mitigated texts obtained by protected attributes sentence removal.

Return type:

pandas.DataFrame

word_replacement_with_hypernym_mitigation_strategy(**kwargs): Abstract method to perform the word replacement with hypernym mitigation strategy.

word_replacement_with_synonyms_mitigation_strategy(df_train, tokenizer, protected_attributes_per_label_dict, text_column_name, label_column_name, id2label, n_synonyms=5, keep_original_sentence=True, mitigate_each_label_separately=False, batch_size=128) → DataFrame

Performs the word replacement with synonyms mitigation strategy. n_synonyms synonyms are generated for each protected attributes in each text by replacing each protected attributes with one of the most similar n_synonyms words.

Parameters:

df_train (pandas.DataFrame) – Training dataset.
tokenizer (transformers.AutoTokenizer) – Tokenizer.
protected_attributes_per_label_dict (dict) – Dictionary of protected attributes per class label.
text_column_name (str) – Name of the column containing the text.
label_column_name (str) – Name of the column containing the class label.
id2label (dict) – Dictionary mapping class indices to class labels.
n_synonyms (int, optional) – Number of synonym-based sentences to generate per text. Defaults to 5.
keep_original_sentence (bool, optional) – Whether to keep the original sentence containing the identified protected attribute. Defaults to True.
mitigate_each_label_separately (bool, optional) – Whether to mitigate each class label separately. If True Protected attributes identified as important for a class label are mitigated for that class label only. If False protected attributes identified for a particular class label are mitigated for all class labels. Defaults to False.
batch_size (int, optional) – Batch size. Defaults to 128.

Returns:

Dataframe with mitigated texts obtained by replacing protected attributes with synonyms.

Return type:

pandas.DataFrame

words_removal_mitigation_strategy(df_train, tokenizer, protected_attributes_per_label_dict, text_column_name, label_column_name, id2label, mitigate_each_label_separately=False, batch_size=128) → DataFrame

Performs the word removal mitigation strategy.

Parameters:

df_train (pandas.DataFrame) – Training dataset.
tokenizer (transformers.AutoTokenizer) – Tokenizer.
( (protected_attributes_per_label_dict) – obj:dict): Dictionary of protected attributes per class label.
text_column_name (str) – Name of the column containing the text.
label_column_name (str) – Name of the column containing the class label.
id2label (dict) – Dictionary mapping class label ids to class label names
id2label – Dictionary mapping class label ids to class label names.
mitigate_each_label_separately (bool, optional) – Whether to mitigate each class label separately.
batch_size (int, optional) – Batch size for processing. Default is 128.

Returns:

DataFrame with the mitigated texts.

Return type:

pandas.DataFrame