Skip to content

API Reference¤

Core Modules¤

config ¤

Classes:

  • Config

    Class for loading config files

Config ¤

Config(config_path: str | Path)

Class for loading config files

Parameters:

  • config_path ¤

    (str | Path) –

    Path to config file

Attributes:

data property ¤

data: dict[str, Any]

Get data config

lora property ¤

lora: dict[str, Any] | None

Get LoRA config

methods property ¤

methods: list[dict[str, Any]]

Get method configs

model property ¤

model: dict[str, Any]

Get model config

sampling_parameters property ¤

sampling_parameters: dict[str, Any]

Get sampling_parameters config

data_loader ¤

Classes:

DataLoader ¤

DataLoader(
    data_path: str | Path | None = None,
    data_format: str = "csv",
    text_column: str = "text",
    label_column: str = "label",
)

Data loader class

Parameters:

  • data_path ¤

    (str | Path | None, default: None ) –

    Path to the data (file or directory, or dataset name for huggingface format)

  • data_format ¤

    (str, default: 'csv' ) –

    Data format ("csv", "jsonl", "json", "parquet", "huggingface")

  • text_column ¤

    (str, default: 'text' ) –

    Name of the text column

  • label_column ¤

    (str, default: 'label' ) –

    Name of the label column

Methods:

  • get_data

    Get data

  • load_mimir

    Load Mimir dataset with fixed text length constraints

  • load_wikimia

    Load WikiMIA dataset with specified text length

get_data ¤

get_data(text_length: int | None = None) -> tuple[list[str], list[int]]

Get data

Parameters:

  • text_length ¤
    (int | None, default: None ) –

    Number of words to split (if None, no split)

Returns:

  • texts ( list[str] ) –

    List of texts

  • labels ( list[int] ) –

    List of labels

load_mimir staticmethod ¤

load_mimir(data_path: str, token: str) -> DataLoader

Load Mimir dataset with fixed text length constraints

Parameters:

  • data_path ¤
    (str) –

    Path to the data (dataset name for huggingface format)

  • token ¤
    (str) –

    Hugging Face token

Returns:

load_wikimia staticmethod ¤

load_wikimia(text_length: int) -> DataLoader

Load WikiMIA dataset with specified text length

Parameters:

  • text_length ¤
    (int) –

    Text length (one of 32, 64, 128, 256)

Returns:

evaluator ¤

Classes:

  • EvaluationResult

    Container for evaluation results with detailed information

  • Evaluator

    Evaluator for membership inference attacks

EvaluationResult dataclass ¤

EvaluationResult(
    results_df: DataFrame,
    detailed_results: list[dict[str, Any]],
    labels: list[int],
    data_stats: dict[str, Any],
    cache_stats: dict[str, Any] = dict(),
)

Container for evaluation results with detailed information

Evaluator ¤

Evaluator(
    data_loader: DataLoader,
    model_loader: ModelLoader,
    methods: list[BaseMethod],
    max_cache_size: int = 1000,
)

Evaluator for membership inference attacks

Parameters:

  • data_loader ¤

    (DataLoader) –

    Data loader

  • model_loader ¤

    (ModelLoader) –

    Model loader

  • methods ¤

    (list[BaseMethod]) –

    List of methods to use for evaluation

  • max_cache_size ¤

    (int, default: 1000 ) –

    Maximum cache size

Methods:

  • evaluate

    Evaluate membership inference attacks on data with specified number of words

evaluate ¤

evaluate(config: Config) -> EvaluationResult

Evaluate membership inference attacks on data with specified number of words

Parameters:

  • config ¤
    (Config) –

    Configuration

Returns:

  • EvaluationResult

    EvaluationResult containing DataFrame, detailed results, labels, and stats

model_loader ¤

Classes:

ModelLoader ¤

ModelLoader(model_config: dict[str, Any])

vLLM model loader class

Parameters:

Methods:

get_lora_request ¤

get_lora_request(lora_config: dict[str, Any]) -> LoRARequest

Get LoRA request

Parameters:

Returns:

  • LoRARequest

    LoRA request

get_sampling_params ¤

get_sampling_params(sampling_parameters: dict[str, Any]) -> SamplingParams

Get sampling parameters

Parameters:

  • sampling_parameters ¤
    (dict[str, Any]) –

    Sampling parameters configuration

Returns:

  • SamplingParams

    Sampling parameters

utils ¤

Functions:

fix_seed ¤

fix_seed(seed: int = 0) -> None

Fix random seed

Parameters:

  • seed ¤

    (int, default: 0 ) –

    Seed value to fix

get_metrics ¤

get_metrics(scores: list[float], labels: list[int]) -> tuple[float, float, float]

Calculate evaluation metrics

Parameters:

  • scores ¤

    (list[float]) –

    List of scores

  • labels ¤

    (list[int]) –

    List of labels (1: membership, 0: non-membership)

Returns:

  • auroc ( float ) –

    AUROC

  • fpr95 ( float ) –

    FPR when TPR is 95%

  • tpr05 ( float ) –

    TPR when FPR is 5%

MIA Methods¤

methods ¤

Modules:

Classes:

  • BaseMethod

    Base class for membership inference methods

  • CONReCaLLMethod

    Con-ReCall membership inference method

  • DCPDDMethod

    DC-PDD membership inference method

  • LossMethod

    Loss (log-likelihood) based membership inference method

  • LowerMethod

    Lower based membership inference method

  • MethodFactory

    Method factory class

  • MinKMethod

    Min-K% Prob based membership inference method

  • PACMethod

    PAC (Polarized Augment Calibration) based membership inference method

  • ReCaLLMethod

    ReCaLL membership inference method

  • RefMethod

    Reference model based membership inference method

  • SaMIAMethod

    SaMIA membership inference method

  • ZlibMethod

    Zlib compression-based membership inference method

BaseMethod ¤

BaseMethod(method_name: str, method_config: dict[str, Any] = None)

              flowchart TD
              methods.BaseMethod[BaseMethod]

              

              click methods.BaseMethod href "" "methods.BaseMethod"
            

Base class for membership inference methods

Parameters:

  • method_name ¤

    (str) –

    Name of the method

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output abstractmethod ¤

process_output(output: RequestOutput) -> float

Process model output and calculate score

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

CONReCaLLMethod ¤

CONReCaLLMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.CONReCaLLMethod[CONReCaLLMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.CONReCaLLMethod
                


              click methods.CONReCaLLMethod href "" "methods.CONReCaLLMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Con-ReCall membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput, prefix_token_length: int) -> float

Process model output and calculate loss (negative log-likelihood)

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

  • prefix_token_length ¤
    (int) –

    Number of prefix tokens to exclude

Returns:

run ¤

run(
    texts: list[str],
    labels: list[int],
    model: LLM,
    tokenizer: AnyTokenizer,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

CON-ReCaLL algorithm to calculate scores for a list of texts Args: texts: List of texts labels: List of labels model: LLM model tokenizer: Tokenizer sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

DCPDDMethod ¤

DCPDDMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.DCPDDMethod[DCPDDMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.DCPDDMethod
                


              click methods.DCPDDMethod href "" "methods.DCPDDMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

DC-PDD membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(
    output: RequestOutput, input_ids: list[int], freq_dist: list[int]
) -> float

Process model output and calculate DC-PDD score

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤

run(
    texts: list[str],
    model: LLM,
    tokenizer: AnyTokenizer,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

DC-PDD algorithm to calculate scores for a list of texts Args: texts: List of texts labels: List of labels model: LLM model tokenizer: Tokenizer sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

LossMethod ¤

LossMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.LossMethod[LossMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.LossMethod
                


              click methods.LossMethod href "" "methods.LossMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Loss (log-likelihood) based membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Process model output and calculate loss

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

LowerMethod ¤

LowerMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.LowerMethod[LowerMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.LowerMethod
                


              click methods.LowerMethod href "" "methods.LowerMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Lower based membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Process model output and calculate loss

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

  • float

    Negative mean log-likelihood (loss)

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run Lower algorithm and calculate scores for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

MethodFactory ¤

Method factory class

Methods:

create_method staticmethod ¤

create_method(method_config: dict[str, Any]) -> BaseMethod

Create a method

Parameters:

  • method_config ¤
    (dict[str, Any]) –

    Method configuration - type: Type of method ('loss', 'lower', 'zlib', 'mink', 'pac', 'recall', 'conrecall', 'samia', 'dcpdd', 'ref') - params: Method-specific parameters

Returns:

Raises:

  • ValueError

    If unknown method type is specified

create_methods staticmethod ¤

create_methods(methods_config: list[dict[str, Any]]) -> list[BaseMethod]

Create multiple methods

Parameters:

Returns:

MinKMethod ¤

MinKMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.MinKMethod[MinKMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.MinKMethod
                


              click methods.MinKMethod href "" "methods.MinKMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Min-K% Prob based membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration - ratio: Ratio of lowest probability tokens to use (0.0-1.0)

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Process model output and calculate Min-K% score

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

PACMethod ¤

PACMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.PACMethod[PACMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.PACMethod
                


              click methods.PACMethod href "" "methods.PACMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

PAC (Polarized Augment Calibration) based membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Process model output and calculate Polarized Distance

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

  • float

    Polarized Distance

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

PAC algorithm to calculate scores for a list of texts. Args: texts: List of texts model: LLM model sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

ReCaLLMethod ¤

ReCaLLMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.ReCaLLMethod[ReCaLLMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.ReCaLLMethod
                


              click methods.ReCaLLMethod href "" "methods.ReCaLLMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

ReCaLL membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput, prefix_token_length: int) -> float

Process model output and calculate loss (negative log-likelihood)

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

  • prefix_token_length ¤
    (int) –

    Number of prefix tokens to exclude

Returns:

run ¤

run(
    texts: list[str],
    labels: list[int],
    model: LLM,
    tokenizer: AnyTokenizer,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

ReCaLL algorithm to calculate scores for a list of texts Args: texts: List of texts labels: List of labels model: LLM model tokenizer: Tokenizer sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

RefMethod ¤

RefMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.RefMethod[RefMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.RefMethod
                


              click methods.RefMethod href "" "methods.RefMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Reference model based membership inference method

method_config: Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Process model output and calculate loss

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Ref algorithm to calculate scores for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model (target)

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns: List of Ref scores

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

SaMIAMethod ¤

SaMIAMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.SaMIAMethod[SaMIAMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.SaMIAMethod
                


              click methods.SaMIAMethod href "" "methods.SaMIAMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

SaMIA membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Calculate SaMIA score from a single model output Note: This method is called from BaseMethod.run, but for SaMIA, a custom implementation using multiple samples is used, so this method is not supported for single output. Use run method instead.

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤

run(
    texts: list[str],
    model: LLM,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

SaMIA algorithm to calculate scores for a list of texts Args: texts: List of texts model: LLM model lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

ZlibMethod ¤

ZlibMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.ZlibMethod[ZlibMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.ZlibMethod
                


              click methods.ZlibMethod href "" "methods.ZlibMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Zlib compression-based membership inference method

Parameters:

  • method_config ¤

    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤

cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤

clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤

get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤

get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤

process_output(output: RequestOutput) -> float

Process model output and calculate zlib-compressed information content ratio

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

  • float

    zlib-compressed information content ratio

run ¤

run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤

set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

base ¤

Classes:

  • BaseMethod

    Base class for membership inference methods

BaseMethod ¤

BaseMethod(method_name: str, method_config: dict[str, Any] = None)

              flowchart TD
              methods.base.BaseMethod[BaseMethod]

              

              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Base class for membership inference methods

Parameters:

  • method_name ¤
    (str) –

    Name of the method

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output abstractmethod ¤
process_output(output: RequestOutput) -> float

Process model output and calculate score

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

conrecall ¤

Classes:

CONReCaLLMethod ¤

CONReCaLLMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.conrecall.CONReCaLLMethod[CONReCaLLMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.conrecall.CONReCaLLMethod
                


              click methods.conrecall.CONReCaLLMethod href "" "methods.conrecall.CONReCaLLMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Con-ReCall membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput, prefix_token_length: int) -> float

Process model output and calculate loss (negative log-likelihood)

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

  • prefix_token_length ¤
    (int) –

    Number of prefix tokens to exclude

Returns:

run ¤
run(
    texts: list[str],
    labels: list[int],
    model: LLM,
    tokenizer: AnyTokenizer,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

CON-ReCaLL algorithm to calculate scores for a list of texts Args: texts: List of texts labels: List of labels model: LLM model tokenizer: Tokenizer sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

dcpdd ¤

Classes:

DCPDDMethod ¤

DCPDDMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.dcpdd.DCPDDMethod[DCPDDMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.dcpdd.DCPDDMethod
                


              click methods.dcpdd.DCPDDMethod href "" "methods.dcpdd.DCPDDMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

DC-PDD membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(
    output: RequestOutput, input_ids: list[int], freq_dist: list[int]
) -> float

Process model output and calculate DC-PDD score

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤
run(
    texts: list[str],
    model: LLM,
    tokenizer: AnyTokenizer,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

DC-PDD algorithm to calculate scores for a list of texts Args: texts: List of texts labels: List of labels model: LLM model tokenizer: Tokenizer sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

factory ¤

Classes:

MethodFactory ¤

Method factory class

Methods:

create_method staticmethod ¤
create_method(method_config: dict[str, Any]) -> BaseMethod

Create a method

Parameters:

  • method_config ¤
    (dict[str, Any]) –

    Method configuration - type: Type of method ('loss', 'lower', 'zlib', 'mink', 'pac', 'recall', 'conrecall', 'samia', 'dcpdd', 'ref') - params: Method-specific parameters

Returns:

Raises:

  • ValueError

    If unknown method type is specified

create_methods staticmethod ¤
create_methods(methods_config: list[dict[str, Any]]) -> list[BaseMethod]

Create multiple methods

Parameters:

Returns:

loss ¤

Classes:

  • LossMethod

    Loss (log-likelihood) based membership inference method

LossMethod ¤

LossMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.loss.LossMethod[LossMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.loss.LossMethod
                


              click methods.loss.LossMethod href "" "methods.loss.LossMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Loss (log-likelihood) based membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Process model output and calculate loss

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

lower ¤

Classes:

  • LowerMethod

    Lower based membership inference method

LowerMethod ¤

LowerMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.lower.LowerMethod[LowerMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.lower.LowerMethod
                


              click methods.lower.LowerMethod href "" "methods.lower.LowerMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Lower based membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Process model output and calculate loss

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

  • float

    Negative mean log-likelihood (loss)

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run Lower algorithm and calculate scores for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

mink ¤

Classes:

  • MinKMethod

    Min-K% Prob based membership inference method

MinKMethod ¤

MinKMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.mink.MinKMethod[MinKMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.mink.MinKMethod
                


              click methods.mink.MinKMethod href "" "methods.mink.MinKMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Min-K% Prob based membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration - ratio: Ratio of lowest probability tokens to use (0.0-1.0)

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Process model output and calculate Min-K% score

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

pac ¤

Classes:

  • PACMethod

    PAC (Polarized Augment Calibration) based membership inference method

PACMethod ¤

PACMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.pac.PACMethod[PACMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.pac.PACMethod
                


              click methods.pac.PACMethod href "" "methods.pac.PACMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

PAC (Polarized Augment Calibration) based membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Process model output and calculate Polarized Distance

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

  • float

    Polarized Distance

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

PAC algorithm to calculate scores for a list of texts. Args: texts: List of texts model: LLM model sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

prefix_utils ¤

Functions:

  • compute_prefix_loss

    Compute negative mean log-likelihood excluding prefix tokens.

  • extract_prefix

    Randomly select num_shots texts from the list without modifying the original.

  • process_prefix

    Process prefix to fit within model's max length.

compute_prefix_loss ¤

compute_prefix_loss(output: RequestOutput, prefix_token_length: int) -> float

Compute negative mean log-likelihood excluding prefix tokens.

Shared by ReCaLL and CON-ReCaLL methods.

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

  • prefix_token_length ¤
    (int) –

    Number of prefix tokens to exclude

Returns:

  • float

    Negative mean log-likelihood (loss)

extract_prefix ¤

extract_prefix(texts: list[str], num_shots: int) -> list[str]

Randomly select num_shots texts from the list without modifying the original.

Parameters:

  • texts ¤
    (list[str]) –

    List of texts to sample from

  • num_shots ¤
    (int) –

    Number of texts to select

Returns:

  • list[str]

    List of randomly selected texts

process_prefix ¤

process_prefix(
    model: LLM,
    tokenizer: AnyTokenizer,
    prefix: list[str],
    avg_length: int,
    pass_window: bool,
    num_shots: int,
) -> tuple[list[str], int]

Process prefix to fit within model's max length.

Parameters:

  • model ¤
    (LLM) –

    LLM model

  • tokenizer ¤
    (AnyTokenizer) –

    Tokenizer

  • prefix ¤
    (list[str]) –

    List of prefix texts

  • avg_length ¤
    (int) –

    Average token length of texts

  • pass_window ¤
    (bool) –

    If True, skip window check

  • num_shots ¤
    (int) –

    Number of shots

Returns:

  • tuple[list[str], int]

    Tuple of (processed prefix, actual number of shots)

recall ¤

Classes:

ReCaLLMethod ¤

ReCaLLMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.recall.ReCaLLMethod[ReCaLLMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.recall.ReCaLLMethod
                


              click methods.recall.ReCaLLMethod href "" "methods.recall.ReCaLLMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

ReCaLL membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput, prefix_token_length: int) -> float

Process model output and calculate loss (negative log-likelihood)

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

  • prefix_token_length ¤
    (int) –

    Number of prefix tokens to exclude

Returns:

run ¤
run(
    texts: list[str],
    labels: list[int],
    model: LLM,
    tokenizer: AnyTokenizer,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

ReCaLL algorithm to calculate scores for a list of texts Args: texts: List of texts labels: List of labels model: LLM model tokenizer: Tokenizer sampling_params: Sampling parameters lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

ref ¤

Classes:

  • RefMethod

    Reference model based membership inference method

RefMethod ¤

RefMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.ref.RefMethod[RefMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.ref.RefMethod
                


              click methods.ref.RefMethod href "" "methods.ref.RefMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Reference model based membership inference method

method_config: Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Process model output and calculate loss

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Ref algorithm to calculate scores for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model (target)

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns: List of Ref scores

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

samia ¤

Classes:

Functions:

  • get_suffix

    Extracts a suffix from the given text, based on the specified prefix ratio and text length.

  • ngrams

    Generates n-grams from a sequence.

  • rouge_n

    Calculates the ROUGE-N score between a candidate and a reference.

SaMIAMethod ¤

SaMIAMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.samia.SaMIAMethod[SaMIAMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.samia.SaMIAMethod
                


              click methods.samia.SaMIAMethod href "" "methods.samia.SaMIAMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

SaMIA membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Calculate SaMIA score from a single model output Note: This method is called from BaseMethod.run, but for SaMIA, a custom implementation using multiple samples is used, so this method is not supported for single output. Use run method instead.

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

run ¤
run(
    texts: list[str],
    model: LLM,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

SaMIA algorithm to calculate scores for a list of texts Args: texts: List of texts model: LLM model lora_request: LoRA request data_config: Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)

get_suffix ¤

get_suffix(text: str, prefix_ratio: float, text_length: int) -> list

Extracts a suffix from the given text, based on the specified prefix ratio and text length.

ngrams ¤

ngrams(sequence: str, n: int) -> zip

Generates n-grams from a sequence.

rouge_n ¤

rouge_n(candidate: list, reference: list, n: int = 1) -> float

Calculates the ROUGE-N score between a candidate and a reference.

zlib ¤

Classes:

  • ZlibMethod

    Zlib compression-based membership inference method

ZlibMethod ¤

ZlibMethod(method_config: dict[str, Any] = None)

              flowchart TD
              methods.zlib.ZlibMethod[ZlibMethod]
              methods.base.BaseMethod[BaseMethod]

                              methods.base.BaseMethod --> methods.zlib.ZlibMethod
                


              click methods.zlib.ZlibMethod href "" "methods.zlib.ZlibMethod"
              click methods.base.BaseMethod href "" "methods.base.BaseMethod"
            

Zlib compression-based membership inference method

Parameters:

  • method_config ¤
    (dict[str, Any], default: None ) –

    Method configuration

Methods:

cleanup_model staticmethod ¤
cleanup_model(model: LLM) -> None

Release GPU memory used by a vLLM model

Parameters:

  • model ¤
    (LLM) –

    LLM model to clean up

clear_cache classmethod ¤
clear_cache() -> None

Clear cache and reset statistics

get_cache_stats classmethod ¤
get_cache_stats() -> dict[str, Any]

Get cache statistics

Returns:

get_outputs ¤
get_outputs(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[RequestOutput]

Get model outputs for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

  • list[RequestOutput]

    List of model outputs

process_output ¤
process_output(output: RequestOutput) -> float

Process model output and calculate zlib-compressed information content ratio

Parameters:

  • output ¤
    (RequestOutput) –

    Model output

Returns:

  • float

    zlib-compressed information content ratio

run ¤
run(
    texts: list[str],
    model: LLM,
    sampling_params: SamplingParams,
    lora_request: LoRARequest = None,
    data_config: dict[str, Any] = None,
) -> list[float]

Run inference for a list of texts

Parameters:

  • texts ¤
    (list[str]) –

    List of texts

  • model ¤
    (LLM) –

    LLM model

  • sampling_params ¤
    (SamplingParams) –

    Sampling parameters

  • lora_request ¤
    (LoRARequest, default: None ) –

    LoRA request

  • data_config ¤
    (dict[str, Any], default: None ) –

    Data configuration

Returns:

set_max_cache_size classmethod ¤
set_max_cache_size(size: int) -> None

Set maximum cache size

Parameters:

  • size ¤
    (int) –

    Maximum cache size (number of entries)