blockingpy.gpu_faiss_blocker.GPUFaissBlocker

class blockingpy.gpu_faiss_blocker.GPUFaissBlocker[source]

A class for performing blocking using the FAISS (Facebook AI Similarity Search) algorithms that are GPU-accelerated.

Parameters:

None

index

The FAISS index used for nearest neighbor search

Type:

faiss.Index (gpu)

x_columns

Column names of the reference dataset

Type:

array-like or None

METRIC_MAP

Mapping of distance metric names to FAISS metric types

Type:

dict

See also

BlockingMethod

Abstract base class defining the blocking interface

Notes

The available Index types from FAISS are: ‘flat’, ‘ivf’, ‘ivfpq’ and ‘cagra’.

For more details about the FAISS library and implementation, see: https://github.com/facebookresearch/faiss

__init__()[source]

Methods

__init__()

block(x, y, k, verbose, controls)

Perform blocking using the GPU FAISS algorithms.

block(x, y, k, verbose, controls)[source]

Perform blocking using the GPU FAISS algorithms.

Parameters:
  • x (DataHandler) – Reference dataset containing features for indexing

  • y (DataHandler) – Query dataset to find nearest neighbors for

  • k (int) – Number of nearest neighbors to find

  • verbose (bool, optional) – If True, print detailed progress information

  • controls (dict) – Algorithm control parameters.

Returns:

DataFrame containing the blocking results with columns: - ‘y’: indices from query dataset - ‘x’: indices of matched items from reference dataset - ‘dist’: distances to matched items

Return type:

pandas.DataFrame