blockingpy.text_encoders.text_transformer.TextTransformer
- class blockingpy.text_encoders.text_transformer.TextTransformer(**control_txt)[source]
Facade for selecting a concrete
TextEncoderbased on a control dictionary.- Parameters:
**control_txt – Configuration mapping. Must contain key
encoderset to one of the registry keys ('shingle'or'embedding'). Additional sub‑mappings with the same names may provide encoder‑specific keyword arguments.
Methods
__init__(**control_txt)fit(X[, y])Learn stateful parameters from X.
fit_transform(X[, y])Fit the encoder on X and return the transformed matrix.
transform(X)Convert raw strings into a numeric feature matrix.
- fit(X, y=None)[source]
Learn stateful parameters from X.
The default implementation is a no-op that returns self; override in subclasses that need to build a vocabulary or train a model.
- Parameters:
X – Series of input strings to learn from.
y – Ignored. Present for scikit-learn API compatibility.
- Returns:
selfto allow method chaining.- Return type:
TextEncoder
- fit_transform(X, y=None)[source]
Fit the encoder on X and return the transformed matrix.
Equivalent to calling
fit()followed bytransform().- Parameters:
X – Series of input strings.
y – Ignored.
- Returns:
The encoded feature matrix together with its column names.
- Return type: