API Reference¶

Python API for programmatic use. All public classes are importable from their respective subpackages.

Configuration — `xlmtec.core.config`¶

`ConfigBuilder`¶

Fluent builder for constructing a validated PipelineConfig.

from xlmtec.core.config import ConfigBuilder
from xlmtec.core.types import TrainingMethod, DatasetSource

config = (
    ConfigBuilder()
    .with_model("gpt2", torch_dtype="float32")
    .with_dataset("./data.jsonl", source=DatasetSource.LOCAL_FILE, max_samples=1000)
    .with_tokenization(max_length=512)
    .with_training(TrainingMethod.LORA, "./output", num_epochs=3, batch_size=4)
    .with_lora(r=8, lora_alpha=32, lora_dropout=0.1)
    .build()
)

Method	Key kwargs	Description
`.with_model(name, **kwargs)`	`torch_dtype`, `load_in_4bit`, `load_in_8bit`	Set model config
`.with_dataset(path, source, **kwargs)`	`max_samples`, `text_columns`, `shuffle`	Set dataset config
`.with_tokenization(**kwargs)`	`max_length`, `truncation`, `padding`	Set tokenization config
`.with_training(method, output_dir, **kwargs)`	`num_epochs`, `batch_size`, `learning_rate`, `fp16`	Set training config
`.with_lora(**kwargs)`	`r`, `lora_alpha`, `lora_dropout`, `target_modules`	Set LoRA config
`.with_evaluation(metrics, **kwargs)`	`batch_size`, `num_samples`	Set evaluation config
`.build()`	—	Validate and return `PipelineConfig`

`PipelineConfig`¶

Pydantic model holding the full pipeline config. Supports JSON and YAML I/O.

config = PipelineConfig.from_yaml(Path("config.yaml"))
config = PipelineConfig.from_json(Path("config.json"))
config.to_yaml(Path("config.yaml"))

Data Pipeline — `xlmtec.data`¶

`quick_load`¶

One-liner for loading and tokenizing a dataset.

from xlmtec.data import quick_load
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("gpt2")
dataset = quick_load("./data.jsonl", tokenizer, max_samples=500, max_length=512)
# Returns: datasets.Dataset with input_ids, attention_mask, labels

`prepare_dataset`¶

Full pipeline with optional train/validation split.

from xlmtec.data import prepare_dataset

result = prepare_dataset(
    dataset_config=config.dataset.to_config(),
    tokenization_config=config.tokenization.to_config(),
    tokenizer=tokenizer,
    split_for_validation=True,
    validation_ratio=0.1,
)
# result["train"], result["validation"]

Model Loading — `xlmtec.models.loader`¶

from xlmtec.models.loader import load_model_and_tokenizer

model, tokenizer = load_model_and_tokenizer(config.model.to_config())

Handles device mapping, 4-bit/8-bit quantization, and pad_token setup automatically.

Trainers — `xlmtec.trainers`¶

`TrainerFactory.train` (recommended)¶

Single entry point — selects the right trainer based on TrainingMethod.

from xlmtec.trainers import TrainerFactory

result = TrainerFactory.train(
    model=model,
    tokenizer=tokenizer,
    dataset=dataset,
    training_config=config.training.to_config(),
    lora_config=config.lora.to_config(),          # required for lora / qlora / instruction
    distillation_config=distillation_config,       # required for vanilla_distillation
    feature_distillation_config=fd_config,         # required for feature_distillation
)

`LoRATrainer` / `QLoRATrainer` / `FullFineTuner` / `InstructionTrainer`¶

from xlmtec.trainers import LoRATrainer

trainer = LoRATrainer(model, tokenizer, training_config, lora_config)
result = trainer.train(dataset)

`DPOTrainer`¶

Requires pip install trl>=0.7.0. Dataset must have prompt, chosen, rejected columns.

from xlmtec.trainers import DPOTrainer, validate_dpo_dataset

validate_dpo_dataset(dataset)   # raises ValueError if columns are missing
trainer = DPOTrainer(model, tokenizer, training_config, lora_config, beta=0.1)
result = trainer.train(dataset)

beta controls preference shaping strength: lower (0.05–0.1) stays close to the reference model; higher (0.3–0.5) applies stronger shaping.

`ResponseDistillationTrainer`¶

Student learns to match the output distribution of a larger teacher model (KL divergence + cross-entropy loss).

from xlmtec.core.types import DistillationConfig
from xlmtec.trainers import ResponseDistillationTrainer

distillation_config = DistillationConfig(
    teacher_model_name="gpt2-medium",
    temperature=2.0,     # higher = softer teacher distribution
    alpha=0.5,           # blend: alpha×KL + (1-alpha)×CE
)
trainer = ResponseDistillationTrainer(
    model, tokenizer, training_config, distillation_config
)
result = trainer.train(dataset)

`FeatureDistillationTrainer`¶

Extends response distillation with MSE loss on intermediate hidden states for stronger layer-level supervision.

from xlmtec.core.types import FeatureDistillationConfig
from xlmtec.trainers import FeatureDistillationTrainer

fd_config = FeatureDistillationConfig(
    teacher_model_name="gpt2-medium",
    temperature=2.0,
    alpha=0.5,           # KL divergence weight
    beta=0.3,            # hidden-state MSE weight
    feature_layers=None, # None = auto-select 4 evenly-spaced layers
)
trainer = FeatureDistillationTrainer(
    model, tokenizer, training_config, fd_config
)
result = trainer.train(dataset)

feature_layers accepts a list of student layer indices to supervise, e.g. [0, 4, 8, 11]. Each student layer is mapped to the proportionally corresponding teacher layer. Pass None for automatic selection.

`TrainingResult`¶

Frozen dataclass returned by all BaseTrainer subclasses.

result.output_dir             # Path — where model/adapter was saved
result.train_loss             # float
result.eval_loss              # float | None
result.epochs_completed       # int
result.steps_completed        # int
result.training_time_seconds  # float
result.trainer_logs           # Dict[str, Any] — raw HF Trainer log history

Pruning — `xlmtec.trainers`¶

Pruners are not BaseTrainer subclasses — they transform a model in-place rather than training it.

`StructuredPruner`¶

Soft structured pruning. Scores each attention head by mean absolute weight magnitude, then zeros the bottom sparsity fraction per layer. The model shape is unchanged.

from pathlib import Path
from xlmtec.core.types import PruningConfig
from xlmtec.trainers import StructuredPruner

pruning_config = PruningConfig(
    output_dir=Path("./outputs/pruned"),
    sparsity=0.3,           # fraction of heads to zero
    method="heads",         # "heads" (default) or "ffn"
    min_heads_per_layer=1,  # safety floor — never collapse a layer entirely
)
pruner = StructuredPruner(model, tokenizer, pruning_config)
result = pruner.prune()

result.output_dir              # Path
result.original_param_count    # int
result.zeroed_param_count      # int
result.sparsity_achieved       # float
result.heads_pruned_per_layer  # Dict[str, int] — layer name → heads zeroed
result.pruning_time_seconds    # float

method="heads" targets the query-projection rows of each attention layer. method="ffn" targets the gate/fc1 neuron rows of each FFN layer.

`WandaPruner`¶

WANDA (Weight AND Activation) unstructured pruning. Scores each weight by |W_ij| × ‖X_j‖₂ where X is the input activation norm, then zeros the bottom sparsity fraction. Requires a calibration dataset for best results; falls back to magnitude-only scoring without one.

from xlmtec.core.types import WandaConfig
from xlmtec.trainers import WandaPruner
import torch

wanda_config = WandaConfig(
    output_dir=Path("./outputs/wanda"),
    sparsity=0.5,
    n_calibration_samples=128,
    calibration_seq_len=128,
    use_row_wise=True,       # per-output-row threshold (recommended)
    layer_types=None,        # None = auto (Linear + Conv1D)
)

# With calibration data (recommended)
calib_ids = torch.load("calib_ids.pt")   # (N, seq_len) token id tensor
pruner = WandaPruner(model, tokenizer, wanda_config)
result = pruner.prune(calibration_input_ids=calib_ids)

# Without calibration data (magnitude-only fallback)
result = pruner.prune()

result.output_dir           # Path
result.original_param_count # int (total weights across all target layers)
result.zeroed_param_count   # int
result.sparsity_achieved    # float
result.layers_pruned        # int — number of linear layers processed
result.pruning_time_seconds # float

Evaluation — `xlmtec.evaluation`¶

`BenchmarkRunner`¶

from xlmtec.evaluation.benchmarker import BenchmarkRunner
from xlmtec.core.types import EvaluationConfig, EvaluationMetric

eval_config = EvaluationConfig(
    metrics=[EvaluationMetric.ROUGE_L, EvaluationMetric.BLEU],
    num_samples=200,
)
runner = BenchmarkRunner(base_model, finetuned_model, tokenizer, eval_config)
report = runner.run(dataset)

report.summary()              # formatted string
report.base_scores            # Dict[str, float]
report.finetuned_scores       # Dict[str, float]
report.delta                  # Dict[str, float] — improvement per metric

Individual metrics¶

from xlmtec.evaluation.metrics import RougeMetric, BleuMetric
from xlmtec.core.types import EvaluationMetric

metric = RougeMetric(EvaluationMetric.ROUGE_L)
score = metric.compute(
    predictions=["the quick brown fox"],
    references=["the quick brown fox"],
)
# score = 1.0

API Reference¶

Configuration — xlmtec.core.config¶

ConfigBuilder¶

PipelineConfig¶

Data Pipeline — xlmtec.data¶

quick_load¶

prepare_dataset¶

Model Loading — xlmtec.models.loader¶

Trainers — xlmtec.trainers¶

TrainerFactory.train (recommended)¶

LoRATrainer / QLoRATrainer / FullFineTuner / InstructionTrainer¶

DPOTrainer¶

ResponseDistillationTrainer¶

FeatureDistillationTrainer¶

TrainingResult¶

Pruning — xlmtec.trainers¶

StructuredPruner¶

WandaPruner¶

Evaluation — xlmtec.evaluation¶

BenchmarkRunner¶