Layer-wise Minimal Pair Probing

Revealing grammatical and conceptual hierarchies inside speech representations.

We design minimal pair probing experiments to uncover how self-supervised speech models encode grammar and meaning across layers. By generating linguistically controlled perturbations, we map the emergence of syntactic and conceptual understanding within contextual speech encoders.

Highlights

  • Introduces minimal pair datasets spanning morphology, syntax, and semantics for spoken language.
  • Provides a hierarchical view of when linguistic information crystallizes during speech model processing.
  • Offers actionable insights for transfer learning and task-specific fine-tuning.

Role

  • Designed the probing study and evaluation pipeline for minimal pair analyses.
  • Built visualization dashboards that make cross-layer trends immediately interpretable.
  • Presented findings as an oral talk at EMNLP 2025.

Resources

  • Paper: (link forthcoming)

References