Layer-wise Minimal Pair Probing
Revealing grammatical and conceptual hierarchies inside speech representations.
We design minimal pair probing experiments to uncover how self-supervised speech models encode grammar and meaning across layers. By generating linguistically controlled perturbations, we map the emergence of syntactic and conceptual understanding within contextual speech encoders.
Highlights
- Introduces minimal pair datasets spanning morphology, syntax, and semantics for spoken language.
- Provides a hierarchical view of when linguistic information crystallizes during speech model processing.
- Offers actionable insights for transfer learning and task-specific fine-tuning.
Role
- Designed the probing study and evaluation pipeline for minimal pair analyses.
- Built visualization dashboards that make cross-layer trends immediately interpretable.
- Presented findings as an oral talk at EMNLP 2025.
Resources
- Paper: (link forthcoming)