The UNIVERSAL Weight SUBSPACE HYPOTHESIS

Prakhar Kaushik Shravan Chaudhari Ankit Vaidya Rama Chellappa Alan Yuille
Johns Hopkins University

Chaos Converges into Order

We analyze over 1,100 deep neural networks—including 500 Mistral-7B LoRAs and 500 Vision Transformers. We provide the first large-scale empirical evidence that networks systematically converge to shared, low-dimensional spectral subspaces, regardless of initialization, task, or domain.

Geometric Universality

Models trained on disjoint data collapse into the same parametric subspace. This suggests architecture dictates geometry more than data does.

Extreme Efficiency

Storing only subspace coefficients enables massive compression (up to 100x). Train new tasks by optimizing lightweight coefficients instead of full weights.

Model Merging

Seamlessly merge models without data. Our method outperforms SOTA merging baselines (Task Arithmetic, TIES) by aligning spectral directions.

The Spectral Decay

The plot illustrates the explained variance ratio of principal components across 500 Vision Transformers.

Despite random initializations and different datasets, the majority of variance is captured in the first few dimensions (the "Universal Subspace").

500 ViTs
500 Mistral LoRAs
Explained Variance / Component
Shared Subspace Boundary

Universal Implications

Sustainable AI

Drastically reduces carbon footprint for training large-scale neural models by reusing subspaces.

Transfer Learning

Explains why techniques like parameter-efficient fine-tuning succeed across architectures.

Interpretability

Offers new insights into the intrinsic organization of information within deep networks.

Democratization

Allows under-resourced researchers to adapt SOTA models without massive compute clusters.

BibTeX

@misc{kaushik2025universalweightsubspacehypothesis,
      title={The Universal Weight Subspace Hypothesis}, 
      author={Prakhar Kaushik and Shravan Chaudhari and Ankit Vaidya and Rama Chellappa and Alan Yuille},
      year={2025},
      eprint={2512.05117},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2512.05117}, 
}