Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback
Neural networks are powerful—but brittle under distribution shift. Rapid Network Adaptation (RNA) proposes a different paradigm: instead of anticipating all possible shifts during training, models should adapt on the fly at test time.
🧠 Motivation
Models fail under distribution shift (blur, noise, occlusion, domain changes, real-world deployment). Training-time robustness (augmentation, objectives) cannot cover the vast space of shifts.
Instead of anticipating shifts, adapt when they occur.
🔁 Adaptive Inference
Traditional models are static at test time (fixed parameters, no feedback). RNA introduces a closed-loop process: prediction → test-time signal → controller → updated model → improved prediction. Inference becomes dynamic and feedback-driven.
⚙️ Method
Base model: pretrained (frozen).
Adaptation signal: from batch (no labels), e.g., entropy, self-supervision, geometric constraints, k-NN retrieval.
Controller: maps outputs + → parameter updates via FiLM:
🚀 Amortized Adaptation
RNA replaces test-time optimization with a learned controller:
- ❌ no iterative SGD
- ❌ no test-time tuning
- ✅ single forward pass
This is amortized optimization—learning how to adapt during training.
🧩 Architecture
- Base network frozen
- Adaptation via FiLM layers only
- Controller predicts modulation parameters
- Controller is lightweight (~5–20% of base model)
→ stable, efficient, and deployable.
📡 Test-Time Signals
Signals act as error proxies without ground truth. Crucially:
Improving the proxy does not guarantee improving the task.
Poor signals can reduce entropy yet harm performance → signal design is critical.
📊 Experiments
Tasks: classification, segmentation, depth, optical flow
Datasets: ImageNet, COCO, ScanNet, Taskonomy, Replica, Hypersim
Shifts: 2D corruptions, 3D geometric shifts, cross-dataset generalization
📈 Results
- Matches or outperforms test-time optimization (TTO)
- Generalizes across tasks, modalities, unseen shifts
- Speed: RNA ~0.01s vs TTO ~3–5s → orders-of-magnitude faster
🔬 Context (TTA Landscape)
| Method | Description |
|---|---|
| TTO | iterative SGD at test time |
| Semi-amortized | hybrid |
| RNA | fully learned, single-pass adaptation |
RNA removes online optimization → faster and simpler deployment.
💡 Key Insights
- Adaptation > robustness for unpredictable shifts
- Learning adaptation strategies scales better than hand-design
- Test-time feedback is useful without labels
- Inference is better viewed as a feedback-driven process, not a fixed forward pass
🔗 Links
- 🌐 Project Page: rapid-network-adaptation.epfl.ch
- 📄 arXiv Paper: arXiv:2309.15762
- 💻 Code: GitHub repository
📌 Citation
@inproceedings{yeo2023rna,
title={Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback},
author={Yeo, Teresa and Kar, Oğuzhan Fatih and Sodagar, Zahra and Zamir, Amir},
booktitle={ICCV},
year={2023}
}