Update README.md

3c03f66d · Aditya Prakash · b50322b0 · 3c03f66d · 3c03f66d · 3c03f66d
Commit 3c03f66d authored 3 years ago by Aditya Prakash
--- a/README.md
+++ b/README.md
 # Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

-## [Project Page](https://ap229997.github.io/projects/transfuser/) | [Paper](https://arxiv.org/pdf/2104.09224.pdf) | Supplementary | [Video](https://youtu.be/cc05F56vjVI) | [Poster](https://ap229997.github.io/projects/transfuser/assets/poster.pdf) | Blog
+## [Project Page](https://ap229997.github.io/projects/transfuser/) | [Paper](https://arxiv.org/pdf/2104.09224.pdf) | Supplementary | [Video](https://youtu.be/WxadQyQ2gMs) | [Poster](https://ap229997.github.io/projects/transfuser/assets/poster.pdf) | Blog

-<img src="transfuser/assets/teaser.png" height="192" hspace=30> <img src="transfuser/assets/full_arch.png" width="400">
+<img src="transfuser/assets/teaser.svg" height="192" hspace=30> <img src="transfuser/assets/full_arch.svg" width="400">

 This repository contains the code for the CVPR 2021 paper [Multi-Modal Fusion Transformer for End-to-End Autonomous Driving](http://www.cvlibs.net/publications/Prakash2021CVPR.pdf). If you find our code or paper useful, please cite
 ```bibtex

--- a/aim/README.md
+++ b/aim/README.md
 # AIM

-<p align="center"> <img src="assets/model.png" width="512"> </p>
+<p align="center"> <img src="assets/model.svg" width="512"> </p>

 AIM consists of a ResNet34 image encoder with an autoregressive GRU-based waypoint prediction network. This is equivalent to adapting CILRS to predict waypoints conditioned on goal locations rather than predicting vehicle controls conditioned on navigational commmands.


--- a/cilrs/README.md
+++ b/cilrs/README.md
 # CILRS

-<p align="center"> <img src="assets/model.png" width="512"> </p>
+<p align="center"> <img src="assets/model.svg" width="512"> </p>

 [CILRS](https://arxiv.org/pdf/1904.08980.pdf) is a conditional imitation learning method in which the agent learns to predict vehicle controls from RGB image and measured speed while being conditioned on the navigational command. In addition, the output of the image encoder is also used for predicted the vehicle speed.


--- a/geometric_fusion/README.md
+++ b/geometric_fusion/README.md
 # Geometric Fusion

-<p align="center"> <img src="assets/model.png"> </p>
+<p align="center"> <img src="assets/model.svg"> </p>

 Geometric Fusion consists of multi-scale image-to-LiDAR and LiDAR-to-image feature projections (inspired by [ContFuse](https://openaccess.thecvf.com/content_ECCV_2018/papers/Ming_Liang_Deep_Continuous_Fusion_ECCV_2018_paper.pdf)). This is equivalent to replacing the transformers in TransFuser with geometry-based feature projections.


--- a/late_fusion/README.md
+++ b/late_fusion/README.md
 # Late Fusion

-<p align="center"> <img src="assets/model.png" width="600"> </p>
+<p align="center"> <img src="assets/model.svg" width="600"> </p>

 Late Fusion consists of a 2-stream encoder in which the RGB image and the LiDAR BEV inputs are processed independently of each other. These features are then combined via element-wise summation and passed to waypoint prediction network. This is equivalent to removing the transformer modules from TransFuser.


--- a/transfuser/README.md
+++ b/transfuser/README.md
 # TransFuser

-<p align="center"> <img src="assets/model.png"> </p>
+<p align="center"> <img src="assets/model.svg"> </p>

 TransFuser uses the self-attention mechanism of the transformers for image and LiDAR feature fusion at multiple resolutions.