Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Multiview Scene Graph

by Zhang, Juexiao , Song, Haorui , Tang, Xinran , Chen, Feng , Gao, Zhu , Li, Sihang , Liu, Xinhao

in Annotations / Datasets / Edge joints / Graphical representations / Graphs / Image reconstruction / Object recognition / Visual fields / Visual tasks

2024

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Are you sure you want to remove the book from the shelf?

Multiview Scene Graph

by Zhang, Juexiao , Song, Haorui , Tang, Xinran , Chen, Feng , Gao, Zhu , Li, Sihang , Liu, Xinhao

in Annotations / Datasets / Edge joints / Graphical representations / Graphs / Image reconstruction / Object recognition / Visual fields / Visual tasks

2024

Confirm

Do you wish to request the book?

Multiview Scene Graph

by Zhang, Juexiao , Song, Haorui , Tang, Xinran , Chen, Feng , Gao, Zhu , Li, Sihang , Liu, Xinhao

in Annotations / Datasets / Edge joints / Graphical representations / Graphs / Image reconstruction / Object recognition / Visual fields / Visual tasks

2024

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Paper

Multiview Scene Graph

Zhang, Juexiao,

Song, Haorui,

Tang, Xinran,

Chen, Feng,

Gao, Zhu,

Li, Sihang,

Liu, Xinhao

2024

Overview

A proper scene representation is central to the pursuit of spatial intelligence where agents can robustly reconstruct and efficiently understand 3D scenes. A scene representation is either metric, such as landmark maps in 3D reconstruction, 3D bounding boxes in object detection, or voxel grids in occupancy prediction, or topological, such as pose graphs with loop closures in SLAM or visibility graphs in SfM. In this work, we propose to build Multiview Scene Graphs (MSG) from unposed images, representing a scene topologically with interconnected place and object nodes. The task of building MSG is challenging for existing representation learning methods since it needs to jointly address both visual place recognition, object detection, and object association from images with limited fields of view and potentially large viewpoint changes. To evaluate any method tackling this task, we developed an MSG dataset and annotation based on a public 3D dataset. We also propose an evaluation metric based on the intersection-over-union score of MSG edges. Moreover, we develop a novel baseline method built on mainstream pretrained vision models, combining visual place recognition and object association into one Transformer decoder architecture. Experiments demonstrate that our method has superior performance compared to existing relevant baselines.

Share this book

Add to My Shelf

Publisher

Cornell University Library, arXiv.org

Subject

Annotations

/ Datasets

/ Edge joints

/ Graphical representations

/ Graphs

/ Image reconstruction

/ Object recognition