Vision and Language Navigation Using Minimal Voice Instructions

Vision and Language Navigation Using Minimal Voice Instructions

Authors

  • Ansh Shah ktu
  • Parth Kansara
  • Parth Meswani
  • Prof. Prachi Tawde

Keywords:

Indoor Navigation, Computer Vision, Natural Language Processing, Matterport 3D

Abstract

The proposed system aims to design an algorithm
that can be used to navigate any 3-D mapped environment, using
the Matterport 3D Simulator by giving only minimal voice
instructions. During the training phase, the nodes of a selected
environment are traversed sequentially in the Simulator and an
object recognition algorithm is applied on the panorama at each
node. This helps in identifying and tagging the objects in the
vicinity of each viewpoint. For the testing phase, a natural
language instruction, specifying the goal location is taken as
input. The goal location is identified from among the various
viewpoints in the 3D environment by matching it to the tags
generated in the testing phase. A shortest path algorithm is
employed to navigate from the starting location to the goal
location. The proposed system focuses on the implementation of
the algorithm which combines natural language processing and
computer vision and can be employed by agents for indoor
navigation.

 

Author Biographies

Parth Kansara

 

 

Parth Meswani

 

 

Prof. Prachi Tawde

 

 

Published

2022-12-20

How to Cite

Ansh Shah, Parth Kansara, Parth Meswani, & Prof. Prachi Tawde. (2022). Vision and Language Navigation Using Minimal Voice Instructions. National Conference on Emerging Computer Applications, 3(1). Retrieved from https://ajcejournal.in/nceca/article/view/193
Loading...