明 光 大 正

论文结构学习一

    研究生生活     paper

本篇学习目标是《Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

关于Introduction

介绍背景和做了哪些东西,现有的不足以及我们实现的效果,如下:

Since the 1970s, there have been various attempts to build a system that can understand such relationships. Recently, with the rise of deep learning models, learning-based approaches have gained wide popularity [1], [2].

说到了一部分的背景。

In order to achieve higher adaptability and flexibility, we introduce a target-driven model.

说到了具体的工作。

Unfortunately, training and quantitatively evaluating DRL algorithms in real environments is often tedious.

不足。

We evaluate our method for the following tasks: (1) Target generalization, where the goal is to navigate to targets that have not been used during training within a scene. (2) Scene generalization, where the goal is to navigate to targets in scenes not used for training. (3) Real-world generalization, where we demonstrate navigation to targets using a real robot.

目标。

In summary, …

总结。

关于RELATED WORK

相关工作介绍了很多已有的技术和方法。给予地图的方法需要地图,然后说说我们的方法不用地图。对于无地图的导航,关注点基本都在给定输入图像的避障,我们侧重在无图,也不基于3D重建。

接着说到了强化学习,说到了直升机的那个,还有DRL的开山之作,玩Atari游戏,我们的输入更复杂,而且可以迁移知识。

接下来介绍了实验平台,也就是实验需要的环境,相当于介绍实验背景

仿真系统的好处就是可伸缩性scalable

The advantage of using a physics engine for modeling the world is that it is highly scalable (training a robot in real houses is not easily scalable).

说到了目标导向的模型以及用于模型的框架deep siamese actor-critic network

  • A先说是问题状态,这里是分析场景以及确定模型的作用:学习从2d图像到三维场景动作的映射。
  • B是问题设想。关于泛化能力的分析,对于新目标不需要重新训练的补充。
  • C是学习的设置。关键点在模型。
  • D是模型。他们的模型嵌入了一个什么的层,这种模型使得场景共用一个层,目标共用一个层。不懂
  • E是训练的方法。
页阅读量:  ・  站访问量:  ・  站访客数: