论文结构学习一

2016-09-22 研究生生活 paper

本篇学习目标是《Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning》

关于Introduction

介绍背景和做了哪些东西，现有的不足以及我们实现的效果,如下：

Since the 1970s, there have been various attempts to build a system that can understand such relationships. Recently, with the rise of deep learning models, learning-based approaches have gained wide popularity [1], [2].

说到了一部分的背景。

In order to achieve higher adaptability and flexibility, we introduce a target-driven model.

说到了具体的工作。

Unfortunately, training and quantitatively evaluating DRL algorithms in real environments is often tedious.

不足。

We evaluate our method for the following tasks: (1) Target generalization, where the goal is to navigate to targets that have not been used during training within a scene. (2) Scene generalization, where the goal is to navigate to targets in scenes not used for training. (3) Real-world generalization, where we demonstrate navigation to targets using a real robot.

目标。

In summary, …

总结。

关于RELATED WORK

相关工作介绍了很多已有的技术和方法。给予地图的方法需要地图，然后说说我们的方法不用地图。对于无地图的导航，关注点基本都在给定输入图像的避障，我们侧重在无图，也不基于3D重建。

接着说到了强化学习，说到了直升机的那个，还有DRL的开山之作，玩Atari游戏，我们的输入更复杂，而且可以迁移知识。

接下来介绍了实验平台，也就是实验需要的环境，相当于介绍实验背景

仿真系统的好处就是可伸缩性scalable

The advantage of using a physics engine for modeling the world is that it is highly scalable (training a robot in real houses is not easily scalable).

说到了目标导向的模型以及用于模型的框架deep siamese actor-critic network

A先说是问题状态，这里是分析场景以及确定模型的作用：学习从2d图像到三维场景动作的映射。
B是问题设想。关于泛化能力的分析，对于新目标不需要重新训练的补充。
C是学习的设置。关键点在模型。
D是模型。他们的模型嵌入了一个什么的层，这种模型使得场景共用一个层，目标共用一个层。不懂
E是训练的方法。