论文结构学习一
本篇学习目标是《Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning》
关于Introduction
介绍背景和做了哪些东西,现有的不足以及我们实现的效果,如下:
Since the 1970s, there have been various attempts to build a system that can understand such relationships. Recently, with the rise of deep learning models, learning-based approaches have gained wide popularity [1], [2].
说到了一部分的背景。
In order to achieve higher adaptability and flexibility, we introduce a target-driven model.
说到了具体的工作。
Unfortunately, training and quantitatively evaluating DRL algorithms in real environments is often tedious.
不足。
We evaluate our method for the following tasks: (1) Target generalization, where the goal is to navigate to targets that have not been used during training within a scene. (2) Scene generalization, where the goal is to navigate to targets in scenes not used for training. (3) Real-world generalization, where we demonstrate navigation to targets using a real robot.
目标。
In summary, …
总结。
关于RELATED WORK
相关工作介绍了很多已有的技术和方法。给予地图的方法需要地图,然后说说我们的方法不用地图。对于无地图的导航,关注点基本都在给定输入图像的避障,我们侧重在无图,也不基于3D重建。
接着说到了强化学习,说到了直升机的那个,还有DRL的开山之作,玩Atari游戏,我们的输入更复杂,而且可以迁移知识。
接下来介绍了实验平台,也就是实验需要的环境,相当于介绍实验背景
仿真系统的好处就是可伸缩性scalable
The advantage of using a physics engine for modeling the world is that it is highly scalable (training a robot in real houses is not easily scalable).
说到了目标导向的模型以及用于模型的框架deep siamese actor-critic network
- A先说是问题状态,这里是分析场景以及确定模型的作用:学习从2d图像到三维场景动作的映射。
- B是问题设想。关于泛化能力的分析,对于新目标不需要重新训练的补充。
- C是学习的设置。关键点在模型。
- D是模型。他们的模型嵌入了一个什么的层,这种模型使得场景共用一个层,目标共用一个层。不懂
- E是训练的方法。