Skip to content

ShareDataWith and ShareBufferWith are prohibited in OP

Tao Luo edited this page Dec 24, 2019 · 1 revision

Op禁止调用ShareDataWithShareBufferWith方法


ShareDataWith and ShareBufferWith are prohibited in OP(English Version)


规范概要

  • 第一节 背景
  • 第二节 ShareDataWith或ShareBufferWith方法修改
  • 第三节 CI检查相关说明

补充说明: 规范在执行过程之中,可能会出现现有规范未考虑到的情况,需在不实施的过程之中不断的补充和完善,请大家积极反馈相关意见。

背景

目前,存在开发人员在Op内部对Output和Input进行Tensor::ShareDataWith 操作,即Output = ShareDataWith(Input),该操作可能会带来如下的错误:

  • 该操作相当于在operator图中创建一条隐藏边,其连接了Input和Output,这条边无法在图分析中表达,会引发基于图优化的错误;
  • 该操作也相当于在Op内部进行了inplace操作,可能会导致显存释放存在问题。(详情可参考官网说明

框架会自动检查哪些Op可以进行inplace 操作,哪些不可以。所以在Op内部禁止对Output/Input调用Tensor::ShareDataWithTensor::ShareBufferWith方法。

ShareDataWith或ShareBufferWith方法修改

Op文件存在调用ShareDataWith 的情况,如lod_reset_op.h文件,此处仅展示ShareDataWith 方法相关的代码内容。

template <typename DeviceContext, typename T>
class LoDResetGradKernel : public framework::OpKernel<T> {
 public:
  void Compute(const framework::ExecutionContext& ctx) const {
    auto* d_out = ctx.Input<framework::Tensor>(framework::GradVarName("Out"));
    auto* d_x = ctx.Output<framework::Tensor>(framework::GradVarName("X"));

    d_x->ShareDataWith(*d_out);
  }
};

Op内部禁止调用Tensor::ShareDataWithTensor::ShareBufferWith方法,可以使用framework::TensorCopy 作为替换,上述代码可更改成如下所示:

template <typename DeviceContext, typename T>
class LoDResetGradKernel : public framework::OpKernel<T> {
 public:
  void Compute(const framework::ExecutionContext& ctx) const {
    auto* d_out = ctx.Input<framework::Tensor>(framework::GradVarName("Out"));
    auto* d_x = ctx.Output<framework::Tensor>(framework::GradVarName("X"));

    framework::TensorCopy(*d_out, d_out->place(), d_x);
  }
};

CI检查相关说明

目前已在PR_CI_CPU_Py2 中开启了本规范的检查,若修改的Op中调用了ShareDataWithShareBufferWith 方法会导致该项检查无法通过,BuildLog中会出现类似下面的报错信息:

****************
Using ShareDataWith or ShareBufferWith is not recommended. You must have one RD's (zhhsplendid (Recommend), sneaxiy or luotao1 or lanxianghit) approval to use these methods. For more information, please refer to https://github.com/PaddlePaddle/Paddle/wiki/ShareDataWith-is-prohibited-in-OP. For more information, please refer to https://github.com/PaddlePaddle/Paddle/wiki/ShareDataWith-is-prohibited-in-OP. The error lines are as follows:
paddle/fluid/operators/fill_op.h
+      tensor.ShareDataWith(out)

There are 1 approved errors.
****************

请跟据报错信息中修改相关代码,以达到Op内部禁止调用Tensor::ShareDataWithTensor::ShareBufferWith方法的规范。如果确认需在Op内部调用该方法,请找相关审批人(CI BuildLog中有审批人名单)审核并需要最少一个approval。

意见反馈

若遇到问题,请联系@guofei

Clone this wiki locally