PyTorch 使用自定义 C ++类扩展 TorchScript

原文： https://pytorch.org/tutorials/advanced/torch_script_custom_classes.html

本教程是自定义运算符教程的后续教程，并介绍了我们为将 C ++类同时绑定到 TorchScript 和 Python 而构建的 API。该 API 与 pybind11 非常相似，如果您熟悉该系统，则大多数概念都将转移过来。

在 C ++中实现和绑定类

在本教程中，我们将定义一个简单的 C ++类，该类在成员变量中保持持久状态。

// This header is all you need to do the C++ portions of this
// tutorial
#include <torch/script.h>
// This header is what defines the custom class registration
// behavior specifically. script.h already includes this, but
// we include it here so you know it exists in case you want
// to look at the API or implementation.
#include <torch/custom_class.h>


#include <string>
#include <vector>


template <class T>
struct Stack : torch::jit::CustomClassHolder {
  std::vector<T> stack_;
  Stack(std::vector<T> init) : stack_(init.begin(), init.end()) {}


  void push(T x) {
    stack_.push_back(x);
  }
  T pop() {
    auto val = stack_.back();
    stack_.pop_back();
    return val;
  }


  c10::intrusive_ptr<Stack> clone() const {
    return c10::make_intrusive<Stack>(stack_);
  }


  void merge(const c10::intrusive_ptr<Stack>& c) {
    for (auto& elem : c->stack_) {
      push(elem);
    }
  }
};

有几件事要注意：

torch/custom_class.h是您需要使用自定义类扩展 TorchScript 的标头。
注意，无论何时使用自定义类的实例，我们都通过c10::intrusive_ptr<>的实例来实现。将intrusive_ptr视为类似于std::shared_ptr的智能指针。使用此智能指针的原因是为了确保在语言(C ++，Python 和 TorchScript）之间对对象实例进行一致的生命周期管理。
注意的第二件事是用户定义的类必须继承自torch::jit::CustomClassHolder。这确保了所有设置都可以处理前面提到的生命周期管理系统。

现在让我们看一下如何使该类对 TorchScript 可见，该过程称为绑定该类：

// Notice a few things:
// - We pass the class to be registered as a template parameter to
//   `torch::jit::class_`. In this instance, we've passed the
//   specialization of the Stack class ``Stack<std::string>``.
//   In general, you cannot register a non-specialized template
//   class. For non-templated classes, you can just pass the
//   class name directly as the template parameter.
// - The single parameter to ``torch::jit::class_()`` is a
//   string indicating the name of the class. This is the name
//   the class will appear as in both Python and TorchScript.
//   For example, our Stack class would appear as ``torch.classes.Stack``.
static auto testStack =
  torch::jit::class_<Stack<std::string>>("Stack")
      // The following line registers the contructor of our Stack
      // class that takes a single `std::vector<std::string>` argument,
      // i.e. it exposes the C++ method `Stack(std::vector<T> init)`.
      // Currently, we do not support registering overloaded
      // constructors, so for now you can only `def()` one instance of
      // `torch::jit::init`.
      .def(torch::jit::init<std::vector<std::string>>())
      // The next line registers a stateless (i.e. no captures) C++ lambda
      // function as a method. Note that a lambda function must take a
      // `c10::intrusive_ptr<YourClass>` (or some const/ref version of that)
      // as the first argument. Other arguments can be whatever you want.
      .def("top", [](const c10::intrusive_ptr<Stack<std::string>>& self) {
        return self->stack_.back();
      })
      // The following four lines expose methods of the Stack<std::string>
      // class as-is. `torch::jit::class_` will automatically examine the
      // argument and return types of the passed-in method pointers and
      // expose these to Python and TorchScript accordingly. Finally, notice
      // that we must take the *address* of the fully-qualified method name,
      // i.e. use the unary `&` operator, due to C++ typing rules.
      .def("push", &Stack<std::string>::push)
      .def("pop", &Stack<std::string>::pop)
      .def("clone", &Stack<std::string>::clone)
      .def("merge", &Stack<std::string>::merge);

使用 CMake 将示例构建为 C ++项目

现在，我们将使用 CMake 构建系统来构建上述 C ++代码。首先，将到目前为止介绍的所有 C ++代码放入class.cpp/文件中。然后，编写一个简单的CMakeLists.txt文件并将其放置在同一目录中。 CMakeLists.txt的外观如下：

cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
project(custom_class)


find_package(Torch REQUIRED)


## Define our library target
add_library(custom_class SHARED class.cpp)
set(CMAKE_CXX_STANDARD 14)
## Link against LibTorch
target_link_libraries(custom_class "${TORCH_LIBRARIES}")

另外，创建一个build目录。您的文件树应如下所示：

custom_class_project/
  class.cpp
  CMakeLists.txt
  build/

现在，要构建项目，请继续从 PyTorch 网站下载适当的 libtorch 二进制文件。将 zip 存档解压缩到某个位置(在项目目录中可能很方便），并记下将其解压缩到的路径。接下来，继续调用 cmake，然后进行构建项目：

$ cd build
$ cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
  -- The C compiler identification is GNU 7.3.1
  -- The CXX compiler identification is GNU 7.3.1
  -- Check for working C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc
  -- Check for working C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc -- works
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Check for working CXX compiler: /opt/rh/devtoolset-7/root/usr/bin/c++
  -- Check for working CXX compiler: /opt/rh/devtoolset-7/root/usr/bin/c++ -- works
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Looking for pthread.h
  -- Looking for pthread.h - found
  -- Looking for pthread_create
  -- Looking for pthread_create - not found
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - found
  -- Found Threads: TRUE
  -- Found torch: /torchbind_tutorial/libtorch/lib/libtorch.so
  -- Configuring done
  -- Generating done
  -- Build files have been written to: /torchbind_tutorial/build
$ make -j
  Scanning dependencies of target custom_class
  [ 50%] Building CXX object CMakeFiles/custom_class.dir/class.cpp.o
  [100%] Linking CXX shared library libcustom_class.so
  [100%] Built target custom_class

您会发现，构建目录中现在有一个动态库文件。在 Linux 上，它可能名为libcustom_class.so。因此，文件树应如下所示：

custom_class_project/
  class.cpp
  CMakeLists.txt
  build/
    libcustom_class.so

从 Python 和 TorchScript 使用 C ++类

现在我们已经将我们的类及其注册编译为.so文件，我们可以将 <cite>.so</cite> 加载到 Python 中并进行尝试。这是一个演示脚本的脚本：

import torch


## `torch.classes.load_library()` allows you to pass the path to your .so file
## to load it in and make the custom C++ classes available to both Python and
## TorchScript
torch.classes.load_library("libcustom_class.so")
## You can query the loaded libraries like this:
print(torch.classes.loaded_libraries)
## prints {'/custom_class_project/build/libcustom_class.so'}


## We can find and instantiate our custom C++ class in python by using the
## `torch.classes` namespace:
## ## This instantiation will invoke the Stack(std::vector<T> init) constructor
## we registered earlier
s = torch.classes.Stack(["foo", "bar"])


## We can call methods in Python
s.push("pushed")
assert s.pop() == "pushed"


## Returning and passing instances of custom classes works as you'd expect
s2 = s.clone()
s.merge(s2)
for expected in ["bar", "foo", "bar", "foo"]:
    assert s.pop() == expected


## We can also use the class in TorchScript
## For now, we need to assign the class's type to a local in order to
## annotate the type on the TorchScript function. This may change
## in the future.
Stack = torch.classes.Stack


@torch.jit.script
def do_stacks(s : Stack): # We can pass a custom class instance to TorchScript
    s2 = torch.classes.Stack(["hi", "mom"]) # We can instantiate the class
    s2.merge(s) # We can call a method on the class
    return s2.clone(), s2.top()  # We can also return instances of the class
                                 # from TorchScript function/methods


stack, top = do_stacks(torch.classes.Stack(["wow"]))
assert top == "wow"
for expected in ["wow", "mom", "hi"]:
    assert stack.pop() == expected

使用自定义类保存，加载和运行 TorchScript 代码

我们也可以在使用 libtorch 的 C ++进程中使用自定义注册的 C ++类。举例来说，让我们定义一个简单的nn.Module，该实例在我们的 Stack 类上实例化并调用一个方法：

import torch


torch.classes.load_library('libcustom_class.so')


class Foo(torch.nn.Module):
    def __init__(self):
        super().__init__()


    def forward(self, s : str) -> str:
        stack = torch.classes.Stack(["hi", "mom"])
        return stack.pop() + s


scripted_foo = torch.jit.script(Foo())
print(scripted_foo.graph)


scripted_foo.save('foo.pt')

我们文件系统中的foo.pt现在包含我们刚刚定义的序列化 TorchScript 程序。

现在，我们将定义一个新的 CMake 项目，以展示如何加载此模型及其所需的.so 文件。有关如何执行此操作的完整说明，请查看在 C ++教程中加载 TorchScript 模型。

与之前类似，让我们创建一个包含以下内容的文件结构：

cpp_inference_example/
  infer.cpp
  CMakeLists.txt
  foo.pt
  build/
  custom_class_project/
    class.cpp
    CMakeLists.txt
    build/

请注意，我们已经复制了序列化的foo.pt文件以及上面custom_class_project的源代码树。我们将添加custom_class_project作为对此 C ++项目的依赖项，以便我们可以将自定义类构建到二进制文件中。

让我们用以下内容填充infer.cpp：

#include <torch/script.h>


#include <iostream>
#include <memory>


int main(int argc, const char* argv[]) {
  torch::jit::script::Module module;
  try {
    // Deserialize the ScriptModule from a file using torch::jit::load().
    module = torch::jit::load("foo.pt");
  }
  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }


  std::vector<c10::IValue> inputs = {"foobarbaz"};
  auto output = module.forward(inputs).toString();
  std::cout << output->string() << std::endl;
}

同样，让我们定义我们的 CMakeLists.txt 文件：

cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
project(infer)


find_package(Torch REQUIRED)


add_subdirectory(custom_class_project)


## Define our library target
add_executable(infer infer.cpp)
set(CMAKE_CXX_STANDARD 14)
## Link against LibTorch
target_link_libraries(infer "${TORCH_LIBRARIES}")
## This is where we link in our libcustom_class code, making our
## custom class available in our binary.
target_link_libraries(infer -Wl,--no-as-needed custom_class)

您知道练习：cd build，cmake和make：

$ cd build
$ cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
  -- The C compiler identification is GNU 7.3.1
  -- The CXX compiler identification is GNU 7.3.1
  -- Check for working C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc
  -- Check for working C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc -- works
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Check for working CXX compiler: /opt/rh/devtoolset-7/root/usr/bin/c++
  -- Check for working CXX compiler: /opt/rh/devtoolset-7/root/usr/bin/c++ -- works
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Looking for pthread.h
  -- Looking for pthread.h - found
  -- Looking for pthread_create
  -- Looking for pthread_create - not found
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - found
  -- Found Threads: TRUE
  -- Found torch: /local/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch.so
  -- Configuring done
  -- Generating done
  -- Build files have been written to: /cpp_inference_example/build
$ make -j
  Scanning dependencies of target custom_class
  [ 25%] Building CXX object custom_class_project/CMakeFiles/custom_class.dir/class.cpp.o
  [ 50%] Linking CXX shared library libcustom_class.so
  [ 50%] Built target custom_class
  Scanning dependencies of target infer
  [ 75%] Building CXX object CMakeFiles/infer.dir/infer.cpp.o
  [100%] Linking CXX executable infer
  [100%] Built target infer

现在我们可以运行令人兴奋的 C ++二进制文件：

$ ./infer
  momfoobarbaz

难以置信！

定义自定义 C ++类的序列化/反序列化方法

如果您尝试将具有自定义绑定 C ++类的ScriptModule保存为属性，则会出现以下错误：

# export_attr.py
import torch


torch.classes.load_library('libcustom_class.so')


class Foo(torch.nn.Module):
  def __init__(self):
      super().__init__()
      self.stack = torch.classes.Stack(["just", "testing"])


  def forward(self, s : str) -> str:
      return self.stack.pop() + s


scripted_foo = torch.jit.script(Foo())


scripted_foo.save('foo.pt')

$ python export_attr.py
RuntimeError: Cannot serialize custom bound C++ class __torch__.torch.classes.Stack. Please define serialization methods via torch::jit::pickle_ for this class. (pushIValueImpl at ../torch/csrc/jit/pickler.cpp:128)

这是因为 TorchScript 无法自动找出 C ++类中保存的信息。您必须手动指定。这样做的方法是使用class_上的特殊def_pickle方法在类上定义__getstate__和__setstate__方法。

注意

TorchScript 中__getstate__和__setstate__的语义与 Python pickle 模块的语义相同。您可以阅读更多有关如何使用这些方法的信息。

这是一个如何更新Stack类的注册码以包含序列化方法的示例：

static auto testStack =
  torch::jit::class_<Stack<std::string>>("Stack")
      .def(torch::jit::init<std::vector<std::string>>())
      .def("top", [](const c10::intrusive_ptr<Stack<std::string>>& self) {
        return self->stack_.back();
      })
      .def("push", &Stack<std::string>::push)
      .def("pop", &Stack<std::string>::pop)
      .def("clone", &Stack<std::string>::clone)
      .def("merge", &Stack<std::string>::merge)
      // class_<>::def_pickle allows you to define the serialization
      // and deserialization methods for your C++ class.
      // Currently, we only support passing stateless lambda functions
      // as arguments to def_pickle
      .def_pickle(
            // __getstate__
            // This function defines what data structure should be produced
            // when we serialize an instance of this class. The function
            // must take a single `self` argument, which is an intrusive_ptr
            // to the instance of the object. The function can return
            // any type that is supported as a return value of the TorchScript
            // custom operator API. In this instance, we've chosen to return
            // a std::vector<std::string> as the salient data to preserve
            // from the class.
            [](const c10::intrusive_ptr<Stack<std::string>>& self)
                -> std::vector<std::string> {
              return self->stack_;
            },
            // __setstate__
            // This function defines how to create a new instance of the C++
            // class when we are deserializing. The function must take a
            // single argument of the same type as the return value of
            // `__getstate__`. The function must return an intrusive_ptr
            // to a new instance of the C++ class, initialized however
            // you would like given the serialized state.
            [](std::vector<std::string> state)
                -> c10::intrusive_ptr<Stack<std::string>> {
              // A convenient way to instantiate an object and get an
              // intrusive_ptr to it is via `make_intrusive`. We use
              // that here to allocate an instance of Stack<std::string>
              // and call the single-argument std::vector<std::string>
              // constructor with the serialized state.
              return c10::make_intrusive<Stack<std::string>>(std::move(state));
            });

注意

我们采用与 pickle API 中的 pybind11 不同的方法。 pybind11 作为传递给class_::def()的特殊功能pybind11::pickle()，为此我们有一个单独的方法def_pickle。这是因为名称torch::jit::pickle已经被使用，我们不想引起混淆。

以这种方式定义(反）序列化行为后，脚本现在可以成功运行：

import torch


torch.classes.load_library('libcustom_class.so')


class Foo(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.stack = torch.classes.Stack(["just", "testing"])


    def forward(self, s : str) -> str:
        return self.stack.pop() + s


scripted_foo = torch.jit.script(Foo())


scripted_foo.save('foo.pt')
loaded = torch.jit.load('foo.pt')


print(loaded.stack.pop())

$ python ../export_attr.py
testing

结论

本教程向您介绍了如何向 TorchScript(以及扩展为 Python）公开 C ++类，如何注册其方法，如何从 Python 和 TorchScript 使用该类以及如何使用该类保存和加载代码以及运行该代码。在独立的 C ++过程中。现在，您可以使用与第三方 C ++库接口的 C ++类扩展 TorchScript 模型，或实现需要 Python，TorchScript 和 C ++之间的界线才能平滑融合的任何其他用例。

与往常一样，如果您遇到任何问题或疑问，可以使用我们的论坛或 GitHub 问题进行联系。另外，我们的常见问题解答(FAQ）页面可能包含有用的信息。

w3cschool 编程狮，随时随地学编程

PyTorch 使用自定义 C ++类扩展 TorchScript

在 C ++中实现和绑定类

使用 CMake 将示例构建为 C ++项目

从 Python 和 TorchScript 使用 C ++类

使用自定义类保存，加载和运行 TorchScript 代码

定义自定义 C ++类的序列化/反序列化方法

结论

PyTorch 入门

PyTorch深度学习

PyTorch 图片

Pytorch 音频

Pytorch 文本

PyTorch 命名为 Tensor(实验性）

PyTorch 强化学习

PyTorch 在生产中部署 PyTorch 模型

PyTorch 并行和分布式训练

PyTorch 扩展

PyTorch 模型优化

PyTorch 用其他语言

PyTorch 基础知识

PyTorch 笔记

PyTorch 语言绑定

Python API

PyTorch torchvision参考

PyTorch 音频参考

PyTorch 社区