TensorFlow 或 ONNX ANN 到 CZANN 的转换工具
项目描述
该项目提供了简单易用的转换工具,用于从驻留在内存或磁盘上 的TensorFlow或ONNX模型生成 CZANN 文件,以 在ZEN Blue >=3.2 和 ZEN Core >3.0 开始的ZEN Intellesis模块中使用。
请检查以下 ZEN Blue/Core 兼容性矩阵和 CZANN 模型规范 JSON 元数据文件的相应版本 (self.version)(请参阅下面的CZANN 模型规范)。版本兼容性通过语义版本规范 (SemVer)定义。
| 模型(旧版)/JSON | 禅蓝 | 禅芯 |
|---|---|---|
| 1.1.0 | >= 3.5 | >= 3.4 |
| 1.0.0 | >= 3.5 | >= 3.4 |
| 3.1.0(旧版) | >= 3.4 | >= 3.3 |
| 3.0.0(旧版) | >= 3.2 | >= 3.1 |
如果您在将模型导入 ZEN 时遇到版本不匹配,请检查此包的正确版本。
回购结构
这个 repo 分为 3 个独立的包 -:core、tensorflow、pytorch。
- 核心 - 提供基本功能,不需要依赖 Tensorflow 或 Pytorch。
- Tensorflow - 提供特定于 Tensorflow 的功能,以及基于 Tensorflow 逻辑的转换器。
- PyTorch - 提供 PyTorch 特定的功能,以及基于 PyTorch 逻辑的转换器。
安装
该库为需要特定依赖项的导出功能提供了一个基本包和附加功能-:
pip install czmodel- 这只会安装基本依赖项,不会安装特定于 Tensorflow/Pytorch 的包。pip install czmodel[tensorflow]- 这将安装基础和 Tensorflow 特定的软件包。pip install czmodel[pytorch]- 这将安装基础和 Pytorch 特定的软件包。
样品
对于 czmodel[pytorch]:
对于 czmodel[张量流]:
系统设置
此工具箱的当前版本只需要全新的 Python 3.x 安装。它在 Windows 上使用 Python 3.7 进行了测试。
模型转换
该工具箱提供了一个convert包含所有支持的转换策略的模块。它目前支持将 Keras / PyTorch 模型转换为内存中或存储在磁盘上的相应元数据 JSON 文件(请参阅下面的CZANN 模型规范)。
内存中的 Keras / PyTorch 模型
该工具箱还提供可以导入的功能,例如在用于拟合 Keras / PyTorch 模型的训练脚本中。它提供了不同的转换器来针对特定版本的导出格式。目前,有两种转换器可用:
- DefaultConverter:导出符合以下规范的 .czann 文件。
- LegacyConverter(仅用于分段):导出 .czmodel 文件(ZEN 中基于 ANN 的传统分段模型的 3.1.0 版)。
转换器可以通过运行来访问:
对于 Keras 模型:
from czmodel.tensorflow.convert import DefaultConverter, LegacyConverter
对于 PyTorch 模型:
from czmodel.pytorch.convert import DefaultConverter, LegacyConverter
每个转换器都提供了一个convert_from_model_spec函数,该函数使用模型规范对象将模型转换为相应的导出格式。它接受将导出到ONNXtensorflow.keras.Model的/ (对于 Keras 模型,如果发生故障,它将被导出到SavedModel)格式,同时包装成一个可以被 Intellesis 导入和使用的 .czann/.czmodel 文件。
为了提供元数据,工具箱提供了一个必须用模型填充的 ModelSpec 类、一个包含规范所需信息的 ModelMetadata 实例(请参阅下面的模型元数据),以及一个可选的许可证文件。torch.nn.Module
可以通过以下三个步骤从 Keras / PyTorch 模型创建 CZANN/CZMODEL。
1.创建模型元数据类
要导出 CZANN,需要必须通过ModelMetadata实例提供的元信息。
对于分段:
from czmodel.core.model_metadata import ModelMetadata, ModelType
model_metadata = ModelMetadata(
input_shape=[1024, 1024, 3],
output_shape=[1024, 1024, 5],
model_type=ModelType.SINGLE_CLASS_SEMANTIC_SEGMENTATION,
classes=["class1", "class2", "class3", "class4", "class5"],
model_name="ModelName",
min_overlap=[90, 90]
)
对于回归:
from czmodel.core.model_metadata import ModelMetadata, ModelType
model_metadata = ModelMetadata(
input_shape=[1024, 1024, 3],
output_shape=[1024, 1024, 3],
model_type=ModelType.REGRESSION,
model_name="ModelName",
min_overlap=[90, 90]
)
对于旧版 CZMODEL 模型,ModelMetadata必须使用旧版:
from czmodel.core.legacy_model_metadata import ModelMetadata as LegacyModelMetadata
model_metadata_legacy = LegacyModelMetadata(
name="Simple_Nuclei_SegmentationModel_Legacy",
classes=["class1", "class2"],
pixel_types="Bgr24",
color_handling="ConvertToMonochrome",
border_size=90,
)
2 .创建模型规范
模型及其相应的元数据现在被包装到 ModelSpec 对象中。
from czmodel.tensorflow.model_spec import ModelSpec # for czmodel[tensorflow]
#from czmodel.pytorch.model_spec import ModelSpec # for czmodel[pytorch]
model_spec = ModelSpec(
model=model,
model_metadata=model_metadata,
license_file="C:\\some\\path\\to\\a\\LICENSE.txt"
)
旧模型的相应模型规范以类似方式实例化。
from czmodel.tensorflow.legacy_model_spec import ModelSpec as LegacyModelSpec # for czmodel[tensorflow]
#from czmodel.pytorch.legacy_model_spec import ModelSpec as LegacyModelSpec # for czmodel[pytorch]
legacy_model_spec = LegacyModelSpec(
model=model,
model_metadata=model_metadata_legacy,
license_file="C:\\some\\path\\to\\a\\LICENSE.txt"
)
3.转换模型
实际的模型转换最终使用 ModelSpec 对象和 CZANN 的输出路径和名称进行。
from czmodel.tensorflow.convert import DefaultConverter as TensorflowDefaultConverter # for czmodel[tensorflow]
TensorflowDefaultConverter().convert_from_model_spec(model_spec=model_spec, output_path='some/path', output_name='some_file_name')
from czmodel.pytorch.convert import DefaultConverter as PytorchDefaultConverter # for czmodel[pytorch]
PytorchDefaultConverter().convert_from_model_spec(model_spec=model_spec, output_path='some/path', output_name='some_file_name', input_shape=(3, 1024, 1024))
对于旧模型,界面类似。
from czmodel.tensorflow.convert import LegacyConverter as TensorflowDefaultConverter # for czmodel[tensorflow]
TensorflowDefaultConverter().convert_from_model_spec(model_spec=legacy_model_spec, output_path='some/path',
output_name='some_file_name')
from czmodel.pytorch.convert import LegacyConverter as PytorchLegacyConverter # for czmodel[pytorch]
PytorchLegacyConverter().convert_from_model_spec(model_spec=legacy_model_spec, output_path='some/path',
output_name='some_file_name', input_shape=(3, 1024, 1024))
导出的 TensorFlow / PyTorch 模型
并非所有 TensorFlow / PyTorch 模型都可以转换。如果模型和提供的元数据符合下面的CZANN 模型规范,您可以转换从 TensorFlow / PyTorch 导出的模型。
实际转换由以下任一调用触发:
from czmodel.tensorflow.convert import DefaultConverter as TensorflowDefaultConverter # for czmodel[tensorflow]
TensorflowDefaultConverter().convert_from_json_spec('Path to JSON file', 'Output path', 'Model Name')
from czmodel.pytorch.convert import DefaultConverter as PytorchDefaultConverter # for czmodel[pytorch]
PytorchDefaultConverter().convert_from_json_spec('Path to JSON file', 'Output path', (3, 1024, 1024), 'Model Name')
或使用savedmodel2czann脚本的命令行界面(仅适用于 Keras 模型):
savedmodel2ann path/to/model_spec.json output/path/ output_name --license_file path/to/license_file.txt
添加预处理和后处理层(仅适用于 Keras 模型)
convert_from_json_spec转换器类和转换器类都convert_from_model_spec接受以下可选参数:
spatial_dims:为模型的新输入节点设置新的空间维度。此参数应按该顺序包含新的高度和宽度。注意:空间输入维度只能在对输入空间维度不变的 ANN 架构中进行更改,例如 FCN。preprocessing: 一个或多个预处理层,将添加到已部署的模型中。必须从类派生预处理层tensorflow.keras.layers.Layer。postprocessing:将附加到已部署模型的一个或多个后处理层。后处理层必须从tensorflow.keras.layers.Layer该类派生。
虽然 ANN 模型通常在 RGB(A) 空间中的图像上进行训练,但 ZEN 基础设施需要 CZANN 内的模型来预期 BGR(A) 颜色空间中的输入。这个工具箱提供了预处理层来转换颜色空间,然后再将输入传递给要实际部署的模型。以下代码显示如何将 BGR 到 RGB 转换层添加到模型并将其空间输入尺寸设置为 512x512。
from czmodel.tensorflow.util.transforms import TransposeChannels
from czmodel.tensorflow.convert import DefaultConverter
# Define dimensions and pre-processing
spatial_dims = 512, 512 # Optional: Target spatial dimensions of the model
preprocessing = [TransposeChannels(order=(2, 1,
0))] # Optional: Pre-Processing layers to be prepended to the model. Can be a single layer, a list of layers or None.
postprocessing = None # Optional: Post-Processing layers to be appended to the model. Can be a single layer, a list of layers or None.
# Perform conversion
DefaultConverter().convert_from_model_spec(
model_spec=model_spec,
output_path='some/path',
output_name='some_file_name',
spatial_dims=spatial_dims,
preprocessing=preprocessing,
postprocessing=postprocessing
)
此外,该工具箱提供了一个SigmoidToSoftmaxScores层,该层可以通过postprocessing参数附加,以将具有 sigmoid 输出激活的模型的输出转换为由具有 softmax 激活的等效模型产生的输出。
解压 CZANN/CZSEG 文件
czmodel 库提供解压现有 CZANN/CZSEG 模型的功能。对于给定的 .czann 或 .czseg 模型,可以将底层 ANN 模型提取到指定文件夹并检索相应的元数据作为 czmodel 库中定义的元数据类的实例。
对于 CZANN 文件:
from czmodel.tensorflow.convert import DefaultConverter # for czmodel[tensorflow]
#from czmodel.pytorch.convert import DefaultConverter # for czmodel[pytorch]
from pathlib import Path
model_metadata, model_path = DefaultConverter().unpack_model(model_file='Path of the .czann file',
target_dir=Path('Output Path'))
对于 CZSEG/CZMODEL 文件:
from czmodel.tensorflow.convert import LegacyConverter # for czmodel[tensorflow]
#from czmodel.pytorch.convert import DefaultConverter # for czmodel[pytorch]
from pathlib import Path
model_metadata, model_path = LegacyConverter().unpack_model(model_file='Path of the .czseg/.czann file',
target_dir=Path('Output Path'))
CZANN 模型规范
本节规定了人工神经网络 (ANN) 模型的要求以及额外需要的元数据,以使模型能够在 ZEN Intellesis 基础架构内执行,从 ZEN blue >=3.2 和 ZEN Core >3.0 开始。
模型格式目前允许捆绑用于语义分割、实例分割、对象检测、分类和回归的模型,并被定义为文件扩展名为 .czann 的 ZIP 存档,其中包含以下具有各自文件名的文件:
- JSON 元数据文件。(文件名:model.json)
- ONNX/TensorFlow SavedModel 格式的模型。在 SavedModel 格式的情况下,代表模型的文件夹必须压缩为单个文件。(文件名:model.model)
- 可选:包含模型的许可文件。(文件名:license.txt)
元数据文件必须符合以下规范:
{
"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"$id": "http://127.0.0.1/model_format.schema.json",
"title": "Exchange format for ANN models",
"description": "A format that defines the meta information for exchanging ANN models. Any future versions of this specification should be evaluated through https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/iglu/igluctl-0-7-2/#lint-1 with --skip-checks numericMinMax,stringLength,optionalNull and https://www.json-buddy.com/json-schema-analyzer.htm.",
"type": "object",
"self": {
"vendor": "com.zeiss",
"name": "model-format",
"format": "jsonschema",
"version": "1-1-0"
},
"properties": {
"Id": {
"description": "Universally unique identifier of 128 bits for the model.",
"type": "string"
},
"Type": {
"description": "The type of problem addressed by the model.",
"type": "string",
"enum": ["SingleClassInstanceSegmentation", "MultiClassInstanceSegmentation", "SingleClassSemanticSegmentation", "MultiClassSemanticSegmentation", "SingleClassClassification", "MultiClassClassification", "ObjectDetection", "Regression"]
},
"MinOverlap": {
"description": "The minimum overlap of tiles for each dimension in pixels. Must be divisible by two. In tiling strategies that consider tile borders instead of overlaps the minimum overlap is twice the border size.",
"type": "array",
"items": {
"description": "The overlap of a single spatial dimension",
"type": "integer",
"minimum": 0
},
"minItems": 1
},
"Classes": {
"description": "The class names corresponding to the last output dimension of the prediction. If the last dimension of the prediction has shape n the provided list must be of length n",
"type": "array",
"items": {
"description": "A name describing a class for segmentation and classification tasks",
"type": "string"
},
"minItems": 2
},
"ModelName": {
"description": "The name of exported neural network model in ONNX (file) or TensorFlow SavedModel (folder) format in the same ZIP archive as the meta data file. In the case of ONNX the model must use ONNX opset version 12. In the case of TensorFlow SavedModel all operations in the model must be supported by TensorFlow 2.0.0. The model must contain exactly one input node which must comply with the input shape defined in the InputShape parameter and must have a batch dimension as its first dimension that is either 1 or undefined.",
"type": "string"
},
"InputShape": {
"description": "The shape of an input image. A typical 2D model has an input of shape [h, w, c] where h and w are the spatial dimensions and c is the number of channels. A 3D model is expected to have an input shape of [z, h, w, c] that contains an additional dimension z which represents the third spatial dimension. The batch dimension is not specified here. The input of the model must be of type float32 in the range [0..1].",
"type": "array",
"items": {
"description": "The size of a single dimension",
"type": "integer",
"minimum": 1
},
"minItems": 3,
"maxItems": 4
},
"OutputShape": {
"description": "The shape of the output image. A typical 2D model has an input of shape [h, w, c] where h and w are the spatial dimensions and c is the number of classes. A 3D model is expected to have an input shape of [z, h, w, c] that contains an additional dimension z which represents the third spatial dimension. The batch dimension is not specified here. If the output of the model represents an image, it must be of type float32 in the range [0..1].",
"type": "array",
"items": {
"description": "The size of a single dimension",
"type": "integer",
"minimum": 1
},
"minItems": 3,
"maxItems": 4
},
"Scaling": {
"description": "The extent of a pixel in x- and y-direction (in that order) in units of m.",
"type": "array",
"items": {
"description": "The extent of a pixel in a single dimension in units of m",
"type": "number"
},
"minItems": 2,
"maxItems": 2
}
},
"required": ["Id", "Type", "InputShape", "OutputShape"]
}
Json 文件可以包含转义序列,并且路径中的 \-字符必须使用 \\ 进行转义。
以下代码片段显示了有效元数据文件的示例:
对于单类语义分割:
{
"Id": "b511d295-91ff-46ca-bb60-b2e26c393809",
"Type": "SingleClassSemanticSegmentation",
"Classes": ["class1", "class2", "class3", "class4", "class5"],
"InputShape": [1024, 1024, 3],
"OutputShape": [1024, 1024, 5]
}
对于回归:
{
"Id": "064587eb-d5a1-4434-82fc-2fbc9f5871f9",
"Type": "Regression",
"InputShape": [1024, 1024, 3],
"OutputShape": [1024, 1024, 3]
}