博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
TensorFlow VS TensorFlow Mobile VS TensorFlow Lite
阅读量:6828 次
发布时间:2019-06-26

本文共 33495 字,大约阅读时间需要 111 分钟。

TensorFlow的简介

TensorFlow是一个机器学习框架,其整体架构设计主要分成Client,Master和Worker。解耦的架构使得它具有高度灵活性,使它可以方便地在机器集群上部署。

TensorFlow的代码架构

TensorFlow整体架构如下(图片来自)。

image.png

Client

Client是算法工程师直接接触使用的。有Python,C++,Java等不同的版本。它的主要作用是:

  • 将计算过程定义成计算图。机器学习主要存在命令式和声明式两种不同的编程模型。命令式编程模型就是我们一般的编程方式。声明式模型类似于RxJava那样,先构建一个数据通道,等事件触发时,才会真正有数据喂入,并执行。TensorFlow就是声明式的编程模型。算法工程师利用Client的API,构建一个计算图。
  • 提供Session接口执行计算图。
Distributed Master
  • 将计算图切分成更小的子计算图。
  • 将子计算图进一步切分成更小的计算片段,使之能够并行运行在不同的进程乃至不同的设备上。
  • 将计算片段分发给不同的Worker。
  • 触发Worker执行分配到的计算任务。
Worker Services
  • 调用TensorFlow内核,根据可用的硬件情况执行计算片段。
  • 和其他Worker进行交互,发送和接收计算结果。
Kernel Implementations
  • 提供细粒度,独立的计算功能(operation),例如加法,减法,字符串切割。

移动端的TensorFlow

在端侧直接执行模型有节省带宽,响应及时,不受网络好坏通断影响更加稳定,无需数据传输更加安全等优点。因此端侧执行模型是有需求的。在移动设备或者其他嵌入式设备上执行TensorFlow,其关注点和云端就有所不同。需要着重注意更低的功耗,更快的速度,更小的size。当前针对移动设备,有TensorFlow Mobile和TensorFlow Lite两种解决方案。TensorFlow Mobile比较早出来,比较稳定,但性能等方面没有针对移动端作过多优化,目前已不推荐使用,预计到2019年初就会被废弃。

根据官网的介绍,TensorFlow Mobile和TensorFlow Lite的主要区别是:

  • TensorFlow Lite是TensorFlow Mobile的进化版。在大多数情况下,TensorFlow Lite拥有跟小的二进制大小,更少的依赖以及更好的性能。
  • TensorFlow Lite尚在开发阶段,可能存在一些功能尚未补齐。不过官方承诺正在加大力度开发。
  • TensorFlow Lite支持的OP比较有限,相比之下TensorFlow Mobile更加全面。

从源码看区别

以上是官网的介绍,然而看这介绍依然比较模糊。TensorFlow Mobile到底精简了啥,它支持哪些OP?TensorFlow Lite在实现上到底有何区别?为搞清这些问题,只有分析了。

TensorFlow 代码目录介绍

Tensorflow/core目录包含了TF核心模块代码。

public: API接口头文件目录,用于外部接口调用的API定义,主要是session.h 和tensor_c_api.h。
client: API接口实现文件目录。
platform: OS系统相关接口文件,如file system, env等。
protobuf: 均为.proto文件,用于数据传输时的结构序列化.
common_runtime: 公共运行库,包含session, executor, threadpool, rendezvous, memory管理, 设备分配算法等。
distributed_runtime: 分布式执行模块,如rpc session, rpc master, rpc worker, graph manager。
framework: 包含基础功能模块,如log, memory, tensor
graph: 计算流图相关操作,如construct, partition, optimize, execute等
kernels: 核心Op,如matmul, conv2d, argmax, batch_norm等
lib: 公共基础库,如gif、gtl(google模板库)、hash、histogram等。
ops: 基本ops运算,ops梯度运算,io相关的ops,控制流和数据流操作

Tensorflow/stream_executor目录是并行计算框架,由google stream executor团队开发。

Tensorflow/contrib目录是contributor开发目录,其中android目录下是android版本的TensorFlow mobile。lite目录下正是TensorFlow lite的源码。
Tensroflow/python目录是python API客户端脚本。
Tensorflow/tensorboard目录是可视化分析工具,不仅可以模型可视化,还可以监控模型参数变化。
third_party目录是TF第三方依赖库。
eigen3: eigen矩阵运算库,TF基础ops调用
gpus: 封装了cuda/cudnn编程库

TensorFlow Mobile精简了啥?

TensorFlow采用bazel进行编译,因此我们可以通过查看编译文件来分析区别。

TensorFlow默认的编译配置
===== /tensorflow/BUILD ===== tf_cc_shared_object(    name = "libtensorflow.so",    linkopts = select({        "//tensorflow:darwin": [            "-Wl,-exported_symbols_list",  # This line must be directly followed by the exported_symbols.lds file            "$(location //tensorflow/c:exported_symbols.lds)",            "-Wl,-install_name,@rpath/libtensorflow.so",        ],        "//tensorflow:windows": [],        "//conditions:default": [            "-z defs",            "-Wl,--version-script",  #  This line must be directly followed by the version_script.lds file            "$(location //tensorflow/c:version_script.lds)",        ],    }),    visibility = ["//visibility:public"],    deps = [        "//tensorflow/c:c_api",        "//tensorflow/c:c_api_experimental",        "//tensorflow/c:exported_symbols.lds",        "//tensorflow/c:version_script.lds",        "//tensorflow/c/eager:c_api",        "//tensorflow/core:tensorflow",    ],)===== /tensorflow/c/BUILD ===== tf_cuda_library(    name = "c_api",    srcs = [        "c_api.cc",        "c_api_function.cc",    ],    hdrs = [        "c_api.h",    ],    copts = tf_copts(),    visibility = ["//visibility:public"],    deps = select({        "//tensorflow:android": [            ":c_api_internal",            "//tensorflow/core:android_tensorflow_lib_lite",        ],        "//conditions:default": [            ":c_api_internal",            "//tensorflow/cc/saved_model:loader",            "//tensorflow/cc:gradients",            "//tensorflow/cc:ops",            "//tensorflow/cc:grad_ops",            "//tensorflow/cc:scope_internal",            "//tensorflow/cc:while_loop",            "//tensorflow/core:core_cpu",            "//tensorflow/core:core_cpu_internal",            "//tensorflow/core:framework",            "//tensorflow/core:op_gen_lib",            "//tensorflow/core:protos_all_cc",            "//tensorflow/core:lib",            "//tensorflow/core:lib_internal",        ],    }) + select({        "//tensorflow:with_xla_support": [            "//tensorflow/compiler/tf2xla:xla_compiler",            "//tensorflow/compiler/jit",        ],        "//conditions:default": [],    }),)tf_cuda_library(    name = "c_api_experimental",    srcs = [        "c_api_experimental.cc",    ],    hdrs = [        "c_api_experimental.h",    ],    copts = tf_copts(),    visibility = ["//visibility:public"],    deps = [        ":c_api",        ":c_api_internal",        "//tensorflow/c/eager:c_api",        "//tensorflow/compiler/jit/legacy_flags:mark_for_compilation_pass_flags",        "//tensorflow/contrib/tpu:all_ops",        "//tensorflow/core:core_cpu",        "//tensorflow/core:framework",        "//tensorflow/core:lib",        "//tensorflow/core:lib_platform",        "//tensorflow/core:protos_all_cc",    ],)===== /tensorflow/c/eager/BUILD ===== tf_cuda_library(    name = "c_api",    srcs = [        "c_api.cc",        "c_api_debug.cc",        "c_api_internal.h",    ],    hdrs = ["c_api.h"],    copts = tf_copts() + tfe_xla_copts(),    visibility = ["//visibility:public"],    deps = select({        "//tensorflow:android": [            "//tensorflow/core:android_tensorflow_lib_lite",        ],        "//conditions:default": [            "//tensorflow/c:c_api",            "//tensorflow/c:c_api_internal",            "//tensorflow/core:core_cpu",            "//tensorflow/core/common_runtime/eager:attr_builder",            "//tensorflow/core/common_runtime/eager:context",            "//tensorflow/core/common_runtime/eager:eager_executor",            "//tensorflow/core/common_runtime/eager:execute",            "//tensorflow/core/common_runtime/eager:kernel_and_device",            "//tensorflow/core/common_runtime/eager:tensor_handle",            "//tensorflow/core/common_runtime/eager:copy_to_device_node",            "//tensorflow/core:core_cpu_internal",            "//tensorflow/core:framework",            "//tensorflow/core:framework_internal",            "//tensorflow/core:lib",            "//tensorflow/core:lib_internal",            "//tensorflow/core:protos_all_cc",        ],    }) + select({        "//tensorflow:with_xla_support": [            "//tensorflow/compiler/tf2xla:xla_compiler",            "//tensorflow/compiler/jit",            "//tensorflow/compiler/jit:xla_device",        ],        "//conditions:default": [],    }) + [        "//tensorflow/core/common_runtime/eager:eager_operation",        "//tensorflow/core/distributed_runtime/eager:eager_client",        "//tensorflow/core/distributed_runtime/rpc/eager:grpc_eager_client",        "//tensorflow/core/distributed_runtime/rpc:grpc_channel",        "//tensorflow/core/distributed_runtime/rpc:grpc_server_lib",        "//tensorflow/core/distributed_runtime/rpc:grpc_worker_cache",        "//tensorflow/core/distributed_runtime/rpc:grpc_worker_service",        "//tensorflow/core/distributed_runtime/rpc:rpc_rendezvous_mgr",        "//tensorflow/core/distributed_runtime:remote_device",        "//tensorflow/core/distributed_runtime:server_lib",        "//tensorflow/core/distributed_runtime:worker_env",        "//tensorflow/core:gpu_runtime",    ],)===== /tensorflow/core/BUILD ===== cc_library(    name = "tensorflow",    visibility = ["//visibility:public"],    deps = [        ":tensorflow_opensource",        "//tensorflow/core/platform/default/build_config:tensorflow_platform_specific",    ],)tf_cuda_library(    name = "tensorflow_opensource",    copts = tf_copts(),    visibility = ["//visibility:public"],    deps = [        ":all_kernels",        ":core",        ":direct_session",        ":example_parser_configuration",        ":gpu_runtime",        ":lib",    ],)cc_library(    name = "all_kernels",    visibility = ["//visibility:public"],    deps = if_dynamic_kernels(        [],        otherwise = [":all_kernels_statically_linked"],    ),)# This is a link-only library to provide a DirectSession# implementation of the Session interface.tf_cuda_library(    name = "direct_session",    copts = tf_copts(),    linkstatic = 1,    visibility = ["//visibility:public"],    deps = [        ":direct_session_internal",    ],    alwayslink = 1,)filegroup(    name = "example_parser_configuration_testdata",    srcs = [        "example/testdata/parse_example_graph_def.pbtxt",    ],)cc_library(    name = "core",    visibility = ["//visibility:public"],    deps = [        ":core_cpu",        ":gpu_runtime",        ":sycl_runtime",    ],)cc_library(    name = "lib",    hdrs = [        "lib/bfloat16/bfloat16.h",        "lib/core/arena.h",        "lib/core/bitmap.h",        "lib/core/bits.h",        "lib/core/casts.h",        "lib/core/coding.h",        "lib/core/errors.h",        "lib/core/notification.h",        "lib/core/raw_coding.h",        "lib/core/status.h",        "lib/core/stringpiece.h",        "lib/core/threadpool.h",        "lib/gtl/array_slice.h",        "lib/gtl/cleanup.h",        "lib/gtl/compactptrset.h",        "lib/gtl/flatmap.h",        "lib/gtl/flatset.h",        "lib/gtl/inlined_vector.h",        "lib/gtl/optional.h",        "lib/gtl/priority_queue_util.h",        "lib/hash/crc32c.h",        "lib/hash/hash.h",        "lib/histogram/histogram.h",        "lib/io/buffered_inputstream.h",        "lib/io/compression.h",        "lib/io/inputstream_interface.h",        "lib/io/path.h",        "lib/io/proto_encode_helper.h",        "lib/io/random_inputstream.h",        "lib/io/record_reader.h",        "lib/io/record_writer.h",        "lib/io/table.h",        "lib/io/table_builder.h",        "lib/io/table_options.h",        "lib/math/math_util.h",        "lib/monitoring/collected_metrics.h",        "lib/monitoring/collection_registry.h",        "lib/monitoring/counter.h",        "lib/monitoring/gauge.h",        "lib/monitoring/metric_def.h",        "lib/monitoring/sampler.h",        "lib/random/distribution_sampler.h",        "lib/random/philox_random.h",        "lib/random/random_distributions.h",        "lib/random/simple_philox.h",        "lib/strings/numbers.h",        "lib/strings/proto_serialization.h",        "lib/strings/str_util.h",        "lib/strings/strcat.h",        "lib/strings/stringprintf.h",        ":platform_base_hdrs",        ":platform_env_hdrs",        ":platform_file_system_hdrs",        ":platform_other_hdrs",        ":platform_port_hdrs",        ":platform_protobuf_hdrs",    ],    visibility = ["//visibility:public"],    deps = [        ":lib_internal",        "@com_google_absl//absl/container:inlined_vector",        "@com_google_absl//absl/strings",        "@com_google_absl//absl/types:optional",    ],)# This includes implementations of all kernels built into TensorFlow.cc_library(    name = "all_kernels_statically_linked",    visibility = ["//visibility:private"],    deps = [        "//tensorflow/core/kernels:array",        "//tensorflow/core/kernels:audio",        "//tensorflow/core/kernels:batch_kernels",        "//tensorflow/core/kernels:bincount_op",        "//tensorflow/core/kernels:boosted_trees_ops",        "//tensorflow/core/kernels:candidate_sampler_ops",        "//tensorflow/core/kernels:checkpoint_ops",        "//tensorflow/core/kernels:collective_ops",        "//tensorflow/core/kernels:control_flow_ops",        "//tensorflow/core/kernels:ctc_ops",        "//tensorflow/core/kernels:cudnn_rnn_kernels",        "//tensorflow/core/kernels:data_flow",        "//tensorflow/core/kernels:dataset_ops",        "//tensorflow/core/kernels:decode_proto_op",        "//tensorflow/core/kernels:encode_proto_op",        "//tensorflow/core/kernels:fake_quant_ops",        "//tensorflow/core/kernels:function_ops",        "//tensorflow/core/kernels:functional_ops",        "//tensorflow/core/kernels:grappler",        "//tensorflow/core/kernels:histogram_op",        "//tensorflow/core/kernels:image",        "//tensorflow/core/kernels:io",        "//tensorflow/core/kernels:linalg",        "//tensorflow/core/kernels:list_kernels",        "//tensorflow/core/kernels:lookup",        "//tensorflow/core/kernels:logging",        "//tensorflow/core/kernels:manip",        "//tensorflow/core/kernels:math",        "//tensorflow/core/kernels:multinomial_op",        "//tensorflow/core/kernels:nn",        "//tensorflow/core/kernels:parameterized_truncated_normal_op",        "//tensorflow/core/kernels:parsing",        "//tensorflow/core/kernels:partitioned_function_ops",        "//tensorflow/core/kernels:random_ops",        "//tensorflow/core/kernels:random_poisson_op",        "//tensorflow/core/kernels:remote_fused_graph_ops",        "//tensorflow/core/kernels:required",        "//tensorflow/core/kernels:resource_variable_ops",        "//tensorflow/core/kernels:rpc_op",        "//tensorflow/core/kernels:scoped_allocator_ops",        "//tensorflow/core/kernels:sdca_ops",        "//tensorflow/core/kernels:searchsorted_op",        "//tensorflow/core/kernels:set_kernels",        "//tensorflow/core/kernels:sparse",        "//tensorflow/core/kernels:state",        "//tensorflow/core/kernels:stateless_random_ops",        "//tensorflow/core/kernels:string",        "//tensorflow/core/kernels:summary_kernels",        "//tensorflow/core/kernels:training_ops",        "//tensorflow/core/kernels:word2vec_kernels",    ] + tf_additional_cloud_kernel_deps() + if_not_windows([        "//tensorflow/core/kernels:fact_op",        "//tensorflow/core/kernels:array_not_windows",        "//tensorflow/core/kernels:math_not_windows",        "//tensorflow/core/kernels:quantized_ops",        "//tensorflow/core/kernels/neon:neon_depthwise_conv_op",    ]) + if_mkl([        "//tensorflow/core/kernels:mkl_concat_op",        "//tensorflow/core/kernels:mkl_conv_op",        "//tensorflow/core/kernels:mkl_cwise_ops_common",        "//tensorflow/core/kernels:mkl_fused_batch_norm_op",        "//tensorflow/core/kernels:mkl_identity_op",        "//tensorflow/core/kernels:mkl_input_conversion_op",        "//tensorflow/core/kernels:mkl_lrn_op",        "//tensorflow/core/kernels:mkl_pooling_ops",        "//tensorflow/core/kernels:mkl_relu_op",        "//tensorflow/core/kernels:mkl_reshape_op",        "//tensorflow/core/kernels:mkl_slice_op",        "//tensorflow/core/kernels:mkl_softmax_op",        "//tensorflow/core/kernels:mkl_transpose_op",        "//tensorflow/core/kernels:mkl_tfconv_op",        "//tensorflow/core/kernels:mkl_aggregate_ops",    ]) + if_cuda([        "//tensorflow/core/grappler/optimizers:gpu_swapping_kernels",        "//tensorflow/core/grappler/optimizers:gpu_swapping_ops",    ]),)
TensorFlow Mobile的编译配置
===== tensorflow/contrib/android/BUILD =====cc_binary(    name = "libtensorflow_inference.so",    srcs = [],    copts = tf_copts() + [        "-ffunction-sections",        "-fdata-sections",    ],    linkopts = if_android([        "-landroid",        "-latomic",        "-ldl",        "-llog",        "-lm",        "-z defs",        "-s",        "-Wl,--gc-sections",        "-Wl,--version-script",  # This line must be directly followed by LINKER_SCRIPT.        "$(location {})".format(LINKER_SCRIPT),    ]),    linkshared = 1,    linkstatic = 1,    tags = [        "manual",        "notap",    ],    deps = [        ":android_tensorflow_inference_jni",        "//tensorflow/core:android_tensorflow_lib",        LINKER_SCRIPT,    ],)cc_library(    name = "android_tensorflow_inference_jni",    srcs = if_android([":android_tensorflow_inference_jni_srcs"]),    copts = tf_copts(),    visibility = ["//visibility:public"],    deps = [        "//tensorflow/core:android_tensorflow_lib_lite",        "//tensorflow/java/src/main/native",    ],    alwayslink = 1,)===== tensorflow/core/BUILD ===== cc_library(    name = "android_tensorflow_lib",    srcs = if_android([":android_op_registrations_and_gradients"]),    copts = tf_copts(),    tags = [        "manual",        "notap",    ],    visibility = ["//visibility:public"],    deps = [        ":android_tensorflow_lib_lite",        ":protos_all_cc_impl",        "//tensorflow/core/kernels:android_tensorflow_kernels",        "//third_party/eigen3",        "@protobuf_archive//:protobuf",    ],    alwayslink = 1,)cc_library(    name = "android_tensorflow_lib_lite",    srcs = if_android(["//tensorflow/core:android_srcs"]),    copts = tf_copts(android_optimization_level_override = None),    linkopts = ["-lz"],    tags = [        "manual",        "notap",    ],    visibility = ["//visibility:public"],    deps = [        ":mobile_additional_lib_deps",        ":protos_all_cc_impl",        ":stats_calculator_portable",        "//third_party/eigen3",        "@double_conversion//:double-conversion",        "@nsync//:nsync_cpp",        "@protobuf_archive//:protobuf",    ],    alwayslink = 1,)alias(    name = "android_srcs",    actual = ":mobile_srcs",    visibility = ["//visibility:public"],)filegroup(    name = "mobile_srcs",    srcs = [        ":mobile_srcs_no_runtime",        ":mobile_srcs_only_runtime",    ],    visibility = ["//visibility:public"],)# Core sources for Android builds.filegroup(    name = "mobile_srcs_no_runtime",    srcs = [        ":protos_all_proto_text_srcs",        ":error_codes_proto_text_srcs",        "//tensorflow/core/platform/default/build_config:android_srcs",    ] + glob(        [            "client/**/*.cc",            "framework/**/*.h",            "framework/**/*.cc",            "lib/**/*.h",            "lib/**/*.cc",            "platform/**/*.h",            "platform/**/*.cc",            "public/**/*.h",            "util/**/*.h",            "util/**/*.cc",        ],        exclude = [            "**/*test.*",            "**/*testutil*",            "**/*testlib*",            "**/*main.cc",            "debug/**/*",            "framework/op_gen_*",            "lib/jpeg/**/*",            "lib/png/**/*",            "lib/gif/**/*",            "util/events_writer.*",            "util/stats_calculator.*",            "util/reporter.*",            "platform/**/cuda_libdevice_path.*",            "platform/default/test_benchmark.*",            "platform/cuda.h",            "platform/google/**/*",            "platform/hadoop/**/*",            "platform/gif.h",            "platform/jpeg.h",            "platform/png.h",            "platform/stream_executor.*",            "platform/windows/**/*",            "user_ops/**/*.cu.cc",            "util/ctc/*.h",            "util/ctc/*.cc",            "util/tensor_bundle/*.h",            "util/tensor_bundle/*.cc",            "common_runtime/gpu/**/*",            "common_runtime/eager/*",            "common_runtime/gpu_device_factory.*",        ],    ),    visibility = ["//visibility:public"],)filegroup(    name = "mobile_srcs_only_runtime",    srcs = [        "//tensorflow/core/kernels:android_srcs",        "//tensorflow/core/util/ctc:android_srcs",        "//tensorflow/core/util/tensor_bundle:android_srcs",    ] + glob(        [            "common_runtime/**/*.h",            "common_runtime/**/*.cc",            "graph/**/*.h",            "graph/**/*.cc",        ],        exclude = [            "**/*test.*",            "**/*testutil*",            "**/*testlib*",            "**/*main.cc",            "common_runtime/gpu/**/*",            "common_runtime/eager/*",            "common_runtime/gpu_device_factory.*",            "graph/dot.*",        ],    ),    visibility = ["//visibility:public"],)cc_library(    name = "stats_calculator_portable",    srcs = [        "util/stat_summarizer_options.h",        "util/stats_calculator.cc",    ],    hdrs = [        "util/stats_calculator.h",    ],    copts = tf_copts(),)cc_library(    name = "mobile_additional_lib_deps",    deps = tf_additional_lib_deps() + [        "@com_google_absl//absl/strings",    ],)===== tensorflow/core/kernels/BUILD ===== cc_library(    name = "android_tensorflow_kernels",    srcs = select({        "//tensorflow:android": [            "//tensorflow/core/kernels:android_core_ops",            "//tensorflow/core/kernels:android_extended_ops",        ],        "//conditions:default": [],    }),    copts = tf_copts(),    linkopts = select({        "//tensorflow:android": [            "-ldl",        ],        "//conditions:default": [],    }),    tags = [        "manual",        "notap",    ],    visibility = ["//visibility:public"],    deps = [        "//tensorflow/core:android_tensorflow_lib_lite",        "//tensorflow/core:protos_all_cc_impl",        "//third_party/eigen3",        "//third_party/fft2d:fft2d_headers",        "@fft2d",        "@gemmlowp",        "@protobuf_archive//:protobuf",    ],    alwayslink = 1,)# Core kernels we want on Android. Only a subset of kernels to keep# base library small.filegroup(    name = "android_core_ops",    srcs = [        "aggregate_ops.cc",        "aggregate_ops.h",        "aggregate_ops_cpu.h",        "assign_op.h",        "bias_op.cc",        "bias_op.h",        "bounds_check.h",        "cast_op.cc",        "cast_op.h",        "cast_op_impl.h",        "cast_op_impl_bfloat.cc",        "cast_op_impl_bool.cc",        "cast_op_impl_complex128.cc",        "cast_op_impl_complex64.cc",        "cast_op_impl_double.cc",        "cast_op_impl_float.cc",        "cast_op_impl_half.cc",        "cast_op_impl_int16.cc",        "cast_op_impl_int32.cc",        "cast_op_impl_int64.cc",        "cast_op_impl_int8.cc",        "cast_op_impl_uint16.cc",        "cast_op_impl_uint32.cc",        "cast_op_impl_uint64.cc",        "cast_op_impl_uint8.cc",        "concat_lib.h",        "concat_lib_cpu.cc",        "concat_lib_cpu.h",        "concat_op.cc",        "constant_op.cc",        "constant_op.h",        "cwise_ops.h",        "cwise_ops_common.cc",        "cwise_ops_common.h",        "cwise_ops_gradients.h",        "dense_update_functor.cc",        "dense_update_functor.h",        "dense_update_ops.cc",        "example_parsing_ops.cc",        "fill_functor.cc",        "fill_functor.h",        "function_ops.cc",        "function_ops.h",        "gather_functor.h",        "gather_nd_op.cc",        "gather_nd_op.h",        "gather_nd_op_cpu_impl.h",        "gather_nd_op_cpu_impl_0.cc",        "gather_nd_op_cpu_impl_1.cc",        "gather_nd_op_cpu_impl_2.cc",        "gather_nd_op_cpu_impl_3.cc",        "gather_nd_op_cpu_impl_4.cc",        "gather_nd_op_cpu_impl_5.cc",        "gather_nd_op_cpu_impl_6.cc",        "gather_nd_op_cpu_impl_7.cc",        "gather_op.cc",        "identity_n_op.cc",        "identity_n_op.h",        "identity_op.cc",        "identity_op.h",        "immutable_constant_op.cc",        "immutable_constant_op.h",        "matmul_op.cc",        "matmul_op.h",        "no_op.cc",        "no_op.h",        "non_max_suppression_op.cc",        "non_max_suppression_op.h",        "one_hot_op.cc",        "one_hot_op.h",        "ops_util.h",        "pack_op.cc",        "pooling_ops_common.h",        "reshape_op.cc",        "reshape_op.h",        "reverse_sequence_op.cc",        "reverse_sequence_op.h",        "sendrecv_ops.cc",        "sendrecv_ops.h",        "sequence_ops.cc",        "shape_ops.cc",        "shape_ops.h",        "slice_op.cc",        "slice_op.h",        "slice_op_cpu_impl.h",        "slice_op_cpu_impl_1.cc",        "slice_op_cpu_impl_2.cc",        "slice_op_cpu_impl_3.cc",        "slice_op_cpu_impl_4.cc",        "slice_op_cpu_impl_5.cc",        "slice_op_cpu_impl_6.cc",        "slice_op_cpu_impl_7.cc",        "softmax_op.cc",        "softmax_op_functor.h",        "split_lib.h",        "split_lib_cpu.cc",        "split_op.cc",        "split_v_op.cc",        "strided_slice_op.cc",        "strided_slice_op.h",        "strided_slice_op_impl.h",        "strided_slice_op_inst_0.cc",        "strided_slice_op_inst_1.cc",        "strided_slice_op_inst_2.cc",        "strided_slice_op_inst_3.cc",        "strided_slice_op_inst_4.cc",        "strided_slice_op_inst_5.cc",        "strided_slice_op_inst_6.cc",        "strided_slice_op_inst_7.cc",        "unpack_op.cc",        "variable_ops.cc",        "variable_ops.h",    ],)# Other kernels we may want on Android.## The kernels can be consumed as a whole or in two groups for# supporting separate compilation. Note that the split into groups# is entirely for improving compilation time, and not for# organizational reasons; you should not depend on any# of those groups independently.filegroup(    name = "android_extended_ops",    srcs = [        ":android_extended_ops_group1",        ":android_extended_ops_group2",        ":android_quantized_ops",    ],    visibility = ["//visibility:public"],)filegroup(    name = "android_extended_ops_headers",    srcs = [        "argmax_op.h",        "avgpooling_op.h",        "batch_matmul_op_impl.h",        "batch_norm_op.h",        "control_flow_ops.h",        "conv_2d.h",        "conv_ops.h",        "data_format_ops.h",        "depthtospace_op.h",        "depthwise_conv_op.h",        "fake_quant_ops_functor.h",        "fused_batch_norm_op.h",        "gemm_functors.h",        "image_resizer_state.h",        "initializable_lookup_table.h",        "lookup_table_init_op.h",        "lookup_table_op.h",        "lookup_util.h",        "maxpooling_op.h",        "mfcc.h",        "mfcc_dct.h",        "mfcc_mel_filterbank.h",        "mirror_pad_op.h",        "mirror_pad_op_cpu_impl.h",        "pad_op.h",        "random_op.h",        "reduction_ops.h",        "reduction_ops_common.h",        "relu_op.h",        "relu_op_functor.h",        "reshape_util.h",        "resize_bilinear_op.h",        "resize_nearest_neighbor_op.h",        "reverse_op.h",        "save_restore_tensor.h",        "segment_reduction_ops.h",        "softplus_op.h",        "softsign_op.h",        "spacetobatch_functor.h",        "spacetodepth_op.h",        "spectrogram.h",        "string_util.h",        "tensor_array.h",        "tile_functor.h",        "tile_ops_cpu_impl.h",        "tile_ops_impl.h",        "topk_op.h",        "training_op_helpers.h",        "training_ops.h",        "transpose_functor.h",        "transpose_op.h",        "where_op.h",        "xent_op.h",    ],)filegroup(    name = "android_extended_ops_group1",    srcs = [        "argmax_op.cc",        "avgpooling_op.cc",        "batch_matmul_op_real.cc",        "batch_norm_op.cc",        "bcast_ops.cc",        "check_numerics_op.cc",        "control_flow_ops.cc",        "conv_2d.h",        "conv_grad_filter_ops.cc",        "conv_grad_input_ops.cc",        "conv_grad_ops.cc",        "conv_grad_ops.h",        "conv_ops.cc",        "conv_ops_fused.cc",        "conv_ops_using_gemm.cc",        "crop_and_resize_op.cc",        "crop_and_resize_op.h",        "cwise_op_abs.cc",        "cwise_op_add_1.cc",        "cwise_op_add_2.cc",        "cwise_op_bitwise_and.cc",        "cwise_op_bitwise_or.cc",        "cwise_op_bitwise_xor.cc",        "cwise_op_div.cc",        "cwise_op_equal_to_1.cc",        "cwise_op_equal_to_2.cc",        "cwise_op_not_equal_to_1.cc",        "cwise_op_not_equal_to_2.cc",        "cwise_op_exp.cc",        "cwise_op_floor.cc",        "cwise_op_floor_div.cc",        "cwise_op_floor_mod.cc",        "cwise_op_greater.cc",        "cwise_op_greater_equal.cc",        "cwise_op_invert.cc",        "cwise_op_isfinite.cc",        "cwise_op_isnan.cc",        "cwise_op_left_shift.cc",        "cwise_op_less.cc",        "cwise_op_less_equal.cc",        "cwise_op_log.cc",        "cwise_op_logical_and.cc",        "cwise_op_logical_not.cc",        "cwise_op_logical_or.cc",        "cwise_op_maximum.cc",        "cwise_op_minimum.cc",        "cwise_op_mul_1.cc",        "cwise_op_mul_2.cc",        "cwise_op_neg.cc",        "cwise_op_pow.cc",        "cwise_op_reciprocal.cc",        "cwise_op_right_shift.cc",        "cwise_op_round.cc",        "cwise_op_rsqrt.cc",        "cwise_op_select.cc",        "cwise_op_sigmoid.cc",        "cwise_op_sign.cc",        "cwise_op_sqrt.cc",        "cwise_op_square.cc",        "cwise_op_squared_difference.cc",        "cwise_op_sub.cc",        "cwise_op_tanh.cc",        "cwise_op_xlogy.cc",        "cwise_op_xdivy.cc",        "data_format_ops.cc",        "decode_wav_op.cc",        "deep_conv2d.cc",        "deep_conv2d.h",        "depthwise_conv_op.cc",        "dynamic_partition_op.cc",        "encode_wav_op.cc",        "fake_quant_ops.cc",        "fifo_queue.cc",        "fifo_queue_op.cc",        "fused_batch_norm_op.cc",        "listdiff_op.cc",        "population_count_op.cc",        "population_count_op.h",        "winograd_transform.h",        ":android_extended_ops_headers",    ] + select({        ":xsmm_convolutions": [            "xsmm_conv2d.h",            "xsmm_conv2d.cc",        ],        "//conditions:default": [],    }),)filegroup(    name = "android_extended_ops_group2",    srcs = [        "batchtospace_op.cc",        "ctc_decoder_ops.cc",        "decode_bmp_op.cc",        "depthtospace_op.cc",        "dynamic_stitch_op.cc",        "in_topk_op.cc",        "initializable_lookup_table.cc",        "logging_ops.cc",        "lookup_table_init_op.cc",        "lookup_table_op.cc",        "lookup_util.cc",        "lrn_op.cc",        "maxpooling_op.cc",        "mfcc.cc",        "mfcc_dct.cc",        "mfcc_mel_filterbank.cc",        "mfcc_op.cc",        "mirror_pad_op.cc",        "mirror_pad_op_cpu_impl_1.cc",        "mirror_pad_op_cpu_impl_2.cc",        "mirror_pad_op_cpu_impl_3.cc",        "mirror_pad_op_cpu_impl_4.cc",        "mirror_pad_op_cpu_impl_5.cc",        "pad_op.cc",        "padding_fifo_queue.cc",        "padding_fifo_queue_op.cc",        "queue_base.cc",        "queue_op.cc",        "queue_ops.cc",        "random_op.cc",        "reduction_ops_all.cc",        "reduction_ops_any.cc",        "reduction_ops_common.cc",        "reduction_ops_max.cc",        "reduction_ops_mean.cc",        "reduction_ops_min.cc",        "reduction_ops_prod.cc",        "reduction_ops_sum.cc",        "relu_op.cc",        "reshape_util.cc",        "resize_bilinear_op.cc",        "resize_nearest_neighbor_op.cc",        "restore_op.cc",        "reverse_op.cc",        "save_op.cc",        "save_restore_tensor.cc",        "save_restore_v2_ops.cc",        "segment_reduction_ops.cc",        "session_ops.cc",        "softplus_op.cc",        "softsign_op.cc",        "spacetobatch_functor.cc",        "spacetobatch_op.cc",        "spacetodepth_op.cc",        "sparse_fill_empty_rows_op.cc",        "sparse_reshape_op.cc",        "sparse_to_dense_op.cc",        "spectrogram.cc",        "spectrogram_op.cc",        "stack_ops.cc",        "string_join_op.cc",        "string_util.cc",        "summary_op.cc",        "tensor_array.cc",        "tensor_array_ops.cc",        "tile_functor_cpu.cc",        "tile_ops.cc",        "tile_ops_cpu_impl_1.cc",        "tile_ops_cpu_impl_2.cc",        "tile_ops_cpu_impl_3.cc",        "tile_ops_cpu_impl_4.cc",        "tile_ops_cpu_impl_5.cc",        "tile_ops_cpu_impl_6.cc",        "tile_ops_cpu_impl_7.cc",        "topk_op.cc",        "training_op_helpers.cc",        "training_ops.cc",        "transpose_functor_cpu.cc",        "transpose_op.cc",        "unique_op.cc",        "where_op.cc",        "xent_op.cc",        ":android_extended_ops_headers",    ],)

TensorFlow Mobile通过编译选项,在完整的TensorFlow基础上进行裁剪,在保留TensorFlow核心功能的同时去掉不必要的代码。例如分布式执行的逻辑,windows平台的兼容逻辑,利用gpu计算的逻辑等等。

TensorFlow Mobile的OP支持完整吗?

TensorFlow Mobile并不包含所有的OP,只有一些核心必要的op,详见上面android_core_ops和android_extended_ops。

TensorFlow Lite在实现上又有啥区别

TensorFlow Lite的源码在tensorflow/contrib/lite目录下。其核心编译逻辑如下

### tensorflow/contrib/lite/BUILDcc_library(    name = "framework",    srcs = [        "allocation.cc",        "graph_info.cc",        "interpreter.cc",        "model.cc",        "mutable_op_resolver.cc",        "optional_debug_tools.cc",        "stderr_reporter.cc",    ] + select({        "//tensorflow:android": [            "nnapi_delegate.cc",            "mmap_allocation.cc",        ],        "//tensorflow:windows": [            "nnapi_delegate_disabled.cc",            "mmap_allocation_disabled.cc",        ],        "//conditions:default": [            "nnapi_delegate_disabled.cc",            "mmap_allocation.cc",        ],    }),    hdrs = [        "allocation.h",        "context.h",        "context_util.h",        "error_reporter.h",        "graph_info.h",        "interpreter.h",        "model.h",        "mutable_op_resolver.h",        "nnapi_delegate.h",        "op_resolver.h",        "optional_debug_tools.h",        "stderr_reporter.h",    ],    copts = tflite_copts(),    linkopts = [    ] + select({        "//tensorflow:android": [            "-llog",        ],        "//conditions:default": [        ],    }),    deps = [        ":arena_planner",        ":graph_info",        ":memory_planner",        ":schema_fbs_version",        ":simple_memory_arena",        ":string",        ":util",        "//tensorflow/contrib/lite/c:c_api_internal",        "//tensorflow/contrib/lite/core/api",        "//tensorflow/contrib/lite/kernels:eigen_support",        "//tensorflow/contrib/lite/kernels:gemm_support",        "//tensorflow/contrib/lite/nnapi:nnapi_lib",        "//tensorflow/contrib/lite/profiling:profiler",        "//tensorflow/contrib/lite/schema:schema_fbs",    ],)

相比TensorFlow Mobile是对完整TensorFlow的裁减,TensorFlow Lite基本就是重新实现了。从内部实现来说,在TensorFlow内核最基本的OP,Context等数据结构,都是新的。从外在表现来说,模型文件从PB格式改成了FlatBuffers格式,TensorFlow的size有大幅度优化,降至300K,然后提供一个converter将普通TensorFlow模型转化成TensorFlow Lite需要的格式。因此,无论从哪方面看,TensorFlow Lite都是一个新的实现方案。

参考资料

转载地址:http://dqykl.baihongyu.com/

你可能感兴趣的文章
StreamWriter写入文件
查看>>
MQ 2035
查看>>
CCR与DAG的区别
查看>>
交换安全
查看>>
freemarker@ # $使用方法的区别
查看>>
Synchronized——实现原理、底层优化
查看>>
快速搭建 Discuz 论坛
查看>>
pip升级常见故障解决心得
查看>>
C语言:指针的运用
查看>>
TortoiseSVN 源码相关网址
查看>>
C语言贪吃蛇代码
查看>>
共享打印机:已达到计算机的连接数最大值,无法再同此远程计算机连接
查看>>
dos2unix 和 unix2dos
查看>>
iOS-应用程序沙盒机制(SandBox)
查看>>
JAVA多线程(十)模式-Work Thread和阶段总结
查看>>
linux中时间设置date、hwclock、clock
查看>>
Linux 软件包管理之RPM
查看>>
Linux bash入门
查看>>
网络协议
查看>>
SecureCRT右键粘贴的设置
查看>>