本文共 33495 字,大约阅读时间需要 111 分钟。
TensorFlow是一个机器学习框架,其整体架构设计主要分成Client,Master和Worker。解耦的架构使得它具有高度灵活性,使它可以方便地在机器集群上部署。
TensorFlow整体架构如下(图片来自)。
Client是算法工程师直接接触使用的。有Python,C++,Java等不同的版本。它的主要作用是:
在端侧直接执行模型有节省带宽,响应及时,不受网络好坏通断影响更加稳定,无需数据传输更加安全等优点。因此端侧执行模型是有需求的。在移动设备或者其他嵌入式设备上执行TensorFlow,其关注点和云端就有所不同。需要着重注意更低的功耗,更快的速度,更小的size。当前针对移动设备,有TensorFlow Mobile和TensorFlow Lite两种解决方案。TensorFlow Mobile比较早出来,比较稳定,但性能等方面没有针对移动端作过多优化,目前已不推荐使用,预计到2019年初就会被废弃。
根据官网的介绍,TensorFlow Mobile和TensorFlow Lite的主要区别是:以上是官网的介绍,然而看这介绍依然比较模糊。TensorFlow Mobile到底精简了啥,它支持哪些OP?TensorFlow Lite在实现上到底有何区别?为搞清这些问题,只有分析了。
Tensorflow/core目录包含了TF核心模块代码。
public: API接口头文件目录,用于外部接口调用的API定义,主要是session.h 和tensor_c_api.h。client: API接口实现文件目录。platform: OS系统相关接口文件,如file system, env等。protobuf: 均为.proto文件,用于数据传输时的结构序列化.common_runtime: 公共运行库,包含session, executor, threadpool, rendezvous, memory管理, 设备分配算法等。distributed_runtime: 分布式执行模块,如rpc session, rpc master, rpc worker, graph manager。framework: 包含基础功能模块,如log, memory, tensorgraph: 计算流图相关操作,如construct, partition, optimize, execute等kernels: 核心Op,如matmul, conv2d, argmax, batch_norm等lib: 公共基础库,如gif、gtl(google模板库)、hash、histogram等。ops: 基本ops运算,ops梯度运算,io相关的ops,控制流和数据流操作Tensorflow/stream_executor目录是并行计算框架,由google stream executor团队开发。
Tensorflow/contrib目录是contributor开发目录,其中android目录下是android版本的TensorFlow mobile。lite目录下正是TensorFlow lite的源码。Tensroflow/python目录是python API客户端脚本。Tensorflow/tensorboard目录是可视化分析工具,不仅可以模型可视化,还可以监控模型参数变化。third_party目录是TF第三方依赖库。eigen3: eigen矩阵运算库,TF基础ops调用gpus: 封装了cuda/cudnn编程库TensorFlow采用bazel进行编译,因此我们可以通过查看编译文件来分析区别。
===== /tensorflow/BUILD ===== tf_cc_shared_object( name = "libtensorflow.so", linkopts = select({ "//tensorflow:darwin": [ "-Wl,-exported_symbols_list", # This line must be directly followed by the exported_symbols.lds file "$(location //tensorflow/c:exported_symbols.lds)", "-Wl,-install_name,@rpath/libtensorflow.so", ], "//tensorflow:windows": [], "//conditions:default": [ "-z defs", "-Wl,--version-script", # This line must be directly followed by the version_script.lds file "$(location //tensorflow/c:version_script.lds)", ], }), visibility = ["//visibility:public"], deps = [ "//tensorflow/c:c_api", "//tensorflow/c:c_api_experimental", "//tensorflow/c:exported_symbols.lds", "//tensorflow/c:version_script.lds", "//tensorflow/c/eager:c_api", "//tensorflow/core:tensorflow", ],)===== /tensorflow/c/BUILD ===== tf_cuda_library( name = "c_api", srcs = [ "c_api.cc", "c_api_function.cc", ], hdrs = [ "c_api.h", ], copts = tf_copts(), visibility = ["//visibility:public"], deps = select({ "//tensorflow:android": [ ":c_api_internal", "//tensorflow/core:android_tensorflow_lib_lite", ], "//conditions:default": [ ":c_api_internal", "//tensorflow/cc/saved_model:loader", "//tensorflow/cc:gradients", "//tensorflow/cc:ops", "//tensorflow/cc:grad_ops", "//tensorflow/cc:scope_internal", "//tensorflow/cc:while_loop", "//tensorflow/core:core_cpu", "//tensorflow/core:core_cpu_internal", "//tensorflow/core:framework", "//tensorflow/core:op_gen_lib", "//tensorflow/core:protos_all_cc", "//tensorflow/core:lib", "//tensorflow/core:lib_internal", ], }) + select({ "//tensorflow:with_xla_support": [ "//tensorflow/compiler/tf2xla:xla_compiler", "//tensorflow/compiler/jit", ], "//conditions:default": [], }),)tf_cuda_library( name = "c_api_experimental", srcs = [ "c_api_experimental.cc", ], hdrs = [ "c_api_experimental.h", ], copts = tf_copts(), visibility = ["//visibility:public"], deps = [ ":c_api", ":c_api_internal", "//tensorflow/c/eager:c_api", "//tensorflow/compiler/jit/legacy_flags:mark_for_compilation_pass_flags", "//tensorflow/contrib/tpu:all_ops", "//tensorflow/core:core_cpu", "//tensorflow/core:framework", "//tensorflow/core:lib", "//tensorflow/core:lib_platform", "//tensorflow/core:protos_all_cc", ],)===== /tensorflow/c/eager/BUILD ===== tf_cuda_library( name = "c_api", srcs = [ "c_api.cc", "c_api_debug.cc", "c_api_internal.h", ], hdrs = ["c_api.h"], copts = tf_copts() + tfe_xla_copts(), visibility = ["//visibility:public"], deps = select({ "//tensorflow:android": [ "//tensorflow/core:android_tensorflow_lib_lite", ], "//conditions:default": [ "//tensorflow/c:c_api", "//tensorflow/c:c_api_internal", "//tensorflow/core:core_cpu", "//tensorflow/core/common_runtime/eager:attr_builder", "//tensorflow/core/common_runtime/eager:context", "//tensorflow/core/common_runtime/eager:eager_executor", "//tensorflow/core/common_runtime/eager:execute", "//tensorflow/core/common_runtime/eager:kernel_and_device", "//tensorflow/core/common_runtime/eager:tensor_handle", "//tensorflow/core/common_runtime/eager:copy_to_device_node", "//tensorflow/core:core_cpu_internal", "//tensorflow/core:framework", "//tensorflow/core:framework_internal", "//tensorflow/core:lib", "//tensorflow/core:lib_internal", "//tensorflow/core:protos_all_cc", ], }) + select({ "//tensorflow:with_xla_support": [ "//tensorflow/compiler/tf2xla:xla_compiler", "//tensorflow/compiler/jit", "//tensorflow/compiler/jit:xla_device", ], "//conditions:default": [], }) + [ "//tensorflow/core/common_runtime/eager:eager_operation", "//tensorflow/core/distributed_runtime/eager:eager_client", "//tensorflow/core/distributed_runtime/rpc/eager:grpc_eager_client", "//tensorflow/core/distributed_runtime/rpc:grpc_channel", "//tensorflow/core/distributed_runtime/rpc:grpc_server_lib", "//tensorflow/core/distributed_runtime/rpc:grpc_worker_cache", "//tensorflow/core/distributed_runtime/rpc:grpc_worker_service", "//tensorflow/core/distributed_runtime/rpc:rpc_rendezvous_mgr", "//tensorflow/core/distributed_runtime:remote_device", "//tensorflow/core/distributed_runtime:server_lib", "//tensorflow/core/distributed_runtime:worker_env", "//tensorflow/core:gpu_runtime", ],)===== /tensorflow/core/BUILD ===== cc_library( name = "tensorflow", visibility = ["//visibility:public"], deps = [ ":tensorflow_opensource", "//tensorflow/core/platform/default/build_config:tensorflow_platform_specific", ],)tf_cuda_library( name = "tensorflow_opensource", copts = tf_copts(), visibility = ["//visibility:public"], deps = [ ":all_kernels", ":core", ":direct_session", ":example_parser_configuration", ":gpu_runtime", ":lib", ],)cc_library( name = "all_kernels", visibility = ["//visibility:public"], deps = if_dynamic_kernels( [], otherwise = [":all_kernels_statically_linked"], ),)# This is a link-only library to provide a DirectSession# implementation of the Session interface.tf_cuda_library( name = "direct_session", copts = tf_copts(), linkstatic = 1, visibility = ["//visibility:public"], deps = [ ":direct_session_internal", ], alwayslink = 1,)filegroup( name = "example_parser_configuration_testdata", srcs = [ "example/testdata/parse_example_graph_def.pbtxt", ],)cc_library( name = "core", visibility = ["//visibility:public"], deps = [ ":core_cpu", ":gpu_runtime", ":sycl_runtime", ],)cc_library( name = "lib", hdrs = [ "lib/bfloat16/bfloat16.h", "lib/core/arena.h", "lib/core/bitmap.h", "lib/core/bits.h", "lib/core/casts.h", "lib/core/coding.h", "lib/core/errors.h", "lib/core/notification.h", "lib/core/raw_coding.h", "lib/core/status.h", "lib/core/stringpiece.h", "lib/core/threadpool.h", "lib/gtl/array_slice.h", "lib/gtl/cleanup.h", "lib/gtl/compactptrset.h", "lib/gtl/flatmap.h", "lib/gtl/flatset.h", "lib/gtl/inlined_vector.h", "lib/gtl/optional.h", "lib/gtl/priority_queue_util.h", "lib/hash/crc32c.h", "lib/hash/hash.h", "lib/histogram/histogram.h", "lib/io/buffered_inputstream.h", "lib/io/compression.h", "lib/io/inputstream_interface.h", "lib/io/path.h", "lib/io/proto_encode_helper.h", "lib/io/random_inputstream.h", "lib/io/record_reader.h", "lib/io/record_writer.h", "lib/io/table.h", "lib/io/table_builder.h", "lib/io/table_options.h", "lib/math/math_util.h", "lib/monitoring/collected_metrics.h", "lib/monitoring/collection_registry.h", "lib/monitoring/counter.h", "lib/monitoring/gauge.h", "lib/monitoring/metric_def.h", "lib/monitoring/sampler.h", "lib/random/distribution_sampler.h", "lib/random/philox_random.h", "lib/random/random_distributions.h", "lib/random/simple_philox.h", "lib/strings/numbers.h", "lib/strings/proto_serialization.h", "lib/strings/str_util.h", "lib/strings/strcat.h", "lib/strings/stringprintf.h", ":platform_base_hdrs", ":platform_env_hdrs", ":platform_file_system_hdrs", ":platform_other_hdrs", ":platform_port_hdrs", ":platform_protobuf_hdrs", ], visibility = ["//visibility:public"], deps = [ ":lib_internal", "@com_google_absl//absl/container:inlined_vector", "@com_google_absl//absl/strings", "@com_google_absl//absl/types:optional", ],)# This includes implementations of all kernels built into TensorFlow.cc_library( name = "all_kernels_statically_linked", visibility = ["//visibility:private"], deps = [ "//tensorflow/core/kernels:array", "//tensorflow/core/kernels:audio", "//tensorflow/core/kernels:batch_kernels", "//tensorflow/core/kernels:bincount_op", "//tensorflow/core/kernels:boosted_trees_ops", "//tensorflow/core/kernels:candidate_sampler_ops", "//tensorflow/core/kernels:checkpoint_ops", "//tensorflow/core/kernels:collective_ops", "//tensorflow/core/kernels:control_flow_ops", "//tensorflow/core/kernels:ctc_ops", "//tensorflow/core/kernels:cudnn_rnn_kernels", "//tensorflow/core/kernels:data_flow", "//tensorflow/core/kernels:dataset_ops", "//tensorflow/core/kernels:decode_proto_op", "//tensorflow/core/kernels:encode_proto_op", "//tensorflow/core/kernels:fake_quant_ops", "//tensorflow/core/kernels:function_ops", "//tensorflow/core/kernels:functional_ops", "//tensorflow/core/kernels:grappler", "//tensorflow/core/kernels:histogram_op", "//tensorflow/core/kernels:image", "//tensorflow/core/kernels:io", "//tensorflow/core/kernels:linalg", "//tensorflow/core/kernels:list_kernels", "//tensorflow/core/kernels:lookup", "//tensorflow/core/kernels:logging", "//tensorflow/core/kernels:manip", "//tensorflow/core/kernels:math", "//tensorflow/core/kernels:multinomial_op", "//tensorflow/core/kernels:nn", "//tensorflow/core/kernels:parameterized_truncated_normal_op", "//tensorflow/core/kernels:parsing", "//tensorflow/core/kernels:partitioned_function_ops", "//tensorflow/core/kernels:random_ops", "//tensorflow/core/kernels:random_poisson_op", "//tensorflow/core/kernels:remote_fused_graph_ops", "//tensorflow/core/kernels:required", "//tensorflow/core/kernels:resource_variable_ops", "//tensorflow/core/kernels:rpc_op", "//tensorflow/core/kernels:scoped_allocator_ops", "//tensorflow/core/kernels:sdca_ops", "//tensorflow/core/kernels:searchsorted_op", "//tensorflow/core/kernels:set_kernels", "//tensorflow/core/kernels:sparse", "//tensorflow/core/kernels:state", "//tensorflow/core/kernels:stateless_random_ops", "//tensorflow/core/kernels:string", "//tensorflow/core/kernels:summary_kernels", "//tensorflow/core/kernels:training_ops", "//tensorflow/core/kernels:word2vec_kernels", ] + tf_additional_cloud_kernel_deps() + if_not_windows([ "//tensorflow/core/kernels:fact_op", "//tensorflow/core/kernels:array_not_windows", "//tensorflow/core/kernels:math_not_windows", "//tensorflow/core/kernels:quantized_ops", "//tensorflow/core/kernels/neon:neon_depthwise_conv_op", ]) + if_mkl([ "//tensorflow/core/kernels:mkl_concat_op", "//tensorflow/core/kernels:mkl_conv_op", "//tensorflow/core/kernels:mkl_cwise_ops_common", "//tensorflow/core/kernels:mkl_fused_batch_norm_op", "//tensorflow/core/kernels:mkl_identity_op", "//tensorflow/core/kernels:mkl_input_conversion_op", "//tensorflow/core/kernels:mkl_lrn_op", "//tensorflow/core/kernels:mkl_pooling_ops", "//tensorflow/core/kernels:mkl_relu_op", "//tensorflow/core/kernels:mkl_reshape_op", "//tensorflow/core/kernels:mkl_slice_op", "//tensorflow/core/kernels:mkl_softmax_op", "//tensorflow/core/kernels:mkl_transpose_op", "//tensorflow/core/kernels:mkl_tfconv_op", "//tensorflow/core/kernels:mkl_aggregate_ops", ]) + if_cuda([ "//tensorflow/core/grappler/optimizers:gpu_swapping_kernels", "//tensorflow/core/grappler/optimizers:gpu_swapping_ops", ]),)
===== tensorflow/contrib/android/BUILD =====cc_binary( name = "libtensorflow_inference.so", srcs = [], copts = tf_copts() + [ "-ffunction-sections", "-fdata-sections", ], linkopts = if_android([ "-landroid", "-latomic", "-ldl", "-llog", "-lm", "-z defs", "-s", "-Wl,--gc-sections", "-Wl,--version-script", # This line must be directly followed by LINKER_SCRIPT. "$(location {})".format(LINKER_SCRIPT), ]), linkshared = 1, linkstatic = 1, tags = [ "manual", "notap", ], deps = [ ":android_tensorflow_inference_jni", "//tensorflow/core:android_tensorflow_lib", LINKER_SCRIPT, ],)cc_library( name = "android_tensorflow_inference_jni", srcs = if_android([":android_tensorflow_inference_jni_srcs"]), copts = tf_copts(), visibility = ["//visibility:public"], deps = [ "//tensorflow/core:android_tensorflow_lib_lite", "//tensorflow/java/src/main/native", ], alwayslink = 1,)===== tensorflow/core/BUILD ===== cc_library( name = "android_tensorflow_lib", srcs = if_android([":android_op_registrations_and_gradients"]), copts = tf_copts(), tags = [ "manual", "notap", ], visibility = ["//visibility:public"], deps = [ ":android_tensorflow_lib_lite", ":protos_all_cc_impl", "//tensorflow/core/kernels:android_tensorflow_kernels", "//third_party/eigen3", "@protobuf_archive//:protobuf", ], alwayslink = 1,)cc_library( name = "android_tensorflow_lib_lite", srcs = if_android(["//tensorflow/core:android_srcs"]), copts = tf_copts(android_optimization_level_override = None), linkopts = ["-lz"], tags = [ "manual", "notap", ], visibility = ["//visibility:public"], deps = [ ":mobile_additional_lib_deps", ":protos_all_cc_impl", ":stats_calculator_portable", "//third_party/eigen3", "@double_conversion//:double-conversion", "@nsync//:nsync_cpp", "@protobuf_archive//:protobuf", ], alwayslink = 1,)alias( name = "android_srcs", actual = ":mobile_srcs", visibility = ["//visibility:public"],)filegroup( name = "mobile_srcs", srcs = [ ":mobile_srcs_no_runtime", ":mobile_srcs_only_runtime", ], visibility = ["//visibility:public"],)# Core sources for Android builds.filegroup( name = "mobile_srcs_no_runtime", srcs = [ ":protos_all_proto_text_srcs", ":error_codes_proto_text_srcs", "//tensorflow/core/platform/default/build_config:android_srcs", ] + glob( [ "client/**/*.cc", "framework/**/*.h", "framework/**/*.cc", "lib/**/*.h", "lib/**/*.cc", "platform/**/*.h", "platform/**/*.cc", "public/**/*.h", "util/**/*.h", "util/**/*.cc", ], exclude = [ "**/*test.*", "**/*testutil*", "**/*testlib*", "**/*main.cc", "debug/**/*", "framework/op_gen_*", "lib/jpeg/**/*", "lib/png/**/*", "lib/gif/**/*", "util/events_writer.*", "util/stats_calculator.*", "util/reporter.*", "platform/**/cuda_libdevice_path.*", "platform/default/test_benchmark.*", "platform/cuda.h", "platform/google/**/*", "platform/hadoop/**/*", "platform/gif.h", "platform/jpeg.h", "platform/png.h", "platform/stream_executor.*", "platform/windows/**/*", "user_ops/**/*.cu.cc", "util/ctc/*.h", "util/ctc/*.cc", "util/tensor_bundle/*.h", "util/tensor_bundle/*.cc", "common_runtime/gpu/**/*", "common_runtime/eager/*", "common_runtime/gpu_device_factory.*", ], ), visibility = ["//visibility:public"],)filegroup( name = "mobile_srcs_only_runtime", srcs = [ "//tensorflow/core/kernels:android_srcs", "//tensorflow/core/util/ctc:android_srcs", "//tensorflow/core/util/tensor_bundle:android_srcs", ] + glob( [ "common_runtime/**/*.h", "common_runtime/**/*.cc", "graph/**/*.h", "graph/**/*.cc", ], exclude = [ "**/*test.*", "**/*testutil*", "**/*testlib*", "**/*main.cc", "common_runtime/gpu/**/*", "common_runtime/eager/*", "common_runtime/gpu_device_factory.*", "graph/dot.*", ], ), visibility = ["//visibility:public"],)cc_library( name = "stats_calculator_portable", srcs = [ "util/stat_summarizer_options.h", "util/stats_calculator.cc", ], hdrs = [ "util/stats_calculator.h", ], copts = tf_copts(),)cc_library( name = "mobile_additional_lib_deps", deps = tf_additional_lib_deps() + [ "@com_google_absl//absl/strings", ],)===== tensorflow/core/kernels/BUILD ===== cc_library( name = "android_tensorflow_kernels", srcs = select({ "//tensorflow:android": [ "//tensorflow/core/kernels:android_core_ops", "//tensorflow/core/kernels:android_extended_ops", ], "//conditions:default": [], }), copts = tf_copts(), linkopts = select({ "//tensorflow:android": [ "-ldl", ], "//conditions:default": [], }), tags = [ "manual", "notap", ], visibility = ["//visibility:public"], deps = [ "//tensorflow/core:android_tensorflow_lib_lite", "//tensorflow/core:protos_all_cc_impl", "//third_party/eigen3", "//third_party/fft2d:fft2d_headers", "@fft2d", "@gemmlowp", "@protobuf_archive//:protobuf", ], alwayslink = 1,)# Core kernels we want on Android. Only a subset of kernels to keep# base library small.filegroup( name = "android_core_ops", srcs = [ "aggregate_ops.cc", "aggregate_ops.h", "aggregate_ops_cpu.h", "assign_op.h", "bias_op.cc", "bias_op.h", "bounds_check.h", "cast_op.cc", "cast_op.h", "cast_op_impl.h", "cast_op_impl_bfloat.cc", "cast_op_impl_bool.cc", "cast_op_impl_complex128.cc", "cast_op_impl_complex64.cc", "cast_op_impl_double.cc", "cast_op_impl_float.cc", "cast_op_impl_half.cc", "cast_op_impl_int16.cc", "cast_op_impl_int32.cc", "cast_op_impl_int64.cc", "cast_op_impl_int8.cc", "cast_op_impl_uint16.cc", "cast_op_impl_uint32.cc", "cast_op_impl_uint64.cc", "cast_op_impl_uint8.cc", "concat_lib.h", "concat_lib_cpu.cc", "concat_lib_cpu.h", "concat_op.cc", "constant_op.cc", "constant_op.h", "cwise_ops.h", "cwise_ops_common.cc", "cwise_ops_common.h", "cwise_ops_gradients.h", "dense_update_functor.cc", "dense_update_functor.h", "dense_update_ops.cc", "example_parsing_ops.cc", "fill_functor.cc", "fill_functor.h", "function_ops.cc", "function_ops.h", "gather_functor.h", "gather_nd_op.cc", "gather_nd_op.h", "gather_nd_op_cpu_impl.h", "gather_nd_op_cpu_impl_0.cc", "gather_nd_op_cpu_impl_1.cc", "gather_nd_op_cpu_impl_2.cc", "gather_nd_op_cpu_impl_3.cc", "gather_nd_op_cpu_impl_4.cc", "gather_nd_op_cpu_impl_5.cc", "gather_nd_op_cpu_impl_6.cc", "gather_nd_op_cpu_impl_7.cc", "gather_op.cc", "identity_n_op.cc", "identity_n_op.h", "identity_op.cc", "identity_op.h", "immutable_constant_op.cc", "immutable_constant_op.h", "matmul_op.cc", "matmul_op.h", "no_op.cc", "no_op.h", "non_max_suppression_op.cc", "non_max_suppression_op.h", "one_hot_op.cc", "one_hot_op.h", "ops_util.h", "pack_op.cc", "pooling_ops_common.h", "reshape_op.cc", "reshape_op.h", "reverse_sequence_op.cc", "reverse_sequence_op.h", "sendrecv_ops.cc", "sendrecv_ops.h", "sequence_ops.cc", "shape_ops.cc", "shape_ops.h", "slice_op.cc", "slice_op.h", "slice_op_cpu_impl.h", "slice_op_cpu_impl_1.cc", "slice_op_cpu_impl_2.cc", "slice_op_cpu_impl_3.cc", "slice_op_cpu_impl_4.cc", "slice_op_cpu_impl_5.cc", "slice_op_cpu_impl_6.cc", "slice_op_cpu_impl_7.cc", "softmax_op.cc", "softmax_op_functor.h", "split_lib.h", "split_lib_cpu.cc", "split_op.cc", "split_v_op.cc", "strided_slice_op.cc", "strided_slice_op.h", "strided_slice_op_impl.h", "strided_slice_op_inst_0.cc", "strided_slice_op_inst_1.cc", "strided_slice_op_inst_2.cc", "strided_slice_op_inst_3.cc", "strided_slice_op_inst_4.cc", "strided_slice_op_inst_5.cc", "strided_slice_op_inst_6.cc", "strided_slice_op_inst_7.cc", "unpack_op.cc", "variable_ops.cc", "variable_ops.h", ],)# Other kernels we may want on Android.## The kernels can be consumed as a whole or in two groups for# supporting separate compilation. Note that the split into groups# is entirely for improving compilation time, and not for# organizational reasons; you should not depend on any# of those groups independently.filegroup( name = "android_extended_ops", srcs = [ ":android_extended_ops_group1", ":android_extended_ops_group2", ":android_quantized_ops", ], visibility = ["//visibility:public"],)filegroup( name = "android_extended_ops_headers", srcs = [ "argmax_op.h", "avgpooling_op.h", "batch_matmul_op_impl.h", "batch_norm_op.h", "control_flow_ops.h", "conv_2d.h", "conv_ops.h", "data_format_ops.h", "depthtospace_op.h", "depthwise_conv_op.h", "fake_quant_ops_functor.h", "fused_batch_norm_op.h", "gemm_functors.h", "image_resizer_state.h", "initializable_lookup_table.h", "lookup_table_init_op.h", "lookup_table_op.h", "lookup_util.h", "maxpooling_op.h", "mfcc.h", "mfcc_dct.h", "mfcc_mel_filterbank.h", "mirror_pad_op.h", "mirror_pad_op_cpu_impl.h", "pad_op.h", "random_op.h", "reduction_ops.h", "reduction_ops_common.h", "relu_op.h", "relu_op_functor.h", "reshape_util.h", "resize_bilinear_op.h", "resize_nearest_neighbor_op.h", "reverse_op.h", "save_restore_tensor.h", "segment_reduction_ops.h", "softplus_op.h", "softsign_op.h", "spacetobatch_functor.h", "spacetodepth_op.h", "spectrogram.h", "string_util.h", "tensor_array.h", "tile_functor.h", "tile_ops_cpu_impl.h", "tile_ops_impl.h", "topk_op.h", "training_op_helpers.h", "training_ops.h", "transpose_functor.h", "transpose_op.h", "where_op.h", "xent_op.h", ],)filegroup( name = "android_extended_ops_group1", srcs = [ "argmax_op.cc", "avgpooling_op.cc", "batch_matmul_op_real.cc", "batch_norm_op.cc", "bcast_ops.cc", "check_numerics_op.cc", "control_flow_ops.cc", "conv_2d.h", "conv_grad_filter_ops.cc", "conv_grad_input_ops.cc", "conv_grad_ops.cc", "conv_grad_ops.h", "conv_ops.cc", "conv_ops_fused.cc", "conv_ops_using_gemm.cc", "crop_and_resize_op.cc", "crop_and_resize_op.h", "cwise_op_abs.cc", "cwise_op_add_1.cc", "cwise_op_add_2.cc", "cwise_op_bitwise_and.cc", "cwise_op_bitwise_or.cc", "cwise_op_bitwise_xor.cc", "cwise_op_div.cc", "cwise_op_equal_to_1.cc", "cwise_op_equal_to_2.cc", "cwise_op_not_equal_to_1.cc", "cwise_op_not_equal_to_2.cc", "cwise_op_exp.cc", "cwise_op_floor.cc", "cwise_op_floor_div.cc", "cwise_op_floor_mod.cc", "cwise_op_greater.cc", "cwise_op_greater_equal.cc", "cwise_op_invert.cc", "cwise_op_isfinite.cc", "cwise_op_isnan.cc", "cwise_op_left_shift.cc", "cwise_op_less.cc", "cwise_op_less_equal.cc", "cwise_op_log.cc", "cwise_op_logical_and.cc", "cwise_op_logical_not.cc", "cwise_op_logical_or.cc", "cwise_op_maximum.cc", "cwise_op_minimum.cc", "cwise_op_mul_1.cc", "cwise_op_mul_2.cc", "cwise_op_neg.cc", "cwise_op_pow.cc", "cwise_op_reciprocal.cc", "cwise_op_right_shift.cc", "cwise_op_round.cc", "cwise_op_rsqrt.cc", "cwise_op_select.cc", "cwise_op_sigmoid.cc", "cwise_op_sign.cc", "cwise_op_sqrt.cc", "cwise_op_square.cc", "cwise_op_squared_difference.cc", "cwise_op_sub.cc", "cwise_op_tanh.cc", "cwise_op_xlogy.cc", "cwise_op_xdivy.cc", "data_format_ops.cc", "decode_wav_op.cc", "deep_conv2d.cc", "deep_conv2d.h", "depthwise_conv_op.cc", "dynamic_partition_op.cc", "encode_wav_op.cc", "fake_quant_ops.cc", "fifo_queue.cc", "fifo_queue_op.cc", "fused_batch_norm_op.cc", "listdiff_op.cc", "population_count_op.cc", "population_count_op.h", "winograd_transform.h", ":android_extended_ops_headers", ] + select({ ":xsmm_convolutions": [ "xsmm_conv2d.h", "xsmm_conv2d.cc", ], "//conditions:default": [], }),)filegroup( name = "android_extended_ops_group2", srcs = [ "batchtospace_op.cc", "ctc_decoder_ops.cc", "decode_bmp_op.cc", "depthtospace_op.cc", "dynamic_stitch_op.cc", "in_topk_op.cc", "initializable_lookup_table.cc", "logging_ops.cc", "lookup_table_init_op.cc", "lookup_table_op.cc", "lookup_util.cc", "lrn_op.cc", "maxpooling_op.cc", "mfcc.cc", "mfcc_dct.cc", "mfcc_mel_filterbank.cc", "mfcc_op.cc", "mirror_pad_op.cc", "mirror_pad_op_cpu_impl_1.cc", "mirror_pad_op_cpu_impl_2.cc", "mirror_pad_op_cpu_impl_3.cc", "mirror_pad_op_cpu_impl_4.cc", "mirror_pad_op_cpu_impl_5.cc", "pad_op.cc", "padding_fifo_queue.cc", "padding_fifo_queue_op.cc", "queue_base.cc", "queue_op.cc", "queue_ops.cc", "random_op.cc", "reduction_ops_all.cc", "reduction_ops_any.cc", "reduction_ops_common.cc", "reduction_ops_max.cc", "reduction_ops_mean.cc", "reduction_ops_min.cc", "reduction_ops_prod.cc", "reduction_ops_sum.cc", "relu_op.cc", "reshape_util.cc", "resize_bilinear_op.cc", "resize_nearest_neighbor_op.cc", "restore_op.cc", "reverse_op.cc", "save_op.cc", "save_restore_tensor.cc", "save_restore_v2_ops.cc", "segment_reduction_ops.cc", "session_ops.cc", "softplus_op.cc", "softsign_op.cc", "spacetobatch_functor.cc", "spacetobatch_op.cc", "spacetodepth_op.cc", "sparse_fill_empty_rows_op.cc", "sparse_reshape_op.cc", "sparse_to_dense_op.cc", "spectrogram.cc", "spectrogram_op.cc", "stack_ops.cc", "string_join_op.cc", "string_util.cc", "summary_op.cc", "tensor_array.cc", "tensor_array_ops.cc", "tile_functor_cpu.cc", "tile_ops.cc", "tile_ops_cpu_impl_1.cc", "tile_ops_cpu_impl_2.cc", "tile_ops_cpu_impl_3.cc", "tile_ops_cpu_impl_4.cc", "tile_ops_cpu_impl_5.cc", "tile_ops_cpu_impl_6.cc", "tile_ops_cpu_impl_7.cc", "topk_op.cc", "training_op_helpers.cc", "training_ops.cc", "transpose_functor_cpu.cc", "transpose_op.cc", "unique_op.cc", "where_op.cc", "xent_op.cc", ":android_extended_ops_headers", ],)
TensorFlow Mobile通过编译选项,在完整的TensorFlow基础上进行裁剪,在保留TensorFlow核心功能的同时去掉不必要的代码。例如分布式执行的逻辑,windows平台的兼容逻辑,利用gpu计算的逻辑等等。
TensorFlow Mobile并不包含所有的OP,只有一些核心必要的op,详见上面android_core_ops和android_extended_ops。
TensorFlow Lite的源码在tensorflow/contrib/lite目录下。其核心编译逻辑如下
### tensorflow/contrib/lite/BUILDcc_library( name = "framework", srcs = [ "allocation.cc", "graph_info.cc", "interpreter.cc", "model.cc", "mutable_op_resolver.cc", "optional_debug_tools.cc", "stderr_reporter.cc", ] + select({ "//tensorflow:android": [ "nnapi_delegate.cc", "mmap_allocation.cc", ], "//tensorflow:windows": [ "nnapi_delegate_disabled.cc", "mmap_allocation_disabled.cc", ], "//conditions:default": [ "nnapi_delegate_disabled.cc", "mmap_allocation.cc", ], }), hdrs = [ "allocation.h", "context.h", "context_util.h", "error_reporter.h", "graph_info.h", "interpreter.h", "model.h", "mutable_op_resolver.h", "nnapi_delegate.h", "op_resolver.h", "optional_debug_tools.h", "stderr_reporter.h", ], copts = tflite_copts(), linkopts = [ ] + select({ "//tensorflow:android": [ "-llog", ], "//conditions:default": [ ], }), deps = [ ":arena_planner", ":graph_info", ":memory_planner", ":schema_fbs_version", ":simple_memory_arena", ":string", ":util", "//tensorflow/contrib/lite/c:c_api_internal", "//tensorflow/contrib/lite/core/api", "//tensorflow/contrib/lite/kernels:eigen_support", "//tensorflow/contrib/lite/kernels:gemm_support", "//tensorflow/contrib/lite/nnapi:nnapi_lib", "//tensorflow/contrib/lite/profiling:profiler", "//tensorflow/contrib/lite/schema:schema_fbs", ],)
相比TensorFlow Mobile是对完整TensorFlow的裁减,TensorFlow Lite基本就是重新实现了。从内部实现来说,在TensorFlow内核最基本的OP,Context等数据结构,都是新的。从外在表现来说,模型文件从PB格式改成了FlatBuffers格式,TensorFlow的size有大幅度优化,降至300K,然后提供一个converter将普通TensorFlow模型转化成TensorFlow Lite需要的格式。因此,无论从哪方面看,TensorFlow Lite都是一个新的实现方案。
转载地址:http://dqykl.baihongyu.com/