ramjet.py_mapper¶
Code for TensorFlow’s Dataset class which allows for multiprocessing in CPU map functions.
Module Contents¶
Classes¶
PyMapper(map_function: Callable, number_of_parallel_calls: int) |
A class which allows for mapping a py_function to a TensorFlow dataset in parallel on CPU. |
Functions¶
map_py_function_to_dataset(dataset: tf.data.Dataset, map_function: Callable, number_of_parallel_calls: int, output_types: Union[Tuple[tf.dtypes.DType, …], tf.dtypes.DType] = tf.float32, output_shapes: Union[List[Tuple[int, …]], Tuple[int, …]] = None, flat_map: bool = False) |
A one line wrapper to allow mapping a parallel py function to a dataset. |
-
class
PyMapper(map_function: Callable, number_of_parallel_calls: int)[source]¶ A class which allows for mapping a py_function to a TensorFlow dataset in parallel on CPU.
-
__init__(self, map_function: Callable, number_of_parallel_calls: int)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
send_to_map_pool(self, *example_elements)[source]¶ Sends the tensor element to the pool for processing.
Parameters: example_elements – The elements list to be processed by the pool. That is, each example_elements is the contents of a single example in the dataset. Often this may be a single element. Returns: The output of the map function on the element.
-
map_to_dataset(self, dataset: tf.data.Dataset, output_types: Union[List[tf.dtypes.DType], tf.dtypes.DType] = tf.float32, output_shapes: Union[List[Tuple[int, ...]], Tuple[int, ...]] = None, flat_map: bool = False)[source]¶ Maps the map function to the passed dataset.
Parameters: - dataset – The dataset to apply the map function to.
- output_types – The TensorFlow output types of the function to convert to.
- output_shapes – The shape of the outputs of the dataset.
- flat_map – Determines whether to flatten the first level of the output, similar to TensorFlow’s flat_map. Note, the output_types should be the shape of the unflattened output.
Returns: The mapped dataset.
-
-
map_py_function_to_dataset(dataset: tf.data.Dataset, map_function: Callable, number_of_parallel_calls: int, output_types: Union[Tuple[tf.dtypes.DType, ...], tf.dtypes.DType] = tf.float32, output_shapes: Union[List[Tuple[int, ...]], Tuple[int, ...]] = None, flat_map: bool = False) → tf.data.Dataset[source]¶ A one line wrapper to allow mapping a parallel py function to a dataset.
Parameters: - dataset – The dataset whose elements the mapping function will be applied to.
- map_function – The function to map to the dataset.
- number_of_parallel_calls – The number of parallel calls of the mapping function.
- output_types – The TensorFlow output types of the function to convert to.
- output_shapes – The shape to set the outputs to clarify from Python to TensorFlow.
- flat_map – Determines whether to flatten the first level of the output, similar to TensorFlow’s flat_map. Note, the output_types should be the shape of the un-flattened output.
Returns: The mapped dataset.