`ramjet.photometric_database.lightcurve_database`¶

Code for a base generalized database for photometric data to be subclassed.

Module Contents¶

Classes¶

LightcurveDatabase(data_directory=’data’) A base generalized database for photometric data to be subclassed.

class LightcurveDatabase(data_directory='data')[source]¶

Bases: abc.ABC

A base generalized database for photometric data to be subclassed.

window_shift :int¶

How much the window shifts for a windowed batch set.

Returns:	The window shift size.

__init__(self, data_directory='data')[source]¶: Initialize self. See help(type(self)) for accurate signature.

log_dataset_file_names(self, dataset: tf.data.Dataset, dataset_name: str)[source]¶: Saves the names of the files used in a dataset to a CSV file in the trail directory.

static normalize_log_0_to_1(lightcurve: np.ndarray)[source]¶: Normalizes from 0 to 1 on the logarithm of the lightcurve.

normalize(self, lightcurve: np.ndarray)[source]¶

Normalizes the lightcurve.

Parameters:	lightcurve – The lightcurve to normalize.
Returns:	The normalized lightcurve.

static normalize_on_percentiles(lightcurve: np.ndarray)[source]¶: Normalizes light curve using percentiles. The 10th percentile is normalized to -1, the 90th to 1.

static shuffle_in_unison(a, b, seed=None)[source]¶: Shuffle two arrays in unison.

static remove_random_values(lightcurve: np.ndarray)[source]¶: Removes random values from the lightcurve.

get_ratio_enforced_dataset(self, positive_training_dataset: tf.data.Dataset, negative_training_dataset: tf.data.Dataset, positive_to_negative_data_ratio: float)[source]¶: Generates a dataset with an enforced data ratio.

static repeat_dataset_to_size(dataset: tf.data.Dataset, size: int)[source]¶: Repeats a dataset to make it a desired length.

is_positive(self, example_path)[source]¶

Checks if an example contains a microlensing event or not.

Parameters:	example_path – The path to the example to check.
Returns:	Whether or not the example contains a microlensing event.

static make_uniform_length(example: np.ndarray, length: int, randomize: bool = True, seed: int = None)[source]¶: Makes the example a specific length, by clipping those too large and repeating those too small.

get_training_and_validation_datasets_for_file_paths(self, example_paths: Union[Iterable[Path], Callable[[], Iterable[Path]]])[source]¶

Creates training and validation datasets from a list of all file paths. The database validation ratio is used to determine the size of the split.

Parameters:	example_paths – The total list of file paths.
Returns:	The training and validation datasets.

static extract_shuffled_chunk_and_remainder(array_to_extract_from: Union[List, np.ndarray], chunk_ratio: float, chunk_to_extract_index: int = 0)[source]¶

Shuffles an array, extracts a chunk of the data, and returns the chunk and remainder of the array.

Parameters:	array_to_extract_from – The array to process. chunk_ratio – The number of equal size chunks to split the array into before extracting one. chunk_to_extract_index – The index of the chunk to extract out of all chunks.
Returns:	The chunk which is extracted, and the remainder of the array excluding the chunk.

flat_window_zipped_example_and_label_dataset(self, dataset, batch_size, window_shift)[source]¶

Takes a zipped example and label dataset and repeats examples in a windowed fashion of a given batch size. It is expected that the resulting dataset will subsequently be batched in some fashion by the given batch size.

Parameters:	dataset – The zipped example and label dataset. batch_size – The size of the batches to produce. window_shift – The shift of the moving window between batches.
Returns:	The flattened window dataset.

padded_window_dataset_for_zipped_example_and_label_dataset(self, dataset: tf.data.Dataset, batch_size: int, window_shift: int, padded_shapes: Tuple[List, List])[source]¶

Takes a zipped example and label dataset, and converts it to padded batches, where each batch uses overlapping examples based on a sliding window.

Parameters:	dataset – The zipped example and label dataset. batch_size – The size of the batches to produce. window_shift – The shift of the moving window between batches. padded_shapes – The output padded shape.
Returns:	The padded window dataset.

window_dataset_for_zipped_example_and_label_dataset(self, dataset: tf.data.Dataset, batch_size: int, window_shift: int)[source]¶

Takes a zipped example and label dataset, and converts it to batches, where each batch uses overlapping examples based on a sliding window.

Parameters:	dataset – The zipped example and label dataset. batch_size – The size of the batches to produce. window_shift – The shift of the moving window between batches.
Returns:	The window dataset.

clear_data_directory(self)[source]¶: Empties the data directory.

create_data_directories(self)[source]¶: Creates the data directories to be used by the database.

static paths_dataset_from_list_or_generator_factory(list_or_generator_factory: Union[Iterable[Path], Callable[[], Iterable[Path]]])[source]¶

Produces a dataset from either the examples path list or example paths factory to strings.

Parameters:	list_or_generator_factory – The list or generator factory.
Returns:	The new path generator.

ramjet.photometric_database.lightcurve_database¶

Module Contents¶

Classes¶

`ramjet.photometric_database.lightcurve_database`¶