Adding a new dataset
The simplest way is to create a class which inherits from ptlflow.data.datasets.BaseFlowDataset
and
then populate its lists according to the structure of your dataset. The code below shows an example:
from ptlflow.data.datasets import BaseFlowDataset
class MyDataset(BaseFlowDataset):
def __init__(
self,
my_params_here
) -> None
super().__init__(
dataset_name='MyDatasetName',
transform=MyAugmentationTransform,
get_valid_mask=True, # To return valid pixels masks, recommended to be True
get_occlusion_mask=True, # If the dataset has occlusion masks
get_motion_boundary_mask=False, # If the dataset has motion boundary masks
get_backward=False, # If the dataset has backward flow, occ, mb masks
get_meta=True # To return some metadata, such as paths, etc.
)
# Read you dataset paths here (for example, using glob) and populate the following lists
# self.img_paths
# self.flow_paths
#
# The lists below are optional
# self.occ_paths
# self.mb_paths
# self.flow_b_paths
# self.occ_b_paths
# self.mb_b_paths
# self.metadata
# For example, suppose one sample has two images, one flow file, and one occlusion mask
# You should add it to the lists as follows:
self.img_paths.append(['/path/to/first/image.png', '/path/to/second/image.png'])
self.flow_paths.append(['/path/to/flow.flo'])
self.occ_paths.append(['/path/to/occlusion_mask.png'])
# Notice we always append a list.
# That is all! BaseFlowDataset handles the actual loading of the data, as long it is correctly defined.
If you want to see more details, check the API definition of the all the dataset at datasets.py. This could serve as a guide to implement you own.