Failed to pickle dataio_prep..audio_pipeline

#8
by Snaffal - opened

AttributeError: Can't pickle local object 'dataio_prep..audio_pipeline'

I don't understand why it does not want to pickle. I converted this repo to a regression task with all (i think) changes for the output nodes to the processing the data. When i run the train_with_wav2vec2.py as given in the repo i get this error message. This is my current dataio_prep function:

def dataio_prep(hparams):
    """This function prepares the datasets to be used in the brain class.
    It also defines the data processing pipeline through user-defined
    functions. We expect `prepare_mini_librispeech` to have been called before
    this, so that the `train.json`, `valid.json`,  and `valid.json` manifest
    files are available.
    Arguments
    ---------
    hparams : dict
        This dictionary is loaded from the `train.yaml` file, and it includes
        all the hyperparameters needed for dataset construction and loading.
    Returns
    -------
    datasets : dict
        Contains two keys, "train" and "valid" that correspond
        to the appropriate DynamicItemDataset object.
    """
    

@sb
	.utils.data_pipeline.takes("wav")
    

@sb
	.utils.data_pipeline.provides("sig")
    def audio_pipeline(wav):
        """Load the signal, and pass it and its length to the corruption class.
        This is done on the CPU in the `collate_fn`."""
        sig = sb.dataio.dataio.read_audio(wav)
        return sig

    # Define label pipeline:
    

@sb
	.utils.data_pipeline.takes("valance", "arousal")
    

@sb
	.utils.data_pipeline.provides("valance", "arousal")
    def label_pipeline(valance, arousal):
        yield valance
        yield arousal
    
    # Define datasets. We also connect the dataset with the data processing
    # functions defined above.
    datasets = {}
    data_info = {
        "train": hparams["train_annotation"],
        "valid": hparams["valid_annotation"],
        "test": hparams["test_annotation"],
    }
    for dataset in data_info:
        datasets[dataset] = sb.dataio.dataset.DynamicItemDataset.from_json(
            json_path=data_info[dataset],
            replacements={"data_root": hparams["data_folder"]},
            dynamic_items=[audio_pipeline, label_pipeline],
            output_keys=["id", "sig", "valence", "arousal"],  # Update output keys
        )

    return datasets
Snaffal changed discussion title from Failed to pickle to Failed to pickle dataio_prep..audio_pipeline
Snaffal changed discussion status to closed

Hi @Snaffal , could you solve the issue?

Sign up or log in to comment