Possible adaptation for M1 devices

#7
by Snaffal - opened

Hey, i am currently trying to finetune this model into a regression issue using the valance and arousal rate. I had already converted the prepare data and training scripts but when trying to start the training i get stuck when pytorch wants to utilise CUDA. Which is not possible on my M1 device. Is there any chance if someone can point me to the direction where i need to change the parameters that can utilise the METAL or set my device to use "mps"?
I need it ASAP because my time for my thesis is running low by now... :'(

Hope you have set up mps in MAC, I was able to do it with M2, google colab is still way better.

if you are working with something like:
https://github.com/speechbrain/speechbrain/blob/develop/recipes/IEMOCAP/emotion_recognition/train.py
then after line 285, i.e.
hparams_file, run_opts, overrides = sb.parse_arguments(sys.argv[1:])
add :
run_opts = {"device":"mps"} #cpu or gpu or mps
based on what you are working with, this might override other run options, be careful.

you might also have to play with num_workers value in yaml file,
https://github.com/speechbrain/speechbrain/blob/develop/recipes/IEMOCAP/emotion_recognition/hparams/train.yaml

it should work

I am trying to convert it to a regression model. Which one would you recommend to use? Train.py or train_with_wav2vec.py?

SpeechBrain org

I am trying to convert it to a regression model. Which one would you recommend to use? Train.py or train_with_wav2vec.py?

This a wav2vec HF model, it has been trained using train_with_wav2vec.py so you should use it as well. Make sure to change the training file to reflect the regression problem. (e.g. output linear is 1 etc)

This a wav2vec HF model, it has been trained using train_with_wav2vec.py so you should use it as well. Make sure to change the training file to reflect the regression problem. (e.g. output linear is 1 etc)

Thank you! I also managed to get the CUDA stuff fixed based on your tip. Now every step i take i come across a wall hahahaha.

Now the issue is that the system can't pickle with the following error: AttributeError: Can't pickle local object 'dataio_prep..audio_pipeline'
I have provided the adapted function for regression. After trying to find similar issues on the internet, it seems to be somewhere in this function that prevents the serialisation of this object.
Don't want to be too bothersome and take too much of your time. But this is kinda my thesis and i need to have 1 more modality to do for my system... So any help would be greatly appreciated.


def dataio_prep(hparams):
    """This function prepares the datasets to be used in the brain class.
    It also defines the data processing pipeline through user-defined
    functions. We expect `prepare_mini_librispeech` to have been called before
    this, so that the `train.json`, `valid.json`,  and `valid.json` manifest
    files are available.
    Arguments
    -
    hparams : dict
        This dictionary is loaded from the `train.yaml` file, and it includes
        all the hyperparameters needed for dataset construction and loading.
    Returns
    -
    datasets : dict
        Contains two keys, "train" and "valid" that correspond
        to the appropriate DynamicItemDataset object.
    """
    # Define audio pipeline
    

@sb
	.utils.data_pipeline.takes("wav")
    

@sb
	.utils.data_pipeline.provides("sig")
    def audio_pipeline(wav):
        """Load the signal, and pass it and its length to the corruption class.
        This is done on the CPU in the `collate_fn`."""
        sig = sb.dataio.dataio.read_audio(wav)
        return sig
    # Initialization of the label encoder. The label encoder assignes to each
    # of the observed label a unique index (e.g, 'spk01': 0, 'spk02': 1, ..)
    """
    # NOT NEEDED FOR REGRESSION, following part is for classification and converting labels to indices
    label_encoder = sb.dataio.encoder.CategoricalEncoder()
    # Define label pipeline:
    

@sb
	.utils.data_pipeline.takes("emo")
    

@sb
	.utils.data_pipeline.provides("emo", "emo_encoded")
    def label_pipeline(emo):
        yield emo
        emo_encoded = label_encoder.encode_label_torch(emo)
        yield emo_encoded
    """
    # Define datasets. We also connect the dataset with the data processing
    # functions defined above.
    datasets = {}
    data_info = {
        "train": hparams["train_annotation"],
        "valid": hparams["valid_annotation"],
        "test": hparams["test_annotation"],
    }
    for dataset in data_info:
        datasets[dataset] = sb.dataio.dataset.DynamicItemDataset.from_json(
            json_path=data_info[dataset],
            replacements={"data_root": hparams["data_folder"]},
            dynamic_items=[audio_pipeline],
            output_keys=["id", "sig", "valence", "arousal"],  # Update output keys
        )
    # Load or compute the label encoder (with multi-GPU DDP support)
    # Please, take a look into the lab_enc_file to see the label to index
    # mappinng.
    """
    # AGAIN NOT NEEDED FOR REGRESSION, 
    # This section of code is responsible for loading or creating the label encoder and saving it to a file named "label_encoder.txt".
    lab_enc_file = os.path.join(hparams["save_folder"], "label_encoder.txt")
    label_encoder.load_or_create(
        path=lab_enc_file,
        from_didatasets=[datasets["train"]],
        output_key="emo",
    )
    """
    return datasets

nvm, i fixed it

Snaffal changed discussion status to closed

Sign up or log in to comment