Package errors when running hyperparam_optimiz_for_disease_classifier.py
Hi, I'm running hyperparam_optimiz_for_disease_classifier.py for fine-tuning the pre-trained Geneformer model, but I encountered many errors regarding several packages. I tried various combinations of package versions, but they either didn't fix the issues or brought new issues. I'm pasting the errors I haven't fixed below:
- ray.init
runtime_env = {"conda": "base",
"env_vars": {"LD_LIBRARY_PATH": "/raid/swang12/Transformer/myenv/lib"}}
ray.init(runtime_env=runtime_env)
TypeError: init() got an unexpected keyword argument 'runtime_env'
I currently walked around this by using ray.init() without runtime_env.
aioredis
(myenv) swang12@sbbdmaplp003:/raid/swang12/Transformer$ python3.8 hyperparam_optimiz_for_disease_classifier.py
2023-08-13 20:38:06,838 INFO services.py:1172 -- View the Ray dashboard at http://127.0.0.1:8265
(raylet) Traceback (most recent call last):
(raylet) File "/home/swang12/.local/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 334, in
(raylet) raise e
(raylet) File "/home/swang12/.local/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 323, in
(raylet) loop.run_until_complete(agent.run())
(raylet) File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
(raylet) return future.result()
(raylet) File "/home/swang12/.local/lib/python3.8/site-packages/ray/new_dashboard/agent.py", line 121, in run
(raylet) self.aioredis_client = await dashboard_utils.get_aioredis_client(
(raylet) File "/home/swang12/.local/lib/python3.8/site-packages/ray/new_dashboard/utils.py", line 662, in get_aioredis_client
(raylet) return await aioredis.create_redis_pool(
(raylet) AttributeError: module 'aioredis' has no attribute 'create_redis_pool'tune.CLIReporter
trainer.hyperparameter_search(
direction="maximize",
backend="ray",
resources_per_trial={"cpu":8,"gpu":1},
hp_space=lambda _: ray_config,
search_alg=hyperopt_search,
n_trials=100, # number of trials
progress_reporter=tune.CLIReporter(max_report_frequency=600,
sort_by_metric=True,
max_progress_rows=100,
mode="max",
metric="eval_accuracy",
metric_columns=["loss", "eval_loss", "eval_accuracy"]
)
)
Traceback (most recent call last):
File "hyperparam_optimiz_for_disease_classifier.py", line 190, in
progress_reporter=tune.CLIReporter(max_report_frequency=600,
TypeError: init() got an unexpected keyword argument 'sort_by_metric'
Besides sort_by_metric, these parameters also raise errors: mode, metric, metric_columns.
For ray, I tried 1.2, 1.5, 1.9, and the latest 2.6 versions. For aioredis, I tried from 0.x to the latest 2.x versions.
I was wondering if you could let me know the key package versions that work for you. Thank you very much!
Sincerely,
Su Wang
HI! Looks like you are using Python 3.8. Ray 2.6.1 issues should be resolved in a Python 3.10.12 contained environment. I recommend either 1) instating a Python 3.10 conda environment before Ray 2.6.X installation or 2) Conda/mamba installation by first requesting that the environment be set to Python version 3.10.12. I suggest conda/mamba installations for Ray. You could also install Ray through PyPi channels in your conda environment with !pip install ray and !pip install ray[tune].
Thank you,
Madhavan
Hi Madhavan, Thanks a lot for your reply! I created a new virtual environment in Python 3.10.12 and installed Ray 2.6.1 and ray[tune], but I still got this error:
(myenv) swang12@sbbdmaplp003:/raid/swang12/Transformer31012$ python hyperparam_optimiz_for_disease_classifier.py
Traceback (most recent call last):
File "/raid/swang12/Transformer31012/hyperparam_optimiz_for_disease_classifier.py", line 19, in
from ray.tune.suggest.hyperopt import HyperOptSearch
ModuleNotFoundError: No module named 'ray.tune.suggest'
Could you help me with this? Thank you very much!
Sincerely,
Su
Sure! Since RayTune is updated, you might want to try changing "from ray.tune.suggest.hyperopt import HyperOptSearch" to "from ray.tune.search.hyperopt import HyperOptSearch"
- Madhavan
Hi Madhavan, I followed your instructions and it works. Now I got another error:
Traceback (most recent call last):
File "/raid/swang12/Transformer31012/hyperparam_optimiz_for_disease_classifier.py", line 110, in
trainset_v4 = trainset_v3.map(classes_to_ids, num_proc=num_proc)
File "/home/swang12/anaconda3/envs/myenv/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 592, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/swang12/anaconda3/envs/myenv/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 557, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/swang12/anaconda3/envs/myenv/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3189, in map
for rank, done, content in iflatmap_unordered(
File "/home/swang12/anaconda3/envs/myenv/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 1387, in iflatmap_unordered
raise RuntimeError(
RuntimeError: One of the subprocesses has abruptly died during map operation.To debug the error, disable multiprocessing.
I tried disabling multiprocessing but it doesn't work. Do you have any clues? Thank you very much!
Sincerely,
Su
When multiprocessing is disabled, what is the error you get? This error comes from a map operation on an Arrow dataset from the datasets library. For troubleshooting, you can try dataset = dataset.map(function_name, num_proc=1).
Another consideration could be a memory issue: on a Linux machine, you can try to monitor this with htop/top. This goes in hand with concurrency, where the script would crash if the background load approached capacity.
Thanks,
Madhavan
This can also happen due to poor network connection between the data and node where the processing is occurring, for example if the data is being used on a mounted network drive as opposed to copied onto a scratch disk local to the processing node, which we would recommend.