in_silico_perturber UnboundLocalError: local variable 'original_max_len' referenced before assignment
#240
by
junguyen
- opened
Hello,
Seems I'm facing the same error described in Discussion #182; however, pulling the latest version of Geneformer did not fix this issue:
gc.collect()
torch.cuda.empty_cache()
isp = InSilicoPerturber(perturb_type="delete",
perturb_rank_shift=None,
# HNF4A: ENSG00000101076
genes_to_perturb=["ENSG00000101076"],
combos=0,
anchor_gene=None,
model_type="Pretrained",
num_classes=0,
emb_mode="cell",
cell_emb_style="mean_pool",
filter_data=None,
cell_states_to_model=None,
max_ncells=10,
emb_layer=-1,
forward_batch_size=10,
nproc=16,
token_dictionary_file = "/home/ubuntu/Geneformer/geneformer/token_dictionary.pkl")
# Perturb data
isp.perturb_data("/home/ubuntu/Geneformer/",
"/data/genecorpus_filtered_hep/",
"/data/genecorpus_filtered_hep/delete_cell/",
"delete_cell_HNF4A")
Filter (num_proc=16): 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 12775/12775 [00:13<00:00, 981.99 examples/s]
Map (num_proc=16): 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββ| 3000/3000 [00:12<00:00, 231.00 examples/s]
Map (num_proc=16): 100%|βββββββββββββββββββββββββββββββββββββββββββββββββ| 3000/3000 [00:00<00:00, 8204.96 examples/s]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/tensorflow/lib/python3.10/site-packages/geneformer/in_silico_perturber.py", line 974, in perturb_data
self.in_silico_perturb(model,
File "/opt/tensorflow/lib/python3.10/site-packages/geneformer/in_silico_perturber.py", line 1052, in in_silico_pertu
rb
cos_sims_data = quant_cos_sims(model,
File "/opt/tensorflow/lib/python3.10/site-packages/geneformer/in_silico_perturber.py", line 411, in quant_cos_sims
attention_mask = gen_attention_mask(original_minibatch, original_max_len)
UnboundLocalError: local variable 'original_max_len' referenced before assignment
When I change max_ncells=30
and keep forward_batch_size=30
, I receive a new error (see below). I'm unsure of whether these 2 errors are related but I thought I'd post both anyways. I've had previous success running the code with these parameters on Aug 31, 2023 so it may be due to changes implemented after that date?
Filter (num_proc=16): 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 12775/12775 [00:12<00:00, 986.85 examples/s]
Map (num_proc=16): 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 30/30 [00:13<00:00, 2.30 examples/s]
Map (num_proc=16): 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 30/30 [00:00<00:00, 172.79 examples/s]
Map (num_proc=16): 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 30/30 [00:00<00:00, 170.27 examples/s]
Map (num_proc=16): 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 30/30 [00:00<00:00, 177.20 examples/s]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/tensorflow/lib/python3.10/site-packages/geneformer/in_silico_perturber.py", line 974, in perturb_data
self.in_silico_perturb(model,
File "/opt/tensorflow/lib/python3.10/site-packages/geneformer/in_silico_perturber.py", line 1052, in in_silico_pertu
rb
cos_sims_data = quant_cos_sims(model,
File "/opt/tensorflow/lib/python3.10/site-packages/geneformer/in_silico_perturber.py", line 444, in quant_cos_sims
cos_sims += [cos(minibatch_emb, minibatch_comparison).to("cpu")]
File "/opt/tensorflow/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/tensorflow/lib/python3.10/site-packages/torch/nn/modules/distance.py", line 87, in forward
return F.cosine_similarity(x1, x2, self.dim, self.eps)
RuntimeError: The size of tensor a (2047) must match the size of tensor b (2046) at non-singleton dimension 1
Happy to report that I am no longer receiving these errors today!
junguyen
changed discussion status to
closed