Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

staramr fails with "KeyError: 'Predicted Phenotype'" #115

Closed
apetkau opened this issue Feb 12, 2020 · 10 comments
Closed

staramr fails with "KeyError: 'Predicted Phenotype'" #115

apetkau opened this issue Feb 12, 2020 · 10 comments
Labels
bug Something isn't working

Comments

@apetkau
Copy link
Member

apetkau commented Feb 12, 2020

When running staramr on any genome, I get the following error:

2020-02-12 16:04:35,752 INFO: Scheduling blasts for SRR1952908.fasta
2020-02-12 16:04:36,591 ERROR: 'Predicted Phenotype'
...
  File "pandas/_libs/index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
  File "pandas/_libs/index.pyx", line 133, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 157, in pandas._libs.index.IndexEngine._get_loc_duplicates
KeyError: 'Predicted Phenotype'

This looks to be due to an issue with recent pandas library versions. Downgrading this to 0.25.3 works (e.g., with conda install pandas==0.25.3).

The staramr code should likely be updated to support more recent pandas versions.

@apetkau apetkau added the bug Something isn't working label Feb 12, 2020
@kapsakcj
Copy link

kapsakcj commented May 4, 2020

Hey @apetkau - just FYI I hit this issue as well with my docker image for staramr 0.7.1.
pip3 install staramr==0.7.1 installs pandas 1.0.3

when I tested staramr I hit a similar error to what you described above

...
2020-05-04 20:39:48 INFO: Scheduling blasts and MLST for contigs.fasta
2020-05-04 20:39:51 WARNING: No drug found for organism=salmonella, gene=parC, position=57
2020-05-04 20:39:51 ERROR: 'Predicted Phenotype'
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 4410, in get_value
    return libindex.get_value_at(s, key)
  File "pandas/_libs/index.pyx", line 44, in pandas._libs.index.get_value_at
  File "pandas/_libs/index.pyx", line 45, in pandas._libs.index.get_value_at
  File "pandas/_libs/util.pxd", line 98, in pandas._libs.util.get_value_at
  File "pandas/_libs/util.pxd", line 83, in pandas._libs.util.validate_indexer
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...

problem resolved when I downgraded to pandas 0.25.3 inside my container

I also tested the biocontainer available on quay.io (quay.io/biocontainers/staramr:0.7.1--py_1), which worked fine with no errors. That container has pandas 0.25.3

Hope this helps! and thanks for your help earlier today with the pypi package!

@apetkau
Copy link
Member Author

apetkau commented May 4, 2020

Thanks for that. Yes, this is still an issue that I haven't had a chance to fix yet.

@kapsakcj
Copy link

kapsakcj commented May 5, 2020

No worries. Just wanted to make you aware in case it would help future development.

It might be worth adding something to the dependencies list and/or install instructions in the README to specify the compatible pandas versions.

@apetkau
Copy link
Member Author

apetkau commented Jun 16, 2020

With regards to this issue note that I have updated the bioconda packages for version 0.7.1 and 0.4.0 so that the correct version of pandas gets installed. This would still be an issue if installing via pip and I have not had a chance to fix the issue so it works in newer versions of pandas.

@javiertognarelli
Copy link

Hi @apetkau I just got stuck with this issue and debugging the error I found this fixed for me (using pandas 1.0.5):

  1. file AMRDetectionSummaryResistance.py line 25:
    flattened_phenotype_list = [y.strip() for x in dataframe.get('Predicted Phenotype').tolist() for y in x.split(self.SEPARATOR)]

  2. file AMRDetectionSummary.py line 45:
    lambda x: {'Gene': (self.SEPARATOR + ' ').join(x.get('Gene'))})

It seems pandas 1.0.5 sometimes didn't like df['key'] instead df.get('key').

Best regards,

@apetkau
Copy link
Member Author

apetkau commented Sep 22, 2020

That's awesome. Thanks so much @javiertognarelli 😄. We can incorporate this fix in and release an update (or you can submit a pull request with fixed code if you want).

@javiertognarelli
Copy link

I'm glad you like it. I'm still new with github and using staramr from conda so I guess it'd be better you do it.
Cheers.

@apetkau
Copy link
Member Author

apetkau commented Sep 23, 2020

@javiertognarelli sounds great. Thanks so much 😄

@apetkau
Copy link
Member Author

apetkau commented Oct 13, 2020

Fixed in #126

@ValentinCledassou
Copy link

ValentinCledassou commented May 9, 2022

Hello, I have the same problem with Galaxy.org (Galaxy Version 0.5.1) and Galaxy.eu (Galaxy Version 0.7.2+galaxy0) with staramr ("KeyError: 'Predicted Phenotype")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants