Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It appears Keyerror when I use the Dataprep module #973

Open
DummyBroker opened this issue Sep 14, 2023 · 1 comment
Open

It appears Keyerror when I use the Dataprep module #973

DummyBroker opened this issue Sep 14, 2023 · 1 comment
Assignees
Labels
type: bug Something isn't working

Comments

@DummyBroker
Copy link

Describe the bug
I use anaconda and install the dataprep module by the following code
conda install -c conda-forge dataprep
Then I try the example code from the website

from dataprep.datasets import load_dataset
from dataprep.eda import create_report
from dataprep.eda import plot, plot_correlation, plot_missing
df = load_dataset("titanic")
print(df.columns.tolist())
create_report(df).show()

and it showed the following error:

from dataprep.datasets import load_dataset
from dataprep.eda import create_report
from dataprep.eda import plot, plot_correlation, plot_missing

df = load_dataset("titanic")
print(df.columns.tolist())
['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked']

create_report(df).show()
Computing series-max-agg-6f34ce939adc72d34b6b5a81d3b66957:   0%|          | 0/1420 [00:00<?, ?it/s]C:\ProgramData\anaconda3\Lib\site-packages\dask\core.py:119: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))
error happended in column:Survived                                                                              
Traceback (most recent call last):

  File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3653 in get_loc
    values are attempted to be sorted, but any TypeError from

  File pandas\_libs\index.pyx:147 in pandas._libs.index.IndexEngine.get_loc

  File pandas\_libs\index.pyx:176 in pandas._libs.index.IndexEngine.get_loc

  File pandas\_libs\hashtable_class_helper.pxi:7080 in pandas._libs.hashtable.PyObjectHashTable.get_item

  File pandas\_libs\hashtable_class_helper.pxi:7088 in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Survived'


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  Cell In[123], line 1
    create_report(df).show()

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\create_report\__init__.py:68 in create_report
    "components": format_report(df, cfg, mode, progress),

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\create_report\formatter.py:78 in format_report
    comps = format_basic(edaframe, cfg)

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\create_report\formatter.py:291 in format_basic
    res_variables = _format_variables(df, cfg, data)

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\create_report\formatter.py:120 in _format_variables
    rndrd = render(itmdt, cfg)

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\distribution\render.py:2473 in render
    visual_elem = render_cat(itmdt, cfg)

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\distribution\render.py:1573 in render_cat
    fig = bar_viz(

  File C:\ProgramData\anaconda3\Lib\site-packages\dataprep\eda\distribution\render.py:223 in bar_viz
    df["pct"] = df[col] / nrows * 100

  File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\frame.py:3761 in __getitem__
    key = com.apply_if_callable(key, self)

  File C:\ProgramData\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3655 in get_loc

KeyError: 'Survived'

My numpy version is 1.25.2
My pandas version is 2.0.3
My Python version is 3.11.4

I want to know why this error happen and how to solve it.
Is there anything needed to be added?
Thank you so much!

@DummyBroker DummyBroker added the type: bug Something isn't working label Sep 14, 2023
@sanchitsharma
Copy link

I was facing this same issue. At least two of your dependencies are incompatible as on Dec 3 2023
Python is only supported from 3.8<=version<3.11 (https://pypi.org/project/dataprep/)
Pandas is supported <2. (Found while installing from pip)
Dataprep is running for me with following versions

python-3.10.10
pandas-1.5.3
numpy-1.26.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants