Error while loading dataset from HF into kaggle notebook
"data = load_dataset("datadrivenscience/movie-genre-prediction", use_auth_token=True)"
while loading the dataset got the following error.
"UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 7: invalid start byte"
can you try again? we made a fix to the dataset
also, please use the latest datasets version
raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({type(e).name})")
ConnectionError: Couldn't reach 'datadrivenscience/movie-genre-prediction' on the Hub (SSLError)
@abhishek - I could not able to access the data.
Did you go to the dataset page and submit access request? Did your use your token?
@abhishek . Yes I have tried accessing the data through my token, which is created from huggingface. Surprisingly, now I would be able to access the data from datasets through load dataset module in google colab. why am I not able to access data in my local PC.
@SSwaminathan
Glad to know that it worked in colab.
Your concern mentioned here: https://huggingface.co/spaces/competitions/movie-genre-prediction/discussions/4 is already being looked at internally.
Please don't create multiple posts for same problem as it creates confusion :)
@abhishek . Sure My intention here is not to create multiple posts. My first post related to hugging face login issue and the second one was with data. Since there was one related to data already, I tried using that root.
Do you get an error while accessing the data?
Following image is from my local PC.
It seems like the same SSL error mentioned in the other post :)
We are looking into it!
Sure. Thanks
can you try doing:
pip install --upgrade certifi requests
and then try loading the dataset again?
@abhishek . Still the same issue persists
@abhishek . Now I am having issue in Google colab too.
you need to update datasets to latest version
Ok thank you it works now
@urielnguefack
I am getting the same error. I also updated datasets.
What did you do to resolve the problem