π€ Datasets, check!
Well, that was quite a tour through the π€ Datasets library β congratulations on making it this far! With the knowledge that youβve gained from this chapter, you should be able to:
- Load datasets from anywhere, be it the Hugging Face Hub, your laptop, or a remote server at your company.
- Wrangle your data using a mix of the
Dataset.map()
andDataset.filter()
functions. - Quickly switch between data formats like Pandas and NumPy using
Dataset.set_format()
. - Create your very own dataset and push it to the Hugging Face Hub.
- Embed your documents using a Transformer model and build a semantic search engine using FAISS.
In Chapter 7, weβll put all of this to good use as we take a deep dive into the core NLP tasks that Transformer models are great for. Before jumping ahead, though, put your knowledge of π€ Datasets to the test with a quick quiz!