How to download a dataset using the Kaggle API

How to download a dataset using the Kaggle API

Kaggle is one of the most popular communities for Data Science and Machine Learning. Through Kaggle, people can join competitions, learn new Concepts, Share Notebooks, Download Datasets and much more. This article focuses on the process of downloading a dataset through the use of Kaggle API command line interface. The following steps explained are based on the official docs and serve to be a illustrative example of the process explained there.

Step 1: Install the Kaggle package

To access the Kaggle api, the kaggle package must be installed through pip. Assuming pip is already installed on the system, run the command pip install kaggle. Figure 1 shows an example of installing kaggle and the output of the command. install_kaggle.png Figure 1: Kaggle installation

Step 2: Create a new api token

Head over to your account settings and click the button that states "Create New Api Token" in the API section as shown in Figure 2. kaggle_api_token.png Figure 2: Kaggle api Token button

Creating a new token will result into a json file being downloaded on the local system. Make a new folder called ".kaggle" in the C:\Users\<username>\ directory and move the new json file inside it.

Step 3: Download the dataset

There are many ways to browse existing datasets, a very simple way is to search for datasets through the kaggle website. After selecting a command, copy the api command as shown in Figure 3. copy_api_command.png Figure 3: Copy API command button

After copying the command, execute it as shown in Figure 4 to download a dataset. download_datasets.png Figure 4: Execution of the command to download datasets

Conclusion

Kaggle API CLI is a powerful tool that can be used for a lot of tasks. This article explained how it can be used to download datasets. For further information visit the official github repo and the official docs.