Kaggling Tutorial #0: Download A Dataset from Kaggle Using API-Key

Kaggle
Colab
Author

Ismail TG

Published

October 18, 2023

How To Download Dataset From Kaggle into Colab NoteBook:

  • In this mini tutorial we will create a colab notebook and download a Kaggle dataset using Kaggle API-key.
  • Having the possibility to work on different Free computing platform, gives us a wide choices to do multiple prototyping in parallel.

Getting the API-Key:

  • First we need to get API-Key form Kaggle website: settings:
  • Click on Create New Token, it will download a json file: kaggle.json
    API_Key
  • Then we need to upload this file into colab notebook where we will work with the dataset
  • Here we create a new directory kaggle and copy the kaggle.json inside:
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle
  • Now we need to modify the permissions and access mode of kaggle.json:
!chmod 600 ~/.kaggle/kaggle.json

Downloading the Dataset:

  • Since we have all ingredients, we can now download our dataset from kaggle:
! kaggle competitions download linking-writing-processes-to-writing-quality
Downloading linking-writing-processes-to-writing-quality.zip to /content
 90% 97.0M/108M [00:00<00:00, 137MB/s]
100% 108M/108M [00:00<00:00, 135MB/s] 
  • The file we get after this command is .zip file, which contains all files and datasets we will work with
  • In order to unzip this file we will use zipfile:
from zipfile import ZipFile
file = 'linking-writing-processes-to-writing-quality.zip'
with ZipFile(file, 'r') as zip:

    # list all the contents of the zip file
    zip.printdir()

    # extract all files
    print('extraction...')
    zip.extractall()
    print('Done!')
File Name                                             Modified             Size
sample_submission.csv                          2023-10-02 17:22:24           48
test_logs.csv                                  2023-10-02 17:22:24          398
train_logs.csv                                 2023-10-02 17:22:30    485679766
train_scores.csv                               2023-10-02 17:23:10        32132
extraction...
Done!

Conclusion:

  • We succefully download a dataset from Kaggle using API-Key
  • We can take it from here and apply all kind of data manipulation, EDA, training model on it …