!mkdir ~/.kaggle
How To Download Dataset From Kaggle into Colab NoteBook:
- In this mini tutorial we will create a colab notebook and download a Kaggle dataset using Kaggle API-key.
- Having the possibility to work on different Free computing platform, gives us a wide choices to do multiple prototyping in parallel.
Getting the API-Key:
- First we need to get API-Key form Kaggle website:
settings
: - Click on
Create New Token
, it will download a json file:kaggle.json
- Then we need to upload this file into colab notebook where we will work with the dataset
- Here we create a new directory
kaggle
and copy thekaggle.json
inside:
!cp kaggle.json ~/.kaggle
- Now we need to modify the permissions and access mode of
kaggle.json
:
!chmod 600 ~/.kaggle/kaggle.json
Downloading the Dataset:
- Since we have all ingredients, we can now download our dataset from kaggle:
! kaggle competitions download linking-writing-processes-to-writing-quality
Downloading linking-writing-processes-to-writing-quality.zip to /content
90% 97.0M/108M [00:00<00:00, 137MB/s]
100% 108M/108M [00:00<00:00, 135MB/s]
- The file we get after this command is
.zip
file, which contains all files and datasets we will work with - In order to unzip this file we will use
zipfile
:
from zipfile import ZipFile
file = 'linking-writing-processes-to-writing-quality.zip'
with ZipFile(file, 'r') as zip:
# list all the contents of the zip file
zip.printdir()
# extract all files
print('extraction...')
zip.extractall()
print('Done!')
File Name Modified Size
sample_submission.csv 2023-10-02 17:22:24 48
test_logs.csv 2023-10-02 17:22:24 398
train_logs.csv 2023-10-02 17:22:30 485679766
train_scores.csv 2023-10-02 17:23:10 32132
extraction...
Done!
Conclusion:
- We succefully download a dataset from Kaggle using API-Key
- We can take it from here and apply all kind of data manipulation, EDA, training model on it …