How to download Kaggle data into Google Colab

Colab is this awesome initiative from google research that allows anyone to play with Nvidia Telsa K80 for free. I was always struggling on how to show the potential of deep learning to my students without using GPU’s. Then everything changed when I discovered colab.

To get started in colab (I will write a more comprehensive post in the near future), you just need to go to https://colab.research.google.com and authenticate with google.

Screenshot from 2018-07-22 12-35-42

To download data from Kaggle, you first need to get your api key from this link https://www.kaggle.com/{username}/account then click in Create New API Token.  It will download a json file.

{“username”:”{username}”,”key”:”{API key}”}

Then go a new notebook in colab and create the following cell

!pip install -U -q kaggle
!mkdir -p ~/.kaggle
!echo '{"username":"{user}","key":"{API key"}' > ~/.kaggle/kaggle.json
!chmod 600 ~/.kaggle/kaggle.json

This will install kaggle python lib and tell it which are your credentials.

Then we can download the data by typing

!mkdir -p data
!kaggle competitions download -c miia4406-movie-genre-classification -f dataTraining.csv -p data

Note that you must be registered in the competition from which you want to get the data, and the data will be download to the virtual machine, you can now read it into python.

import pandas as pd
import os
dataTraining = pd.read_csv(os.path.join('data', 'dataTraining.csv'), encoding='UTF-8', index_col=0)
dataTraining.head()
year title plot genres rating
3107 2003 Most most is the story of a single father who takes… [‘Short’, ‘Drama’] 8.0
900 2008 How to Be a Serial Killer a serial killer decides to teach the secrets o… [‘Comedy’, ‘Crime’, ‘Horror’] 5.6
6724 1941 A Woman’s Face in sweden , a female blackmailer with a disfi… [‘Drama’, ‘Film-Noir’, ‘Thriller’] 7.2
4704 1954 Executive Suite in a friday afternoon in new york , the presi… [‘Drama’] 7.4
2582 1990 Narrow Margin in los angeles , the editor of a publishing h… [‘Action’, ‘Crime’, ‘Thriller’] 6.6

That’s it, you can now train your models using a Telsa K80 for free.

Click here to open this notebook in colab (colab notebooks are stored in google drive and can be shared as a normal google document)

Leave a Reply