How to download Kaggle dataset from command line?

Problem

Suppose you found your favorite data set on Kaggle, but it is multiple gigabytes and you need it on your deep learning machine, not your local laptop. 

You cannot simply use wget because you need to be logged in to Kaggle.

Solution

The solution is to export your cookies and tell wget to use your cookies when downloading the data.

To export your cookies, install the chrome extension called cookietxt-export and do the following:

  1. Log in to Kaggle
  2. Press the cookietxt-export button and copy the cookie text
  3. Go to the terminal of the deep learning machine and paste the cookie txt in a file called e.g. cookie.txt.
  4. Go back to the Kaggle site and copy the download link, e.g.

    https://www.kaggle.com/kmader/rsna-bone-age/downloads/rsna-bone-age.zip/2

  5. Finally, in a terminal on the deep learning machine, use wget with the cookie file to download your data set:

    wget -x -c --load-cookies cookies.txt https://www.kaggle.com/kmader/rsna-bone-age/downloads/rsna-bone-age.zip

This command will use your exported cookies to download the Kaggle data set file to your deep learning box.

comments powered by Disqus