Using Wget with Cookies

CookiesOne of the powerful tools available in most Linux distributions is the Wget command line utility.  With a simply one-line command, the tool can download files from the web and save them to the local disk.  While this capability might initially seem only moderately useful (Why not just use Chrome or Firefox to download the file?) – most Linux servers are managed remotely through a tool called SSH.  SSH normally offers only a command line interface without any graphical components, so all the server maintenance needs to be done through the command line.  Wget is used constantly throughout the installation process to download files from the Internet and install new programs on the system.

Normally, downloading a file from the Internet using Wget is done as follows:

wget http://www.domain.com/path/to/file

In addition to downloading programs, however, Wget can be used to remotely trigger events or run jobs in web applications.  In order to leverage the already-built code of the web application, many backend jobs are often programmed as scripts on the website.  In order to run the job, the server simply needs to access the webpage at a predefined interval.  In order to access that webpage, the server can use Wget and discard the output by piping it to /dev/null:

wget -qO- http://www.domain.com/script.php &> /dev/null

This script can be then put inside a cron job and executed on a target interval as needed by the application.

A problem arises, however, when the script is secured, as it should be so that non-administrative users will not have access to run system batch jobs.  It’s insecure to pass login parameters directly through the URL due to server logging.  The job should ideally run through a token-based cookie, isolated to the local machine.

In order to generate the cookie, the login script should first be run by passing the login parameters through the POST data, and then saving the resulting cookie to disk:

wget -qO- --keep-session-cookies --save-cookies cookies.txt --post-data 'user=MYUSER&password=MYPASS' http://www.domain.com/login.php

Depending on the login form arguments, different post-data will need to be entered.  The resulting cookies will be saved to the file cookies.txt in the current folder.  This command should only be run once, and should not be stored inside any script to prevent hard storage of the password.

Finally, the authentication token in cookies.txt can be used to run the script in the batch job:

wget -qO- --load-cookies cookies.txt http://www.domain.com/script.php &> /dev/null

This technique enables a secure method for batch processing in web applications, and helps reduce application vulnerability to hacking.  Ideally, the system account would only have access to the particular batch jobs that it executes.

An alternative method to executing batch jobs on the PHP platform is to directly call the PHP executable from the command line, instead of going through the web server.  While this can work in some instances, dynamic web applications often require virtual paths properly set and can behave unexpectedly when called from the command line.  It is generally safer and more cross-platform compatible to work within the web server framework, and directly access the job service in that same manner that other web requests are processed.  The Wget technique will also work with other web development technologies, such as Node.js, ASP.NET, Rails, and Django.

Written by Andrew Palczewski

About the Author
Andrew Palczewski is CEO of apHarmony, a Chicago software development company. He holds a Master's degree in Computer Engineering from the University of Illinois at Urbana-Champaign and has over ten years' experience in managing development of software projects.
Google+

RSS Twitter LinkedIn Facebook Email

6 thoughts on “Using Wget with Cookies”

  1. So many services now are cookie dependent, this is a great way to get one’s scripts working again.

    As an alternative, would it be possible to login using a different browser and then copy the cookie over to the cookies.txt file?

    Lorian Bartle

  2. Hi

    I have a question for you- if there are hidden parameters in the form then how to handle them?

    Eg.

    Please let me know how to handle these parameters>?

    Yours sincerely,
    Arvind.

Leave a Reply to Lorian Bartle Cancel reply

Your email address will not be published. Required fields are marked *