Prevent scripts from being killed

Posted March 25th, 2009 by Juozas

Getting back to shared servers problems. I have some very time consuming scripts running through CRON – some nice web scrapping jobs. They are not processing-intense, but rather slow because of slow websites. All these jobs are really hard to divide in to separate scripts (another article), so one script should have no limits to run for hours. However, web servers don’t like it by default.

To start with, max-execution time is first problem. By default, Apache process will kill itself if script has been running for more than 30 sec. Actual time depends on various parameters, but it’s nowhere near some hours of running. So first thing is to remove time limit:

set_time_limit(0);

Zero means no time limit at all. However, problem is not solved yet. If you are calling your script through Apache it’s most likely that script without any output in about 5 will be killed too. I believe that this depends on web server settings, but it can be easily tested – just create infinity loop and try to load it in Firefox.

After some time Firefox will display “Download” window with your script name – this means that your process has just been killed. I haven’t spent much time analyzing this behaviour, but easiest thing to do is just printing some text, for example:

for ($i = 0; $i < $pageCount; $i++)
{
   print $i . ' out of  ' . $pageCount . ', working with: '.$pages[$i];
   flush();
   hardWork ($page[$i]);
}

This not only prevents script from being killed, but also displays completion (x of N) information. It’s very useful when code may have bugs, because it shows actual unit where your code has stuck. Also, you need to make sure that there is enough memory. I have this code:

ini_set('memory_limit', '128M');

Not all scripts require that much of memory, but since all of this is used only for CRON tasks, it’s not unsafe.

Furthermore, when scripts are called by wget or just browser, they will be killed as soon as user aborts them. So if you click “Stop loading this page” in Firefox – execution stops. It’s good, but my experience showed, that sometimes wget (or other similar tool) decides not to wait longer and simply stops loading. Process gets killed again.

I don’t know why, but I spend whole day trying to make script complete its execution. Memory wasn’t an issue, there were no bugs, but still it kept being killed. Nevertheless, there is solution for this problem also:

ignore_user_abort(true);

Ignore user abort – it does what it says.

Last thing to make sure – disable output caching. When running CRON jobs, gzip‘ing content is absolutely useless and also uses memory and creates more problems with buffer flushing. I have it disabled by this:

apache_setenv('no-gzip', 1);
ini_set('zlib.output_compression', 0);
ini_set('implicit_flush', 1);
header("Content-Encoding: none");

My server uses gzip by default, so these settings makes sure that it’s not compressed.

That’s all. I use all these lines in start of my CRON jobs front controller and everything works fine. Please, better don’t try them on user-side scripts, because they can create problems – if you have no access to running processes, stuck processes with 0 time limit will probably kill your web server.

Trackbacks/Pingbacks

  1. Juozas Kaziukenas’ Blog: Prevent scripts from being killed | Development Blog With Code Updates : Developercast.com
  2. Juozas Kaziukenas’ Blog: Prevent scripts from being killed : WebNetiques, LLC : Website Developers in Minneapolis, MN
  3. Juozas Kaziukenas’ Blog: Prevent scripts from being killed : Dragonfly Networks

Comments (4)

  1. Giorgio Sironi

    Nice, while I never understand why php permits to set memory limit in the script itself while a 1kb file cannot be written (without using chmod before)…

  2. Wesley

    I do the same, except disable output compression. Smart, I will add (remove) that from my scripts as well. (Though they typically don’t output anything)

    I do think that the apache_setenv line could be removed, since you typically execute these php cron jobs via the CLI (command line interface) and not via wget http://url/script.php

  3. Adam L

    I don’t understand why you just don’t run the command from the Command Line?

  4. Juozas (author)

    Because of hosting provider I use – I have no access to shell; CRON jobs are ran with wget, not cli.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">