Getting back to shared servers problems. I have some very time consuming scripts running through CRON – some nice web scrapping jobs. They are not processing-intense, but rather slow because of slow websites. All these jobs are really hard to divide in to separate scripts (another article), so one script should have no limits to run for hours. However, web servers don’t like it by default.
To start with, max-execution time is first problem. By default, Apache process will kill itself if script has been running for more than 30 sec. Actual time depends on various parameters, but it’s nowhere near some hours of running. So first thing is to remove time limit:
set_time_limit(0);
Zero means no time limit at all. However, problem is not solved yet. If you are calling your script through Apache it’s most likely that script without any output in about 5 will be killed too. I believe that this depends on web server settings, but it can be easily tested – just create infinity loop and try to load it in Firefox.
After some time Firefox will display “Download” window with your script name – this means that your process has just been killed. I haven’t spent much time analyzing this behaviour, but easiest thing to do is just printing some text, for example:
for ($i = 0; $i < $pageCount; $i++) { print $i . ' out of ' . $pageCount . ', working with: '.$pages[$i]; flush(); hardWork ($page[$i]); }
This not only prevents script from being killed, but also displays completion (x of N) information. It’s very useful when code may have bugs, because it shows actual unit where your code has stuck. Also, you need to make sure that there is enough memory. I have this code:
ini_set('memory_limit', '128M');
Not all scripts require that much of memory, but since all of this is used only for CRON tasks, it’s not unsafe.
Furthermore, when scripts are called by wget or just browser, they will be killed as soon as user aborts them. So if you click “Stop loading this page” in Firefox – execution stops. It’s good, but my experience showed, that sometimes wget (or other similar tool) decides not to wait longer and simply stops loading. Process gets killed again.
I don’t know why, but I spend whole day trying to make script complete its execution. Memory wasn’t an issue, there were no bugs, but still it kept being killed. Nevertheless, there is solution for this problem also:
ignore_user_abort(true);
Ignore user abort – it does what it says.
Last thing to make sure – disable output caching. When running CRON jobs, gzip‘ing content is absolutely useless and also uses memory and creates more problems with buffer flushing. I have it disabled by this:
apache_setenv('no-gzip', 1); ini_set('zlib.output_compression', 0); ini_set('implicit_flush', 1); header("Content-Encoding: none");
My server uses gzip by default, so these settings makes sure that it’s not compressed.
That’s all. I use all these lines in start of my CRON jobs front controller and everything works fine. Please, better don’t try them on user-side scripts, because they can create problems – if you have no access to running processes, stuck processes with 0 time limit will probably kill your web server.







