Tip on using Environment Variables inside Python scripts
After receiving several error emails from cron when trying to run the ETL pipeline I describe here, I realized that setting environment variables in your .profile
does not guarantee your .sh
scripts will have access to them. For example, if script.py
looks like
import os
sqlusr = os.getenv('SQLUSR')
sqlpass = os.getenv('SQLCRD')
and these variables are required for something else in the script, the following cron job will not work:
*/10 * * * * /usr/bin/python script.py
The reason is that cron runs in a mostly empty environment, so SQLUSR
and SQLCRD
are not available, even though they have been defined in .profile
. There are several ways to resolve this, I used the following approach. Wrap script.py
in a shell script (cron_script.sh
) as shown below
#!/usr/local/bin/bash
export SQLUSR=valueofvariable
export SQLCRD=valueofvariable
source ~/yourproject/env/bin/activate
cd ~/whereyoukeepthepythonscript
python script.py
deactivate
and call cron_script.sh
in your crontab
*/10 * * * * ~/cron_script.sh
Alternatively, source your .profile
inside of cron_script.sh
. Sourcing your .profile
seems more secure but I’m not sure if it actually is.