This blog post explains how setting an environment variable can save thousands (or in some cases, tens of thousands) of unnecessary system calls that can be generated by glibc over small periods of time.
This has been tested on Ubuntu Precise (12.04) and Ubuntu Xenial (16.04). It likely applies to other flavors of Linux, as well. It is very easy to test if this applies to you and to correct it, if so. Keep reading for more details!
To avoid extra system calls on server processes where you won’t be updating the timezone (or can restart processes when you do) simply set the TZ
environment variable to :/etc/localtime
(or some other timezone file of your choise) for a process. This will cause glibc to avoid making extra (and unnecessary) system calls.
To understand why this is and how to test if your processes can benefit, read on!
In our previous blog post about Linux system calls, we explained the different system call methods and highlighted a very interesting method in particular: the vDSO system call method.
The purpose of the vDSO system call method was to create a way for certain, very frequently used system calls (like gettimeofday
, time
, clock_gettime
, etc) to avoid needing to actually enter the kernel and cause a context switch from user land to kernel land. The result of this method is that certain system calls, like those listed, can be used by programs at much, much lower cost.
What would happen if every time you called one of these fast vDSO system calls (like time
) you also called a normal system call like, say stat
, which does not pass through the vDSO?
If you did that, you’d essentially be negating some of the performance improvement you were meant to gain by the vDSO optimization in the first place; you’d be making a slow system call very often.
It turns out that this situation happens rather often with a pair of functions that are commonly used together:
time
: A vDSO-enabled system call used to obtain the number of seconds since the epoch, andlocaltime
: A glibc provided function which converts the output oftime
to a local time in the user’s timezone. This is not a system call, but internallylocaltime
can make a system call in some cases.
The time
vDSO-enabled system call and the localtime
function from glibc are often used together in applications either directly by the programmer, or at a lower level unbeknownst to the programmer for formatting dates and times for everything from log messages to SQL queries. This pair is commonly used in Ruby on Rails.
It turns out that the localtime
function in glibc will check if the TZ
environment variable is set. If it is not set (the two Ubuntus I’ve tested do not set it), then glibc will use the stat
system call every time localtime
is called.
In other words: your system supports calling the time
system call via the Linux kernel’s vDSO to avoid the cost of switching to the kernel. But, as soon as your program calls time
, it calls localtime
immediately after, which invokes a system call anyway. So, you’ve eliminated one system call with the vDSO, but replaced it with another.
Let’s see a sample program that shows this behavior, how to use strace
to detect this, and finally how to prevent it by setting the TZ
environment variable.
Sample program showing the issue
Let’s start by creating a simple test program that reproduces this issue:
#include <time.h>#include <stdio.h>intmain(intargc,char*argv[]){inti=0;time_ttimep;printf("Greetings!\n");for(i=0;i<10;i++){time(&timep);localtime(&timep);}printf("Godspeed, dear friend!\n");return0;}
You can compile this program by simply running gcc -o test test.c
. As you can see, this program simply calls time
and localtime
in a loop 10 times.
Verifying this with strace
Every single call to the localtime
glibc function will generate a system call to stat
. Don’t believe me? Let’s use strace
to prove it using the program shown above:
$ gcc -o test test.c$ strace -ttT ./test
... truncated output ...
23:55:16.957457 write(1, "Greetings!\n", 11)=11<0.000033>
23:55:16.957560 open("/etc/localtime", O_RDONLY|O_CLOEXEC)=3<0.000013>
23:55:16.957650 fstat(3, {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000007>
23:55:16.957723 fstat(3, {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000006>
23:55:16.957797 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096)=127<0.000011>
23:55:16.957904 lseek(3, -71, SEEK_CUR)=56<0.000008>
23:55:16.957975 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096)=71<0.000008>
23:55:16.958048 close(3)=0<0.000009>
23:55:16.958130 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958215 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958296 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000010>
23:55:16.958375 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958454 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958533 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958611 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958690 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958769 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...})=0<0.000009>
23:55:16.958848 write(1, "Godspeed, dear friend!\n", 23)=23<0.000020>
In the strace
output above we see a few things:
- The
/etc/localtime
file is opened. fstat
is called twice, followed by two reads to the file to pull in the timezone data.- Next, 9 calls to
stat
are made passing/etc/localtime
over and over.
Notice that the strace
output does not show the call to time
. This is expected. System calls made via the vDSO do not appear in strace
output. To see them, you’d need to use ltrace
instead.
What’s going on here is that the first call to localtime
in glibc opens and reads the contents of /etc/localtime
. All subsequent calls to localtime
internally call stat
, but they do this to ensure that the timezone file has not changed.
On many systems /etc/localtime
is a symlink to a timezone file. It is conceivable that a program might be running when the /etc/localtime
symlink is updated. If this were to happen, glibc would notice this when localtime
is called and re-read the file before doing any time conversions.
Many production systems use the UTC timezone and don’t ever need (or want) to change that. For this use case, there’s no reason to stat
the /etc/localtime
file or symlink over and over and over when localtime
is called. The timezone is never going to change from UTC (or if it does, the application can just be restarted).
The easiest way to prevent these stat
calls is to set the TZ
environment variable. When the TZ
environment variable is set glibc will:
- Notice you’ve told it explicitly what timezone file to use for your program.
- Read the file and cache it internally.
- Never
stat
orread
that file path again, as long as the value of theTZ
environment variable is left unchanged.
Let’s set the TZ
variable and check with strace
:
$TZ=:/etc/localtimestrace-ttT./test00:14:53.274564write(1,"Greetings!\n",11)=11<0.000051>00:14:53.274721open("/etc/localtime",O_RDONLY|O_CLOEXEC)=3<0.000038>00:14:53.274873fstat(3,{st_mode=S_IFREG|0644,st_size=127,...})=0<0.000026>00:14:53.275001fstat(3,{st_mode=S_IFREG|0644,st_size=127,...})=0<0.000026>00:14:53.275129read(3,"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"...,4096)=127<0.000028>00:14:53.275257lseek(3,-71,SEEK_CUR)=56<0.000027>00:14:53.275379read(3,"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"...,4096)=71<0.000033>00:14:53.275512close(3)=0<0.000029>00:14:53.275657write(1,"Godspeed, dear friend!\n",23)=23<0.000048>
As you can see, simply setting TZ
causes glibc to read /etc/localtime
a single time and never again. The same program runs with 9 fewer system calls all because of a single environment variable.
Effect on production systems
The effect of setting TZ
on a production system will depend mostly on how often localtime
is called. This is application specific and can vary with request load. All that said, if you aren’t changing your timezone often (or ever), it may be worth simply eliminating these unnecessary calls even if your system isn’t making many of them.
On my test system with a real life app (not the example shown above):
- Without setting
TZ
during normal operations yields approximately: 14,925 calls tostat
over a 30 second period (or roughly 497stat
s per second). - With
TZ
set during the same time period results in 8 calls tostat
over a 30 second period.
So, I’ve eliminated on the order of 10,000 extra system calls (and their associated context switches) without changing anything other than an environment variable. Pretty cool.
Using strace
and questioning why patterns of system calls emerge from your infrastructure can help you better understand exactly what your systems are doing and why. You may even be able to remove unncessary system calls and save some system resources, too.
If you enjoyed this post, you may enjoy some of our other low-level technical posts: