View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000113 | tcsh | General | public | 2019-10-11 22:04 | 2019-10-19 12:54 |
Reporter | sobomax | Assigned To | christos | ||
Priority | normal | Severity | major | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Fixed in Version | 6.22.00 | ||||
Summary | 0000113: Error handling is seriously broken in kill built-in when started in non-interactive mode and with SIGINT ignored | ||||
Description | We were investigating an issue with one of our Jenkins script hanging indefinitely with 100% CPU usage occasionally. It appears that very simple command such as tcsh -c "kill TERM XYZ" just hangs forever in an infinite loop when the following 3 conditions are met: 1. tcsh is started with standard output being anything but terminal (file, dev null, pipe); 2. Parent process that invoked tcsh sets SIGINT to SIG_IGN, in our case, we are starting it using su(1) which does it apparently; 3. kill(2) fails (e.g. ENOENT, etc). The issue seems to be [Free]BSD specific, as we cannot reproduce it on Linux using the very same version of the software. The tcsh invokes kill(2) over and over again. 14:46:59.309 95884: No such process 14:46:59.309 95884: No such process 14:46:59.309 95884: No such process 14:46:59.309 95884: No such process 14:46:59.309 95884: No such process [...ad infinitum...] | ||||
Steps To Reproduce | #include <assert.h> #include <fcntl.h> #include <signal.h> #include <unistd.h> int main(void) { int devnull; assert(signal(SIGINT, SIG_IGN) != SIG_ERR); devnull = open("/dev/null", O_WRONLY); assert(devnull >= 0); assert(dup2(devnull, STDOUT_FILENO) >= 0); assert(close(devnull) == 0); execl("/bin/tcsh", "/bin/tcsh", "-c", "kill -TERM 99999", NULL); return (255); } | ||||
Additional Information | Version 6.20.00. | ||||
Tags | No tags attached. | ||||
|
This is how it looks while it hangs: 99089 - R 53:05.77 _su -m -c kill -TERM 95884 (csh) Some of the ktrace logs of the actual issue (not our simulated test case): 99089 csh CALL ioctl(0x6,TIOCGETA,0x7fffffffc1d8) 99089 csh RET ioctl -1 errno 25 Inappropriate ioctl for device 99089 csh CALL sigprocmask(SIG_SETMASK,0x687fd0,0) 99089 csh RET sigprocmask 0 99089 csh CALL close(0x6) 99089 csh RET close 0 99089 csh CALL openat(AT_FDCWD,0x8014407f0,0<O_RDONLY>) 99089 csh NAMI "/root/.history" 99089 csh RET openat -1 errno 13 Permission denied 99089 csh CALL sigprocmask(SIG_BLOCK,0,0x687fd0) 99089 csh RET sigprocmask 0 99089 csh CALL setitimer(0,0x7fffffffe400,0x7fffffffe3e0) 99089 csh STRU itimerval { .interval = {0, 0}, .value = {0, 0} } 99089 csh STRU itimerval { .interval = {0, 0}, .value = {0, 0} } 99089 csh RET setitimer 0 99089 csh CALL close(0) 99089 csh RET close -1 errno 9 Bad file descriptor 99089 csh CALL dup(0x13) 99089 csh RET dup 0 99089 csh CALL fcntl(0,F_SETFD,0) 99089 csh RET fcntl 0 99089 csh CALL close(0x1) 99089 csh RET close -1 errno 9 Bad file descriptor 99089 csh CALL dup(0x11) 99089 csh RET dup 1 99089 csh CALL fcntl(0x1,F_SETFD,0) 99089 csh RET fcntl 0 99089 csh CALL close(0x2) 99089 csh RET close -1 errno 9 Bad file descriptor 99089 csh CALL dup(0x12) 99089 csh RET dup 2 99089 csh CALL fcntl(0x2,F_SETFD,0) 99089 csh RET fcntl 0 99089 csh CALL sigprocmask(SIG_BLOCK,0,0x687fd0) 99089 csh RET sigprocmask 0 99089 csh CALL kill(0x1768c,SIGTERM) 99089 csh RET kill -1 errno 3 No such process 99089 csh CALL stat(0x7fffffffd9c0,0x7fffffffd948) 99089 csh NAMI "/usr/share/nls/C/libc.cat" 99089 csh RET stat -1 errno 2 No such file or directory 99089 csh CALL stat(0x7fffffffd9c0,0x7fffffffd948) 99089 csh NAMI "/usr/share/nls/libc/C" 99089 csh RET stat -1 errno 2 No such file or directory 99089 csh CALL stat(0x7fffffffd9c0,0x7fffffffd948) 99089 csh NAMI "/usr/local/share/nls/C/libc.cat" 99089 csh RET stat -1 errno 2 No such file or directory 99089 csh CALL stat(0x7fffffffd9c0,0x7fffffffd948) 99089 csh NAMI "/usr/local/share/nls/libc/C" 99089 csh RET stat -1 errno 2 No such file or directory 99089 csh CALL write(0x1,0x6aa960,0x17) 99089 csh GIO fd 1 wrote 23 bytes "95884: No such process " 99089 csh RET write 23/0x17 99089 csh CALL lseek(0x10,0,SEEK_END) 99089 csh RET lseek -1 errno 29 Illegal seek 99089 csh CALL sigprocmask(SIG_SETMASK,0x687fd0,0) 99089 csh RET sigprocmask 0 99089 csh CALL sigprocmask(SIG_BLOCK,0,0x687fd0) 99089 csh RET sigprocmask 0 99089 csh CALL sigprocmask(SIG_SETMASK,0x687fd0,0) 99089 csh RET sigprocmask 0 99089 csh CALL sigprocmask(SIG_BLOCK,0,0x687fd0) 99089 csh RET sigprocmask 0 99089 csh CALL setitimer(0,0x7fffffffe400,0x7fffffffe3e0) 99089 csh STRU itimerval { .interval = {0, 0}, .value = {0, 0} } 99089 csh STRU itimerval { .interval = {0, 0}, .value = {0, 0} } 99089 csh RET setitimer 0 99089 csh CALL close(0) 99089 csh RET close 0 99089 csh CALL dup(0x13) 99089 csh RET dup 0 99089 csh CALL fcntl(0,F_SETFD,0) 99089 csh RET fcntl 0 99089 csh CALL close(0x1) 99089 csh RET close 0 99089 csh CALL dup(0x11) 99089 csh RET dup 1 99089 csh CALL fcntl(0x1,F_SETFD,0) 99089 csh RET fcntl 0 99089 csh CALL close(0x2) 99089 csh RET close 0 99089 csh CALL dup(0x12) 99089 csh RET dup 2 99089 csh CALL fcntl(0x2,F_SETFD,0) 99089 csh RET fcntl 0 99089 csh CALL sigprocmask(SIG_BLOCK,0,0x687fd0) 99089 csh RET sigprocmask 0 99089 csh CALL kill(0x1768c,SIGTERM) 99089 csh RET kill -1 errno 3 No such process 99089 csh CALL write(0x1,0x6aa960,0x17) 99089 csh GIO fd 1 wrote 23 bytes "95884: No such process " [cycle repeats after that] |
|
FIxed on HEAD, please re-open if you have issues. |
Date Modified | Username | Field | Change |
---|---|---|---|
2019-10-11 22:04 | sobomax | New Issue | |
2019-10-11 22:17 | sobomax | Note Added: 0003315 | |
2019-10-19 12:54 | christos | Assigned To | => christos |
2019-10-19 12:54 | christos | Status | new => assigned |
2019-10-19 12:54 | christos | Status | assigned => resolved |
2019-10-19 12:54 | christos | Resolution | open => fixed |
2019-10-19 12:54 | christos | Fixed in Version | => 6.22.00 |
2019-10-19 12:54 | christos | Note Added: 0003316 |