|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some reasons, the linux kernel has made some changes in syscalls.
As shown in src code, we pay attention to fork/vfork/clone to create
process, while exit/exit_group to kill it. From my opinion, the fork
and clone syscall should be totally different, otherwise there will
be only one syscall. However, according to the logs, I heard only
clone but no fork, exit_group but no exit. Infact, fork calls clone
and then makes some special set. They're different, fork means parents
and children, while clone means calling and callee, which allows to
share sth between caller and callee. Both fork and clone makes a new
process, pthread makes tasks and is called thread. Pid is factually
task id.
Now the 3 coroutines works well, and I've get a process tree by
map[int]*process. Here hides some questions:
- is it right for 2nd corutine to send to 3rd as long as eoe?
- how to make the delay between exit_group and deletePid clear and
suitable?
Next works:
- Change the pids from map into DataBase, which means that we
should devide front-end and back-end. Besides, when you delete
sth(such as process exit), don't delete from databese, instead
just make a tag and record their exit code. In other words, we
judge if it's alive not by entry existance but exit tag.
- Make containers recorded, for instance, rootFS, root-process,
name, id, etc.. And record them in map, maintain this database table.
|
|
This repo is to supervise all processes in containers, in other
words inspect behaviors of dockers, and get the pid tree.
There are several ways for programs in user space to intereact with
kernel space:
- system calls, which can be found out in source path arch/x86/syscalls
- ioctl
- /proc virtual file system, to read kernel realtime info
- nerlink socket
the pid we should pay attention to is /usr/bin/containerd, which may
come from service docker-daemon and ppid is 1. Each time a docker is
start or stop, this forks a pid, the pid then forks, that's the main
process of the docker.
To grub the info of pid create or exit, this program is based on
go-libauditd, which uses netlink socket to hear from kernel about
audit log. What's worrying is that one event is always devided into
several entries, and several events may be received alternately.
So, from my point of view, which program has 3 coroutines and 2
channels. the first receives raw event message from audit, then
throw it to channel 1; the second listen to channel 1, and organizes
each event until there's a EOE, then throw to channel 2; the third
discover event from channel 2, deal with th event, such as create or
delete pid. Specially, since two relative infomation(pid 1 fork pid2,
then pid 1 exits)may comes out of order, deletion mast be delayed for
some time(may 1 second), to keep the process tree correct.
|