| Commit message (Collapse) | Author | Files | Lines |
|
For some reasons, kernel-connector can catch exec event, but it
doesn't tell me about what the process exec and what're its args.
So we should use audit to collect these infomations, and complete
in the database.
However, there's different delays between connector and audit,
although they both use netlink socket, as a result of which, exec
may comes before fork. we deal with it the same way. But, there's
also exec event lost, may because of the check for ppid in exec
event, but it's necessary, and if is deleted, too much irrelavent
infomation would flood into database, i've tried. So make it there,
just go forward.
Besides, what's newly discovered is that pthread_create also use
clone syscall, but if pid 1 has a thread 2, the exec info will say
that pid 2 execs. So i shouldn't ignore connector msg that childPid
ne childTgid.
This is my first attempt to use git-submodule function in my own pro-
ject, also golang local package. Congratulations!
Now, fight to fix about file operations. Hope that there wouldn't
be too many fucking bugs.
|
|
|
|
|
|
|
|
this commit i successfully catch open/close syscall, and insert them
as an independent collection in mongodb otherwise along with pids.
and now I've record those open flag "O_TRUNC" as written.
|
|
There's 2 bugs from ancestor commits:
- In the 'things_left' tag commit(the grandpa of this commit), we
add a function that allows execve comes before fork, but when it
happens, I forget to insert the basic info (pid, ppid, etc.), as a
result of which it doesn't work in the designed way. Now it is well,
insert execve with pid and ppid, so that the fork event can find it
and finish other info. However, we shouldn't make start_stamp in
this case, so that it's also a flag. I've not removed the unused
execve info, waiting for the future.
- In the parent commit, the syscallRegex is changed, because when we
add more syscalls to be watched, we need more info about their params
but not only the first one. Instead of keeping using single a0 to get
the first param, i use argsRegex for all the params. But this change
causes mismatch of syscallRegex. Now it's fixed.
|
|
To record it, we must listen to open/write and several syscalls,
and now I've add open into the 2nd coroutine. In syscall open,
what we should do is to judge the permission flag (the 2nd param
in the syscall), to find out if it can write to the file. If so,
the exit code is its file descriptor, and when write is called, the
audit shows only file descriptor but no file name.
So the next step is to add things into 3rd coroutine, to make the
whole program running again, and find out bugs.
|
|
The Most important work during this time is to find out solution
to the out-of-order bug. Discribe it here in detail: info from
audit may be out of order, which means fork may comes after execve,
even after exit. What an absurd penomenon to see a process not yet
created to work or exit!
To deal with this problem, I've tried several ways:
- in the 2nd coroutine, when EOE msg comes, if it's a fork/clone
event, send it immediately, otherwise wait for some time(such as
100 ms). But after all it delays longer, and has other problems.
- the 2nd coroutine doesn't send directly, but record all the finished
event id in a slice, and another thread checks once every one second,
if there are sth in slice, send corresponding events in the order of
event id. But: event that happens first doesn't always has lower id
or time, for example, 1 forks 2, then 2 execve, the audit in kernel
it self may gets execve before fork(maybe fork makes other settings),
which means execve has earlier timestamp and lower event id. The out-
of-order problem is not completely resolved. If we then add delays
to non-clone event, a more serious problem happens: we must use mutex
to lock the slice recording finished event id to prevent crush between
send thread and wait thread, but the wait thread can't get the mutex
again, because there are to much clone event and frequent send!
- So I use no delay but mongodb, when an execve comes, if pid is not
recorded, just insert it and wait for the fork. It does works, but
some other works is still left to do:
- what should i do if 2 forks 3 comes before 1 forks 2? Now I
suggest it doesn't happen, but what if?
- when execve comes before fork, i recorded it, but if this process
has a parent i don't care, delete, or stays there?
Also, as mentioned above, I've add EXECVE field in process into db,
records all the execve(time, and args) from the same process. Besides,
exit_timestamp and exit_code can be caught now, but too many process
has no exit info. This is also to be fixed.
Now, let's listen to the file changed by process. Don't forget the
to-do works listed above!
|
|
I failed to print the process tree out. While I'm printing the tree,
the tree itself gets changed, maybe deleted. What's more, the output
show that there are 4 lines with the same ppid and pid, how an absurd
result! It may be caused by multi-thread. So, use database instead.
Mongodb uses bson(binary json) to store data but not relational
database like mysql, which means it's more easy to use.(?)
Beside inserting, I've also solved a question that "fork" is called
once but returns twice. For instance, pid 1 forked pid 2, in the
audit log it's not an event "syscall=clone,ppid=1,pid=2", but actually
two events "syscall=clone,exit=0,ppid=0,pid=1" and "syscall=clone,exit=
2,ppid=0,pid=1", which is just what we see in sys_fork in kernel source.
To deal with this, when syscall is clone and exit is 0 we just drop it.
Left question: To find out the exit code when a process exit/exit_group,
and finish the code to record it in the database.
|
|
Put all the src code in only one file is to ugly, so devide it!
and mv them into src dir to keep the whole repo clear.
|
|
For some reasons, the linux kernel has made some changes in syscalls.
As shown in src code, we pay attention to fork/vfork/clone to create
process, while exit/exit_group to kill it. From my opinion, the fork
and clone syscall should be totally different, otherwise there will
be only one syscall. However, according to the logs, I heard only
clone but no fork, exit_group but no exit. Infact, fork calls clone
and then makes some special set. They're different, fork means parents
and children, while clone means calling and callee, which allows to
share sth between caller and callee. Both fork and clone makes a new
process, pthread makes tasks and is called thread. Pid is factually
task id.
Now the 3 coroutines works well, and I've get a process tree by
map[int]*process. Here hides some questions:
- is it right for 2nd corutine to send to 3rd as long as eoe?
- how to make the delay between exit_group and deletePid clear and
suitable?
Next works:
- Change the pids from map into DataBase, which means that we
should devide front-end and back-end. Besides, when you delete
sth(such as process exit), don't delete from databese, instead
just make a tag and record their exit code. In other words, we
judge if it's alive not by entry existance but exit tag.
- Make containers recorded, for instance, rootFS, root-process,
name, id, etc.. And record them in map, maintain this database table.
|
|
As is planed, the first coroutine throw rae event infomation to the
second, and it organizes all info for the same event accroding to
event id, which is unique without shutdown of this computer.
There's several defficuties I've encountered, so I list their solution
here to remeber:
- raw info from 1st coroutine is correct, but wrong when 2nd gets it;
or it's correct while recieved, then regular expr goes to match it,
the first match is inline with expectations, but the next match goes
totally wrong, and the info is different from what is received.
Look into the src of go-libaudit, we'll find out that when heard
from netlink socket, the read buffer is always the same slice, it
first received a long data, then **pass the origin slice to
rawEvent.Data**, and then received a shorter data. rawEvent.Data is
passed to 2nd coruntine as **a pointer to rawEvent**, which means
all this 3 process use the same part of memory. Then, when a shorter
info comes from socket, the slice won't be moved, otherwise it write
aigin to this part of mem, then coroutine 2 will get a dirty data.
To deal with it, we change the type of channel from pointer to
interface, and make a deep copy of rawEvent before passing down.
As a result, the 2nd coroutine gets a copy of message but not origin,
it finally comes right.
- While designing a regular expr, it's thought correct but miss
matched from the right string. There maybe sth wrong that can't be
discovered by people's eye, you can try to rewrite the expr, then
it may be fixed.
Also, there's some hidden dangers:
- 2nd coroutine comes with no error checks althouth err variable is
set and catched ubder the rules of compiler. we **shall** make it
later.
- Is it reasonable to pass cooked event info immediately to 3rd
coroutine without waiting some time? Info from network is out of
order after all.
Fight! Fight! Fight!
|