aboutsummaryrefslogtreecommitdiffstats
path: root/.gitignore (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Add db structure, fix filePath, start filteringWe-unite2024-08-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit I made several changes: - Use structure instead of simple bson.M(interface{}). bson.M has some shortcomings: 1) It makes the database in chaos and hard to read, but this's not important; 2) Some entrys may has more or less content than others, which makes it hard to decode and filt. So I design new data structure to encode and decode. Hopes that there's no bugs. - Fix the way to calculate file path. The original method is to add all the PATH entries together, that's totally wrong! PATH entry has several types, as it shows in "objtype". I can't find it in the kernel src code, so what i know is just "PARENT" means the dir the file is in, while the filename itself has the path, so we whould ignore all "PARENT"s. When the src code is found, we should check it again. - Fix bugs in updating. The update function of mongodb is set to required to has a '$' such as 'set'/'push', so when we update a whole doc, we should use replace but not update function. And, we should never ignore the error infomation it gives us. Hope that there's no more bugs for this Big Change. Now its' time to write filter as well as viewer. Best wishes with NO BUGS!
* Use netlink connector to recv pid info, fix execWe-unite2024-08-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | For some reasons, kernel-connector can catch exec event, but it doesn't tell me about what the process exec and what're its args. So we should use audit to collect these infomations, and complete in the database. However, there's different delays between connector and audit, although they both use netlink socket, as a result of which, exec may comes before fork. we deal with it the same way. But, there's also exec event lost, may because of the check for ppid in exec event, but it's necessary, and if is deleted, too much irrelavent infomation would flood into database, i've tried. So make it there, just go forward. Besides, what's newly discovered is that pthread_create also use clone syscall, but if pid 1 has a thread 2, the exec info will say that pid 2 execs. So i shouldn't ignore connector msg that childPid ne childTgid. This is my first attempt to use git-submodule function in my own pro- ject, also golang local package. Congratulations! Now, fight to fix about file operations. Hope that there wouldn't be too many fucking bugs.
* Fix execve before fork & Fix regex to match "exit"We-unite2024-07-261-0/+2
| | | | | | | | | | | | | | | | | | | There's 2 bugs from ancestor commits: - In the 'things_left' tag commit(the grandpa of this commit), we add a function that allows execve comes before fork, but when it happens, I forget to insert the basic info (pid, ppid, etc.), as a result of which it doesn't work in the designed way. Now it is well, insert execve with pid and ppid, so that the fork event can find it and finish other info. However, we shouldn't make start_stamp in this case, so that it's also a flag. I've not removed the unused execve info, waiting for the future. - In the parent commit, the syscallRegex is changed, because when we add more syscalls to be watched, we need more info about their params but not only the first one. Instead of keeping using single a0 to get the first param, i use argsRegex for all the params. But this change causes mismatch of syscallRegex. Now it's fixed.
* Use mongodb, insert process info into itWe-unite2024-07-221-3/+5
| | | | | | | | | | | | | | | | | | | | I failed to print the process tree out. While I'm printing the tree, the tree itself gets changed, maybe deleted. What's more, the output show that there are 4 lines with the same ppid and pid, how an absurd result! It may be caused by multi-thread. So, use database instead. Mongodb uses bson(binary json) to store data but not relational database like mysql, which means it's more easy to use.(?) Beside inserting, I've also solved a question that "fork" is called once but returns twice. For instance, pid 1 forked pid 2, in the audit log it's not an event "syscall=clone,ppid=1,pid=2", but actually two events "syscall=clone,exit=0,ppid=0,pid=1" and "syscall=clone,exit= 2,ppid=0,pid=1", which is just what we see in sys_fork in kernel source. To deal with this, when syscall is clone and exit is 0 we just drop it. Left question: To find out the exit code when a process exit/exit_group, and finish the code to record it in the database.
* Depart the whole program into several files.We-unite2024-07-191-0/+5
| | | | | Put all the src code in only one file is to ugly, so devide it! and mv them into src dir to keep the whole repo clear.
* Mainly finish the second coroutine, organize eventWe-unite2024-07-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As is planed, the first coroutine throw rae event infomation to the second, and it organizes all info for the same event accroding to event id, which is unique without shutdown of this computer. There's several defficuties I've encountered, so I list their solution here to remeber: - raw info from 1st coroutine is correct, but wrong when 2nd gets it; or it's correct while recieved, then regular expr goes to match it, the first match is inline with expectations, but the next match goes totally wrong, and the info is different from what is received. Look into the src of go-libaudit, we'll find out that when heard from netlink socket, the read buffer is always the same slice, it first received a long data, then **pass the origin slice to rawEvent.Data**, and then received a shorter data. rawEvent.Data is passed to 2nd coruntine as **a pointer to rawEvent**, which means all this 3 process use the same part of memory. Then, when a shorter info comes from socket, the slice won't be moved, otherwise it write aigin to this part of mem, then coroutine 2 will get a dirty data. To deal with it, we change the type of channel from pointer to interface, and make a deep copy of rawEvent before passing down. As a result, the 2nd coroutine gets a copy of message but not origin, it finally comes right. - While designing a regular expr, it's thought correct but miss matched from the right string. There maybe sth wrong that can't be discovered by people's eye, you can try to rewrite the expr, then it may be fixed. Also, there's some hidden dangers: - 2nd coroutine comes with no error checks althouth err variable is set and catched ubder the rules of compiler. we **shall** make it later. - Is it reasonable to pass cooked event info immediately to 3rd coroutine without waiting some time? Info from network is out of order after all. Fight! Fight! Fight!
* Initial commitWe-unite2024-07-171-0/+2
This repo is to supervise all processes in containers, in other words inspect behaviors of dockers, and get the pid tree. There are several ways for programs in user space to intereact with kernel space: - system calls, which can be found out in source path arch/x86/syscalls - ioctl - /proc virtual file system, to read kernel realtime info - nerlink socket the pid we should pay attention to is /usr/bin/containerd, which may come from service docker-daemon and ppid is 1. Each time a docker is start or stop, this forks a pid, the pid then forks, that's the main process of the docker. To grub the info of pid create or exit, this program is based on go-libauditd, which uses netlink socket to hear from kernel about audit log. What's worrying is that one event is always devided into several entries, and several events may be received alternately. So, from my point of view, which program has 3 coroutines and 2 channels. the first receives raw event message from audit, then throw it to channel 1; the second listen to channel 1, and organizes each event until there's a EOE, then throw to channel 2; the third discover event from channel 2, deal with th event, such as create or delete pid. Specially, since two relative infomation(pid 1 fork pid2, then pid 1 exits)may comes out of order, deletion mast be delayed for some time(may 1 second), to keep the process tree correct.