Today I share with everyone some of the linux commands that I often use in log analysis on the server.
1. Commonly used usecases
- Want to investigate the cause when the number of accesses to the server soared
- Detected on the server a flaw and want to investigate whether anyone has intruded or not
2. Commonly used linux commands
Command line | Explain |
---|---|
grep | Find lines containing the specified string |
awk | Split columns and search rows conditionally |
sort | Line arrangement |
uniq | Remove duplicate rows and count |
WC | Count the number of characters, the number of lines |
sed | Replace specified conditions |
3. Subject of analysis
1 2 3 4 5 6 7 8 9 10 | # access_log 203.0.113.1 - - [03/May/2020:12:00:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)" 203.0.113.1 - - [03/May/2020:12:10:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)" 203.0.113.2 - - [03/May/2020:12:20:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64)" 203.0.113.2 - - [03/May/2020:12:30:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64)" 203.0.113.2 - - [03/May/2020:12:40:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64)" 198.51.100.3 - - [03/May/2020:12:50:00 +0900] "GET /index.cgi?page=<Script>alert('Evil')</Script> HTTP/1.1" 200 3000 "-" "Evil User Agent" 198.51.100.3 - - [03/May/2020:13:00:00 +0900] "GET /../../../../../etc/shadow HTTP/1.1" 200 3000 "-" "Evil User Agent" |
Inside:
- 203.0.113.0/24 is the correct ip address from the user
- 198.51.100.0/24 is the ip address from the suspicious source
4. Apply
4.1. Special search (grep)
・ Investigate attacks on directories
1 2 3 | $ grep -n ".." access_log 7:198.51.100.3 - - [03/May/2020:13:00:00 +0900] "GET /../../../../../etc/shadow HTTP/1.1" 200 3000 "-" "Evil User Agent" |
・ Investigate suspicious script insertion
1 2 3 | $ grep -i "<script>" --color access_log 198.51.100.3 - - [03/May/2020:12:50:00 +0900] "GET /index.cgi?page=<Script>alert('Evil')</Script> HTTP/1.1" 200 3000 "-" "Evil User Agent" |
・ Investigate suspicious html tag insertion
1 2 3 | $ grep -E "<[^>]+>[^<]+<[^>]+>" -o access_log <Script>alert('Evil')</Script> |
Explain options in grep:
option | Explain |
---|---|
-n | Displays the line number of the match line |
-i | Not case sensitive and lowercase |
–color | Display color for matched word |
-o | Show matched part |
-E | Use regex expressions |
4.2. Eliminate unnecessary logs
・ Remove log from standard source
1 2 3 4 | $ grep -v "203.0.113." access_log 198.51.100.3 - - [03/May/2020:12:50:00 +0900] "GET /index.cgi?page=<Script>alert('Evil')</Script> HTTP/1.1" 200 3000 "-" "Evil User Agent" 198.51.100.3 - - [03/May/2020:13:00:00 +0900] "GET /../../../../../etc/shadow HTTP/1.1" 200 3000 "-" "Evil User Agent" |
Explain options in grep:
option | Explain |
---|---|
-v | Do not show match lines |
4.3. Extract element (awk)
・ Extract the ip client address
1 2 3 4 5 6 7 8 9 | $ awk '{print $1}' access_log 203.0.113.1 203.0.113.1 203.0.113.2 203.0.113.2 203.0.113.2 198.51.100.3 198.51.100.3 |
・ Extract the client’s User Agent
1 2 3 4 5 6 7 8 9 | $ awk -F["] '{print $6}' access_log Mozilla/5.0 (Windows NT 10.0; Win64; x64) Mozilla/5.0 (Windows NT 10.0; Win64; x64) Mozilla/5.0 (Windows NT 6.3; Win64; x64) Mozilla/5.0 (Windows NT 6.3; Win64; x64) Mozilla/5.0 (Windows NT 6.3; Win64; x64) Evil User Agent Evil User Agent |
Explain options in awk:
option | Explain |
---|---|
{print $ 1} | Displays the first column separated by a separator |
-F | Specify a separator (default is white space) |
4.4. Statistics and sorting (sort / uniq / wc)
・ Display the number of each client IP address
1 2 3 4 5 | $ awk '{print $1}' access_log | sort | uniq -c | sort -rn 3 203.0.113.2 2 203.0.113.1 2 198.51.100.3 |
・ The total number of unique ip client addresses
1 2 3 | $ awk '{print $1}' access_log | sort | uniq | wc -l 3 |
Explain:
command line, option | Explain |
---|---|
uniq -c | Display duplicate numbers (requires sorting for comparison before and after) |
sort -n | Arrange number fields according to arithmetic values |
sort -r | Sort in descending order (default is ascending order) |
wc -l | Display the line number |
4.5. Replace (sed)
・ Anonymize the source IP address for normal access
1 2 3 4 5 6 7 8 9 | $ sed "s/203.0.113.[0-9]{1,3}/xxx.xxx.xxx.xxx/g" access_log xxx.xxx.xxx.xxx - - [03/May/2020:12:00:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)" xxx.xxx.xxx.xxx - - [03/May/2020:12:10:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)" xxx.xxx.xxx.xxx - - [03/May/2020:12:20:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64)" xxx.xxx.xxx.xxx - - [03/May/2020:12:30:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64)" xxx.xxx.xxx.xxx - - [03/May/2020:12:40:00 +0900] "GET /index.html HTTP/1.1" 200 1000 "http://example.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64)" 198.51.100.3 - - [03/May/2020:12:50:00 +0900] "GET /index.cgi?page=<Script>alert('Evil')</Script> HTTP/1.1" 200 3000 "-" "Evil User Agent" 198.51.100.3 - - [03/May/2020:13:00:00 +0900] "GET /../../../../../etc/shadow HTTP/1.1" 200 3000 "-" "Evil User Agent" |
Explain:
option | Explain |
---|---|
s / string A / string B / g | Replace string A with string B. Replace all by adding g. |
Conclude
If we use the 6 statements we mentioned above, we can easily analyze the log.
In addition, it is common to use pipe line characters | to analyze log 1 more effectively.
For example:
1 2 3 | $ grep -v "203.0.113." access_log | awk '{print $1}' | sort | uniq -c | sort -rn 2 198.51.100.3 |
==============
To receive notifications when there are latest posts, you can like my fanpage below: