No, given a bunch of Apache logs I needed to find the top 10 queries that met conditions A and B.
1. grepping for conditions and extracting the query with sed
2. appending a char to a flat file (i.e. increasing query count by 1)
3. sort by largest files and get the top 10
Parsing log files[0]:
for ./*.log -type f -exec echo 1 >> $(grep cond_A {} | grep cond_B | sed -E "s:.*(query).*:./results/\1.txt:") \;
Finding top 10 results:
ls ./results/*.txt -Sl | head
[0]: Untested code, and it only grabs one line per log file instead of grepping all matching lines in the log file. I'd have to move the grep to the outside and loop the `echo 1 >>` command, but you get the point.
You probably could have done the same thing in one line:
cat *.log | grep condA | grep condB | sed 'some regex to get rid of dates, etc' | sort | uniq -c | sort -nr | head -n 5
Or something like that... Not very efficient, but it would work in a pinch. I actually do something like this all the time for large datasets. In the time it would take me to write something better, this set of commands is already done.
1. grepping for conditions and extracting the query with sed
2. appending a char to a flat file (i.e. increasing query count by 1)
3. sort by largest files and get the top 10
Parsing log files[0]:
Finding top 10 results: [0]: Untested code, and it only grabs one line per log file instead of grepping all matching lines in the log file. I'd have to move the grep to the outside and loop the `echo 1 >>` command, but you get the point.