55502f40dc8b7c769880b10874abc9d0

I was processing logs like this:
$ tail -1000000 process.log|grep 'PROCESSING N'|cut -d: -f2 > ids
$ cat ids
1
2
3
4
6
8
26
44
54
55
56
83
87
93
95
121
129
139
143
159
161
165
56
83
159
56
83
159
169
56
83
159
56
83
159
173
56
83
159
56
83
159
175
56
83
159

Then to get stats:
$for id in `cat ids|sort -rn|uniq`; do echo `grep -w $id ids|wc -l` :$id; done|sort -nr
8 :83
8 :56
8 :159
1 :95
1 :93
1 :87
1 :8
1 :6
1 :55
1 :54
1 :44
1 :4
1 :3
1 :26
1 :2
1 :175
1 :173
1 :169
1 :165
1 :161
1 :143
1 :139
1 :129
1 :121
1 :1

Can this be simplified somehow?

for id in `cat ids|sort -rn|uniq`; do echo `grep -w $id ids|wc -l` :$id; done|sort -nr

Refactorings

No refactoring yet !

49de4cd2f26705785cbef2b15a9df7aa

Nick

November 1, 2008, November 01, 2008 16:57, permalink

1 rating. Login to rate!

I'd write a small script for this. I've been using Ruby lately, so here's one solution:

filename    = 'process.log'
ids         = {}

File.readlines(filename).each do |line|
  id = line.match(/\d+/)[0]
  ids[id] ? ids[id] += 1 : ids[id] = 1
end

ids.sort {|a,b| a[1] <=> b[1]}.reverse.each do |pair|
  puts "#{pair[1]} :#{pair[0]}"
end
4d72203c38dd5f3e3d2d446b5888e8a7

Elij

November 1, 2008, November 01, 2008 18:29, permalink

2 ratings. Login to rate!

It's straight forward to do this with sort/uniq -- see below -- if you need to inject the colon use awk or sed

tail -1000000 process.log | grep 'PROCESSING N' | cut -d: -f2 | sort | uniq -c

Your refactoring





Format Copy from initial code

or Cancel