本文实例讲述了python统计日志ip访问数的方法。分享给大家供大家参考。具体如下:
import re f=open(\"/tmp/a.log\",\"r\") arr={} lines = f.readlines() for line in lines: ipaddress=re.compile(r\'^#(((2[0-4]\\d|25[0-5]|[01]?\\d\\d?)\\.){3}(2[0-4]\\d|25[0-5]|[01]?\\d\\d?))\') match=ipaddress.match(line) if match: ip = match.group(1) if(arr.has_key(ip)): arr[ip]+=1 else: arr.setdefault(ip,1) f.close() for key in arr: print key+\"->\"+str(arr[key])
日志格式为:
#111.172.249.84 - - [12/Dec/2011:05:33:36 +0800] \"GET /images/i/goTop.png HTTP/1.0\" 200 486 \"http://wh.xxxx.com/\" \"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)\" #111.172.249.84 - - [12/Dec/2011:05:33:36 +0800] \"GET /images/i/goTop.png HTTP/1.0\" 200 486 \"http://wh.xxxx.com/\" \"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)\" #111.172.249.85 - - [12/Dec/2011:05:33:36 +0800] \"GET /images/i/goTop.png HTTP/1.0\" 200 486 \"http://wh.xxxx.com/\" \"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)\" #111.172.249.86 - - [12/Dec/2011:05:33:36 +0800] \"GET /images/i/goTop.png HTTP/1.0\" 200 486 \"http://wh.xxxx.com/\" \"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)\"
输出结果:
111.172.249.86->1 111.172.249.84->2 111.172.249.85->1
注释:python字段的setdefault用法为获取信息,如果获取不到的时候就按照他的参数设置该值
>>> a={} >>> a[\'key\']=\'123\' >>> print (a) {\'key\': \'123\'} >>> print (a.setdefault(\'key\',\'456\')) #显示a这个字典的\'key\'值的内容,因为字典有,所以不会去设置它 123 >>> print (a.setdefault(\'key1\',\'456\')) #显示a这个字典的\'key1\'值的内容,因为字典没有,所以设置为456了 456 >>> a {\'key1\': \'456\', \'key\': \'123\'}
希望本文所述对大家的Python程序设计有所帮助。