Python
Basic
Basic - 基本运算
-
python 除法
# python2 10 / 3 = 3 10 / 3.0 = 3.333333 10 // 3 = 3 10 // 3.0 = 3.0 # 截取小数部分但还是小数 # python3 10 / 3 = 3.333333 10 / 3.0 = 3.333333 10 // 3 = 3 10 // 3.0 = 3.0 # 截取小数部分但还是小数
-
查看变量类型
type()
-
ord, chr, unichr
chr(65) = 'A' ord('a') = 97 unichr(12345) = u'\u3039'
Basic - eval, exec
-
An explanation
Bit manipulation
Bit manipulation - 进制转换
-
10进制转换成2、8、16进制
bin();oct();hex() 60 = 0b11100 = 0o74 = 0x3c
-
其他进制转换成10进制
int('101',2) = 5 int('17',8) = 15
Bit manipulation - 位操作
-
二进制操作
&
:按位与|
:按位或^
:按位异或~
:取反<<
:左移>>
: 右移
List
List - Sort
- Two method
a.sort() # did on a a = sorted(a, reverse = True) # gen a new item
List - Counter
-
count 函数
a = [1,1,1,'1'] a.count(1) = 3 a.count('1') = 1
-
collections.Counter 模块 计数
http://www.zlovezl.cn/articles/collections-in-python/
from collections import Counter s = '''A Counter i....'''.lower() c = Counter(s) # 获取出现频率最高的5个字符 print c.most_common(5) # Result: [(' ', 54), ('e', 32), ('s', 25), ('a', 24), ('t', 24)]
List - Join, Split
- http://wangwei007.blog.51cto.com/68019/1100587
li = ['my','name','is','bob'] '_'.join(li) = 'my_name_is_bob' 'a b'.split(' ') = ['a','','b'] # 中间两个空格
List - Extended Slices
-
Official explanation
https://docs.python.org/2/whatsnew/2.3.html#extended-slices
L = range(11) # L = [0,1,2,3,4,5,6,7,8,9,10] L[::2] # = [0,2,4,6,8,10] L[0:10:2] # = [0,2,4,6,8] L[::-1] # L = [10,9,8,7,6,5,4,3,2,1,0]
-
获取特定 index 范围
# list a = [1,2,3,4,5,6] a[:3] = [1,2,3] a[3:] = [4,5,6] a[-2:] = [5,6]
-
反转 Reverse
a = [1,2,3] a[::-1] = [3,2,1] a.reverse() # update on a directly
List - copy
Ref: http://www.jb51.net/article/64030.htm
-
=
通常只是创建了一个引用 Normally,=
is just creating an alias# wrong a = [1,2,3,4,5] b = a b.append(6) print a # [1,2,3,4,5,6]
-
大多数情况下,下面两种方式可以复制一个 list 对象,In most cases, the following will do
# one possible ans b = a[:] b.append(6) print a # [1,2,3,4,5] # another possible ans b = list(a)
-
当list里包含list的时候,必须使用
copy.deepcopy()
import copy a = [[1,2,3], [2,3,4]] b = copy.deepcopy(a)
List - Traverse
-
Basic:
>>> nums = [6, 7, 8, 9, 10] >>> for i in nums: ... print i, 6 7 8 9 10
-
with index using enumerate:
>>> nums = [6, 7, 8, 9, 10] >>> for idx, n in enumerate(nums): ... print idx, n 0 6 1 7 2 8 3 9 4 10
List - Special Usage
-
range 用法
range(10) == range(0,10) == range(0,10,1) == [0,1,2,3,4,5,6,7,8,9] range(10,0,-1) == [10,9,8,7,6,5,4,3,2,1] == range(1,11)[::-1]
-
获取数组最后一个元素 Last element
a = [1,2,3,4] a[-1] = 4
-
获取元素某元素的 index
# a.index(obj[, start_search_index]) >>> a = [1,2,3,4,3,2,1] >>> a.index(3) # 2 >>> a.index(3, 2) # 2 >>> a.index(3, 3) # 4 >>> a.index(5) # Value Error, will exit program >>> a.index(max(a)) # Get max obj index
List - Bisect
-
Official Intro: https://docs.python.org/2/library/bisect.html
- An amazing module, can automatically insert a value into a sorted list
- Note: The list must be sorted from small to big
-
插入:
import bisect as bi # insert to the left bi.insort_left(l, val) # bi.insort_left(l, val, lo=0, hi=len(l)) l.insert(bi.bisect_left(l, val), val) # same as above # Below is equivalent, insert to the right bi.insort_right(l, val) bi.insort(l, val)
-
查找插入的位置
import bisect as bi # return the index of 'will insert place' (left) bi.bisect_left(l, val) # bi.bisect_left(l, val, lo=0, hi=len(l)) # Below is equivalent, return the index of 'will insert place' (right) bi.bisect_right(l, val) bi.bisect(l, val)
Generator
Generator - Basic Usage
-
Using
()
>>> L = [x * x for x in range(10)] >>> L [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] >>> g = (x * x for x in range(10)) >>> g <generator object <genexpr> at 0x104feab40> >>> for n in g: ... print n, ... 0 1 4 9 16 25 36 49 64 81
-
An
yield
example# An example def rev_str(my_str): length = len(my_str) for i in range(length - 1,-1,-1): yield my_str[i] for char in rev_str("hello"): print(char), # o l l e h
-
Ref
String
String - Encode, Decode
-
Encode
http://www.runoob.com/python/att-string-decode.html
>>> str = "this is string example....wow!!!"; >>> print "Encoded String: " + str.encode('base64','strict') Encoded String: dGhpcyBpcyBzdHJpbmcgZXhhbXBsZS4uLi53b3chISE=
-
Decode
http://www.runoob.com/python/att-string-encode.html
# str.decode(encoding='UTF-8',errors='strict') >>> str = "this is string example....wow!!!"; >>> str = str.encode('base64','strict'); >>> print "Encoded String: " + str; >>> print "Decoded String: " + str.decode('base64','strict') Encoded String: dGhpcyBpcyBzdHJpbmcgZXhhbXBsZS4uLi53b3chISE= Decoded String: this is string example....wow!!!
Queue
Queue - Basic
- Official Intro:
https://docs.python.org/2/tutorial/datastructures.html?highlight=queue
>>> from collections import deque >>> queue = deque(["Eric", "John", "Michael"]) >>> queue.append("Terry") # Terry arrives >>> queue.append("Graham") # Graham arrives >>> queue.popleft() # 'Eric' >>> queue.popleft() # 'John' >>> queue # deque(['Michael', 'Terry', 'Graham'])
Dict
Dict - 遍历 Traverse
- 遍历:
for k,v in d.items(): print k,v
Dict - 排序 Sort
-
Using operator
# Sort by keys import operator x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0} sorted_x = sorted(x.items(), key=operator.itemgetter(0),reverse=True) for k,v in sorted_x: # Return a tuple, can directly use: 'for k,v in sorted_x' ... # Sort by values: import operator x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0} sorted_x = sorted(x.items(), key=operator.itemgetter(1),reverse=True)
-
Using lambda
# Sort by keys: sorted(d.items(), key=lambda x: x[0],reverse=True) # Sort by values: sorted(d.items(), key=lambda x: x[1],reverse=True)
Set
Set - Basic Usage
-
Basic
x.add('d') # 添加一项 x.update([10,37,42]) # 在s中添加多项 x.remove('H') x in s # 测试 x 是否是 s 的成员 x not in s # 测试 x 是否不是 s 的成员
-
symbol
x & y # 交集 x | y # 并集 x - y # 差集 x ^ y # 对称差集(项在x或sy中,但不会同时出现在二者中)
-
Others
# 测试是否 s 中的每一个元素都在 t 中 s.issubset(t) == True s <= t # 测试是否 t 中的每一个元素都在 s 中 s.issuperset(t) s >= t # 删除 set s 中的所有元素 s.clear()
File
File - Traverse a directory (3 methods)
-
using
os.listdir
只会列出当前文件夹下的文件、目录,若要再遍历,需自己写递归import os sep = os.sep # get sys seperator root = "." + sep + "Desktop" + sep + "mywiki" + sep def traverse_dir(root): for f in os.listdir(root): full_path = os.path.join(root, f) # join path if os.path.isfile(full_path): print full_path # file if os.path.isdir(full_path): traverse_dir(full_path) # dir, call traverse_dir again
-
using
os.walk
会直接再递归遍历完全部的文件夹# os.walk(top, topdown = True, onerror = None) # topdown = True : bfs # topdown = false: dfs import os def traverse_dir(rootDir): for root, dirs, files in os.walk(rootDir): print root, dirs, files for d in dirs: print os.path.join(root, d) for f in files: print os.path.join(root, f)
-
using
os.path.walk
,要利用回调函数,会直接再递归遍历完全部的文件夹# os.path.walk(top, func, arg) # func: call back func, must contain at least 3 args (arg, dirname, files) # arg: arg in call_back_func as a tuple import os # call back func def find_file(arg, dirname, files): for file in files: file_path = os.path.join(dirname, file) if os.path.isfile(file_path): print "find file:%s" % file_path os.path.walk("./Desktop/mywiki", find_file, ())
-
Ref
File - Read csv
-
First Row is column name
with open(path, 'r') as f: reader = csv.DictReader(f) # fields: No., Time, Source, Destination, Protocol, Length, Host, Info, Request URI Query Parameter for row in reader: print row["Info"]
-
First Row is data, need to specify column name
with open('names.csv', 'w') as csvfile: fieldnames = ['first_name', 'last_name'] writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
-
Does not use column name, row is a list instead
with open('some.csv', 'rb') as f: reader = csv.reader(f) for row in reader: print row
Numpy
Numpy - array to list, list to array
- Usage
import numpy as np l = [2, 3, 4, 5] arr = np.array(l) # convert list to array l = arr.tolist() # convert array to list
Numpy - get indices when sorting an array
-
get indices
>>> import numpy as np >>> arr = np.array([1, 10, 2, 4, 8]) # convert list to array >>> arr.argsort() # ascending array([0, 2, 3, 4, 1]) >>> arr.argsort()[::-1] # descending array([0, 2, 3, 4, 1])
-
get max / min n indices
>>> import numpy as np >>> top_n = 3 >>> arr = np.array([1, 3, 2, 4, 5]) # convert list to array >>> top_n_indices = arr.argsort()[-top_n:][::-1] # get max n indices array([4, 3, 1]) >>> min_n_indices = arr.argsort()[:n] # get min n indices array([0, 2, 1])
Special
Special - with
-
open one file
with open("x.txt") as f: data = f.read() # do something with data
-
open multiple files
with open("x.txt") as f1, open('xxx.txt') as f2: # do something with f1,f2
Special - Levenshtein Distance
-
Installation:
Ref: https://github.com/ztane/python-Levenshtein/
Ref: http://www.cnblogs.com/kaituorensheng/archive/2013/05/18/3085653.html
-
Usage:
import Levenshtein # Calc Levenshtein dDistance (or Edit Distance) # 删除、插入、替换 +1 Levenshtein.distance(str1, str2) # Calc Levenshtein Ratio, r = (sum - ldist) / sum # sum = len(str1) + len(str2) # but 删除、插入 +1,替换 +2 Levenshtein.ratio(str1, str2) # Calc Hamming Distance # len must be the same Levenshtein.hamming(str1, str2) # Calc Jaro Distance Levenshtein.jaro(s1, s2) # Calc Jaro–Winkler Distance Levenshtein.jaro_winkler(s1, s2)