Python

Basic

Basic - 基本运算

  1. python 除法

    # python2
    10 / 3 = 3
    10 / 3.0 = 3.333333
    10 // 3 = 3
    10 // 3.0 = 3.0 # 截取小数部分但还是小数
    
    # python3
    10 / 3 = 3.333333
    10 / 3.0 = 3.333333
    10 // 3 = 3
    10 // 3.0 = 3.0 # 截取小数部分但还是小数
    
  2. 查看变量类型

    type()
    
  3. ord, chr, unichr

    chr(65) = 'A'
    ord('a') = 97
    unichr(12345) = u'\u3039'
    

Basic - eval, exec

  1. An explanation

    http://www.mojidong.com/python/2013/05/10/python-exec-eval/

Bit manipulation

Bit manipulation - 进制转换

  1. 10进制转换成2、8、16进制

    bin();oct();hex()
    60 = 0b11100 = 0o74 = 0x3c
    
  2. 其他进制转换成10进制

    int('101',2) = 5
    int('17',8) = 15
    

Bit manipulation - 位操作

List

List - Sort

List - Counter

  1. count 函数

    a = [1,1,1,'1']
    a.count(1) = 3
    a.count('1') = 1
    
  2. collections.Counter 模块 计数

    http://www.zlovezl.cn/articles/collections-in-python/

    from collections import Counter
    s = '''A Counter i....'''.lower()
    
    c = Counter(s)
    # 获取出现频率最高的5个字符
    print c.most_common(5)
    
    # Result:
    [(' ', 54), ('e', 32), ('s', 25), ('a', 24), ('t', 24)]
    

List - Join, Split

  1. http://wangwei007.blog.51cto.com/68019/1100587
    li = ['my','name','is','bob']
    '_'.join(li) = 'my_name_is_bob'
    'a  b'.split(' ') = ['a','','b'] # 中间两个空格
    

List - Extended Slices

  1. Official explanation

    https://docs.python.org/2/whatsnew/2.3.html#extended-slices

    L = range(11) # L = [0,1,2,3,4,5,6,7,8,9,10]
    L[::2] # = [0,2,4,6,8,10]
    L[0:10:2] # = [0,2,4,6,8]
    L[::-1] # L = [10,9,8,7,6,5,4,3,2,1,0]
    
  2. 获取特定 index 范围

    # list
    a = [1,2,3,4,5,6]
    a[:3] = [1,2,3]
    a[3:] = [4,5,6]
    a[-2:] = [5,6]
    
  3. 反转 Reverse

    a = [1,2,3]
    a[::-1] = [3,2,1]
    a.reverse() # update on a directly
    

List - copy

Ref: http://www.jb51.net/article/64030.htm

  1. =通常只是创建了一个引用 Normally, = is just creating an alias

    # wrong 
    a = [1,2,3,4,5]
    b = a
    b.append(6)
    print a # [1,2,3,4,5,6]
    
  2. 大多数情况下,下面两种方式可以复制一个 list 对象,In most cases, the following will do

    # one possible ans
    b = a[:]
    b.append(6)
    print a # [1,2,3,4,5]
    
    # another possible ans
    b = list(a)
    
  3. 当list里包含list的时候,必须使用 copy.deepcopy()

    import copy
    a = [[1,2,3], [2,3,4]]
    b = copy.deepcopy(a)
    

List - Traverse

  1. Basic:

    >>> nums = [6, 7, 8, 9, 10]
    >>> for i in nums:
    ...     print i,
    6 7 8 9 10
    
  2. with index using enumerate:

    >>> nums = [6, 7, 8, 9, 10]
    >>> for idx, n in enumerate(nums):
    ...     print idx, n
    0 6
    1 7
    2 8
    3 9
    4 10
    

List - Special Usage

  1. range 用法

    range(10) == range(0,10) == range(0,10,1) == [0,1,2,3,4,5,6,7,8,9]
    range(10,0,-1) == [10,9,8,7,6,5,4,3,2,1] == range(1,11)[::-1]
    
  2. 获取数组最后一个元素 Last element

    a = [1,2,3,4]
    a[-1] = 4
    
  3. 获取元素某元素的 index

    # a.index(obj[, start_search_index])
    >>> a = [1,2,3,4,3,2,1]
    >>> a.index(3)      # 2
    >>> a.index(3, 2)   # 2
    >>> a.index(3, 3)   # 4
    >>> a.index(5)      # Value Error, will exit program
    >>> a.index(max(a)) # Get max obj index
    

List - Bisect

  1. Official Intro: https://docs.python.org/2/library/bisect.html

    • An amazing module, can automatically insert a value into a sorted list
    • Note: The list must be sorted from small to big
  2. 插入:

    import bisect as bi
    
    # insert to the left
    bi.insort_left(l, val) # bi.insort_left(l, val, lo=0, hi=len(l))
    l.insert(bi.bisect_left(l, val), val) # same as above
    
    # Below is equivalent, insert to the right
    bi.insort_right(l, val)
    bi.insort(l, val)
    
  3. 查找插入的位置

    import bisect as bi
    
    # return the index of 'will insert place' (left)
    bi.bisect_left(l, val) # bi.bisect_left(l, val, lo=0, hi=len(l))
    
    # Below is equivalent, return the index of 'will insert place' (right)
    bi.bisect_right(l, val)
    bi.bisect(l, val)
    

Generator

Generator - Basic Usage

  1. Using ()

     >>> L = [x * x for x in range(10)]
     >>> L
     [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
     >>> g = (x * x for x in range(10))
     >>> g
     <generator object <genexpr> at 0x104feab40>
    
     >>> for n in g:
     ...     print n,
     ...
     0 1 4 9 16 25 36 49 64 81
    
  2. An yield example

     # An example
     def rev_str(my_str):  
         length = len(my_str)  
         for i in range(length - 1,-1,-1):  
             yield my_str[i]
     for char in rev_str("hello"):  
         print(char), # o l l e h
    
  3. Ref

String

String - Encode, Decode

  1. Encode

    http://www.runoob.com/python/att-string-decode.html

    >>> str = "this is string example....wow!!!";
    >>> print "Encoded String: " + str.encode('base64','strict')
    
    Encoded String: dGhpcyBpcyBzdHJpbmcgZXhhbXBsZS4uLi53b3chISE=
    
  2. Decode

    http://www.runoob.com/python/att-string-encode.html

    # str.decode(encoding='UTF-8',errors='strict')
    >>> str = "this is string example....wow!!!";
    >>> str = str.encode('base64','strict');
    >>> print "Encoded String: " + str;
    >>> print "Decoded String: " + str.decode('base64','strict')
    
    Encoded String: dGhpcyBpcyBzdHJpbmcgZXhhbXBsZS4uLi53b3chISE=
    Decoded String: this is string example....wow!!!
    

Queue

Queue - Basic

Dict

Dict - 遍历 Traverse

Dict - 排序 Sort

  1. Using operator

    # Sort by keys
    import operator
    x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
    sorted_x = sorted(x.items(), key=operator.itemgetter(0),reverse=True)
    for k,v in sorted_x: # Return a tuple, can directly use: 'for k,v in sorted_x'
       ...
    
    # Sort by values:
    import operator
    x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
    sorted_x = sorted(x.items(), key=operator.itemgetter(1),reverse=True)
    
  2. Using lambda

    # Sort by keys:
    sorted(d.items(), key=lambda x: x[0],reverse=True)
    
    # Sort by values:
    sorted(d.items(), key=lambda x: x[1],reverse=True)
    

Set

Set - Basic Usage

  1. Basic

    x.add('d') # 添加一项 
    x.update([10,37,42]) # 在s中添加多项
    x.remove('H')
    x in s # 测试 x 是否是 s 的成员
    x not in s # 测试 x 是否不是 s 的成员
    
  2. symbol

    x & y # 交集
    x | y # 并集
    x - y # 差集
    x ^ y # 对称差集(项在x或sy中,但不会同时出现在二者中)
    
  3. Others

    # 测试是否 s 中的每一个元素都在 t 中
    s.issubset(t) == True
    s <= t
    
    # 测试是否 t 中的每一个元素都在 s 中
    s.issuperset(t)
    s >= t
    
    # 删除 set s 中的所有元素
    s.clear()
    

File

File - Traverse a directory (3 methods)

  1. using os.listdir 只会列出当前文件夹下的文件、目录,若要再遍历,需自己写递归

    import os
    sep = os.sep # get sys seperator
    root = "." + sep + "Desktop" + sep + "mywiki" + sep
    def traverse_dir(root):
        for f in os.listdir(root):
            full_path = os.path.join(root, f) # join path
            if os.path.isfile(full_path):
                print full_path # file
            if os.path.isdir(full_path):
                traverse_dir(full_path) # dir, call traverse_dir again
    
  2. using os.walk 会直接再递归遍历完全部的文件夹

    # os.walk(top, topdown = True, onerror = None)
    # topdown = True : bfs
    # topdown = false: dfs
    
    import os 
    def traverse_dir(rootDir): 
        for root, dirs, files in os.walk(rootDir): 
            print root, dirs, files
            for d in dirs: 
                print os.path.join(root, d)      
            for f in files: 
                print os.path.join(root, f)
    
  3. using os.path.walk,要利用回调函数,会直接再递归遍历完全部的文件夹

    # os.path.walk(top, func, arg)
    # func: call back func, must contain at least 3 args (arg, dirname, files)
    # arg:  arg in call_back_func as a tuple
    
    import os
    # call back func
    def find_file(arg, dirname, files):
        for file in files:
            file_path = os.path.join(dirname, file)
            if os.path.isfile(file_path):
                print "find file:%s" % file_path
    
    os.path.walk("./Desktop/mywiki", find_file, ())
    
  4. Ref

File - Read csv

  1. First Row is column name

    with open(path, 'r') as f:
        reader = csv.DictReader(f)
        # fields: No., Time, Source, Destination, Protocol, Length, Host, Info, Request URI Query Parameter
        for row in reader:
            print row["Info"]
    
  2. First Row is data, need to specify column name

    with open('names.csv', 'w') as csvfile:
        fieldnames = ['first_name', 'last_name']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
  3. Does not use column name, row is a list instead

    with open('some.csv', 'rb') as f:
        reader = csv.reader(f)
        for row in reader:
            print row
    

Numpy

Numpy - array to list, list to array

Numpy - get indices when sorting an array

  1. get indices

    >>> import numpy as np
    >>> arr = np.array([1, 10, 2, 4, 8]) # convert list to array
    
    >>> arr.argsort() # ascending
    array([0, 2, 3, 4, 1])
    
    >>> arr.argsort()[::-1] # descending
    array([0, 2, 3, 4, 1])
    
  2. get max / min n indices

    >>> import numpy as np
    >>> top_n = 3
    >>> arr = np.array([1, 3, 2, 4, 5]) # convert list to array
    
    >>> top_n_indices = arr.argsort()[-top_n:][::-1] # get max n indices
    array([4, 3, 1])
    
    >>> min_n_indices = arr.argsort()[:n] # get min n indices
    array([0, 2, 1])
    

Special

Special - with

  1. open one file

    with open("x.txt") as f:
        data = f.read()
        # do something with data
    
  2. open multiple files

    with open("x.txt") as f1, open('xxx.txt') as f2:
        # do something with f1,f2
    

Special - Levenshtein Distance

  1. Installation:

    Ref: https://github.com/ztane/python-Levenshtein/

    Ref: http://www.cnblogs.com/kaituorensheng/archive/2013/05/18/3085653.html

  2. Usage:

    import Levenshtein
    
    # Calc Levenshtein dDistance (or Edit Distance)
    # 删除、插入、替换 +1
    Levenshtein.distance(str1, str2)
    
    # Calc Levenshtein Ratio, r = (sum - ldist) / sum
    # sum = len(str1) + len(str2)
    # but 删除、插入 +1,替换 +2
    Levenshtein.ratio(str1, str2)
    
    # Calc Hamming Distance
    # len must be the same
    Levenshtein.hamming(str1, str2)
    
    # Calc Jaro Distance
    Levenshtein.jaro(s1, s2)
    
    # Calc Jaro–Winkler Distance
    Levenshtein.jaro_winkler(s1, s2)