Python

2021-01-25

  • 1 Basics
    • 1.1 String and List: Query
      • 1.1.1 Length
      • 1.1.2 Check if a list is empty
      • 1.1.3 Index a list with another list like this
      • 1.1.4 Count number of True in list
    • 1.2 String and List: Modify
      • 1.2.1 Remove element by index in a list
      • 1.2.2 Sort string by the number inside
      • 1.2.3 Remove specific substrings from a set of strings
      • 1.2.4 Modify items in string by slicing and combination
      • 1.2.5 Remove all special characters, punctuation and spaces from a string
      • 1.2.6 Replace white with other character
      • 1.2.7 Get the lower-case of string
      • 1.2.8 Trim white spaces in string:
    • 1.3 String and List: Conversion
      • 1.3.1 Convert float type to string
      • 1.3.2 Convert string to list
      • 1.3.3 Convert list to string:
      • 1.3.4 Check a string represents an int
    • 1.4 String and List: Search
      • 1.4.1 Search str1 in str2
      • 1.4.2 Check exact word in strings
      • 1.4.3 Extract number from strings
      • 1.4.4 Find the first iterm matched
      • 1.4.5 Get all the terms matched:
      • 1.4.6 Get the index of all items that contain specific string
    • 1.5 dict
      • 1.5.1 Basics
      • 1.5.2 list(dictionary.keys()) raises error in pdb
      • 1.5.3 Convert to Other Types
    • 1.6 range
      • 1.6.1 convert range to list
    • 1.7 tuple
    • 1.8 bytes
    • 1.9 IO
      • 1.9.1 Read Text File
      • 1.9.2 Read a specific line
      • 1.9.3 Write Text File
      • 1.9.4 Booleans Formatted in Strings
      • 1.9.5 Print without Newline
      • 1.9.6 Print to file
    • 1.10 Function
      • 1.10.1 Default Parameter
    • 1.11 Control Statement
      • 1.11.1 if-else
      • 1.11.2 while Loop
  • 2 Numpy and Scipy
    • 2.1 Basics
      • 2.1.1 ND array to 1D
    • 2.2 Advanced
      • 2.2.1 Normality test
  • 3 Advanced
    • 3.1 Tips
      • 3.1.1 Check the normality of an array
      • 3.1.2 Create a nested directory
      • 3.1.3 Explanation on if __name__ == "__main__"
      • 3.1.4 Test a Variable is None
      • 3.1.5 Output Complex Number in Certain Format
      • 3.1.6 Check Python version
      • 3.1.7 How do I manually throw/raise an exception in Python?
  • 4 Packages
    • 4.1 os
      • 4.1.1 Iterate over files in a directory
      • 4.1.2 Get the full path to the directory a Python file is contained in
      • 4.1.3 Excluding directories in os.walk
      • 4.1.4 Extract file name from path
    • 4.2 subprocess
      • 4.2.1 Calling the source command from subprocess.Popen
    • 4.3 colorama
      • 4.3.1 Print colored output with Python 3
    • 4.4 pip
      • 4.4.1 Uninstall all packages
      • 4.4.2 Uninstall the dependencies of a package
    • 4.5 pyinstaller
      • 4.5.1 Cannot run on Windows 10
      • 4.5.2 pyinstaller 3.5 Not Working with python3.8
    • 4.6 argparse
      • 4.6.1 Example 1
      • 4.6.2 Example 2
    • 4.7 time Related Modules
      • 4.7.1 Measure Time Elapsed in Python
      • 4.7.2 Get Time from String
      • 4.7.3 Time Delay
    • 4.8 mpmath
      • 4.8.1 Example of Calculation
  • 5 Misc
    • 5.1 System
      • 5.1.1 Shebang
    • 5.2 Mixed Usage with Other Software
      • 5.2.1 Compile Python with sqlite3 Support
      • 5.2.2 iPython Notebook Font

1 Basics

1.1 String and List: Query

1.1.1 Length

Get the length of string or list

len(StringArray)

1.1.2 Check if a list is empty

if not list

1.1.3 Index a list with another list like this

L = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Idx = [0, 3, 7]
T = L[ Idx ]
>>> T = [L[i] for i in Idx]
0.1.1.4 Single line function by using list
>>> squares = []
>>> for x in range(10):
...     squares.append(x**2)
...
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

the above codes can be implemented in this single line.

squares = [x**2 for x in range(10)]

1.1.4 Count number of True in list

>>> sum([True, True, False, False, False, True])
3

1.2 String and List: Modify

1.2.1 Remove element by index in a list

>>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> del a[-1]
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8]
>>> del a[2:4]
>>> a
[0, 1, 4, 5, 6, 7, 8, 9]

1.2.2 Sort string by the number inside

In [1]: import re
In [2]: l = ['asdas2', 'asdas1', 'asds3ssd']
In [3]: sorted(l, key=lambda x: int(re.sub('\D', '', x)))
Out[3]: ['asdas1', 'asdas2', 'asds3ssd']

where re.sub('\D', '', x) replaces everything but the digits.

1.2.3 Remove specific substrings from a set of strings

>>> x = 'Pear.good'
>>> y = x.replace('.good','')
>>> y
'Pear'
>>> x
'Pear.good'

1.2.4 Modify items in string by slicing and combination

>>> a='0034'
>>> a[1]='1' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'str' object does not support item assignment
>>> a=a[:1]+'1'+a[2:]
>>> print a 0134
>>> a=a[:2]+'2'+a[2:]
>>> print a 01234

1.2.5 Remove all special characters, punctuation and spaces from a string

>>> string = "Special $#! characters spaces 888323"
>>> ''.join(e for e in string if e.isalnum())
'Specialcharactersspaces888323'

1.2.6 Replace white with other character

mystring.replace(" ", "_")

1.2.7 Get the lower-case of string

s = "Kilometer"
print(s.lower())

1.2.8 Trim white spaces in string:

string.strip()

Whitespace on the right side:

string.rstrip()

Whitespace on the left side:

string.lstrip()

1.3 String and List: Conversion

1.3.1 Convert float type to string

'%.2f' % 1.234 # output: '1.23'
'%.2f' % 5.0 # output: '5.00'

1.3.2 Convert string to list

states = "Alaska Alabama Arkansas American Samoa Arizona California Colorado"
print states.split()
>>> 'ab c\n\nde fg\rkl\r\n'.splitlines()
['ab c', '', 'de fg', 'kl']

1.3.3 Convert list to string:

list = ['1', '2', '3']
str = ''.join(list)

1.3.4 Check a string represents an int

def RepresentsInt(s):
    try: 
        int(s)
        return True
    except ValueError:
        return False
>>> print RepresentsInt("+123")
True
>>> print RepresentsInt("10.0")
False

Or use regular expression.

1.4 String and List: Search

1.4.1 Search str1 in str2

str.index(str, beg=0 end=len(string))
str.rindex(str, beg=0 end=len(string)) # return highest index
str.find(str, beg=0 end=len(string)) # like index
str.rfind(str, beg=0 end=len(string)) # like rindex

1.4.2 Check exact word in strings

import re
s = '98787This is correct'
for words in ['This is correct', 'This', 'is', 'correct']:
    if re.search(r'\b' + words + r'\b', s):
        print('{0} found'.format(words))

1.4.3 Extract number from strings

import re
re.findall(r'\d+', 'hello 42 I\'m a 32 string 30')
['42', '32', '30']
0.1.4.4 Check if some string exists in a list:
some_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
if any("abc" in s for s in some_list):
    # whatever

1.4.4 Find the first iterm matched

[i for i in xrange(100000) if i == 1000][0]

1.4.5 Get all the terms matched:

matching = [s for s in some_list if "abc" in s]

1.4.6 Get the index of all items that contain specific string

indices = [i for i, s in enumerate(mylist) if 'aa' in s]

1.5 dict

1.5.1 Basics

mydict.keys()
mydict.values()

1.5.2 list(dictionary.keys()) raises error in pdb

The problem is that list is a pdb debugger command.

!list(error['extras'].keys())

Commands that the debugger doesn’t recognize are assumed to be Python statements and are executed in the context of the program being debugged. Python statements can also be prefixed with an exclamation point (!).

1.5.3 Convert to Other Types

list(mydict.keys())

1.6 range

Rather than being a function, range is actually an immutable sequence type

1.6.1 convert range to list

my_list=list(range(1,1001))

1.7 tuple

A tuple is a sequence of immutable Python objects.

  1. Tuples are sequences, just like lists

  2. Tuples cannot be changed unlike lists

  3. Tuples use parentheses, whereas lists use square brackets

  4. zip can combine 2 array to tuple list.

    X = np.linspace(0.0, 1.0, 5) Y = X**2 zip(X, Y) [(0.0, 0.0), (0.25, 0.0625), (0.5, 0.25), (0.75, 0.5625), (1.0, 1.0)] type(zip(X, Y)): list

1.8 bytes

In Python3, bytes is an data type. To convert bytes into str, use the following commands.

b = b"example"
s = "example"
bytes(s, encoding = "utf8")
str(b, encoding = "utf-8")
b.decode("utf-8")

1.9 IO

Prompt the user for input

  1. raw_input in Python 2.X
  2. input in Python 3

1.9.1 Read Text File

f = open("test.txt","r")
content = f.readlines()

1.9.2 Read a specific line

with open('xxx.txt') as f:
    for i, line in enumerate(f):
        if i == 6:
            break
    else:
        print('Not 7 lines in file')
        line = None

Note that enumerate is zero-based.

1.9.3 Write Text File

  1. writelines expects a list of strings, while write expects a single string. # line1 + "\n" + line2 merges those strings together into a single string before passing it to write. Note that if you have many lines, you may want to use "\n".join(list_of_lines).

1.9.4 Booleans Formatted in Strings

>>> print("%r, %r"%(True, False) )
True, False

1.9.5 Print without Newline

print('.', end='', flush=True)

1.9.6 Print to file

with open("output.txt", "a") as f:
    print("Hello StackOverflow!", file=f)
    print("I have a question.", file=f)

1.10 Function

1.10.1 Default Parameter

>>> def function(data=[]):
...     data.append(1)
...     return data
...
>>> function()
[1]

Forms like data=[1, 2] is not acceptable for function.

1.11 Control Statement

1.11.1 if-else

>>> if x < 0:
...     x = 0
...     print 'Negative changed to zero'
... elif x == 0:
...     print 'Zero'
... elif x == 1:
...     print 'Single'
... else:
...     print 'More'

1.11.2 while Loop

count = 0
while (count < 9):
   print 'The count is:', count
   count = count + 1
print "Good bye!"

2 Numpy and Scipy

2.1 Basics

2.1.1 ND array to 1D

Use np.ravel (for a 1D view) or np.ndarray.flatten (for a 1D copy) or np.ndarray.flat (for an 1D iterator).

In [12]: a = np.array([[1,2,3], [4,5,6]])
In [13]: b = a.ravel()
In [14]: b
Out[14]: array([1, 2, 3, 4, 5, 6])

In [15]: c = a.flatten()

In [20]: d = a.flat
In [21]: d
Out[21]: <numpy.flatiter object at 0x8ec2068>
In [22]: list(d)
Out[22]: [1, 2, 3, 4, 5, 6]

2.2 Advanced

2.2.1 Normality test

import numpy as np
from scipy import stats

mat = np.loadtxt("gdb.txt", delimiter=',')
arr = mat.ravel()
print(np.mean(arr))
print(np.std(arr))

import statsmodels.api as sm
import matplotlib.pyplot as plt
plt.style.use('sjc')
fig=plt.figure()
ax=fig.gca()

ax.hist(arr, bins=50, density=True)
plt.savefig('normal.png')

3 Advanced

3.1 Tips

3.1.1 Check the normality of an array

Ways: 1) check its mean and std. 2) hist plot. 3) QQ plot.

Reference: QQ plot

3.1.2 Create a nested directory

On Python ≥ 3.5, use pathlib.Path.mkdir:

from pathlib import Path
Path("/my/directory").mkdir(parents=True, exist_ok=True)

3.1.3 Explanation on if __name__ == "__main__"

When the Python interpreter reads a source file, it executes all of the code found in it. Before executing the code, it will define a few special variables. For example, if the python interpreter is running that module (the source file) as the main program, it sets the special __name__ variable to have a value "__main__". If this file is being imported from another module, __name__ will be set to the module’s name.

3.1.4 Test a Variable is None

if variable is None:

3.1.5 Output Complex Number in Certain Format

Split the complex number to be x.real and x.imag, then format them separately and combine them in the way x.real+x.imag*1i.

3.1.6 Check Python version

>>> import sys
>>> print (sys.version)
2.5.2 (r252:60911, Jul 31 2008, 17:28:52) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)]
>>> sys.version_info
(2, 5, 2, 'final', 0)
assert sys.version_info >= (2,5)

3.1.7 How do I manually throw/raise an exception in Python?

Check this link.

4 Packages

4.1 os

4.1.1 Iterate over files in a directory

import os

for filename in os.listdir(directory):
    if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
        continue
    else:
        continue

4.1.2 Get the full path to the directory a Python file is contained in

import os 
dir_path = os.path.dirname(os.path.realpath(__file__))

4.1.3 Excluding directories in os.walk

Modifying dirs in-place will prune the (subsequent) files and directories visited by os.walk:

# exclude = set(['New folder', 'Windows', 'Desktop'])
for root, dirs, files in os.walk(top, topdown=True):
    dirs[:] = [d for d in dirs if d not in exclude]

4.1.4 Extract file name from path

import os
print(os.path.basename(your_path))

4.2 subprocess

4.2.1 Calling the source command from subprocess.Popen

>>> foo = subprocess.Popen("source the_script.sh", shell = True)
>>> /bin/sh: source: not found

source is not an executable command, it’s a shell builtin.

4.3 colorama

4.3.1 Print colored output with Python 3

import colorama
from colorama import Fore, Style
print(Fore.BLUE + "Hello World")

And call this to reset the color settings:

print(Style.RESET_ALL)

4.4 pip

4.4.1 Uninstall all packages

Dump out all the packages into the file packages.txt. Then uninstall them.

pip freeze > packages.txt
pip uninstall -r packages.txt

4.4.2 Uninstall the dependencies of a package

Use a package pip-autoremove

pip-autoremove some-package

4.5 pyinstaller

4.5.1 Cannot run on Windows 10

Error message

ModuleNotFoundError: No module named 'pkg_resources.py2_warn'

Modify the following file

C:\Users\***\AppData\Local\Programs\Python\Python37\Lib\site-packages\pkg_resources\init.py

comment import('pkg_resources.py2_warn')

4.5.2 pyinstaller 3.5 Not Working with python3.8

Install the development version.

pip install https://github.com/pyinstaller/pyinstaller/archive/develop.tar.gz

4.6 argparse

4.6.1 Example 1

parser = argparse.ArgumentParser(description = "Convert markdown to Html.")parser.add_argument("file", help = "Default convert style.")parser.add_argument("-t", "--toc", action = "store_true",help = "Add Table of Content.")args = parser.parse_args()if args.toc:SystemCmdList.append("--toc")

4.6.2 Example 2

import argparseparser = argparse.ArgumentParser(description='Process some integers.')parser.add_argument('op', choices = ['checkin', 'askhelp'],help='Operation to do')parser.add_argument('--doi', nargs=1, type=str, default = '',help='DOI of the paper for the help')parser.add_argument('--helpcredit', type=int, nargs=1, default = 5,metavar='N', help='Credit score for the help')args = parser.parse_args()my_muchong = MuChong(MUCHONG_USERNAME, MUCHONG_PASSWORD)#  my_muchong.login()if ( args.op == 'checkin' ):my_muchong.check_in()elif ( args.op == 'askhelp' ):my_muchong.askForHelp(args.doi[0],helpcredit=args.helpcredit)

4.7 time Related Modules

4.7.1 Measure Time Elapsed in Python

import time
start = time.time()
print("hello")
end = time.time()
print(end - start)

4.7.2 Get Time from String

from datetime import datetime
datetime_object = datetime.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

4.7.3 Time Delay

import time
time.sleep(5) # delays for 5 seconds

4.8 mpmath

A package for arbitrary precision. There are other ways of using data type with more precision than double. numpy.longdouble is one way. Decimal is another way. But they are not powerful when I need to calculate a list of coefficients accurately in some numerical schemes.

A comparison between mpmath and numpy.longdouble is as follows,

import numpy as np
from mpmath import mp, mpf, matrix
mp.dps = 32
val = np.longdouble( np.sqrt(1.0/7.0) )
print("Type of val is",type(val))
p = np.finfo(val).precision
prt_str = "np.sqrt(1/7): "+"%."+"%d"%(p)+"f"
print(prt_str%(val))
print("True value  : 0.3779644730092272272145165362341800608157513118689214")
print("mpmath (1/7)**(1/2):", ( mpf(1)/mpf(7) )**( mpf(1)/mpf(2) ) )
print("True value         : 0.3779644730092272272145165362341800608157513118689214")

The output is as follows,

Type of val is <class 'numpy.float128'>
np.sqrt(1/7): 0.377964473009227198
True value  : 0.3779644730092272272145165362341800608157513118689214
mpmath (1/7)**(1/2): 0.37796447300922722721451653623418
True value         : 0.3779644730092272272145165362341800608157513118689214

The true value is from Wolfram Alpha.

4.8.1 Example of Calculation

from mpmath import mp, mpf, matrix
mp.dps = 32
gamma = mpf('0.43586652150845899941601945')
alpha = 1 - 4*gamma + 2*gamma**2
beta = -1 + 6*gamma - 9*gamma**2 + 3*gamma**3
b2 = (-3*alpha**2) / (4*beta)
c2 = (2-9*gamma+6*gamma**2) / (3*alpha)
A = matrix(3,3)
A[0,0] = gamma
A[1,1] = gamma
A[2,2] = gamma
A[1,0] = c2 - gamma
A[2,0] = 1 - b2 - gamma
A[2,1] = b2
print("A is ")
print(A)
B = A[2,:]
print("B is ")
print(B)
C = matrix(3,1) 
C[0,0] = gamma
C[1,0] = c2
C[2,0] = 1
print("C is ")
print(C)

The output is as follows,

A is
[      0.43586652150845899941601945                                  0.0                           0.0]
[0.28206673924577050029199027679033         0.43586652150845899941601945                           0.0]
[ 1.2084966491760100703364776750294  -0.64436317068446906975249712502944  0.43586652150845899941601945]
B is
[1.2084966491760100703364776750294  -0.64436317068446906975249712502944  0.43586652150845899941601945]
C is
[      0.43586652150845899941601945]
[0.71793326075422949970800972679033]
[                               1.0]

5 Misc

5.1 System

5.1.1 Shebang

#!/usr/bin/env python3

5.2 Mixed Usage with Other Software

5.2.1 Compile Python with sqlite3 Support

  1. Download sqlite3 source

  2. Download Python source

  3. Edit the setup.py file to add the sqlite3 include directory to it. Something like this,

    sqlite_inc_paths = [ ‘/usr/include’, ‘/usr/include/sqlite’, ‘/usr/include/sqlite3’, ‘/usr/local/include’, ‘/usr/local/include/sqlite’, ‘/usr/local/include/sqlite3’, ‘//include’ ]

  4. configure, make, make install.

  5. Make sure there is no warning messages related to sqlite3.

5.2.2 iPython Notebook Font

For Jupyter notebook version >= 4.1, modify the custom css file, which is located at ~/.jupyter/custom/custom.css. Do something like the following.

@import url('https://fonts.googleapis.com/css?family=Source+Code+Pro');.CodeMirror pre {font-family: 'Source Code Pro', monospace;;font-size: 16pt;font-weight: 600;line-height: normal;}
.
Created on 2021-01-25 with pandoc