pythonref.txt 104 KB


  1. Download Python Documentation - https://docs.python.org/3/download.html
  2. Python core development news and information - https://blog.python.org/
  3. Python Developer's Guide - https://devguide.python.org/
  4. Python best practices for formatting code - https://www.python.org/dev/peps/pep-0008/
  5. Installing packages using pip and virtual environments -
  6. packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/
  7. Packaging python projects -
  8. https://packaging.python.org/en/latest/tutorials/packaging-projects/
  9. pyenv Tutorial -
  10. https://amaral.northwestern.edu/resources/guides/pyenv-tutorial
  11. Pythontic.com Pythonic Permutations
  12. Scipy Lecture Notes - https://scipy-lectures.org/
  13. Book: Python Scripting for Computational Science
  14. Book: Think Python
  15. https://southampton.ac.uk/~fangohr/training/python/pdfs/
  16. PLOS - Computational Biology
  17. https://www.physics.ohio-state.edu/undergrad/greStuff/
  18. https://docs.python.org/3/tutorial/venv.html # tutorial for venv
  19. Boston University TechWeb -
  20. www.bu.edu/tech/support/research/software-and-programming/common-languages/python/
  21. Google Python Style Guide - https://google.github.io/styleguide/pyguide.html
  22. Python charts - https://python-charts.com/
  23. PyQt Scientific Graphics and GUI Library for Python - https://www.pyqtgraph.org
  24. The complete PyQt5 tutorial - https://www.pythonguis.com/pyqt5-tutorial/
  25. Customizing Ticks -
  26. https://jakevdp.github.io/PythonDataScienceHandbook/04/10-customising-ticks.html
  27. Python Concurrency Learning Paths - https://superfastpython.com/learning-paths/
  28. gmpy2 - https://gmpy2.readthedocs.io/en/latest/intro.html
  29. Memory profiler - https://github.com/pythonprofilers/memory_profiler
  30. Meliae - https://launchpad.net/meliae
  31. Built-in Types - https://docs.python.org/3/library/stdtypes.html#int.bit_length
  32. Pytest - https://docs.pytest.org/en/7.4.x/contents.html
  33. ISciNumPy - https://iscinumpy.dev/
  34. Software Engineering for Scientific Computing -
  35. https://henryiii.github.io/se-for-sci/content/intro.html
  36. Scientific Python - https://scientific-python.org/
  37. Scientific Python Library Development Guide -
  38. https://learn.scientific-python.org/development/
  39. Using pytest proto-fixtures - https://michaelgoerz.net/notes/using-proto-fixtures.html
  40. Python plot HTML browser - www.scivision.dev/python-html-plotting-iframe-share/
  41. Catch Numpy warnings as error - https://www.scivision.dev/python-numpy-catch-warnings/
  42. Parallel processing in Python -
  43. https://berkeley-scf.github.io/tutorial-parallelization/parallel-python
  44. Catch all errors in Python - https://embeddedinventor.com/python-catch-all-errors/
  45. Better Scientific Software (BSSw) - https://bssw.io/
  46. Python Packaging User Guide -
  47. https://packaging.python.org/en/latest/tutorials/packaging-projects/
  48. Programming is simply the act of entering instructions for the computer to perform.
  49. Programming instructions - source code
  50. Debugging programs - finding and fixing errors
  51. People develop programming skills through practice.
  52. Three Rules of Programming:
  53. 1. It Must Work!
  54. 2. It Must be Maintainable!
  55. 3. Now Think Performance.
  56. Python source files are UTF-8-encoded text files that normally have a .py
  57. suffix. International (Unicode) characters can be freely used in the source
  58. code as long as you use the UTF-8 encoding.
  59. It is common to use #! to specify the interpreter on the first line of a
  60. program, like this:
  61. #!/usr/bin/env python
  62. The interpreter runs statements in order until it reaches the end of the input
  63. file. At that point, the program terminates and Python exits.
  64. # package
  65. The difference between a regular directory and a Python package is that the
  66. latter includes a file named __init__.py that instructs Python's interpreter to
  67. understand the directory as a package with Python code.
  68. # dynamically typed language
  69. Python does not require a variable's data type to be specified upon initialization.
  70. # naming conventions
  71. * variable: snake_case
  72. * global constant: ALL_CAPS
  73. * function: snake_case
  74. * class: CamelCase
  75. * hidden/not public: _underscore
  76. * built-in: __dunder__
  77. # load a file into the python console
  78. $ python -i program.py
  79. # list of Python's built-ins that you should avoid using as variables
  80. >>> import builtins
  81. >>> dir(builtins)
  82. ['ArithmeticError', 'AssertionError', 'AttributeError',
  83. 'BaseException', 'BaseExceptionGroup', 'BlockingIOError',
  84. 'BrokenPipeError', 'BufferError', 'BytesWarning',
  85. 'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError',
  86. 'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning',
  87. 'EOFError', 'Ellipsis', 'EncodingWarning', 'EnvironmentError',
  88. 'Exception', 'ExceptionGroup', 'False', 'FileExistsError',
  89. 'FileNotFoundError', 'FloatingPointError', 'FutureWarning',
  90. 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning',
  91. 'IndentationError', 'IndexError', 'InterruptedError', 'IsADirectoryError',
  92. 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError',
  93. 'ModuleNotFoundError', 'NameError', 'None', 'NotADirectoryError',
  94. 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError',
  95. 'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError',
  96. 'RecursionError', 'ReferenceError', 'ResourceWarning', 'RuntimeError',
  97. 'RuntimeWarning', 'StopAsyncIteration', 'StopIteration', 'SyntaxError',
  98. 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'TimeoutError',
  99. 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError',
  100. 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError',
  101. 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError',
  102. '__build_class__', '__debug__', '__doc__', '__import__', '__loader__', '__name__',
  103. '__package__', '__spec__', 'abs', 'aiter', 'all', 'anext', 'any', 'ascii', 'bin',
  104. 'bool', 'breakpoint', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod',
  105. 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod',
  106. 'enumerate', 'eval', 'exec', 'exit', 'filter', 'float', 'format', 'frozenset',
  107. 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int',
  108. 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'map',
  109. 'max', 'memoryview', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print',
  110. 'property', 'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice',
  111. 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']
  112. >>> dir(__builtins__)
  113. ['ArithmeticError', 'AssertionError', 'AttributeError',
  114. 'BaseException', 'BaseExceptionGroup', 'BlockingIOError',
  115. 'BrokenPipeError', 'BufferError', 'BytesWarning',
  116. 'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError',
  117. 'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning',
  118. 'EOFError', 'Ellipsis', 'EncodingWarning', 'EnvironmentError',
  119. 'Exception', 'ExceptionGroup', 'False', 'FileExistsError',
  120. 'FileNotFoundError', 'FloatingPointError', 'FutureWarning',
  121. 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning',
  122. 'IndentationError', 'IndexError', 'InterruptedError', 'IsADirectoryError',
  123. 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError',
  124. 'ModuleNotFoundError', 'NameError', 'None', 'NotADirectoryError',
  125. 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError',
  126. 'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError',
  127. 'RecursionError', 'ReferenceError', 'ResourceWarning', 'RuntimeError',
  128. 'RuntimeWarning', 'StopAsyncIteration', 'StopIteration', 'SyntaxError',
  129. 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'TimeoutError',
  130. 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError',
  131. 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError',
  132. 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError',
  133. '__build_class__', '__debug__', '__doc__', '__import__', '__loader__', '__name__',
  134. '__package__', '__spec__', 'abs', 'aiter', 'all', 'anext', 'any', 'ascii', 'bin',
  135. 'bool', 'breakpoint', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod',
  136. 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod',
  137. 'enumerate', 'eval', 'exec', 'exit', 'filter', 'float', 'format', 'frozenset',
  138. 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int',
  139. 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'map',
  140. 'max', 'memoryview', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print',
  141. 'property', 'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice',
  142. 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']
  143. >>> len(dir(__builtins__))
  144. 158
  145. >>> type(dir(__builtins__))
  146. <class 'list'>
  147. >>> dir(__builtins__)[0]
  148. 'ArithmeticError'
  149. >>> type(dir(__builtins__)[0])
  150. <class 'str'>
  151. >>> [s for s in dir(__builtins__) if s.islower() and not s.startswith('_')]
  152. ['abs', 'aiter', 'all', 'anext', 'any', 'ascii', 'bin', 'bool', 'breakpoint',
  153. 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', 'compile', 'complex',
  154. 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval',
  155. 'exec', 'exit', 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals',
  156. 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass',
  157. 'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview', 'min',
  158. 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range',
  159. 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod',
  160. 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']
  161. >>> [s for s in dir(__builtins__) if s.endswith('Error')]
  162. ['ArithmeticError', 'AssertionError', 'AttributeError', 'BlockingIOError',
  163. 'BrokenPipeError', 'BufferError', 'ChildProcessError', 'ConnectionAbortedError',
  164. 'ConnectionError', 'ConnectionRefusedError', 'ConnectionResetError', 'EOFError',
  165. 'EnvironmentError', 'FileExistsError', 'FileNotFoundError', 'FloatingPointError',
  166. 'IOError', 'ImportError', 'IndentationError', 'IndexError', 'InterruptedError',
  167. 'IsADirectoryError', 'KeyError', 'LookupError', 'MemoryError', 'ModuleNotFoundError',
  168. 'NameError', 'NotADirectoryError', 'NotImplementedError', 'OSError', 'OverflowError',
  169. 'PermissionError', 'ProcessLookupError', 'RecursionError', 'ReferenceError',
  170. 'RuntimeError', 'SyntaxError', 'SystemError', 'TabError', 'TimeoutError', 'TypeError',
  171. 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError',
  172. 'UnicodeTranslateError', 'ValueError', 'ZeroDivisionError']
  173. >>> [s for s in dir(__builtins__) if s.endswith('Warning')]
  174. ['BytesWarning', 'DeprecationWarning', 'EncodingWarning', 'FutureWarning',
  175. 'ImportWarning', 'PendingDeprecationWarning', 'ResourceWarning', 'RuntimeWarning',
  176. 'SyntaxWarning', 'UnicodeWarning', 'UserWarning', 'Warning']
  177. >>> 'len' in dir(__builtins__)
  178. True
  179. >>> len('string')
  180. 6
  181. >>> builtins.len('string') # built-in len() function as builtins.len()
  182. 6
  183. >>> builtins.len
  184. <built-in function len>
  185. >>> builtins.len is len
  186. True
  187. >>> __builtins__.len('string') # __builtins__ module without importing it
  188. 6
  189. >>> __builtins__.len
  190. <built-in function len>
  191. >>> __builtins__.len is len
  192. True
  193. # install the build tools
  194. $ doas apt-get update
  195. $ doas apt-get install make build-essential libssl-dev zlib1g-dev libbz2-dev
  196. libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev
  197. xz-utils tk-dev libffi-dev liblzma-dev libgdbm-compat-dev python3-openssl
  198. # Python-3.11.4 Current Stable version installation
  199. $ wget https://www.python.org/ftp/python/3.11.4/Python-3.11.4.tgz
  200. $ tar -xzvf Python-3.11.4.tgz
  201. $ mv Python-3.11.4/ ~/.python/
  202. $ cd .python/
  203. $ ./configure --enable-optimizations
  204. $ make
  205. $ make test
  206. $ doas make altinstall
  207. Note: The default Python installation is /usr/bin. If you want to install Python
  208. under /usr/local/bin instead of overwriting the default, do this:
  209. $ doas make altinstall
  210. # upgrade Python-3.11.4 to Python-3.11.5 [https://www.python.org/downloads/source/]
  211. 1. download current stable release: Python 3.11.5 [Gzipped source tarball]
  212. $ wget https://www.python.org/ftp/python/3.11.5/Python-3.11.5.tgz
  213. 2. untar Python-3.11.5.tgz
  214. $ tar -xzvf Python-3.11.5.tgz
  215. 3. move Python-3.11.5/ to ~/.python3.11.5/
  216. $ mv downloads/Python-3.11.5/ .python3.11.5/
  217. 4. change directory .python3.11.5/
  218. $ cd .python3.11.5/
  219. 5. read the file README.rst
  220. $ vi -M README.rst
  221. 6. build instructions on linux
  222. $ ./configure --enable-optimizations
  223. $ make
  224. $ make test
  225. $ doas make altinstall
  226. 7. remove Python 3.11.4 version ~/.python/
  227. $ doas rm -rf ~/.python/
  228. 8. move ~/.python3.11.5/ ~/.python/
  229. $ mv .python3.11.5/ .python/
  230. 9. verify Python 3.11.5 install path
  231. $ which python3.11
  232. 10. verify Python version
  233. $ python3.11 --version
  234. # built-in type hinting/variable annotation
  235. You can add type hinting with the following built-in types:
  236. * int integer
  237. * float floating point number
  238. * bool boolean value (subclass of int)
  239. * str text, sequence of unicode codepoints
  240. * bytes 8-bit string, sequence of byte values
  241. * object an arbitrary object (object is the common base class)
  242. These can be used both in functions and in variable annotation. The concept of
  243. variable annotation was added to the Python language in 3.6 Variable annotation
  244. allows you to add type hints to variables.
  245. Here are some examples:
  246. x: int # a variable named x without initialization
  247. y: float = 1.0 # a float variable, initialized to 1.0
  248. z: bool = False
  249. a: str = "Hello type hinting"
  250. # sequence type hinting
  251. A collection is a group of items in Python. Common collections or sequences are
  252. list, dict, tuple and set. However, you cannot annotate variables using these
  253. built-in types. Instead, you must use the typing module.
  254. from typing import List
  255. names: List[str] = ["Math"]
  256. Here you created a list with a single str in it. This specifies that you are
  257. creating a list of strings. If you know the list is always going to be the same
  258. size, you can specify each item's type in the list:
  259. from typing import List
  260. names: List[str, str] = ["Math", "Physics"]
  261. Hinting tuples is very similar:
  262. from typing import Tuple
  263. s: Tuple[int, float, str] = (5, 3.14, "laplace")
  264. Dictionaries are a little different in that you should hint types the key and
  265. values are:
  266. from typing import Dict
  267. d: Dict[str, int] = {"Newton": 25}
  268. If you know a collection will have variable size, you can use an ellipses:
  269. from typing import Tuple
  270. t: Tuple[int, ...] = (4, 5, 6)
  271. # hinting values that could be None
  272. Sometimes a value needs to be initialized as None, but when it gets set later,
  273. you want it to be something else.
  274. For that, you can use Optional:
  275. from typing import Optional
  276. result: Optional[str] = function()
  277. On the other hand, if the value can never be None, you should add an assert to
  278. your code:
  279. assert result is not None
  280. # type hinting functions
  281. Type hinting functions is similar to type hinting variables. The main
  282. difference is that you can also add a return type to a function.
  283. Let's take a look at an example:
  284. def adder(x: int, y: int) -> None:
  285. print(f"The total of {x} + {y} = {x+y}")
  286. This example shows you that adder() takes two arguments, x and y, and that they
  287. should both be integers. The return type is None, which you specify using the
  288. -> after the ending parentheses but before the colon.
  289. Let's say that you want ot assign the adder() function to a variable. You can
  290. anootate the variable as a Callable like this:
  291. from typing import Callable
  292. def adder(x: int, y: int) -> None:
  293. print(f"The total of {x} + {y} = {x+y}")
  294. a: Callable[[int, int], None] = adder
  295. The Callable takes in a list of arguments for the function. It also allows you
  296. to specify the return type.
  297. Let's look at one more example where you pass in more complex arguments:
  298. from typing import Tuple, Optional
  299. def some_func(x: int, y: Tuple[str, str], z: Optional[float]: = None): -> Optional[str]:
  300. if x > 10:
  301. return None
  302. return "You called some_func"
  303. For this example, you created some_func() that accepts 3 arguments:
  304. * an int
  305. * a two-item tuple of strings
  306. * an optional float that is defaulted to None
  307. Note that when you use defaults in a function, you should add a space before
  308. and after the equals sign when using type hints.
  309. It also returns either None or a string.
  310. # when things get complicated
  311. You have already learned what to do when a value can be None, but what else can
  312. you do when things get complicated? For example, what do you do if the argument
  313. being passed in can be multiple different types?
  314. For that specific use case, you can use Union:
  315. from typing import Union
  316. z: Union[str, int]
  317. What this type hint means is that the variable z, can be either a string or an
  318. integer.
  319. There are also cases where a function may take in an object. If that object can
  320. be one of several different objects, then you can use Any.
  321. x: Any = some_function()
  322. Use Any with caution because you can't really tell what it is that you are
  323. returning. Since it can be "any" type, it is like catching all exceptions with
  324. a bare except. You don't know what exception you are catching with that and you
  325. also don't know what type you are hinting at when you use Any.
  326. # -m flag
  327. The -m command-line flag will import any Python module and run it as a script.
  328. # debug mode
  329. $ python -m pdb cachefib.py
  330. >/home/user/directory/program.py(3)<module>()
  331. (Pdb)
  332. (Pdb) ll
  333. 1 #!/usr/bin/env python
  334. 2
  335. 3 cache = {}
  336. 4
  337. 5 def fibonacci(n):
  338. 6 if n < 3:
  339. 7 return 1
  340. 8
  341. 9 if n in cache:
  342. 10 return cache[n]
  343. 11
  344. 12 cache[n] = fibonacci(n - 1) + fibonacci(n - 2)
  345. 13 return cache[n]
  346. 14
  347. 15
  348. 16 fib = fibonacci(30)
  349. 17 -> print(f"The 30th Fibonacci number is {fib}")
  350. (Pdb)
  351. (Pdb) whatis fib
  352. <class 'int'>
  353. (Pdb) pp fib # pretty print
  354. 832040
  355. (Pdb) p fibonacci(50)
  356. 12586269025
  357. (Pdb) b cachefib.py:13 # breakpoint at line 13
  358. (Pdb) b cachefib.fibonacci # breakpoint at function fibonacci
  359. (Pdb) b 24 # breakpoint at line 24
  360. (Pdb) break # list of breakpoints
  361. (Pdb) disable 1 # disable a breakpoint
  362. (Pdb) enable 1 # enable a breakpoint
  363. (Pdb) clear 1 # remove a breakpoint entirely
  364. (Pdb) tbreak 3 # temporary break at line 3
  365. (Pdb) l # list 11 lines around the current line
  366. (Pdb) l 11,21 # list the lines 11 - 21
  367. (Pdb) ll # list all the lines in the program
  368. (Pdb) exit # to exit debug mode
  369. Table of pdb commands:
  370. args [a]
  371. Print the argument list of the current function
  372. break [b]
  373. Creates a breakpoint (requires parameters) in the program execution
  374. By using the break command to set breakpoints, you'll run the program
  375. up until the specified breakpoint.
  376. Type clear and then y to remove all current breakpoints. You can then
  377. place a breakpoint where a function is defined.
  378. To see a list of breakpoints that are currently set to run, use the
  379. command break without any arguments. You'll receive information about
  380. the particularities of the breakpoint(s) you've set.
  381. We can also disable a breakpoint with the command disable and the
  382. number of the breakpoint.
  383. To enable a breakpoint, use the enable command, and to remove a breakpoint
  384. entirely, use the clear command.
  385. Creating temporary breakpoints that are automatically cleared the first
  386. time program execution hits the point with the command tbreak.
  387. continue [c/cont]
  388. Continues program execution until there is a breakpoint
  389. help [h]
  390. Provides list of commands or help for a specified command
  391. You can use the command help to learn pdf commands, and
  392. help <command> to learn more about a specific command.
  393. jump [j]
  394. Set the next line to be executed
  395. list [l]
  396. Print the source code around the current line
  397. Without providing arguments, the list command provides 11 lines
  398. around the current line.
  399. list 3,7 [l 3,7]
  400. Print the lines 3-7
  401. longlist [ll]
  402. List all source code for the current function or frame
  403. next [n]
  404. Continue execution until the next line in the current function is reached
  405. run
  406. Restart the debugged python program at any place within the program
  407. step [s]
  408. Execute the current line, stopping at first possible occasion
  409. The difference between step and next is that step will stop within a called
  410. function, while next executes called functions to only stop at the next line
  411. of the current function.
  412. pp
  413. Pretty-prints the value of the expression using the pprint module
  414. quit/exit [q]
  415. Aborts the program
  416. return [r]
  417. Continue execution until the current function returns
  418. Note: Call the last command you called by pressing the <Enter> key at the prompt.
  419. Ref:
  420. https://docs.python.org/3/library/pdb.html
  421. https://www.digitalocean.com/community/tutorials/how-to-use-the-python-debugger
  422. https://www.redhat.com/sysadmin/python-debugger-pdb
  423. # linux system packages required for scipy
  424. sudo apt-get install gcc gfortran python3-dev meson libopenblas-dev
  425. liblapack-dev cython
  426. # wisdom of OOP
  427. write methods only when the exclusive association with the data type is not in
  428. doubt (e.g. ADT's).
  429. # executing python script in ipython
  430. %run script.py
  431. Terminate an interactive session:
  432. >>> import sys
  433. >>> sys.exit()
  434. or
  435. >>> raise SystemExit
  436. or
  437. control-D for UNIX/Linux
  438. Shell commands in Python:
  439. >>> import os
  440. >>> os.system('ls -CF')
  441. In Python interpreters, the underscore character _ is a variable with the value
  442. of the previous command's output.
  443. Functions are defined with parameters and are called with arguments.
  444. When defining a function, its input variables are called the parameters of the
  445. function. The input used when executing the function is called its argument.
  446. A list is a basic Python data structure.
  447. To see the methods associated with a list, type the object name (list),
  448. followed by a period, and press tab.
  449. The help() function is the key to understanding most other functions.
  450. >>> help(round) # display description of what the function does
  451. >>> help(round(-2.01)) # Error
  452. Python evaluates an expression like this from the inside out. First it
  453. calculates the value of round(-2.01), then it provides help on the output of
  454. that expression.
  455. >>> help() # to start interactive help utility
  456. help> list # get documentation on the list class
  457. help> numpy # get documentation on the numpy package
  458. help> scipy # get documentation on the scipy package
  459. help> matplotlib # get documentation on the matplotlib package
  460. help> matplotlib.pylab # get documentation on the matplotlib.pylab
  461. help> matplotlib.pyplot # get documentation on the matplotlib.pyplot
  462. help> quit/q # end the interactive help session
  463. Strings are a collection of characters which are stored together to represent
  464. arbitrary text inside a python program.
  465. Special characters in strings:
  466. \ continuation
  467. \\ literal backslash
  468. \' single quote
  469. \" double quote
  470. \a bell
  471. \b backspace
  472. \e escape character
  473. \0 null terminator
  474. \n newline
  475. \t horizontal tab
  476. \f form feed
  477. \r carriage return
  478. \OXX octal character XX
  479. \xXX hexadecimal value XX
  480. threelines = 'First\ threelines = '''First
  481. Second\ Second
  482. Third' Third'''
  483. """ """ or ''' ''', creates a multi-line string literal
  484. Python provides what are called raw strings, in which the character sequences
  485. have no special meaning. To construct a raw string, precede the opening quote
  486. character with either a lowercase or uppercase (r or R). Note, however that a
  487. backslash cannot be the very last character of a raw string.
  488. >>> print('Here is a backslash: \\')
  489. Here is a backslash: \
  490. >>> print(r'Here is a backslash: \ ')
  491. Here is a backslash: \
  492. Unicode string:
  493. Unicode string characters are stored in 16 bits instead of the 8 bits used by
  494. a normal string.
  495. print(u'institute of mathematical sciences')
  496. If a unicode string is combined with a regular string, the resulting string
  497. will also be a Unicode string.
  498. Python considers the type of an object when it tries to apply an operator, so
  499. that if you try to concatenate a string and a number, you'll have problems.
  500. >>> x = 12./7.
  501. >>> print("The answer is " + x)
  502. The number (x) must first be converted to a string before it can be
  503. concatenated. Python provides two ways to do this: the core function repr, or
  504. the backquote operator ('').
  505. >>> print("The answer is " + repr(x))
  506. The answer is 1.71428571429
  507. >>> print("The answer is " + 'x')
  508. The answer is 1.71428571429
  509. The asterisk (*), when used between a string and an integer creates a new
  510. string with the old string repeated by the value of the integer. The order of
  511. the arguments not important.
  512. >>> '-' * 10
  513. ----------
  514. >>> 10 * '-'
  515. ----------
  516. indexing and slicing string:
  517. >>> name = 'institute'
  518. >>> for n in range(len(name)):
  519. print(name[n])
  520. i
  521. n
  522. s
  523. t
  524. i
  525. t
  526. u
  527. t
  528. e
  529. >>> name = 'institute'
  530. >>> for n in name:
  531. print(n)
  532. i
  533. n
  534. s
  535. t
  536. i
  537. t
  538. u
  539. t
  540. e
  541. >>> name = 'institute of mathematical sciences'
  542. >>> name[0]
  543. i
  544. >>> name[-1]
  545. s
  546. >>> len(name)
  547. 34
  548. >>> name[:50]
  549. institute of mathematical sciences
  550. >>> name[-50:]
  551. institute of mathematical sciences
  552. If you use a value for a slice index which is larger than the length of the
  553. string, python does not raise an exception, but treats the index as if it was
  554. the length of the string.
  555. Using a second index which is less than or equal to the first index will result
  556. in an empty string.
  557. Strings in python are immutable objects, this means that you can't change the
  558. value of a string in place.
  559. >>> name = 'institute of mathematical sciences'
  560. >>> name.split()
  561. ['institute', 'of', 'mathematical', 'sciences']
  562. >>> name = 'institute of mathematical sciences'
  563. name.split(' ')
  564. ['institute', 'of', 'mathematical', '', 'sciences']
  565. # Python Key words
  566. and as assert break class continue def del elif
  567. else except exec finally for from global if import in is
  568. lambda not or pass print raise return try while with yield
  569. (Note: In Py.3.0, exec is no longer a keyword)
  570. Arithmetic operations:
  571. Operator Name Description
  572. a + b Addition Sum of a and b
  573. a - b Subtraction Difference of a and b
  574. a * b Multiplication Product of a and b
  575. a / b True division Quotient of a and b
  576. a // b Floor division Quotient of a and b, removing fractional parts
  577. a % b Modulus Integer remainder after division of a by b
  578. a ** b Exponentiation a raised to the power of b
  579. -a Negation The negative of a
  580. +a Unary plus
  581. * The division operator (/) produces a floating-point number when applied to
  582. integers. Therefore, 7/4 is 1.75.
  583. * The truncating division operator //, also known as floor division, truncates
  584. the result to an integer and works with both integers and floating-point numbers.
  585. * The modulo operator returns the remainder of the division x//y.
  586. For example, 7 % 4 is 3.
  587. * For floating-point numbers, the modulo operator returns the floating-point
  588. remainder of x//y, which is x - (x//y) * y
  589. Mathematic Functions:
  590. Function Description
  591. abs(x) Absolute value
  592. divmod(x, y) Returns (x//y, x%y)
  593. pow(x, y) Returns (x**y)
  594. pow(x, y, z) Returns (x**y)%z
  595. round(x, n) Rounds to the nearest multiple of 10 to the nth power
  596. * round() function implements "banker's rounding". If the value being rounded is
  597. equally close to two multiples, it is rounded to the nearest even multiple.
  598. For example, 0.5 is rounded to 0.0 and 1.5 is rounded to 2.0
  599. Bit Manipulation Operators:
  600. Operation Description
  601. x << y Left shift
  602. x >> y Right shift
  603. x & y Bitwise and
  604. x | y Bitwise or
  605. x ^ y Bitwise xor (exclusive or)
  606. ~x Bitwise negation
  607. * One would commonly use these with binary integers.
  608. For example:
  609. a = 0b11001001
  610. mask = 0b11110000
  611. x = (a & mask) >> 4 # x = 0b1100 (12)
  612. Comparison Operators:
  613. Operation Description
  614. x == y Equal to
  615. x != y Not equal to
  616. x < y Less than
  617. x > y Greater than
  618. x >= y Greater than or equal to
  619. x <= y Less than or equal to
  620. * The result of a comparison is a Boolean value True or False
  621. * A value is considered false if it is literally False, None, numerically zero,
  622. or empty. Otherwise, it's considered true.
  623. Logical Operators:
  624. Operator Description
  625. x or y If x is false, return y; otherwise, return x
  626. x and y If x is false, return x; otherwise, return y
  627. not x If x is false, return True; otherwise, return False
  628. * Python does not have increment (++) or decrement(--) operators.
  629. pyenv Tutorial: https://amaral.northwestern.edu/resources/guides/pyenv-tutorial
  630. $ git clone git@github.com:pyenv/pyenv.git .pyenv
  631. $ vi .bashrc
  632. export PYENV_ROOT="$HOME/.pyenv"
  633. export PATH="$PYENV_ROOT/bin:$PATH"
  634. eval "$(pyenv init -)"
  635. $ source ~/.bashrc
  636. pyenv walkthrough:
  637. $ pyenv global
  638. system
  639. $ pyenv versions
  640. * system (set by /home/user/.pyenv/version)
  641. $ pyenv install -list # show all available python versions to install
  642. $ pyenv install 3.11.0 # install python version 3.11.0
  643. $ python versions # pyenv now lists two python versions
  644. * system (set by /home/user/.pyenv/version)
  645. 3.11.0
  646. $ pyenv global 3.11.0 # to use python 3.11.0 as the global
  647. # To use pyenv as projet-specific, or local version
  648. $ pyenv global system
  649. $ mkdir project_directory
  650. $ cd project_directory/
  651. project_directory$ pyenv local 3.11.0
  652. project_directory$ python -V
  653. Python 3.11.0
  654. project_directory$ cd ..
  655. $ python -V
  656. Python 3.9.2
  657. # delete/remove virtual environment with pyenv
  658. $ pyenv virtualenv-delete <name>
  659. check this: $ pyenv uninstall <name>
  660. Creating a virtual environment:
  661. In Python 3.6+, the recommended way to create a virtual environment is to run:
  662. $ python3 -m venv /path/to/new/virtual/environment
  663. Make sure that python3 resolves to whichever version of python3 you'd like to
  664. bind to your virtual environment. For example, to create a new virtual
  665. environment for CS41 named cs41-env in your home directory, you could run:
  666. $ python3 -m venv ~/cs41-env
  667. Activation and Deactivation:
  668. To activate a virtual environment on macOS or Linux running bash or zsh, source
  669. the following path:
  670. $ source ~/cs41-env/bin/activate
  671. https://github.com/stanfordpython/python-handouts/blob/master/virtual-environment.md
  672. $ deactivate # deactivate a virtual environment
  673. which python3 # installation search path of python3
  674. /usr/bin/python3 # in Ubuntu 16.04
  675. When a script file is used, it is sometimes useful to be able to run the script
  676. and enter interactive mode afterwards. This can be done by passing -i before
  677. the script.
  678. >>> python3 -i script.py
  679. !pydoc numpy # documentation for numpy in ipython
  680. Example:
  681. #!/usr/bin/env python3
  682. from math import sin
  683. import sys
  684. x = float(sys.argv[1])
  685. print('sin({0}) = {1}'.format(x, sin(x)))
  686. print('sin({x:q}) = {s:.3f}'.format(x=x, s=sin(x)))
  687. # curly brackets are place holders - {0}, {1}
  688. Print Statements:
  689. x = 0.8
  690. print("The value of x is", x)
  691. print("sin(%f) = %f" %(x, sin(x)))
  692. print("sin(%.2f) = %.2f" %(x, sin(x)))
  693. print("sin({0}) = {1}".format(x, sin(x)))
  694. print("sin({x:q}) = {s:.3f}".format(x=x, s=sin(x)))
  695. Complex Numbers:
  696. a = 1+2j
  697. b = 3-5j
  698. print(a*b)
  699. Logrithm:
  700. from math import log10
  701. log10(5)
  702. Special characters in strings:
  703. s = "\"This is a quote\" and \n here comes a backslash: \\"
  704. print(s)
  705. "This is a quote" and
  706. here comes a backslash: \
  707. String concatenation:
  708. Strings can be glued together with the + and the * operators
  709. "hello "*3 + "world"
  710. 'hello hello hello world'
  711. quote = 'I will not eat chips all day"
  712. (quote + ", ")*10 + quote
  713. "1"*10 = 1111111111
  714. int("1")*10 = 10
  715. Slicing:
  716. You can extract a sub-string with the [start:end] slicing notation:
  717. quote[2:6]
  718. 'will'
  719. If the start argument is left out, the substring will start from the first
  720. charachter:
  721. quote[:6] # I will
  722. quote[7:] # not eat chips all day
  723. Negative indices can be used to index "from the right":
  724. | c | h | i | p | s |
  725. 0 1 2 3 4 5
  726. -5 -4 -3 -2 -1 0
  727. 'chips'[1:-2]
  728. 'hi'
  729. Python strings ar
  730. import numpy as np
  731. dir(np)
  732. help(np.zeros)
  733. chmod u+x file.py
  734. ./file.py
  735. import this # Zen of Python
  736. # exit
  737. CTRL-D # to exit python
  738. quit()/exit()
  739. Pip (recursive acronym for "Pip installs Packages") is a cross platform package
  740. manager for installing and managing Python packages (which can be found in the
  741. Python Package Index (PyPI))
  742. # string formaat code
  743. %s string
  744. %c character
  745. %d integer
  746. %f floating pointer
  747. %o octal number
  748. %x hexadecimal number
  749. %e scientific notation
  750. # system useage
  751. python3 -m pip install <pkgname> # install
  752. python3 -m pip install --upgrade <pkgname> # upgrade
  753. python3 -m pip search <pkgname> # search
  754. python3 -m pip list | less -N # list
  755. python3 -m pip help | less -N # help
  756. pip3 --version # version
  757. # virtual environment using venv
  758. ## install
  759. pip install setuptools
  760. pip install wheel
  761. pip install numpy
  762. pip install scipy
  763. pip install sympy
  764. pip install gnuplotlib
  765. pip install matplotlib
  766. pip install matplotlib-venn
  767. pip install seaborn
  768. pip install altair
  769. pip install bokeh
  770. pip install dash
  771. pip install pandas
  772. pip install pylint
  773. pip install keras
  774. pip install theano
  775. pip install tensorflow
  776. pip install num2word
  777. pip install num2words
  778. pip install sklearn
  779. pip install imageio
  780. pip install mglearn
  781. pip install mpld3
  782. pip install mpmath
  783. pip install nltk
  784. pip install opencv-python
  785. pip install pygal
  786. pip install altair
  787. pip install umPlot
  788. pip install tilemapbase
  789. pip install PROJ
  790. pip install pydataset
  791. Ref: https://packaging.python.org/tutorials/installing-packages/
  792. Ref: https://pip.pypa.io/en/latest/ # pip documentation
  793. ## upgrade
  794. pip install --upgrade pip
  795. pip install --upgrade <pkgname>
  796. pip install --upgrade `pip list --outdated | awk 'NR>2 {print $1}'`
  797. for pip > 20.0:
  798. pip list --format freeze --outdated | sed 's/=.*//g' | xargs -n1 pip install -U
  799. ## downgrade
  800. pip install <pkgname>==<version>
  801. ## search
  802. pip search packagename
  803. ## show
  804. pip show packagename
  805. ## list
  806. pip list | less -N
  807. ## list pkgs [Documentation: https://pip.pypa.io/en/stable/]
  808. # list outdated packages
  809. pip list --outdated
  810. # list outdated packages using freeze formatting
  811. pip list --outdated --format=freeze
  812. # list uptodate packages
  813. pip list --uptodate
  814. # list editable projects
  815. pip list --editable
  816. # do not list globally-installed packages
  817. pip list --local
  818. # output packages installed in user-site
  819. pip list --user
  820. # restrict to the specified installation path
  821. pip list --path <path>
  822. # include pre-release and development versions
  823. pip list --pre
  824. # select the output format: columns, freeze, json
  825. pip list --format <list_format>
  826. # list packages that are not dependencies
  827. pip list --not-required
  828. # exclude editable package from output
  829. pip list --exclude-editable
  830. # include editable package from output
  831. pip list --include-editable
  832. # exclude specified package from the output
  833. pip list --exclude <package>
  834. # base URL of the python package index
  835. pip list --index-url <url>
  836. # extra URLs of package indexes to use
  837. pip list --extra-index-url <url>
  838. # ignore package index
  839. pip list --no-index
  840. # if a URL or path to an html file, then parse for links to archives such as
  841. # sdist (tar.gz) or wheel (.whl) files.
  842. pip list --find-links <url>
  843. for pip >= 10.0.1:
  844. import pkg_resources
  845. from subprocess import call
  846. packages = [dist.project_name for dist in pkg_resources.working_set]
  847. call("pip install --upgrade " + ' '.join(packages), shell=True)
  848. ## to upgrade all local packages
  849. pip install pip-review
  850. pip-review --local --interactive
  851. ## get complete information of installed package
  852. pip show <pkgname>
  853. ## help
  854. pip help | less
  855. (To see a list of all commands type)
  856. pip help install | less
  857. ## write all packages list to a file
  858. pip freeze > requirements.txt
  859. open the text file, replace the == with >=, and execute
  860. pip install -r requirements.txt --upgrade
  861. ## version
  862. pip --version
  863. ## uninstall
  864. pip uninstall <pkgname>
  865. pip uninstall <pkgname1> <pkgname2> <pkgname3>
  866. # pip
  867. Usage: sudo python3 -m pip <command> [options]
  868. Commands:
  869. install Install packages
  870. download Download packages
  871. uninstall Uninstall packages
  872. freeze Output installed packages in requirements format
  873. list List installed packages
  874. show Show information about installed packages
  875. check Verify installed packages have compatible dependencies
  876. search Search PyPI for packages
  877. wheel Build wheels from your requirements
  878. hash Compute hashes of package archives
  879. completion A helper command used for command completion
  880. help Show help for commands
  881. # to check installed
  882. python3 -c "import numpy"
  883. python3 -c "import sklearn"
  884. # pylint
  885. pylint file.py
  886. # pysparks
  887. pysparks file.py
  888. time = [time for time in np.linspace(0, 40, 10) for n in range(2)]
  889. print(time)
  890. [0.0, 0.0, ... 40.0, 40.0]
  891. time = [np.linspace(0, 40, 10) for n in range(2)]
  892. print(time)
  893. [array([0., ... 40.]), array([0., ... 40.])]
  894. time = 2*[np.linspace(0, 40, 10)] # result is same as above
  895. print(time)
  896. [array([0., ... 40.]), array([0., ... 40.])]
  897. # to check the architecture of python installed
  898. import platform, sys
  899. platform.architecture(), sys.maxsize
  900. import numpy as np
  901. np.__version__
  902. # for reference
  903. /usr/local/lib/python3.5/dist-packages/numpy/core/function_base.py
  904. # to find the python architecture
  905. import platform
  906. platform.architecture()
  907. Matplotlib gallery:
  908. https://matplotlib.org/gallery.html
  909. https://github.com/rasbt/matplotlib-gallery
  910. http://nbviewer.ipython.org/github/cs109/content/blob/master/
  911. lec_03_statistical_graphs.ipynb
  912. Matplotlib examples:
  913. https://matplotlib.org/1.3.1/examples/
  914. https://github.com/jbmouret/matplotlib_for_papers
  915. https://www.programcreek.com/python/example/102352/
  916. matplotlib.pyplot.ticklable_format
  917. # to list available style format
  918. import matplotlib.pyplot as plt
  919. plt.style.available
  920. style sheets reference:
  921. https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html
  922. # customizing matplotlib with style sheets and rcParams
  923. # Ref: https://matplotlib.org/stable/tutorials/introductory/customizing.html
  924. # to display the currently active matplotlibrc file was loaded from
  925. >>> import matplotlib
  926. >>> matplotlib.matplotlib_fname()
  927. '/home/user/.venv/dsci/lib/python3.12/site-packages/matplotlib/mpl-data/matplotlibrc'
  928. The path /user/.../mpl-data/ is where we would like to go and locate the style sheets:
  929. $ cd .venv/dsci/lib/python3.12/site-packages/matplotlib/mpl-data/
  930. $ ls
  931. fonts/ images/ kpsewhich.lua matplotlibrc plot_directive/ sample_data/ stylelib/
  932. The directory of interest is stylelib/
  933. $ cd stylelib/
  934. $ ls
  935. bmh.mplstyle
  936. classic.mplstyle
  937. dark_background.mplstyle
  938. fast.mplstyle
  939. fivethirtyeight.mplstyle
  940. ggplot.mplstyle
  941. ...
  942. If we have write privilege to the above mentioned path for stylelib/, we can put
  943. the custom style sheet into the same directory and invoke the style sheet with:
  944. >>> plt.style.use("signature")
  945. If we don't have the write privilege, the only extra thing we would need to do
  946. is to include the full path of the custom style sheet:
  947. >>> plt.style.use("/home/user/signature.mplstyle")
  948. # invoke custom style sheet
  949. plt.style.use("signature") # apply globally
  950. with plt.style.context("signature"): # apply locally with context manager
  951. plt.plot([1, 2, 3, 4])
  952. # return to default styling
  953. >>> import matplotlib as mpl
  954. >>> mpl.rcParams.update(mpl.rcParamsDefault) # reset via rcParams.update
  955. >>> import matplotlib.pyplot as plt
  956. >>> plt.style.use("default") # reset with default style sheet
  957. # matplotlib animation writers list
  958. >>> import matplotlib.animation as animation
  959. >>> animation.writers.list()
  960. ['pillow', 'ffmpeg', 'ffmpeg_file', 'imagemagick', 'imagemagick_file', 'html']
  961. Python - https://www.python.org/
  962. Python Course - https://www.python-course.eu/index.php
  963. Scipy Cookbook
  964. https://scipy-cookbook.readthedocs.io/items/FrequencySweptDemo.html
  965. https://book.pythontips.com/en/latest/index.html
  966. # Some examples
  967. x = [1,3,5,7,9]
  968. sum_squared = 0
  969. for i in range(len(x)):
  970. sum_squared+=x[i]**2
  971. for y in x:
  972. sum_squared+=y**2
  973. x = [1,3,5,7,9]
  974. sum_squared = sum([y**2 for y in x]) # pythonic way
  975. x = [1,2,3,4,5,6,7,8,9]
  976. even_squared = [y**2 for y in x if y%2==0]
  977. squared_cubed = [y**2 if y%2==0 else y**3 for y in x]
  978. # Dictionary comprehension
  979. x = [1,2,3,4,5,6,7,8,9]
  980. {k:k**2 for k in x}
  981. {k:k**2 for k in x if x%2==0}
  982. {k:k**2 if k%2==0 else k**3 for k in x}
  983. # Pandas
  984. import pandas as pd
  985. # reading data
  986. data = pd.read_csv('file.csv')
  987. data = pd.read_csv('file.csv', sep=';', encoding='latin-1', nrows=1000, skiprows=[2,5])
  988. # (read_csv, read_excel, read_clipboard, read_sql)
  989. # writing data
  990. data.to_csv('file.csv', index=None)
  991. # index=None will simply write the data as it is. If you don't write index=None, you'll
  992. get an additional first column of 1,2,3,... until the last row.
  993. # (.to_excel, .to_json, .to_pickle, .to_clipboard)
  994. # checking the data
  995. data.shape # gives (#rows, #columns)
  996. data.describe() # Compute basic statistics
  997. # seeing the data
  998. data.head(3)
  999. # Print the first 3 rows of the data. Similarly to .head(), .tail() will look at the
  1000. last rows of the data.
  1001. data.loc[8] # Print the 8th row
  1002. data.loc[8, 'column_1'] # Print the value of the 8th row on 'column_1'
  1003. data.loc[range(4,6)] # Subset from row 4 to 6(excluded)
  1004. # total sum per column and per row and saving it to Total
  1005. df.loc['Total',:] = df.sum(axis=0) # Total sum per column
  1006. df.loc[:,'Total'] = df.sum(axis=1) # Total sum per row
  1007. To be continue...
  1008. Source: https://towardsdatascience.com/be-a-more-efficient-data-scientist-today-master
  1009. -pandas-with-this-guide-ea362d27386
  1010. # 23 Pandas codes for Data Science
  1011. 1. read in a CSV dataset
  1012. pd.DataFrame.from_csv("csv_file") or pd.read_csv("csv_file")
  1013. 2. read in an excel dataset
  1014. pd.read_excel("excel_file")
  1015. 3. write your data frame directly to csv
  1016. df.to_csv("data.csv", sep=",", index=False)
  1017. 4. basic dataset feature info
  1018. df.info()
  1019. 5. basic dataset statistics
  1020. print(df.describe())
  1021. 6. print data frame in a table
  1022. print(tabulate(print_table, headers=headers))
  1023. 7. list the column names
  1024. df.columns
  1025. 8. drop missing data
  1026. df.dropna(axis=0, how='any')
  1027. 9. replace missing data
  1028. df.replace(to_replace=None, value=None)
  1029. 10. check for NANs
  1030. pd.isnull(object)
  1031. 11. drop a feature
  1032. df.drop('feature_variable_name', axis=1) # axis is either 0 for rows, 1 for columns
  1033. 12. convert object type to float
  1034. pd.to_numeric(df["feature_name"], errors='coerce')
  1035. 13. convert data frame to numpy array
  1036. df.as_matrix()
  1037. 14. get first "n" rows of a data frame
  1038. df.head(n)
  1039. 15. get data by feature name
  1040. df.loc[feature_name]
  1041. 16. apply a function to a data frame
  1042. df["height"].apply(lambda height: 2 * height)
  1043. or
  1044. def multiply(x):
  1045. return x*2
  1046. df["height"].apply(multiply)
  1047. 17. renaming a column
  1048. df.rename(columns = {df.columns[2]: 'size'), inplace=True)
  1049. 18. get the unique entries of a column
  1050. df["name"].unique()
  1051. 19. accessing sub-data frames
  1052. new_df = df[["name", "size"]]
  1053. 20. summary information about your data
  1054. df.sum() # sum of values in a data frame
  1055. df.min() # lowest value of a data frame
  1056. df.max() # highest value
  1057. df.idxmin() # index of the lowest value
  1058. df.idxmax() # index of the highest value
  1059. df.describe() # statistical summary of the data frame, with quartiles, median, etc.
  1060. df.mean() # average values
  1061. df.median() # median values
  1062. df.corr() # correlation between columns
  1063. df["size"].median() # to get these values for only one column
  1064. 21. sorting your data
  1065. df.sort_values(ascending=False)
  1066. 22. boolean indexing
  1067. df[df["size"] == 5]
  1068. 23. selecting values
  1069. df.loc([0], ['size'])
  1070. t = [0.00 0.31 0.59 .90 1.21 1.48 1.81]
  1071. dt = mean(diff(t))
  1072. st = std(diff(t)) # standard deviation of diff(t)
  1073. F = 1/dt
  1074. time = np.linspace(0, 10, 5) for n in range(5):
  1075. for n in range(len(time)): time = np.linspace(0, 10, 5)
  1076. print(n, time) print(n, time)
  1077. a = 1 + 2 + 3 + \ a = (1 + 2 + 3 +
  1078. 4 + 5 + 6 + \ 4 + 5 + 6 +
  1079. 7 + 8 + 9 7 + 8 + 9)
  1080. # 10 Python tips and tricks for writing better code
  1081. 1. Ternary Operatior
  1082. In computer science, a ternary operator is an operator that takes three arguments (or
  1083. operands). The arguments and result can be of different types.
  1084. condition = True condition = True
  1085. if condition: x = 1 if condition else 0 # ternary condition
  1086. x = 1
  1087. else: print(x)
  1088. x = 0
  1089. print(x)
  1090. 2. working with large numbers
  1091. num1 = 10000000000 num1 = 10_000_000_000
  1092. num2 = 100000000 num2 = 100_000_000
  1093. total = num1 + num2 total = num1 + num2
  1094. print(total) print(f'{total:,}')
  1095. 3. context manager
  1096. f = open('test.txt', 'r') with open('test.txt', 'r') as f:
  1097. file_contents = f.read() file_contents = f.read()
  1098. f.close()
  1099. words = file_contents.split(' ') words = file_contents.split(' ')
  1100. word_count = len(words) word_count = len(words)
  1101. print(word_count) print(word_count)
  1102. # Opening two files at the same time with one 'with' statement
  1103. with open("file1", "r") as source, open("file2", "w") as destination:
  1104. destination.write(source.read())
  1105. 4. enumerate function - return both index and value
  1106. names = ['Corey', 'Chris', 'Dave', 'Travis'] ""
  1107. index = 0
  1108. for name in names:
  1109. print(index, name)
  1110. index += 1
  1111. names = ['Corey', 'Chris', 'Dave', 'Travis']
  1112. for index, name in enumerate(names): # default count from 0
  1113. for index, name in enumerate(names, start=1):
  1114. print(index, name)
  1115. 5. zip function (unpacking) - loop over two or more lists at once
  1116. names = ['Peter Parker', 'Clark Kent', 'Wade Wilson', 'Bruce Wayne']
  1117. heroes = ['Spiderman', 'Superman', 'Deadpool', 'Batman']
  1118. for index, name in enumerate(names):
  1119. hero = heroes[index]
  1120. print(f'{name} is actually {hero}')
  1121. names = ['Peter Parker', 'Clark Kent', 'Wade Wilson', 'Bruce Wayne']
  1122. heroes = ['Spiderman', 'Superman', 'Deadpoolv, 'Batman']
  1123. for name, hero in zip(names, heroes):
  1124. print(f'{name} is actually {hero}')
  1125. names = ['Peter Parker', 'Clark Kent', 'Wade Wilson', 'Bruce Wayne']
  1126. heroes = ['Spiderman', 'Superman', 'Deadpool', 'Batman']
  1127. universes = ['Marvel', 'DC', 'Marvel', 'DC']
  1128. for name, hero, universe in zip(names, heroes, universes):
  1129. print(f'{name} is actually {hero} from {universe}')
  1130. names = ['Peter Parker', 'Clark Kent', 'Wade Wilson', 'Bruce Wayne']
  1131. heroes = ['Spiderman', 'Superman', 'Deadpool', 'Batman']
  1132. universes = ['Marvel', 'DC', 'Marvel', 'DC']
  1133. # to print tuples of all three values
  1134. for value in zip(names, heroes, universes):
  1135. print(value)
  1136. 6. To be continue ...
  1137. # Pylint
  1138. pylint program.py
  1139. # Valgrind in Python
  1140. valgrind python3 file.py # to use valgrind
  1141. valgrind --tool=massif python3 file.py # use of memory during program execution
  1142. # tabs into space (to fix python indentation)
  1143. pip install autopep8
  1144. autopep8 script.py # print only
  1145. autopep8 -i script.py # write file
  1146. On most UNIX-like systems, can also run:
  1147. expand -t4 oldfile.py > newfile.py
  1148. from the command line, changing the number if you want to repalce tabs with a
  1149. number of spaces other than 4. You can easily write a shell script to do this
  1150. with a bunch of files at once, retaining the original file names.
  1151. # profiling in python
  1152. python3 -m cProfile -o file.prof file.py
  1153. Using the -o switch will output the profiler results to the file.prof
  1154. # line_profiler and kernprof (Ref: https://github.com/pyutils/line_profiler)
  1155. pip install line_profiler
  1156. kernprof will create an instance of LineProfiler and insert it into the
  1157. __builtins__ namespace with the name profile. It has been written to be used
  1158. as a decoratro, so in your script, you decorate the functions you want to
  1159. profile with @profile.
  1160. @profile
  1161. def function(x):
  1162. ...
  1163. kernprof -l program.py
  1164. mpiexec -np 2 kernprof -l program.py
  1165. the default behavior of kernprof is to put the results into a binary file
  1166. program.py.lprof you can tell kernprof to immediately view the formatted
  1167. results at the terminal with the [-v/--view] option. Otherwise, you can
  1168. view the results later like so:
  1169. $ python -m line_profiler program.py.lprof
  1170. kernprof -lv program.py
  1171. mpiexec -np 2 kernprof -lv program.py
  1172. kernprof also works with cProfile, its third-party incarnation lsprof, or the
  1173. pure-Python profile module depending on what is available.
  1174. # memory profiler (Ref: https://github.com/pythonprofilers/memory_profiler)
  1175. pip install memory_profiler
  1176. use mprof to generate a full memory usage report of your executable and to plot it:
  1177. mprof run program.py
  1178. mprof plot
  1179. line-by-line memory usage:
  1180. @profile
  1181. def function():
  1182. ...
  1183. python -m memory_profiler program.py
  1184. a function decorator is also available:
  1185. from memory_profiler import profile
  1186. @profile
  1187. def function():
  1188. ...
  1189. in this case the script can be run without specifying -m memeory_profiler in the
  1190. command line.
  1191. in function decorator, you can specify the precision as an argument to the decorator
  1192. function:
  1193. from memory_profiler import profile
  1194. @profile(precision=4)
  1195. def function():
  1196. ...
  1197. if a python script with decorator @profile is called using -m memory_profiler in the
  1198. command line, the precision parameter is ignored.
  1199. # complex number
  1200. >>> complex(2,3)
  1201. (2+3j)
  1202. >>> z = complex(input('Enter a complex number: '))
  1203. >>> z = 2+3j
  1204. >>> z.real
  1205. 2
  1206. >>> z.imag
  1207. 3
  1208. >> z.conjugate()
  1209. (2-3j)
  1210. # fractional number
  1211. >>> from fractions import Fraction
  1212. >>> Fraction(2, 3)
  1213. 2/3
  1214. >>> Fraction(2/3)
  1215. 2/3
  1216. >>> f = Fraction(input('Enter a fractional number: '))
  1217. >>> type(f)
  1218. >>> int('4.8')
  1219. Error
  1220. >>> int(float('4.8'))
  1221. 4
  1222. >>> N = 4.0
  1223. >>> N.is_integer()
  1224. True
  1225. >>> N = 4.8
  1226. >>> N.is_integer()
  1227. False
  1228. # random number generation
  1229. np.random.randint(2, size=10)
  1230. array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0])
  1231. np.random.randint(1, size=10)
  1232. array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
  1233. # generate a 2 x 4 array of ints between 0 and 4, inclusive:
  1234. np.random.randint(5, size=(2, 4))
  1235. array([[4, 0, 2, 1],
  1236. [3, 2, 2, 0]])
  1237. Ref: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randint.html
  1238. # Array
  1239. a = np.array([11, 12, 13, 14])
  1240. b = np.array([1, 2, 3, 4])
  1241. c = a - b # array subtraction
  1242. b**2 # squating an array
  1243. np.cos(b) # a trigonometric function performed on the array
  1244. b < 2 # conditional operations
  1245. # help
  1246. import numpy
  1247. help(numpy)
  1248. # use 'enumerate' function in loops instead of creating an 'index' variable
  1249. Harmful:
  1250. my_container = ['Larry', 'Moe', 'Curly']
  1251. index = 0
  1252. for element in my_container:
  1253. print('{} {}'.format(index, element))
  1254. index += 1
  1255. Idiomatic:
  1256. my_container = ['Larry', 'Moe', 'Curly']
  1257. for index, element in enumerate(my_container):
  1258. print('{} {}'.format(index, element))
  1259. # Tips and Tricks in Python
  1260. 1. Swapping of Two numbers
  1261. x, y = 10, 20
  1262. print(x, y)
  1263. Result: 10 20
  1264. x, y = y, x
  1265. print(x, y)
  1266. Result: 20 10
  1267. 2. Reversing a string in Python
  1268. a = "GeeksForGeeks"
  1269. print("Reverse is", a[::-1])
  1270. Result: Reverse is skeeGroFskeeG
  1271. 3. Create a single string from all the elements in list
  1272. a = ["Geeks", "For", "Geeks"]
  1273. print(" ".join(a))
  1274. Result: Geeks For Geeks
  1275. 4. Chaining of Comparison Operatorst
  1276. n = 10
  1277. result = 1 < n n <= 9
  1278. print(result)
  1279. Result: True
  1280. False
  1281. 5. Print the file path of imported modules
  1282. import os;
  1283. import socket;
  1284. print(os)
  1285. print(socket)
  1286. 6. Use of Enums
  1287. class MyName:
  1288. Geeks, For, Geeks = range(3)
  1289. print(MyName.Geeks)
  1290. print(MyName.For)
  1291. print(MyName.Geeks)
  1292. Result: 2
  1293. 1
  1294. 2
  1295. 7. Return multiple values from functions
  1296. def x():
  1297. return 1, 2, 3, 4
  1298. a, b, c, d = x()
  1299. print(a, b, c, d)
  1300. Result: 1 2 3 4
  1301. 8. Find the most frequent value in a list
  1302. test = [1, 2, 3, 4, 2, 2, 3, 1, 4, 4, 4]
  1303. print(max(set(test), key = test.count))
  1304. Result: 4
  1305. # Lists
  1306. empty_list = list()
  1307. also_empty_list = []
  1308. zeros_list = [0] * 5
  1309. print(zeros_list]
  1310. >>> [0, 0, 0, 0, 0]
  1311. empty_list.append(1)
  1312. print(empty_list)
  1313. >>> [1]
  1314. print(len(empty_list))
  1315. >>> 1
  1316. # List indexing
  1317. list_var = range(10)
  1318. print(list_var)
  1319. >>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  1320. print(list_var[4])
  1321. >>> 4
  1322. print(list_var[4:7])
  1323. >>> [4, 5, 6]
  1324. print(list_var[0::3]) # empty index means to the beginning/end
  1325. >>> [0, 3, 6, 9]
  1326. print(list_var)
  1327. >>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  1328. print(list_var[-1])
  1329. >>> 9
  1330. print(list_var[::-1])
  1331. >>> [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
  1332. print(list_var[3:1:-1])
  1333. >>> [3, 2]
  1334. # Dictionaries
  1335. empty_dict = ()
  1336. also_empty_dict = {}
  1337. filled_dict = (3: 'Hello, ', 4: 'world!')
  1338. print(filled_dict[3] + filled_dict[4])
  1339. >>> Hello, World!
  1340. filled_dict[5] = 'New String'
  1341. print(filled_dict)
  1342. >>> (3: 'Hello, ', 4: 'World!', 5: 'New String')
  1343. del filled_dict[3]
  1344. print(filled_dict)
  1345. >>> (4: 'World!', 5: 'New String')
  1346. print(len(filled_dict))
  1347. >>> 2
  1348. # Functions, Lambda functions
  1349. def add_numbers(a, b):
  1350. return a + b
  1351. print(add_numbers(3, 4))
  1352. >>> 7
  1353. lambda_add_numbers = lambda a, b: a + b
  1354. print(lambda_add_numbers(3, 4))
  1355. >>> 7
  1356. # Loops, List and Dictionary Comprehensions
  1357. for i in range(10):
  1358. print('Looping %d' %i)
  1359. >>> looping 0
  1360. ...
  1361. >>> looping 9
  1362. filled_list = [a/2 for a in range(10)]
  1363. print(filled_list)
  1364. >>> [0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5]
  1365. filled_dict = (a:a**2 for a in range(5)]
  1366. print(filled_dict)
  1367. >>> (0: 0, 1: 1, 2: 4, 3: 9, 4: 16)
  1368. # zip
  1369. L1 = [1,2,3,4]
  1370. L2 = [5,6,7,8]
  1371. In: list(zip(L1,L2))
  1372. Out: [(1,5), (2,6), (3,7), (4,8)]
  1373. In: for (x,y) in zip(L1,L2):
  1374. print(x, y, '--', x+y)
  1375. Out: 1 5 -- 6
  1376. ........
  1377. 4 8 -- 12
  1378. # Python Lambda Functions
  1379. def average(x, y):
  1380. return (x+y)/2
  1381. avg = average(2, 5)
  1382. average = lambda x, y: (x+y)/2
  1383. var = [1, 5, -2, 3, -7, 4]
  1384. sorted_var = sorted(var)
  1385. Ref: https://www.pythonforthelab.com/blog/intro-to-python-lambda-functions/
  1386. # Beyond the for-loop
  1387. integers = range(0, 10)
  1388. even = []
  1389. for i in integers:
  1390. if i%2 == 0:
  1391. even.append(i)
  1392. >>> even
  1393. >>> [0, 2, 4, 6, 8]
  1394. integers = range(0, 10)
  1395. even = filter(lambda x: x%2 == 0, integers)
  1396. integers = range(0, 10)
  1397. def is_even(x):
  1398. return x%2 == 0
  1399. even = filter(is_even, integers)
  1400. integers = range(0, 10)
  1401. even = [x for x in integers if x%2 == 0]
  1402. # Map
  1403. integers = range(0, 10)
  1404. list(map(lambda x: x*x, integers))
  1405. >>> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
  1406. integers = range(0, 10)
  1407. [x*x for x in integers]
  1408. >>> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
  1409. # Reduce
  1410. from functools import reduce
  1411. integers = range(1, 10)
  1412. reduce(lambda x, y: x*y, integers)
  1413. >>> 362880
  1414. # Sum
  1415. integers = range(1, 10)
  1416. sum(integers)
  1417. >>> 45
  1418. # Any and All
  1419. any([False, True, False])
  1420. >>>True
  1421. all([False, True, False])
  1422. >>> False
  1423. To check for even numbers in a list:
  1424. integers = range(1, 10)
  1425. any(x%2 == 0 for x in integers)
  1426. >>> True
  1427. all(x%2 == 0 for x in integers)
  1428. >>> False
  1429. # List Methods
  1430. list.append(x)
  1431. list.extend(iter)
  1432. list.insert(i, x)
  1433. list.remove(x)
  1434. list.clear()
  1435. list.copy()
  1436. list.reverse()
  1437. list.pop([i])
  1438. list.sort()
  1439. # format text
  1440. greeting = 'Hello'
  1441. name = 'Raman'
  1442. message = '{}, {}. Welcome!'.format(greeting, name)
  1443. message = f'{greeting), {name}. Welcome!'
  1444. print(message)
  1445. $ Hello, Raman. Welcome!
  1446. message = f'{greeting}, {name.upper()}. Welcome!'
  1447. print(message)
  1448. $ Hello, RAMAN. Welcome!
  1449. print(dir(name)) # dir function
  1450. print(help(str)) # help function
  1451. print(help(str.lower)) # help function
  1452. Creating a vector:
  1453. vector_row = np.array([1, 2, 3])
  1454. vector_col = np.array([[1], [2]. [3]])
  1455. Creating a matrix:
  1456. matrix = np.array([[1,2], [1,2], [1,2]])
  1457. matrix_obj = np.mat([[1,2], [1,2], [1,2]])
  1458. # Seaborn
  1459. import seaborn as sns
  1460. splot = sns.regplot(x='fieldname1', y='fieldname2', data=dfname, fit reg=False)
  1461. splot.set(xscale='log') # logrithm plot
  1462. Note: https://cmdlinetips.com/2019/04/how-to-make-scatter-plot-in-python/
  1463. # convert an array into a list
  1464. np.array([1,2,3], [4,5,6]).tolist()
  1465. [[1,2,3], [4,5,6]]
  1466. # num2words
  1467. from num2words import num2words
  1468. print(num2words(30)) # thirty-six
  1469. print(num2words(30, to = 'ordinal') # thirty-six
  1470. print(num2words(30, to = 'ordinal_num')) # thirty-sixth
  1471. print(num2words(30, to = 'year')) # 36th
  1472. print(num2words(30, to = 'currency')) # zero euro, thirty-six cents
  1473. print(num2words(30, lang = 'es')) # treinta y seis
  1474. # to find out which file you're importing
  1475. import math
  1476. print(math.__file__)
  1477. /usr/local/lib/python3.12/lib-dynload/math.cpython-312-x86_64-linux-gnu.so
  1478. import numpy
  1479. print(numpy.__file__)
  1480. /home/saran/.venv/fdtd/lib/python3.12/site-packages/numpy/__init__.py
  1481. # Checking which indices have null for column c
  1482. pd.isnull(df['c'])
  1483. # Checking which indices don't have null for column c
  1484. pd.notnull(df['c'])
  1485. # Selecting rows of df where c is not null
  1486. df[pd.notnull(df['c'])]
  1487. # Selecting rows of df where c is null
  1488. df[pd.isnull(df['c'])]
  1489. # Selecting rows of column c of df where c is not null
  1490. df['c'][pd.notnull(df['c'])]
  1491. What to do about outliers?
  1492. 1. Remove the case. If you have many cases and there does not appear to be an
  1493. explanation is that it is in error, you can simply get rid of it.
  1494. 2. Assign the next value nearer to the median in place of the outlier value.
  1495. You will find that this approach leaves the distribution close to what it would
  1496. be without the value. You can use this approach if you have few cases and are
  1497. trying to maintain the number of observations you do have.
  1498. 3. Calculate the mean of the remaining values without the outlier and assign
  1499. that to the outlier case. While I have seen this frequently, I don't really
  1500. understand its justification and I think it distorts the distribution more than
  1501. the previous solution.
  1502. EDA is one of the most crucial aspects in any data science projects, and an
  1503. absolutely must have before commencement of any machine learning projects.
  1504. Achieving a high degree of certainty and accuracy on the validity,
  1505. interpretation and applicability of the data set and the project in general
  1506. ensures desired outcomes.
  1507. # box-plot
  1508. We analyse Uni-variate outlier i.e. we used DIS column only to check the
  1509. outlier. But we can do multivariate outlier analysis too. Can we do the
  1510. multivariate analysis with box-plot ?
  1511. Well it depends, if you have a categorical values then you can use that with
  1512. any continuous variable and do multivariate outlier analysis.
  1513. # Visulization:
  1514. Matplotlib - https://matplotlib.org
  1515. Seaborn - https://seaborn.pydata.org
  1516. Plotly - https://plot.ly
  1517. Bokeh - https://bokeh.pydata.org
  1518. Pygal - http://pygal.org/en
  1519. Dash - https://plot.ly/products/dash
  1520. Altair - https://altair-viz.github.io
  1521. http://www.intellspot.com/python-visualization-tools/
  1522. Python Programming Books:
  1523. 01. Automate the Boring Stuff with Python by Al Sweigart
  1524. 02. Effective Python: 59 Specific ways to write better Python by Brett Slatkin
  1525. 03. Fluent Python: Clear, Concise and Effective Programming by Luciano Ramalho
  1526. 04. Hello Web App by Tracy Osborn
  1527. 05. Invent Your Own Computer Games with Python by Al Sweigart
  1528. 06. Learning Python by Mark Lutz and David Ascher
  1529. 07. Learning Python: Learn to code like a professional with Python by Fabrizio Romano
  1530. 08. Learn to Program with Python 3 by Irv Kalb
  1531. 09. Programming Arcade Games with Python and Pygame by Paul Craven
  1532. 10. Python 101 by Mike Driscoll
  1533. 11. Python3 Object-oriented Programming by Dusty Philips
  1534. 12. Python Cookbook by David Beazley and Brian K. Jones
  1535. 13. Python Scripting with Scribus by Greg Pittman
  1536. 14. Python Tricks: The Book by Dan Bader
  1537. 15. Scaling Python by Julien Danjou
  1538. 16. The Hacker's Guide to Python by Julien Danjou
  1539. 17. The Quick Python Book by Naomi Ceder
  1540. 18. Treading on Python: Volume 2 Intermediate Python by Matt Harrison
  1541. """
  1542. Error: After October 2020 you may experience errors when installing or updating packages.
  1543. This is because pip will change the way that it resolves dependency conflicts.
  1544. We recommend you use --use-feature=2020-resolver to test your packages with the new
  1545. resolver before it becomes the default.
  1546. """
  1547. # use r code in python
  1548. from rpy2.robjects import pandas2ri
  1549. pandas2ri.activate()
  1550. from rpy2.robjects import r
  1551. r.data('iris')
  1552. df = pandas2ri.rpy2py(r[name])
  1553. df.to_csv("/path/to/file.csv", sep=',')
  1554. df.to_csv("/path/to/file.csv", sep=',', encoding='utf-8')
  1555. # save session in IPython
  1556. 1. %logstart # saves IPython session
  1557. 2. %save session.py 1-N
  1558. 3. %%file session.py
  1559. 4. import readline
  1560. readline.write_history_file("/home/saran/session.py")
  1561. 5. %store session.py
  1562. 6. %store -r session.py
  1563. # linux commands in IPython
  1564. ls
  1565. cd
  1566. pwd
  1567. %run script.py
  1568. %load script.py
  1569. # jupyter qtconsole
  1570. jupyter qtconsole --style monokai
  1571. jypyter qtconsole -h
  1572. # jypyter notebook remote access
  1573. jupyter notebook --ip=0.0.0.0 --port=8080 --no-browser
  1574. First make sure you install Jupyter notebook in both remote and local host.
  1575. In remote host, open the terminal, change directory to where you have your notebooks and type:
  1576. jupyter notebook --no-browser --port=8889
  1577. In your local computer, open terminal and then type
  1578. ssh -N -f -L localhost:8888:localhost:8889 user@remote_hostname
  1579. Now open web browser and type:
  1580. localhost:8888
  1581. jupyter --paths
  1582. jupyter kernelspec list
  1583. # empty line in markdown
  1584. &nbsp;
  1585. # using find command
  1586. find . -type f -name iris.csv
  1587. # UnicodeDecodeError when reading csv file in pandas with python
  1588. import pandas as pd
  1589. df = pd.read_csv('/path/to/file/file_name.csv', engine='python')
  1590. alternate solution:
  1591. - open the csv file in vi editor
  1592. - save the file in utf-8 format
  1593. then,
  1594. import pandas as pd
  1595. df = pd.read_csv('/path/to/file/file_name.csv', encoding='utf-8')
  1596. other encodings include, 'cp1252', 'ISO-8859-1'
  1597. # unpacking data
  1598. a, b, c = 1, 2, 3 # print(a, b, c) = 1 2 3
  1599. a, b, c = [1, 2, 3] # print(a, b, c) = 1 2 3
  1600. # dictionary
  1601. mydict = {'a': 1, 'b': 2. 'c': 3}
  1602. mydict.keys() # keys
  1603. mydict.values() # values
  1604. for n in mydict.keys():
  1605. print(mydict[n])
  1606. 1
  1607. 2
  1608. 3
  1609. # convert python list to numpy array
  1610. import numpy as np
  1611. mylist = [1, 2, 3, 4, 5]
  1612. myarry = np.array(mylist)
  1613. print(myarry)
  1614. print(myarry.shape)
  1615. mylist = [[1, 2, 3], [4, 5, 6]]
  1616. myarry = np.array(mylist)
  1617. print(myarry)
  1618. print(myarry.shape)
  1619. print('First row:', myarry[0])
  1620. print('Last row:', myarry[-1])
  1621. print('Specific row and column:', myarry[0,2])
  1622. print('All last column values:', myarry[:,-1])
  1623. # number of dimensions
  1624. myarry.ndim
  1625. # size of an array
  1626. myarry.size
  1627. # shape of an array
  1628. myarry.shape
  1629. # set option in pandas
  1630. from pandas import set_option
  1631. set_option('display.width', 100)
  1632. set_option('precision', 3)
  1633. df.describe()
  1634. # plot groupby (for classification problems)
  1635. df.groupby('colname').size()
  1636. plt.plot(df.groupby('colname').size())
  1637. plt.show()
  1638. # groupby size, count, describe
  1639. df.groupby('colname').size()
  1640. df.groupby('colname').count()
  1641. df.groupby('colname').describe()
  1642. # correlations between attributes
  1643. Some machine learning algorithms like linear and logistic regression can suffer poor
  1644. performance if there are highly correlated attributes in your dataset.
  1645. df.corr()
  1646. df.corr(method = 'pearson') # assumes a normal distribution of the attributes involved
  1647. # skewness - degree of distortion
  1648. A symmetrical distribution has skewness value = 0 (gaussion/normal distribution)
  1649. fairly symmetrical: -0.5 to +0.5
  1650. moderately skewed: -1 to -0.5 or +0.5 to +1
  1651. highly skewed: less than -1 or greater than +1
  1652. df.skew() # calculate skewness of a pandas dataframe
  1653. df.hist() # histogram plot for skewness
  1654. plt.show()
  1655. #
  1656. The code:
  1657. np.random.seed(1)
  1658. np.random.normal(loc = 0, scale = 1, size = (3,3))
  1659. Operates effectively the same as this code:
  1660. np.random.seed(1)
  1661. np.random.randn(3,3)
  1662. # number of elements
  1663. row, col = df.shape
  1664. print(row, col)
  1665. print(df.shape[0], df.shape[1])
  1666. # Python Notes [Higher Level Programming]
  1667. Books and tutorials:
  1668. Python Library Reference
  1669. Python 3 tutorial
  1670. Think Python
  1671. # Build-in documentation
  1672. pydoc module
  1673. pydoc module.func
  1674. Example: !pydoc sys.exit
  1675. # Running the script from the command-line
  1676. #!/usr/bin/env python3 # kind of script language interpreter to use
  1677. from math import sin # access library functionality like the function sin
  1678. import sys # and the list sys.arg (of command-line arguments)
  1679. # read first command-line argument and convert it to a floating point object
  1680. x = float(sys.argv[1])
  1681. # print out the result using a format string
  1682. print("sin({0}) = {1}".format(x, sin(x)))
  1683. # complete control of the formating of floats (similar to the C's printf syntax)
  1684. print("sin({x:q}) = {s:.3f}".format(x=x, s=sin(x)))
  1685. $ python program.py 0.8 # python - name of the interpreter
  1686. sin(0.8) = 0.7173560908995228
  1687. $ chmod u+x program.py # make the file executable
  1688. $ ./program.py 0.8
  1689. sin(0.8) = 0.7173560908995228
  1690. # Python as a calculator
  1691. In[]: 1+2
  1692. 3
  1693. In[]: 4.5/3 + (1+2)*3
  1694. 10.5
  1695. in[]: 4**5
  1696. 1024
  1697. In[]: a = 1+2j
  1698. In[]: b = 3-5j
  1699. In[]: a*b
  1700. (13+1j)
  1701. In[]: from math import log10
  1702. log10(5)
  1703. 0.6989700043360189
  1704. # variables and data types
  1705. 'some string' is equivalent to "some string"
  1706. text = """ large portions of a text can be conveniently placed inside
  1707. triple-quoted strings (newlines are preserved)"""
  1708. # special characters in strings
  1709. use the backslash \ to escape special characters
  1710. s = "\"This is a quote\" and \n here comes a backslash: \\"
  1711. print(s)
  1712. "This is a quote" and
  1713. here comes a backslash: \
  1714. # string concatenation
  1715. "condensed "*3 + "matter"
  1716. condensed condensed condensed matter
  1717. quote = "I will not eat chips all day"
  1718. (quote + ", ") * 10 + quote
  1719. # slicing notation [start:end]
  1720. quote[2:6] # will
  1721. quote[:6] # I will
  1722. quote[7:] # not eat chips all day
  1723. quote[2:-4] # will not eat chips all
  1724. python strings are immutable, meaning that they cannot be changed
  1725. quote[1] = "x"
  1726. TypeError: 'str' object does not support item assignment
  1727. if one wants to change a string, one needs to create a new one:
  1728. quote = quote[:1] + "x" + quote[2:]
  1729. print(quote)
  1730. Ixwill not eat chips all day
  1731. >>> 'day' in quote
  1732. True
  1733. >>> quote.find('i')
  1734. 3
  1735. >>> quote.split()
  1736. ['I', 'will', 'not', 'eat', 'chips', 'all', 'day']
  1737. >>> quote.replace('chips', 'salad')
  1738. 'I will not eat salad all day'
  1739. >>> quote.lower()
  1740. 'i will not eat chips all day'
  1741. >>> quote.upper()
  1742. 'I WILL NOT EAT CHIPS ALL DAY'
  1743. >>> quote.strip() # remove leading/trailing blanks
  1744. 'I will not eat chips all day'
  1745. # lists
  1746. lists can contain items of different type, though in practice they often
  1747. have the same type
  1748. mylist = ['institute', 'of', 'mathematical', 'sciences']
  1749. mylist = ['institute', 4, True]
  1750. # list operations
  1751. >>> mylist[0] # indexing
  1752. 'institute'
  1753. >>> mylist[1:] # slicing
  1754. [4, True]
  1755. >>> newlist = mylist + ["!"]*3
  1756. >>> newlist
  1757. ['institute', 4, 'True', '!', '!', '!']
  1758. in constrast to strings, lists are mutable and can be changed
  1759. mylist = [11, 12, 14]
  1760. mylist[2] = 13
  1761. mylist
  1762. [11, 12, 13]
  1763. we can also append additional items to a list
  1764. mylist.append(14)
  1765. mylist
  1766. [11, 12, 13, 14]
  1767. # tuples
  1768. tuples are very similar to lists, but they are immutable, just like strings
  1769. mytuple = ('a string', 2.5, 6, 'another string')
  1770. mytuple = 'a string', 2.5, 6, 'another string' # shorter notation
  1771. mytuple[1] = -10 # Error, tuple cannot be changed
  1772. instead we need to create a new tuple with the changed values, for example
  1773. by converting the tuple to list, changing it, and converting it back to a
  1774. tuple:
  1775. l = list(mytuple)
  1776. l[1:3] = ["is", "not"]
  1777. mytuple = tuple(l)
  1778. mytuple
  1779. ('a string', 'is', 'not', 'another string')
  1780. # change backends in matplotlib
  1781. matplotlib.get_backend() # find backend
  1782. matplotlib.use('TKAgg', warn=False, force=True) # use TKAgg
  1783. import matplotlib.pyplot as plt
  1784. plt.switch_backend('TKAgg')
  1785. # what backends are available and where
  1786. import matplotlib as m; help(m);
  1787. import matplotlib as m
  1788. print('I: {}\nN: {}'.format(m.rcsetup.interactive_bk, m.rcsetup.non_interactive_bk))
  1789. import matplotlib as m
  1790. p = m.get_backend(); print("current backend is:", p)
  1791. import matplotlib as m
  1792. p = m.matplotlib_fname(); print("The matplotlibrc is located at:\n", p)
  1793. # setting the back-end
  1794. There are 3 ways to configure backend:
  1795. 1. setting the rcParams["backend"] (default: 'Agg') parameter in matplotlibrc file
  1796. 2. setting the MPLBACKEND environment (shell) variable
  1797. 3. using the function matplotlib.use()
  1798. # using the backend
  1799. import matplotlib
  1800. matplotlib.use('TKAgg', force=True) # Agg rendering to a TK canvas
  1801. matplotlib.use('wxcairo', force=True) # Cairo rendering to a wxwidgets canvas
  1802. matplotlib.use('wxagg', force=True) # Agg rendering to a wxwidgets canvas
  1803. matplotlib.use('webagg', force=True) # On show() will start a tornado server with an interactive figure
  1804. matplotlib.use('qt5cairo', force=True) # Cairo rendering to a Qt5 canvas
  1805. matplotlib.use('qt5agg', force=True) # Agg rendering to a Qt5 canvas
  1806. # extra info
  1807. pip install pycairo # Cairo: GTK3 based backend (replaces: cairocffi)
  1808. pip install mplcairo # Cairo: Easy & Specific for matplotlib
  1809. pip install PyQt5 # Qt5: Require: Qt's qmake tool
  1810. pip install Pyside2 # Qt5: Require: shiboken2 & clang lib bindings
  1811. pip install wxPython # wxAgg:
  1812. pip install tornado # webAgg: Require: pycurl, twisted, pycares
  1813. # change the backend
  1814. 1. First locate the matplotlibrc file:
  1815. import matplotlib
  1816. matplotlib.matplotlib_fname()
  1817. 2. Open terminal and do:
  1818. cd /users/serafeim/.matplotlib/
  1819. ls
  1820. 3. Edit the file (if it does not exist use this command: touch matplotlib to create it):
  1821. vi matplotlibrc
  1822. 4. Save in matplotlibrc:
  1823. backend: TKAgg
  1824. cd .envn/dsci/lib/python3.7/site-packages/matplotlib/mpl-data/
  1825. vi matplotlibrc
  1826. line No. 81
  1827. backend: TKAgg
  1828. Both Agg and TkAgg do not require any dependencies beyond Python's standard library. If
  1829. need to save to files and not plt.show(), using Agg instead (just replace it where TkAgg
  1830. appears below).
  1831. Either add the following line to ~/.config/matplotlib/matplotlibrc:
  1832. backend: TkAgg
  1833. or the following lines to python file:
  1834. import matplotlib
  1835. matplotlib.use('TkAgg')
  1836. import matplotlib.pyplot as plt
  1837. To display where the currently active matplotlibrc file was loaded from, on can do the
  1838. following:
  1839. >>> import matplotlib
  1840. >>> matplotlib.matplotlib_fname()
  1841. '/home/user/.config/matplotlib/matplotlibrc'
  1842. Ref: https://matplotlib.org/stable/tutorials/introductory/customizing.html
  1843. # jupyter notebook with inline matplotlib
  1844. jupyter notebook --matplotlib=inline
  1845. # garbage collection
  1846. >>> x = 9
  1847. >>> print(9)
  1848. 9
  1849. >>> del x
  1850. >>> print(x)
  1851. NameError: name 'x' is not defined
  1852. # tuple:
  1853. A tuple is a general way of grouping together a number of values with a variety of types
  1854. into one compound type. Tuples have a fixed length: once declared they vannot grow or
  1855. shrink in size.
  1856. # functions:
  1857. Functions start with a header introduced by the def keyword. The indented block of code
  1858. following the : is run when the function is called. return is another keyword uniquely
  1859. associated with functions. When Python encounters a return statement, it exits the
  1860. function immediately, and passes the value on the right hand side to the calling context.
  1861. def least_difference(a, b, c):
  1862. """ Return the smallest difference between any two numbers among a, b, c.
  1863. >>> least_differnce(1, 5, -5)
  1864. 4 """
  1865. diff1 = abs(a - b)
  1866. diff2 = abs(b - c)
  1867. diff3 = abs(a - c)
  1868. return min(diff1, diff2, diff3)
  1869. print(
  1870. least_difference(1, 10, 100),
  1871. least_difference(1, 10, 10),
  1872. least_difference(5, 6, 7)
  1873. )
  1874. help(least_difference) # display the docstring
  1875. The convention of including 1 or more example calls in a function's docstring is far from
  1876. universally observed, but it can be very effective at helping someone understand your
  1877. function.
  1878. print(1, 2, 3, sep=' < ')
  1879. 1 < 2 < 3
  1880. # functions applied to functions
  1881. def multiply_by_five(x):
  1882. return 5 * x
  1883. def call(fn, arg):
  1884. """ Call fn on arg """
  1885. return fn(arg)
  1886. def squared_call(fn, arg):
  1887. """ Call fn on the result of calling fn on arg """
  1888. return fn(fn(arg))
  1889. print(
  1890. call(multiply_by_five, 1),
  1891. squared_call(multiply_by_five, 1),
  1892. sep='\n'
  1893. }
  1894. 5
  1895. 25
  1896. Functions that operate on other functions are called "higher-order functions".
  1897. By default, max returns the largest of its arguments. But if we pass in a function using
  1898. the optional 'key' argument, it returns the argument x that maximizes key(x) (aka the
  1899. 'argmax').
  1900. def modulus_5(x):
  1901. """ Returns the remainder of x after dividing by 5 """
  1902. return x % 5
  1903. print(
  1904. max(100, 51, 14), # print biggest number
  1905. max(100, 51, 14, key=modulus_5), # print biggest modulo 5 number
  1906. sep='\n'
  1907. )
  1908. 100
  1909. 14
  1910. # find the word-size, int_info, float_info
  1911. import sys
  1912. sys.maxsize
  1913. sys.int_info
  1914. sys.float_info
  1915. # NotADirectoryError: [Errno 20] Not a directory: 'dvipng'
  1916. sudo apt-get install dvipng
  1917. # IPython and the pylab mode
  1918. $ ipython --pylab
  1919. # built-in factorial
  1920. import math
  1921. math.factorial(1000)
  1922. # user defined factorial
  1923. def factorial(n):
  1924. """returns n!"""
  1925. if type(n) != type(0):
  1926. raise TypeError, "integer required as input"
  1927. return 1 if n < 2 else n * factorial(n-1)
  1928. factorial(998)
  1929. # cpu count
  1930. import os
  1931. os.cpu_count()
  1932. # hidden gems in numpy
  1933. import numpy as np
  1934. np.iinfo(np.int16)
  1935. np.iinfo(np.int32)
  1936. np.iinfo(np.int64)
  1937. x, y = np.ogrid[1:10, 1:5]
  1938. # solve the system of equations x + 2y = 1 and 3x + 5y = 2
  1939. import numpy as np
  1940. A = np.array([[1, 2], [3, 5]])
  1941. b = np.array([1, 2])
  1942. x = np.linalg.solve(A, b)
  1943. print(x) # array([-1., 1.])
  1944. np.allclose(np.dot(A, x), b) # check that the solution is correct -Output: True
  1945. Ref: https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html
  1946. # performance check
  1947. $ /usr/bin/time -p ./program.py
  1948. $ /usr/bin/time --verbose ./program.py
  1949. $ python -m cProfile -s cumulative program.py
  1950. $ python -m cProfile -o profile.stats program.py
  1951. $ kernprof -l -v ./program.py
  1952. $ python -m memory_profiler program.py
  1953. $ perf stat -e cycles, instructions, cache-references, cache-misses, branches,\
  1954. branch-misses, task-clock, faults, minor-faults, cs, migrations python program.py
  1955. $ python -m timeit '[x**0.5 for x in range(1000)]'
  1956. $ python -m timeit -s 'from math import sqrt' '[sqrt(x) for x in range(1000)]'
  1957. $ python -m cProfile -s ncalls program.py # which function is called the most
  1958. number of times
  1959. $ python -m cProfile -o program.stats program.py
  1960. $ python -m pstats program.stats
  1961. $ python -m cProfile -s tottime program.py
  1962. sort string meaning
  1963. calls call count
  1964. cumulative cumulative time
  1965. cumtime cumulative time
  1966. file file name
  1967. filename file name
  1968. module file name
  1969. ncalls call count
  1970. pcalls primitive call count
  1971. line line number
  1972. name function name
  1973. nfl name/file/line
  1974. stdname standard name
  1975. time internal time
  1976. tottime internal time
  1977. # find packages that should or should not be in requirements for a project:
  1978. https://pypi.org/project/pip-check-reqs/2.0/
  1979. https://packaging.python.org/en/latest/tutorial/#creating-your-own-project
  1980. # remove a package and its unused dependencies
  1981. Ref:https://pypi.org/project/pip-autoremove/
  1982. pip-autoremove <package> -y
  1983. # binary, octal, hexadecimal
  1984. bin(64)
  1985. oct(64)
  1986. hex(64)
  1987. # mayavi reference
  1988. 1. 3D plotting with Mayavi
  1989. https://scipy-lectures.org/packages/3d_plotting/index.html
  1990. 2. Installation
  1991. https://docs.enthought.com/mayavi/mayavi/installation.html
  1992. https://www.math.univ-paris13.fr/~cuvelier/Python/Python-Mayavi-Qt5-centOS7.html
  1993. 3. Mayavi: 3D scientific data visualization and plotting in Python
  1994. https://mayavi.readthedocs.io/en/latest/
  1995. 4. Example gallery
  1996. https://docs.enthought.com/mayavi/mayavi/auto/examples.html
  1997. 5. 3D plotting with Mayavi
  1998. http://python4esac.github.io/plotting/mayavi_example.html
  1999. 6. SAM's Scientific Python Tools
  2000. https://metaphor.ethz.ch/fsdb/sam/PythonTutorial/tips_mayavi2.html
  2001. 7. Better 3d visualizations with Mayavi
  2002. https://wwwstud.fh-zwickau.de/jef19jdw/teaching/pti01830/mayavi.html
  2003. 8. Mayavi surf
  2004. https://wizardforcel.gitbooks.io/scipy-cookbook-en/content/61.html
  2005. 9. Mayavi github codes - https://github.com/enthought/mayavi
  2006. # monitoring CUDA activity on a GPU
  2007. watch -n 0.1 nvidia-smi
  2008. nvidia-smi -lms 500 (every 500 milliseconds)
  2009. watch nvidia-smi -q g 0 -d UTILIZATION
  2010. watch gpustat -cp
  2011. watch -c gpustat -cp --color
  2012. watch -n 0.5 -c gpustat -cp --color
  2013. Ref: https://stackoverflow.com/questions/8223811/a-top-like-utility-for-monitoring-cuda-activity-on-a-gpu
  2014. # control GPUs with nvidia-smi
  2015. nvidia-smi
  2016. nvidia-smi -L # to list all available NVIDIA devices
  2017. nvidia-smi --query-gpu=index,name,uuid,serial --format=csv # to list certain details about each GPU
  2018. nvidia-smi -q -d SUPPORTED_CLOCKS # listing of available clock speeds for each GPU
  2019. nvidia-smi -q -d CLOCK # to review the current GPU clock speed, default clock speed, and maximum possible clock speed
  2020. nvidia-smi -q -d PERFORMANCE
  2021. nvidia-smi topo --matrix
  2022. nvidia-smi nvlink --status
  2023. nvidia-smi nvlink --capabilities
  2024. nvidia-smi -i 0 -q # list all available data on a particular GPU, specify the ID of the card with -i
  2025. Ref: https://www.microway.com/hpc-tech-tips/nvidia-smi_control-your-gpus/
  2026. # MPI for Python
  2027. https://mpi4py.readthedocs.io/en/stable/index.html
  2028. http://education.molssi.org/parallel-programming/03-distributed-examples-mpi4py/index.html
  2029. https://www.ibm.com/docs/en/smpi/10.4?topic=command-mpirun-options
  2030. mpiexec --help
  2031. mpiexec -n 4 python program.py
  2032. mpiexec --use-hwthread-cpus python program.py
  2033. mpiexec -n 4 --oversubscribe python program.py
  2034. mpiexec --mca plm_rsh_args -x ./program.py
  2035. # MPI communication
  2036. 1. communication for generic python objects
  2037. * use "lower case" methods: send(), receive()
  2038. 2. communication for buffer-provider objects (e.g. numpy arrays)
  2039. * use "upper case" methods: Send(), Receive()
  2040. # importing mpi library
  2041. from mpi4py import MPI
  2042. # getting important information
  2043. comm = MPI.COMM_WORLD
  2044. rank = comm.Get_rank()
  2045. size = comm.Get_size()
  2046. name = MPI.Get_processor_name()
  2047. comm.Barrier()
  2048. # collective communication
  2049. MPI_MAX - returns the maximum element
  2050. MPI_MIN - returns the minimum element
  2051. MPI_SUM - sums the elements
  2052. MPI_PROD - multiplies all elements
  2053. MPI_LAND - performs a logical "and" across the elements
  2054. MPI_LOR - performs a logical "or" across the elements
  2055. MPI_MAXLOC - the maximum value and the rank of the process that owns it
  2056. MPI_MINLOC - the minimum value and the rank of the process that owns it
  2057. # operating system information
  2058. import os
  2059. os.uname()
  2060. # anonymous function
  2061. list1 = [2, 18, 9, 22, 17, 24, 8, 12, 27]
  2062. filter(lambda x: x%3 == 0, list1)
  2063. for n in filter(lambda x: x%3 == 0, list1):
  2064. print(n)
  2065. map(lambda x: x*2 + 10, list1)
  2066. for n in map(lambda x: x*2 + 10, list1):
  2067. print(n)
  2068. reduce(lambda x,y: x + y, list1) # is equivalent to sum(list1)
  2069. # built-in functions (source: http://docs.python.org/2/library/functions.html)
  2070. abs() divmod() input() open() staticmethod()
  2071. all() enumerate() int() ord() str()
  2072. any() eval() isinstance() pow() sum()
  2073. basestring() execfile() issubclass() print() super()
  2074. bin() file() iter() property() tuple()
  2075. bool() filter() len() range() type()
  2076. bytearray() float() list() raw_input() unichr()
  2077. callable() format() locals() reduce() unicode()
  2078. chr() frozenset() long() reload() vars()
  2079. classmethod() getattr() map() repr() xrange()
  2080. cmp() globals() max() reversed() zip()
  2081. compile() hasattr() memoryview() round() __import__()
  2082. complex() hash() min() set() apply()
  2083. delattr() help() next() setattr() buffer()
  2084. dict() hex() object() slice() coerce()
  2085. dir() id() oct() sorted() intern()
  2086. # putting an if-elif-else statement on one line
  2087. # fucntion
  2088. def f(x):
  2089. return 1 if x == 0 else 2*x if x == 1 else 10*x
  2090. def f(x):
  2091. return 1 if x == 0 else(2*x if x == 1 else 10*x)
  2092. def f(x):
  2093. return (x == 0 and 1) or (x == 1 and 2*x) or 10*x
  2094. # code format
  2095. black -> pylint -> pytype -> pytest
  2096. # lint using pylint
  2097. pylint --list-msgs # get list of pylint warnings
  2098. pylint --help-msg=C6409 # get more information on a particular message
  2099. # array size
  2100. x = np.array([0.2, 6.4, 3.0, 1.6])
  2101. for n in range(x.size):
  2102. print(n)
  2103. 0
  2104. 1
  2105. 2
  2106. 3
  2107. n = x.size
  2108. print(n)
  2109. 4
  2110. # show config of the numpy, scipy package
  2111. >>> import numpy as np
  2112. >>> np.show_config()
  2113. # print the installed module location
  2114. $ python -c "import pip; print(pip)"
  2115. $ python -c "import scipy; print(scipy)"
  2116. # QuTiP: Quantum Toolbox in Python
  2117. >>> import qutip
  2118. >>> qutip.about()
  2119. # integer to character
  2120. The built-in function chr() takes an integer argument and produces the
  2121. correspoinding character.
  2122. >>> chr(65)
  2123. 'A'
  2124. >>> chr(66)
  2125. 'B'
  2126. >>> for i in range(65, 75):
  2127. print(chr(i))
  2128. # character to integer
  2129. The built-in function ord() is effectively the inverse of chr(). ord() takes a
  2130. string of length one, i.e., a single character, and returns the corresponding
  2131. ASCII value.
  2132. >>> ord('A')
  2133. 65
  2134. >>> ord('B')
  2135. 66
  2136. >>> for ch in "ASCII = numbers":
  2137. print(ord(ch))
  2138. # scipy constants
  2139. >>> import scipy.constants
  2140. >>> dir(scipy.constants) # list of constants
  2141. >>> from scipy.constants import g, G
  2142. >>> g
  2143. 9.80665
  2144. >>> G
  2145. 6.6743e-11
  2146. # ruff - linter written in Rust
  2147. # Install
  2148. $ pip install ruff
  2149. # Usage
  2150. $ ruff check . # lint all files in the current directory
  2151. $ ruff check path/to/code # lint all files in /path/to/code
  2152. $ ruff check path/to/code/*.py # lint all .py files in /path/to/code
  2153. $ ruff check path/to/code/to/file.py # lint file.py
  2154. # virtual environment in google-colab (doesn't work as expected)
  2155. !pip install virtualenv
  2156. !virtualenv <name>
  2157. !source /content/<name>/bin/activate
  2158. # google colab environment setup
  2159. 1. !pip list | grep mpi4py
  2160. 2. !pip install mpi4py
  2161. 3. !nvidia-smi
  2162. # writing file in google-colab
  2163. %%writefile program.py
  2164. [Shift + Enter] to save the above file
  2165. !mpiexec --allow-run-as-root -np 8 python program.py
  2166. !mpiexec --allow-run-as-root --oversubscribe -np 8 python program.py
  2167. # downloading files to your local file system from google colab
  2168. from google.colab import files
  2169. with open('example.txt', 'w') as f:
  2170. f.write('some content')
  2171. files.download('example.txt')
  2172. # uploading files from your local file system to google colab
  2173. from google.colab import files
  2174. files.upload()
  2175. # permanently install a module on google colab
  2176. * to be able to interact with Google Drive's operating system
  2177. import os, sys
  2178. * drive is a module that allows us use Python to interact with google drive
  2179. from google.colab import drive
  2180. * mounting google drive allows us to work with its contents
  2181. drive.mount('/content/gdrive')
  2182. * the last three lines are what changes the path of the file
  2183. nb_path = '/content/notebooks'
  2184. os.symlink('/content/gdrive/My Drive/Colab Notebooks', nb_path)
  2185. sys.path.insert(0, nb_path) # or append(nb_path)
  2186. * install the module in the notebook directory permanently
  2187. !pip install --target=$nb_path <module>
  2188. * colab notebook
  2189. from google.colab import drive
  2190. drive.mount('/content/gdrive')
  2191. import sys
  2192. sys.path.append('/content/gdrive/My Drive/Colab Notebooks')
  2193. import <module>
  2194. # rules of functional programming
  2195. At its core, functional programming is just programming with functions - pure
  2196. mathematical functions. The result of a function depends only on the arguments,
  2197. and there are no side effects, such as I/O or mutation of state. Programs are
  2198. built by combining functions together.
  2199. There are two main things you need to know to understand the concept:
  2200. * Data is immutable: If you want to change data, such as an array, you
  2201. return a new array with the changes, not the original.
  2202. * Functions are stateless: Functions act as if for the first time, every
  2203. single time! In other words, the function always gives the same return
  2204. value for the same arguments.
  2205. There are three best practices that you should generally follow:
  2206. 1. Your functions should accept at least one argument.
  2207. 2. Your functions should return data, or another function.
  2208. 3. Don't use loops!
  2209. Doing functional programming meaningfully in a language without higher-order
  2210. fucntions (the ability to pass functions as arguments and return functions),
  2211. lambdas (anonymous functions), and generics is difficult. Most modern languages
  2212. have these, but there are differences in how well different languages support
  2213. functional programming. The languages with the best support are called
  2214. functional programming languages. These include Haskell, OCaml, F#, and Scala,
  2215. which are statically typed, and the dynamically typed Erlang and Clojure.
  2216. Recall that the result of a function depends only on its inputs. Alas, almost
  2217. all programming languages have "features" that break this assumption. Null
  2218. values, type case (instanceof), type casting, exceptions, side-effects, and the
  2219. possibility of infinite recursion are trap doors that break equational reasoning
  2220. and impair a programmer's ability to reason about the behavior or correctness of
  2221. a program.
  2222. # infix function example
  2223. The function combines two functions into one, applying g to the output of f.
  2224. def compose(g, f):
  2225. return lambda x: g(f(x))
  2226. # number of values
  2227. 2**20 == 1 << 20
  2228. # pytype
  2229. While annotations are optional for pytype, it will check and apply them where
  2230. present.
  2231. type inference and checking:
  2232. $ pytype program.py
  2233. Generate type annotations in standalone files ("pyi files"), which can be
  2234. merged back into the Python source with a provided merge-pyi tool.
  2235. merging back inferred type information:
  2236. $ merge-pyi -i program.py .pytype/pyi/program.pyi
  2237. # pyqtgraph examples
  2238. pyqtgraph includes an extensive set of examples that can be accessed by running
  2239. >>> import pyqtgraph.examples
  2240. >>> pyqtgraph.examples.run()
  2241. # pip installation logs
  2242. $ pip install pylint --log pylintlog.txt
  2243. # install qtbase5-dev and set qmake tool on PATH for pyqt5
  2244. $ sudo apt-get install qtbase5-dev
  2245. $ which qmake
  2246. /usr/bin/qmake
  2247. $ qmake --version
  2248. QMake version 3.1
  2249. Using Qt version 5.15.2 in /usr/lib/i386-linux-gnu
  2250. $ pip install pyqt5
  2251. # pip install pyqt5
  2252. Successfully installed PyQt-builder-1.13.0 packaging-21.3 ply-3.11
  2253. pyparsing-3.0.9 setuptools-65.3.0 sip-6.6.2 toml-0.10.2
  2254. Cleaning up ...
  2255. Removing source in /tmp/pip-install-2gmr_frd/sip
  2256. Removed build tracker: '/tmp/pip-req-tracker-vuj8lfsc'
  2257. Installing build dependencies ... done
  2258. Running command /usr/bin/python3 /tmp/tmppi_h1r7x
  2259. get_requires_for_build_wheel /tmp/tmpxrg1n2t9
  2260. Getting requirements to build wheel ... done
  2261. Created temporary directory: /tmp/pip-modern-metadata-b___1na9b
  2262. Running command /usr/bin/python3 /tmp/tmpwp46ffki
  2263. prepare_metadata_for_build_wheel /tmp/tmpxrg1n2t9
  2264. Querying qmake about your Qt installation ...
  2265. This is the GPL version of PyQt 5.15.7 (licensed under the GNU General
  2266. Public License) for Python 3.8.2 on linux.
  2267. Type 'L' to view the license.
  2268. Type 'yes' to accept the terms of the license.
  2269. Type 'no' to decline the terms of the license.
  2270. Solution:
  2271. When pip does not have a wheel to work from, it attempts to compile from
  2272. source. By passing a --config-settings argument to pip you can pass an
  2273. argument to the configure.py which would be used during compilation.
  2274. pyqt has an argument to automatically accept the license --confirm-license.
  2275. However, pip expects the argument in a Key=value form so you need to pass
  2276. --confirm-license= (i.e. no value) and it will work. It took a while (about
  2277. 30+ min) but did finally get pyqt5 installed.
  2278. $ pip install pyqt5 --config-settings --confirm-license= --verbose
  2279. # using 'or' in if statement
  2280. weather == "Good!" or weather == "Great!"
  2281. weather in ("Good!", "Great!")
  2282. (weather == "Good!") or ("Great!")
  2283. # Object-Oriented Programming
  2284. Functions bound to objects are known as methods.
  2285. For example, where a string possesses methods designed to manipulate its
  2286. sequence of characters, a NumPy array possesses methods for operating on
  2287. the numerical data bound to that array.
  2288. >>> string = "institute"
  2289. >>> string.capitalize() # use the string-method 'capitalize'
  2290. "Institute"
  2291. >>> import numpy as np
  2292. >>> array = np.array([[0, 1, 2], [3, 4, 5]])
  2293. >>> array.sum() # use the array-method 'sum'
  2294. 15
  2295. An object can possess data, known as attributes, which summarize information
  2296. about that object.
  2297. For example, the array-attributes ndim and shape provide information about the
  2298. indexing-layout of that array's numerical data.
  2299. # accessing an object's attributes
  2300. >>> array.ndim
  2301. 2
  2302. >>> array.shape
  2303. (2, 3)
  2304. # psutil
  2305. Psutil provides complete access to system information.
  2306. >>> import psutil
  2307. >>> psutil.boot_time()
  2308. >>> psutil.cpu_count()
  2309. >>> psutil.cpu_freq()
  2310. >>> psutil.cpu_stats()
  2311. >>> psutil.cpu_times()
  2312. >>> psutil.cpu_times_percent()
  2313. >>> psutil.cpu_percent()
  2314. >>> psutil.cpu_percent(interval=5, percpu=True)
  2315. >>> psutil.version_info
  2316. >>> psutil.swap_memory()
  2317. >>> psutil.users()
  2318. >>> psutil.net_connections()
  2319. >>> psutil.net_if_addrs()
  2320. >>> psutil.disk_partitions()
  2321. # there's always better way of writing
  2322. dxs = [-1, 1, 1, -1, -1, 1, 1, -1]
  2323. don't write as:
  2324. ddxs = [random.random() * 0.7 - 0.7/2 for i in range(len(dxs))]
  2325. always write as:
  2326. ddxs = [random.random() * 0.7 - 0.7/2 for _ in dxs]
  2327. # smart if/else condition
  2328. # always recommend
  2329. def condition(x):
  2330. return x if x > 0 else 0
  2331. # not recommend
  2332. def condition(x):
  2333. return (x > 0) * x
  2334. if x > 0, then (x > 0) == 1 and (x > 0) * x == x
  2335. else (x > 0) == 0 and (x > 0) * x == 0
  2336. # if-else in return statement
  2337. def fibonacci(n):
  2338. return fibonacci(n - 1) + fibonacci(n - 2) if n > 3 else 1
  2339. # access python modules source code
  2340. >>> import numpy as np
  2341. >>> np.ones
  2342. <function ones at 0x7faa039956c0>
  2343. >>> np.ones.__code__
  2344. <code object ones at 0x7faa065dbeb0, file "/home/saran/.envn/dsci/lib/python3.11/site-packages/numpy/core/numeric.py", line 136>
  2345. # idiomatic and pythonic code [Reference: https://martinheinz.dev/blog/32]
  2346. * In Python you have choice of using either 'is' or '==' for comparisons, where 'is'
  2347. checks identity and '==' checks value.
  2348. * Using 'is None', 'is True' or 'is False' isn't just about convention or improved
  2349. readability though. It also improves performance, especially if you would use
  2350. 'x is None' instead of 'x == None' inside loop.
  2351. * # Bad
  2352. try:
  2353. page = urlopen(url)
  2354. ...
  2355. finally:
  2356. page.close()
  2357. # Good
  2358. from contextlib import closing
  2359. with closing(urlopen(url)) as page:
  2360. ...
  2361. * # Bad
  2362. import os
  2363. try:
  2364. os.remove(path)
  2365. except FileNotFoundError:
  2366. pass
  2367. # Good
  2368. from contextlib import suppress
  2369. with suppress(FileNotFoundError):
  2370. os.remove(path)
  2371. * Variable unpacking
  2372. # first = 1, middle = [2, 3, 4], last = 5
  2373. first, *middle, last = [1, 2, 3, 4, 5]
  2374. # first = 1, middle = 2, rest = [3, 4, 5]
  2375. first, second, *rest = [1, 2, 3, 4, 5]
  2376. # name = "John", address = "Some Street", email = "john@mail.com"
  2377. name, address, *_, email = ["John", "Some Street", "Credit Number", "john@mail.com"]
  2378. # header_row -< first line
  2379. # table_rows -< list of remaining lines
  2380. header_row, *table_rows = open("filename").read().split("\n")
  2381. # module information
  2382. >>> import numpy
  2383. >>> numpy
  2384. <module 'numpy' from '/home/saran/.envn/dsci/lib/python3.11/site-packages/numpy/__init__.py'>
  2385. >>> numpy.__name__
  2386. 'numpy'
  2387. >>> numpy.__doc__
  2388. # python paths
  2389. >>> import sysconfig
  2390. >>> sysconfig.get_paths()
  2391. {'stdlib': '/usr/local/lib/python3.11',
  2392. 'platstdlib': '/home/saran/.envn/dsci/lib/python3.11',
  2393. 'purelib': '/home/saran/.envn/dsci/lib/python3.11/site-packages',
  2394. 'platlib': '/home/saran/.envn/dsci/lib/python3.11/site-packages',
  2395. 'include': '/usr/local/include/python3.11',
  2396. 'platinclude': '/usr/local/include/python3.11',
  2397. 'scripts': '/home/saran/.envn/dsci/bin',
  2398. 'data': '/home/saran/.envn/dsci'}
  2399. # dask reference
  2400. (venv)$ pip install dask
  2401. (venv)$ pip install --upgrade "dask[distributed]" # enable distributed compute
  2402. # using local cluster to start dask
  2403. >>> import dask
  2404. >>> from dask.distributed import Client
  2405. >>> client = Client()
  2406. <Client: 'tcp://127.0.0.1:34085' processes=4 threads=4, memory=7.67 GiB>
  2407. # Python 3.12 support for the Linux perf profiler
  2408. Reference: https://docs.python.org/3.12/howto/perf_profiling.html
  2409. We can run perf to sample CPU stack traces at 9999 hertz:
  2410. $ perf record -F 9999 -g -o perf.data python program.py
  2411. Then we can use perf report to analyze the data:
  2412. $ perf report --stdio -n -g
  2413. # bit manipulation
  2414. 1. multiplication
  2415. a = 10
  2416. a = a << 1 # multiply a by 2
  2417. 2. division
  2418. a = 10
  2419. a = a >> 1 # divide a by 2
  2420. 3. True/False
  2421. x = 5
  2422. if x & 1 == 1:
  2423. print("x is an odd number")
  2424. else:
  2425. print("x is an even number")
  2426. Using Bitwise AND operator:
  2427. * The idea is to check whether the last bit of the number is set or not.
  2428. * If last bit is set then the number is odd, otherwise even.
  2429. * If a number is odd & (bitwise AND) of the number by 1 will be 1, because
  2430. the last bit would already be set, otherwise it will give 0 as output.
  2431. # print without for-looping
  2432. >>> institute = ["Institute", "of", "Mathematical", "Sciences"]
  2433. >>> print(*institute)
  2434. Institute of Mathematical Sciences
  2435. # Integer string conversion length limitation
  2436. Reference: https://docs.python.org/3/library/stdtypes.html#integer-string-conversion-
  2437. length-limitation
  2438. def fibonacci(n: int) -> int:
  2439. """compute nth fibonacci"""
  2440. fib: list[int] = [0, 1, 1]
  2441. for _ in range(1, n):
  2442. fib[-1] = fib[0] + fib[1]
  2443. fib[0], fib[1] = fib[1], fib[-1]
  2444. return fib[0] if n == 0 else fib[1] if n == 1 else fib[-1]
  2445. if __name__ == "__main__":
  2446. N: int = 100000
  2447. print(fibonacci(N))
  2448. (venv)$ ./fibonacci.py
  2449. Traceback (most recent call last):
  2450. File "/home/saran/codelearn/python/./fibonaccid.py", line 23, in <module>
  2451. print(fibonacci(N))
  2452. ValueError: Exceeds the limit (4300 digits) for integer string conversion;
  2453. use sys.set_int_max_str_digits() to increase the limit
  2454. (venv)$ python -X int_max_str_digits=0 fibonacci.py
  2455. # binary methods on integer types
  2456. 1. Return the number of bits necessary to represent an integer in binary,
  2457. excluding the sign and leading zeros:
  2458. >>> n = -37
  2459. >>> bin(n)
  2460. '-0b100101'
  2461. >>> n.bit_length()
  2462. 6
  2463. 2. Return the number of ones in the binary representation of the absolute
  2464. value of the integer:
  2465. >>> n = 19
  2466. >>> bin(n)
  2467. '0b10011'
  2468. >>> n.bit_count()
  2469. 3
  2470. >>> (-n).bit_count()
  2471. 3
  2472. 3. Return an array of bytes representing an integer:
  2473. >>> (1024).to_bytes(2, byteorder='big')
  2474. b'\x04\x00'
  2475. The default values can be used to conveniently turn an integer into a
  2476. single byte object:
  2477. >>> (65).to_bytes()
  2478. b'A'
  2479. 4. Return the integer represented by the given array of bytes:
  2480. >>> int.from_bytes(b'\x00\x10', byteorder='big')
  2481. 16
  2482. >>> int.from_bytes(b'\x00\x10', byteorder='little')
  2483. 4096
  2484. >>> int.from_bytes(b'\xfc\x00', byteorder='big', signed=True)
  2485. -1024
  2486. >>> int.from_bytes(b'\xfc\x00', byteorder='big', signed=False)
  2487. 64512
  2488. >>> int.from_bytes([255, 0, 0], byteorder='big')
  2489. 16711680
  2490. Reference: https://docs.python.org/3/library/stdtypes.html#int.from_bytes
  2491. # disassemble - disassembler for python bytecode
  2492. >>> import math
  2493. >>> import dis
  2494. >>> dis.dis("math.pi")
  2495. 0 0 RESUME 0
  2496. 1 2 LOAD_NAME 0 (math)
  2497. 4 LOAD_ATTR 2 (pi)
  2498. 24 RETURN_VALUE
  2499. >>> dis.dis("math.sin()")
  2500. 0 0 RESUME 0
  2501. 1 2 LOAD_NAME 0 (math)
  2502. 4 LOAD_ATTR 3 (NULL|self + sin)
  2503. 24 CALL 0
  2504. 32 RETURN_VALUE
  2505. >>> import numpy as np
  2506. >>> dis.dis("np.zeros()")
  2507. 0 0 RESUME 0
  2508. 1 2 LOAD_NAME 0 (np)
  2509. 4 LOAD_ATTR 3 (NULL|self + zeros)
  2510. 24 CALL 0
  2511. 32 RETURN_VALUE
  2512. Reference: https://docs.python.org/3/library/dis.html
  2513. # format specifiers
  2514. Format specifiers may also contain evaluated expressions. This allows code such as:
  2515. >>> width = 10
  2516. >>> precision = 4
  2517. >>> value = decimal.Decimal('12.34567')
  2518. >>> f'result: {value:{width}.{precision}}'
  2519. 'result: 12.35'
  2520. Once expressions in a format specifier are evaluated (if necessary), format specifiers
  2521. are not interpreted by the f-string evaluator. Just as in str.format(), they are
  2522. merely passed in to the __format__() method of the object being formatted.
  2523. Reference: https://peps.python.org/pep-0498/#format-specifiers
  2524. # invoke library for managing shell-oriented subprocesses
  2525. $ pip install invoke
  2526. $ invoke --list # to see which tasks are available, you use the --list option
  2527. $ invoke all
  2528. $ invoke clean
  2529. Reference:
  2530. Getting started: https://docs.pyinvoke.org/en/stable/getting-started.html
  2531. Invoking tasks:
  2532. https://docs.pyinvoke.org/en/stable/concepts/invoking-tasks.html#how-tasks-run
  2533. # ctypes
  2534. create python array:
  2535. nval = 10 # number of element
  2536. arr = (ctypes.c_int * nval)() # array with nval integer elements
  2537. for n in range(nval): # initialize
  2538. arr[n] = n
  2539. converting an existing list into array:
  2540. arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  2541. nval = len(arr)
  2542. arr = (ctypes.c_int * nval)(*arr)
  2543. # numpy.ctypeslib.ndpointer
  2544. An ndpointer instance is used to describe an ndarray in restype and argtypes
  2545. specifications. This approach is more flexible than using, for example,
  2546. POINTER(c_double), since several restrictions can be specified, which are
  2547. verified upon calling the ctypes function. These include data type, number of
  2548. dimensions, shape and flags. If a given array does not satisfy the specified
  2549. restrictions, a TypeError is raised.
  2550. numpy.ctypeslib.ndpointer(dtype=None, ndim=None, shape=None, flags=None)
  2551. Parameters: dtype: data-type, optional
  2552. Array data-type.
  2553. ndim: int, optional
  2554. Number of array dimensions.
  2555. shape: tuple of ints, optional
  2556. Array shape.
  2557. flags: str or tuple of str
  2558. Array flags; may be one or more of:
  2559. * C_CONTIGUOUS/C/CONTIGUOUS
  2560. * F_CONTIGUOUS/F/FORTRAN
  2561. * OWNDATA/O
  2562. * WRITEABLE/W
  2563. * ALIGNED/A
  2564. * WRITEBACKIFCOPY/X
  2565. Returns: klass: ndpointer type object
  2566. A type object, which is an _ndtpr instance containing dtype,
  2567. ndim, shape and flags information.
  2568. Raises: TypeError
  2569. If a given array does not satisfy the specified restrictions.
  2570. # np.double
  2571. >>> np.double is np.float64
  2572. True
  2573. # append numpy arrays to a list
  2574. Ex = []
  2575. for n in range(nsteps):
  2576. for k in range(100):
  2577. ex[k] = ...
  2578. Ex.append(ex.copy()) # list of numpy arrays
  2579. # append numpy arrays to a numpy array
  2580. Ex = np.empty((0, ex.shape[0]))
  2581. for n in range(nsteps):
  2582. for k in range(100):
  2583. ex[k] = ...
  2584. Ex = np.vstack((Ex, ex)) # numpy array of numpy arrays
  2585. # reshape numpy array
  2586. >>> ex = np.zeros(4*10, dtype=np.float64)
  2587. >>> ex = ex.reshape(10, 4)
  2588. >>> ex
  2589. array([[0., 0., 0., 0.],
  2590. [0., 0., 0., 0.],
  2591. [0., 0., 0., 0.],
  2592. [0., 0., 0., 0.],
  2593. [0., 0., 0., 0.],
  2594. [0., 0., 0., 0.],
  2595. [0., 0., 0., 0.],
  2596. [0., 0., 0., 0.],
  2597. [0., 0., 0., 0.],
  2598. [0., 0., 0., 0.]])
  2599. >>> ex = np.zeros(4*10, dtype=np.float64)
  2600. >>> ex = ex.reshape(ex.shape[0], 1)
  2601. >>> ex
  2602. array([[0.],
  2603. [0.],
  2604. [0.],
  2605. [0.],
  2606. [0.],
  2607. [0.],
  2608. [0.],
  2609. [0.],
  2610. [0.],
  2611. [0.],
  2612. [0.],
  2613. [0.],
  2614. [0.],
  2615. [0.],
  2616. [0.],
  2617. [0.],
  2618. [0.],
  2619. [0.],
  2620. [0.],
  2621. [0.],
  2622. [0.],
  2623. [0.],
  2624. [0.],
  2625. [0.],
  2626. [0.],
  2627. [0.],
  2628. [0.],
  2629. [0.],
  2630. [0.],
  2631. [0.],
  2632. [0.],
  2633. [0.],
  2634. [0.],
  2635. [0.],
  2636. [0.],
  2637. [0.],
  2638. [0.],
  2639. [0.],
  2640. [0.],
  2641. [0.]])
  2642. >>> ex = np.zeros(4*10, dtype=np.float64)
  2643. >>> ex = np.reshape(ex, (-1, 4))
  2644. >>> ex
  2645. array([[0., 0., 0., 0.],
  2646. [0., 0., 0., 0.],
  2647. [0., 0., 0., 0.],
  2648. [0., 0., 0., 0.],
  2649. [0., 0., 0., 0.],
  2650. [0., 0., 0., 0.],
  2651. [0., 0., 0., 0.],
  2652. [0., 0., 0., 0.],
  2653. [0., 0., 0., 0.],
  2654. [0., 0., 0., 0.]])
  2655. >>> ex = np.zeros(4*10, dtype=np.float64)
  2656. >>> ex.shape = (ex.size//4, 4)
  2657. >>> ex
  2658. array([[0., 0., 0., 0.],
  2659. [0., 0., 0., 0.],
  2660. [0., 0., 0., 0.],
  2661. [0., 0., 0., 0.],
  2662. [0., 0., 0., 0.],
  2663. [0., 0., 0., 0.],
  2664. [0., 0., 0., 0.],
  2665. [0., 0., 0., 0.],
  2666. [0., 0., 0., 0.],
  2667. [0., 0., 0., 0.]])
  2668. >>> ex = np.zeros(4*10, dtype=np.float64)
  2669. >>> ex.shape = (-1, 4)
  2670. >>> ex
  2671. array([[0., 0., 0., 0.],
  2672. [0., 0., 0., 0.],
  2673. [0., 0., 0., 0.],
  2674. [0., 0., 0., 0.],
  2675. [0., 0., 0., 0.],
  2676. [0., 0., 0., 0.],
  2677. [0., 0., 0., 0.],
  2678. [0., 0., 0., 0.],
  2679. [0., 0., 0., 0.],
  2680. [0., 0., 0., 0.]])
  2681. >>> ex = np.zeros(4*10, dtype=np.float64)
  2682. >>> ex = np.reshape(ex, (1, ex.size))
  2683. >>> ex
  2684. array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
  2685. 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
  2686. 0., 0., 0., 0., 0., 0., 0., 0.]])
  2687. >>> ex.shape
  2688. (1, 40)
  2689. >>> ex.size
  2690. 40
  2691. >>> ex = np.zeros(4*10, dtype=np.float64)
  2692. >>> ex.shpae = (-1, 4)
  2693. >>> ex
  2694. array([[0., 0., 0., 0.],
  2695. [0., 0., 0., 0.],
  2696. [0., 0., 0., 0.],
  2697. [0., 0., 0., 0.],
  2698. [0., 0., 0., 0.],
  2699. [0., 0., 0., 0.],
  2700. [0., 0., 0., 0.],
  2701. [0., 0., 0., 0.],
  2702. [0., 0., 0., 0.],
  2703. [0., 0., 0., 0.]])
  2704. >>> ex.flatten()
  2705. array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
  2706. 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
  2707. 0., 0., 0., 0., 0., 0.])
  2708. >>> ex
  2709. array([[0., 0., 0., 0.],
  2710. [0., 0., 0., 0.],
  2711. [0., 0., 0., 0.],
  2712. [0., 0., 0., 0.],
  2713. [0., 0., 0., 0.],
  2714. [0., 0., 0., 0.],
  2715. [0., 0., 0., 0.],
  2716. [0., 0., 0., 0.],
  2717. [0., 0., 0., 0.],
  2718. [0., 0., 0., 0.]])
  2719. # type hints
  2720. Reference:
  2721. https://docs.python.org/3/library/typing.html
  2722. https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html
  2723. Checking the type of a variable:
  2724. x = [1, 2, 3, 4, 5] # list
  2725. type(x) # returns list
  2726. isinstance(x, list) # True
  2727. Type aliases:
  2728. A type alias is defined using the type statement, which creates an instance of
  2729. TupeAliasType. In this example, Vector and list[float] will be treated equivalently
  2730. by static type checkers:
  2731. type Vector = list[float]
  2732. def scale(scalar: float, vector: Vector) -> Vector:
  2733. return [scalar * num for num in vector]
  2734. passes type checking (a list of floats qualifies as a Vector):
  2735. new_vector = scale(2.0, [1.0, -4.2, 5.4])
  2736. note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs [annotation-unchecked]
  2737. $ mypy --check-untyped-defs *.py
  2738. # print docstrings
  2739. >>> import numpy as np
  2740. >>> import sources
  2741. >>> dir(np)
  2742. >>> dir(sources)
  2743. >>> help(np.zeros)
  2744. >>> help(sources.pulse)
  2745. >>> print(np.zeros.__doc__)
  2746. >>> print(sources.pulse.__doc__)
  2747. # measure execution time with timeit
  2748. >>> import timeit
  2749. >>> import numpy as np
  2750. >>> import sources
  2751. >>> ke = 200
  2752. >>> ex = np.zeros(ke, dtype=np.float64)
  2753. >>> hy = np.zeros(ke, dtype=np.float64)
  2754. >>> timeit.timeit('sources.pulse(ke, ex, hy)', globals=globals())
  2755. >>> timeit.timeit(lambda: sources.pulse(ke, ex, hy))
  2756. >>> timeit.Timer(lambda: sources.pulse(ke, ex, hy)).timeit()
  2757. Normally, such a function would run in a few milliseconds, but the reported
  2758. timings are in the order of seconds. That's because, by default, timeit.timeit
  2759. will run the benchmarked code 1 million times to provide a result where any
  2760. temporary change in speed of the execution won't impact the final result much.
  2761. The default value for number(number of executions) is 1,000,000. Be aware that
  2762. running time-consuming code with the default value can take significant time.
  2763. >>> timeit.timeit('sources.pulse(ke, ex, hy)', number=10, globals=globals())
  2764. >>> timeit.timeit(lambda: sources.pulse(ke, ex, hy), number=10)
  2765. >>> timeit.Timer(lambda: sources.pulse(ke, ex, hy)).timeit(number=10)
  2766. timeit.repeat() to repeat the timeit() function, the result is returned as a list:
  2767. >>> timeit.repeat(lambda: sources.pulse(ke, ex, hy), repeat=5, number=100)
  2768. Reference: https://note.nkmk.me/en/python-timeit-measure/
  2769. # profiling CPU usage
  2770. $ python -m cProfile sources.py
  2771. $ python
  2772. >>> import cProfile
  2773. >>> profile = cProfile.Profile()
  2774. >>> import sources
  2775. >>> import numpy as np
  2776. >>> ke = 200
  2777. >>> ex = np.zeros(ke, dtype=np.float64)
  2778. >>> hy = np.zeros(ke, dtype=np.float64)
  2779. >>> profile.runcall(lambda: sources.pulse(ke, ex, hy))
  2780. >>> profile.print_stats()
  2781. # profiling and timing script
  2782. %time time the execution of a single statement
  2783. %timeit time repeated execution of a single statement for more accuracy
  2784. %prun run script with the profiler
  2785. %lprun run script with the line-by-line profiler
  2786. %memit measure the memory use of a single statement
  2787. %mprun run code with the line-by-line memory profiler
  2788. Note: the last four commands are not bundled with IPython - you'll need to get
  2789. the line_profiler and memory_profiler extensions.
  2790. Reference:
  2791. https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html
  2792. # timing script snippets with %time and %timeit
  2793. In [1]: import numpy as np
  2794. In [2]: ke = 200
  2795. In [3]: ex = np.zeros(ke, dtype=np.float64)
  2796. In [4]: hy = np.zeros(ke, dtype=np.float64)
  2797. In [5]: import sources
  2798. In [6]: %time sources.pulse(ke, ex, hy)
  2799. In [7]: %timeit sources.pulse(ke, ex, hy)
  2800. # profiling full script with %prun
  2801. In [1]: import numpy as np
  2802. In [2]: ke = 200
  2803. In [3]: ex = np.zeros(ke, dtype=np.float64)
  2804. In [4]: hy = np.zeros(ke, dtype=np.float64)
  2805. In [5]: import sources
  2806. In [6]: %prun sources.pulse(ke, ex, hy)
  2807. # line-by-line profiling with %lprun
  2808. In [1]: import numpy as np
  2809. In [2]: ke = 200
  2810. In [3]: ex = np.zeros(ke, dtype=np.float64)
  2811. In [4]: hy = np.zeros(ke, dtype=np.float64)
  2812. In [5]: import sources
  2813. In [6]: %load_ext line_profiler # load the line_profiler IPython extension
  2814. In [7]: %lprun -f sources.pulse sources.pulse(ke, ex, hy)
  2815. # memory profiling with %memit and %mprun
  2816. In [1]: import numpy as np
  2817. In [2]: ke = 200
  2818. In [3]: %load_ext memory_profiler
  2819. In [4]: %memit ex = np.zeros(ke, dtype=np.float64)
  2820. In [5]: %memit hy = np.zeros(ke, dtype=np.float64)
  2821. In [6]: %mprun -f sources.pulse sources.pulse(ke, ex, hy)
  2822. # pytest
  2823. $ pytest
  2824. $ pytest ./
  2825. $ pytest file.py
  2826. $ pytest file.py --collect-only
  2827. $ pytest file.py::test_name
  2828. $ pytest file.py -k <sub_string>/test_name