Remember as little as possible! Instead have a few good websites (or notes) that you can access easily.
Don’t try to remember syntax. Instead try to understand how the syntax is structured (e.g. Python uses tab
indentation to).
Experiment! Experiment! Experiment! Playing with the code does not cost anything. So, be curious. Go with your intuition and try things out. Things won’t work so well at the start but it will get better.
Keyboard Shortcuts makes life easy and efficient. Learn as many as you can.
Don’t work alone Learning is more fun and faster if you discuss and clarify things with a colleague.
Just learn what you need When starting programming it is better to learn the basics and just what you need to solve your problem.
Copy and edit others’ code This is the fastest way to get started. After some practice you will become more independent.
Python is a free, high-level(i.e. more English like), highly readable programming language.
Python is an interpreted language. The latest version of the interpreter is Python 3.
There are several ways to install the Python interpreter. One of the best (and easiest) is with the Anaconda distribution.
Python instructions (code) is usually saved in a file with the extension .py
and then ‘passed’ onto the Python interpreter. However, if you use the (powerful and versatile) , the file will be saved with a .ipynb
extension.
Python code is shown in a grey box like this:
print('Code is shown in a box like this')
Step 1. Visit the download page at Anaconda.
Step 2. Download and install the 64-bit Python 3 distribution suitable for your operating system.
(Windows users should run Anaconda as Administrator)
There are many ways to issue commands to the Python interpreter. A is one very easy environment to write Python(or R or Julia) code.
files have the extension .ipynb
allows you to combine Markdown and Python in the same document.
have two types of cells: Markdown cells or Code cells. Both type of cells are ‘run’ by pressing SHIFT + ENTER
.
Take a look at this site for some cool tricks and optimisations for these notebooks.
Colab is a (free) platform for coding in Python (and some other languages). Colab offers an environment (almost) identical to .
Some advantages of using Colab are: - Colab will allow us to use Python without having to install it on our computers. - Colab will enable us to share our code with others (just like any other Google document) - Colab does all the processing on their servers. So, it won’t tax your computer(Of course you pay the price of computations being slightly slower).
Let’s see what Colab can do by watching their introductory video.
Markdown is a simple, but powerful language for ‘word processing’. Markdown is not only succinct and efficient but (like \(\LaTeX\)) tries to separate content from style. So, you will have less distractions and can focus on producing the content.
You can convert the content of Markdown (.md) files directly into PDF or HTML formats1. Markdown is also used in the Jupyter notebooks.
Markdown has different ‘flavours’ that have slightly different features. You can read more about it at the Wikipedia page.
OS | Example Software |
---|---|
iOS | Werdsmith |
OS X | MacDown |
Windows | WriteMonkey |
Cross platform | Typora |
Web based | StackEdit Dillinger Madoko Markdown & LaTeX Editor |
scatter
(which is commented below).fmt
is short for ‘format string’. This decides the shape of the data point.from matplotlib import pyplot as plt
# Some data for plotting
= [0, 1, 2, 3, 4, 5]
x = [0, 2, 4, 6, 8, 10]
y_1 = [0, 4, 8, 12, 16, 20]
y_2 = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
err
# Lets start plotting
= plt.subplots(nrows=1, ncols=1, figsize=(5, 5))
fig, ax ='red', linestyle='dashed', label='$Y_1$ values')
ax.plot(x, y_1, color=err, xerr=.25, color='black', fmt='o', label='$Y_2$ values')
ax.errorbar(x, y_2, yerr# ax.scatter(x, y_2, color='blue', label='$Y_2$ values')
'x-values')
ax.set_xlabel('y-values')
ax.set_ylabel('X vs Y')
ax.set_title(=.25)
ax.grid(alpha='upper left')
ax.legend(loc
'simple-01.png', dpi=150)
plt.savefig( plt.show()
errorbar
plot to and uncomment the scatter
plot.Please spend not more that 5 minutes on this exercise.
from matplotlib import pyplot as plt
# Some data for plotting
= [0, 1, 2, 3, 4, 5]
x = [0, 2, 4, 6, 8, 10]
y_1 = [0, 4, 8, 12, 16, 20]
y_2 = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
err
# Lets start plotting
= plt.subplots(nrows=1, ncols=1, figsize=(5, 5))
fig, ax ='green', linestyle='dashed', label='$Y_1$ values')
ax.plot(x, y_1, color='blue', label='$Y_2$ values')
ax.scatter(x, y_2, color
'x-values')
ax.set_xlabel('y-values')
ax.set_ylabel('X vs Y and 2Y')
ax.set_title(=.25, color='blue')
ax.grid(alpha='upper left')
ax.legend(loc
'simple-01_ex-01.png', dpi=150)
plt.savefig( plt.show()
import numpy as np
from matplotlib import pyplot as plt
= np.linspace(-np.pi, np.pi, num=100, endpoint=True)
x = np.cos(x)
cos_x = np.sin(x)
sin_x
= plt.subplots()
fig, axes ='sin x')
axes.plot(x, sin_x, label='cos x')
axes.plot(x, cos_x, label
=cos_x > sin_x,
axes.fill_between(x, cos_x, sin_x, where='orange', alpha=.125, label='cos x > sin x')
color
=cos_x < sin_x,
axes.fill_between(x, cos_x, sin_x, where='b', alpha=.075, label='cos x < sin x')
color
-np.pi, -np.pi/2, 0, np.pi/2, np.pi]),
axes.set_xticks([r'$-\pi$', r'$-\pi/2$', r'$0$', r'$+\pi/2$', r'$+\pi$'])
axes.set_xticklabels([
axes.legend()='x', alpha=.5)
axes.grid(axis
'simple-02.png', dpi=150)
plt.savefig( plt.show()
matplotlib
allows several syntaxes. One is referred to as the pyplot API. It is simple but can be limited.from matplotlib import pyplot as plt
# Some data for plotting
= [0, 1, 2, 3, 4, 5]
x = [0, 2, 4, 6, 8, 10]
y_1 = [0, 4, 8, 12, 16, 20]
y_2 = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
err
# Lets start plotting
=(5, 5))
plt.figure(figsize='red', linestyle='dashed', label='$Y_1$ values')
plt.plot(x, y_1, color=err, color='black', fmt='o', label='$Y_2$ values')
plt.errorbar(x, y_2, yerr
'x-values')
plt.xlabel('y-values')
plt.ylabel('X vs Y')
plt.title(=.25)
plt.grid(alpha='upper left')
plt.legend(loc plt.show()
matplotlib
comes with several predefined styles that we can apply with just a single line!plt.style.use('<NAME OF STYLE>')
plt.xkcd(True/False)
.from matplotlib import pyplot as plt
# plt.style.use('default')
'bmh')
plt.style.use(# plt.style.use('ggplot')
# plt.style.use('grayscale')
# plt.style.use('fivethirtyeight')
# plt.xkcd()
# Some data for plotting
= [0, 1, 2, 3, 4, 5]
x = [0, 2, 4, 6, 8, 10]
y_1 = [0, 4, 8, 12, 16, 20]
y_2 = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
err
# Lets start plotting
= plt.subplots(nrows=1, ncols=1, figsize=(5, 5))
fig, ax ='$Y_1$ values')
ax.plot(x, y_1, label=err, fmt='o', label='$Y_2$ values')
ax.errorbar(x, y_2, yerr
'x-values')
ax.set_xlabel('y-values')
ax.set_ylabel('X vs Y')
ax.set_title(=.25)
ax.grid(alpha='upper left')
ax.legend(loc
'simple-01_styled.png', dpi=150)
plt.savefig( plt.show()
Seaborn
is a Python package that is built on top of matplotlib
and allows us to do useful things with less code. The Seaborn
functions are often more friendlier than juts using matplotlib
.
Even if you do not use the functions of Seaborn
you can still make your plots look nicer just by calling Seaborn
!
Seaborn
allows to adjust the styles of your plot (out of the box) in two convenient ways.
set_theme()
set_context()
Please visit this Seaborn help page for more information.
Seaborn
style and context.from matplotlib import pyplot as plt
import seaborn as sns
# Please refer to http://seaborn.pydata.org/tutorial/aesthetics.html for
# Set default Seaborn parameters
sns.set_theme() 'darkgrid') # Options: darkgrid, whitegrid, dark, white, and ticks
sns.set_style(='talk') # Options: paper, notebook, talk, and poster
sns.set_context(context
# Some data for plotting
= [0, 1, 2, 3, 4, 5]
x = [0, 2, 4, 6, 8, 10]
y_1 = [0, 4, 8, 12, 16, 20]
y_2 = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
err
# Lets start plotting
= plt.subplots(nrows=1, ncols=1, figsize=(5, 5))
fig, ax ='$Y_1$ values')
ax.plot(x, y_1, label=err, fmt='o', label='$Y_2$ values')
ax.errorbar(x, y_2, yerr
'y-values')
ax.set_ylabel('X vs Y')
ax.set_title(='upper left')
ax.legend(loc
plt.tight_layout()'simple-01_styled-with-sns.png', dpi=150)
plt.savefig( plt.show()
import numpy as np
from matplotlib import pyplot as plt
= np.linspace(-np.pi, np.pi, num=100, endpoint=True)
x = np.cos(x)
cos_x = np.sin(x)
sin_x
= plt.subplots(ncols=1, nrows=2, figsize=(8, 5), sharex=True)
fig, axes
0].plot(x, sin_x, label='sin x')
axes[0].fill_between(x, sin_x, 0, alpha=.125)
axes[
1].plot(x, cos_x, label='cos x', color='green')
axes[1].fill_between(x, cos_x, 0, color='green', alpha=.125)
axes[
1].set_xlabel('x values')
axes[
for ax in axes.flat:
='upper left', frameon=False)
ax.legend(loc='x', alpha=.5)
ax.grid(axis
plt.tight_layout()'multi-plots-01.png')
plt.savefig( plt.show()
from matplotlib import pyplot as plt
import numpy as np
#--------- Generate cosine and sine values --------#
= np.linspace(-np.pi, np.pi, num=100, endpoint=True)
x = np.cos(x)
cos_x = np.sin(x)
sin_x = np.exp(-x)*np.cos(5*x)
fun_cos_x = np.exp(-x)*np.sin(5*x)
fun_sin_x
#------------------ Plot the data -----------------#
= plt.subplots(nrows=2, ncols=2, figsize=(12, 8), sharex='col', sharey='row')
fig, axes
# Plot 0,0 : Cosines
0, 0].plot(x, cos_x, color='r', label='cos x')
axes[0, 0].plot(x, cos_x**2, color='grey', linestyle='--', label='cos$^2$ x')
axes[0, 0].set_title('Cosine x & Cosine$^2$ x')
axes[0, 0].set_xlim(-np.pi, np.pi)
axes[0, 0].legend(loc='lower center', frameon=False)
axes[
# Plot 0,1 : Sine
0, 1].plot(x, sin_x, color='g', label='sin x')
axes[0, 1].plot(x, sin_x**2, color='grey', linestyle='--', label='sin$^2$ x')
axes[0, 1].set_title('Sin x & Sin$^2$ x')
axes[0, 1].set_ylim(-1.25, 1.25)
axes[0, 1].legend(loc='lower right', frameon=False)
axes[
# Plot 1,0 : Function with Cosine
1, 0].plot(x, fun_cos_x, color='r')
axes[1, 0].fill_between(x, fun_cos_x, 0, color='r', alpha=.125)
axes[1, 0].set_title('Function with Cosine')
axes[1, 0].set_xlim(-np.pi, np.pi)
axes[
# Plot 0,1 : Function with Sine
1, 1].plot(x, fun_sin_x, color='g')
axes[1, 1].fill_between(x, fun_sin_x, 0, color='g', alpha=.125)
axes[1, 1].set_title('Function with Sine')
axes[1, 1].set_xlim(-np.pi, np.pi)
axes[
1, 0].set_xlabel('Angle (radians)')
axes[1, 1].set_xlabel('Angle (radians)')
axes[
0, 0].set_ylim(-1, 1)
axes[0, 1].set_ylim(-1, 1)
axes[
1, 0].set_ylim(-20, 15)
axes[1, 1].set_ylim(-20, 15)
axes[
for a in axes.flat: # 'flat', 'opens' the 2D list into a simple 1D list
=.5)
a.grid(alpha-np.pi, np.pi)
a.set_xlim(
plt.tight_layout()'multi-plots-02.png')
plt.savefig( plt.show()
from matplotlib import pyplot as plt
import numpy as np
# --------- Generate cosine and sine values --------#
= np.linspace(-np.pi, np.pi, num=200, endpoint=True)
x = np.cos(x)
cos_x = np.exp(-x)*np.cos(5*x)
fun_cos_x
# ------------------ Plot the data -----------------#
= plt.subplots(figsize=(10, 5))
fig, axes
='$\\cos x$', color='black', linestyle='--')
axes.plot(x, cos_x, label-1, alpha=.05)
axes.fill_between(x, cos_x,
-np.pi, np.pi)
axes.set_xlim(-1, 1)
axes.set_ylim('Angle (radians)')
axes.set_xlabel('$\cos (x)$', rotation=0, ha='right')
axes.set_ylabel(=.5)
axes.grid(alpha=(.825, .9), frameon=False)
axes.legend(loc
# Create a twin axix that shares the x-axis
= axes.twinx()
ax_twin ='$e^{-x}\\cos (5x)$')
ax_twin.plot(x, fun_cos_x, label-20, 15)
ax_twin.set_ylim('$e^{-x}\\cos (5x)$', rotation=0, ha='left')
ax_twin.set_ylabel(-20, alpha=.1)
ax_twin.fill_between(x, fun_cos_x, =(.825, .825), frameon=False)
ax_twin.legend(loc
plt.tight_layout()'multi-plots-03.png')
plt.savefig( plt.show()
Recast the plot from Example 1 so that the plots are in a single row, with a shared y axis.
Please spend not more that 10 minutes on this exercise.
import numpy as np
from matplotlib import pyplot as plt
= np.linspace(-np.pi, np.pi, num=100, endpoint=True)
x = np.cos(x)
cos_x = np.sin(x)
sin_x
= plt.subplots(ncols=2, nrows=1, figsize=(20, 5), sharey=True)
fig, axes
0].plot(x, sin_x, label='sin x')
axes[0].fill_between(x, sin_x, 0, alpha=.125)
axes[
1].plot(x, cos_x, label='cos x', color='green')
axes[1].fill_between(x, cos_x, 0, color='green', alpha=.125)
axes[
for ax in axes.flat:
='upper left')
ax.legend(loc='x', alpha=.5)
ax.grid(axis'x values')
ax.set_xlabel(
plt.tight_layout()'multi-plots-01_ex.png')
plt.savefig( plt.show()
Please work on this in your own time at home!
The Lennard-Jones potential is a simple model for the interaction between two atoms as a function of their distance, \(r\). The potential (\(U\)) , inter-atomic force (\(F\)), depth (\(\epsilon\)) and position (\(r_0\)) of the well are given by:
\[ \begin{align*} U(r) &=\dfrac{B}{ r^{12}} − \dfrac{A}{r^{6}}\\[1em] F(r) =−\dfrac{dU}{dr} &= \dfrac{12 B}{ r^{13}} − \dfrac{6 A}{r^{7}}\\[1em] \epsilon &= -\dfrac{A^2}{4B}\\[1em] r_0 &= \left(\dfrac{2B}{A}\right)^{1/6}\\[1em] \end{align*} \]
\(A\) and \(B\) are positive constants.
For small displacements from the equilibrium inter-atomic separation (where \(F = 0\)), the potential may be approximated to the harmonic oscillator function \(V\) given by:
\[ \begin{align} V(r) &= \dfrac{1}{2}k(r-r_0)^2+ \epsilon \\[1em] \text {where}\quad k &= \left|\dfrac{d^2U}{dr^2}\right|_{r_0}= \dfrac{156 B}{ {r_0}^{14}} − \dfrac{42 A}{{r_0}^{8}} \end{align} \]
from matplotlib import pyplot as plt
import numpy as np
= {'force': '#1b9e77', 'potential': '#d95f02', 'harmonic': '#7570b3'}
my_colors
# ------------------------------ Values for plotting ------------------------
= 1000 # Number of points
n = .3275, .8 # Plotting range
r_min, r_max # -------------------------- Constants for the potential --------------------
= 1.024E-23 # J nm^6
A = 1.582E-26 # J nm^12
B
= A/1.381E-23 # K nm^6
A = B/1.381E-23 # K nm^12
B
= (2.*B/A)**(1.0/6.0)
r0 = -1./4.*A**2./B # Epsilon
eps
# ------------------------------- Data for plotting -------------------------
= np.linspace(r_min, r_max, n) # Generate r values for plotting
r = B/r**12 - A/r**6 # Calculate potential
potential = 12*B/r**13 - 6*A/r**7 # Calculate force
force
= np.linspace(r0 - .065, r0 + .065, n) # Generate r values for plotting
r_2 = 156.*B/r0**14. - 42.*A/r0**8
k = 1./2.*k*(r_2-r0)**2 + eps # Harmonic approximation
v
# -------------------------------- Start plotting ---------------------------
= plt.subplots(figsize=(10, 8))
fig, axes
='Potential $U(r)$', color=my_colors['potential'], linestyle='-')
axes.plot(r, potential, label='--', color=my_colors['harmonic'], label='Harmonic $V(r)$')
axes.plot(r_2, v, linestyleTrue, axis='both', alpha=.5)
axes.grid(
0.565, -135, "Lennard-Jones Potential\nwith Harmonic Approximation",
axes.text(=16, horizontalalignment='left')
fontsize'Interatomic separtion (nm)')
axes.set_xlabel('Potential(K)')
axes.set_ylabel(-150, 150)
axes.set_ylim(.3, .8)
axes.set_xlim(='upper right', frameon=False)
axes.legend(loc
# Plot force
= axes.twinx() # Get a new y-axis for the force
ax_twin ='Force', color=my_colors['force'], linestyle='-')
ax_twin.plot(r, force, label'Force (N)')
ax_twin.set_ylabel(=(.8, .8625), frameon=False)
ax_twin.legend(loc-900, 900)
ax_twin.set_ylim(.3, .8)
ax_twin.set_xlim(
'plot-lj.png', dpi=150)
plt.savefig( plt.show()
The files Spectrum 01 contain data from a diffraction experiment of an atomic Hydrogen lamp. The data is in the form of two columns. The first is the measured angle (in radians) and the second the measured intensity of the light (in an arbitrary unit). We expect to see the lines for red (656.3 nm) and green (434.0 nm) at angles \(\pm .473\) \(\pm0.306\) respectively.
Plot this data and indicate the position of the red and green lines.
import numpy as np
from matplotlib import pyplot as plt
= np.loadtxt('spectrum-01.txt', skiprows=2)
data = data[:, 0]
angle = data[:, 1]
intensity
= plt.subplots(figsize=(8.09, 5))
fig, axes ='black')
axes.plot(angle, intensity, color=angle, y1=intensity, y2=0, color='k', alpha=.25)
axes.fill_between(x=.25)
axes.grid(alpha
'Angle (radians)')
axes.set_xlabel('Intensity (%)')
axes.set_ylabel('Source 1')
axes.set_title(
= 0.473, 656.3
ang, wavelength -ang, ang], 0, 70, linestyle='--',
axes.vlines([=.5, color=(.5, 0, 0), label=f'{wavelength} nm')
alpha
= 0.306, 434.0
ang, wavelength -ang, ang], 0, 70, linestyle='--',
axes.vlines([=.5, color=(0, .5, 0), label=f'{wavelength} nm')
alpha
axes.legend()'data-01.png', dpi=150)
plt.savefig( plt.show()
The sites shown below shares CO\(_2\) and temperature data from ice core measurement in Lake Vostok. Plot these to see if there is any correlation between temperature and CO\(_2\) levels.
Link | |
---|---|
Temperature | https://cdiac.ess-dive.lbl.gov/ftp/trends/temp/vostok/vostok.1999.temp.dat |
CO\(_2\) | https://cdiac.ess-dive.lbl.gov/ftp/trends/co2/vostok.icecore.co2 |
Note: - You can read the data directly by giving the url. - Remember to skiprows
.
import numpy as np
from matplotlib import pyplot as plt
'ggplot')
plt.style.use(
= 'https://cdiac.ess-dive.lbl.gov/ftp/trends/temp/vostok/vostok.1999.temp.dat'
src = np.loadtxt(src, skiprows=59)
t_data = t_data[:, 1]
t_age = t_data[:, 3]
t_temperature
= 'https://cdiac.ess-dive.lbl.gov/ftp/trends/co2/vostok.icecore.co2'
src = np.loadtxt(src, skiprows=20)
co2_data = co2_data[:, 1]
co2_age = co2_data[:, 3]
co2_concentration
# Lets start plotting
= plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(10, 4))
fig, (co2_ax, t_ax)
/1000, co2_concentration)
co2_ax.plot(co2_age/1000, t_temperature, color='#0077BB')
t_ax.plot(t_age
'CO$_2$ Concentration')
co2_ax.set_ylabel('Temperature (C)')
t_ax.set_ylabel('Millennia before present')
t_ax.set_xlabel(
for ax in [co2_ax, t_ax]:
=.35)
ax.grid(alpha0, 430)
ax.set_xlim(
'Vostok Ice Core Data for global CO$_2$ and Temperatures')
plt.suptitle(=[0, 0.03, 1, 0.95])
fig.tight_layout(rect'data-ex.png', dpi=150)
plt.savefig( plt.show()
“The greatest value of a picture is when it forces us to notice what we never expected to see”.
—John W. Tukey, Exploratory Data Analysis
Here are a few visualisation fundamentals that you should think about. These are based in the idea presented by Alberto Cairo. “The Truthful Art: Data, Charts, and Maps for Communication”.
Not all charts are equal. Pick an appropriate chart (see here for example).
Our brains have been fine tuned by evolution
Our minds are capacity limited: the more unnecessary stuff you add, the less we will pick up.
Following are some ‘rules’ for the sensible use of colour. They have been extracted from this article by Stephen Few.
If you want different objects of the same color in a table or graph to look the same, make sure that the background—the color that surrounds them—is consistent.
If you want objects in a table or graph to be easily seen, use a background color that contrasts suffi ciently with the object.
Use color only when needed to serve a particular communication goal.
Use different colors only when they correspond to differences of meaning in the data.
Use soft, natural colors to display most information and bright and/or dark colors to highlight information that requires greater attention.
When using color to encode a sequential range of quantitative values, stick with a single hue (or a small set of closely related hues) and vary intensity from pale colors for low values to increasingly darker and brighter colors for high values.
Non-data components of tables and graphs should be displayed just visibly enough to perform their role, but no more so, for excessive salience could cause them to distract attention from the data.
To guarantee that most people who are colorblind can distinguish groups of data that are color coded, avoid using a combination of red and green in the same display.
Avoid using visual effects in graphs.
Original website: colorbrewer2.org.
Also see: Every ColorBrewer Scale
These are alternative tools to pick a suitable colour pallette.
Author | Link | Comments | |
---|---|---|---|
Fundamentals of Data Visualization | Claus O. Wilke | Link | – Highly recommended – Written for scientists |
The Truthful Art: Data, Charts, and Maps for Communication | Alberto Cairo | – Highly recommended – Has a bias towards journalism. |
|
Storytelling with Data: A Data Visualization Guide for Business Professionals | Nussbaumer KnaflicModern | – Easy read – Not for science but has useful ideas |
|
Scientific Computing for Chemists | Charles J. Weiss | Link | – Highly recommended |