1.5 Storing Data

There are several ways we can store data for various task while the programming is running.

1.5.1 Using Python Dictionaries to store data

  • You can think of a dictionary as list where each item(value) has a name (key).
  • We access the items by using the ‘key’
  • Note the use of { }

Useage Example 1

# Create a dictionary
shapes = {'square': 'a 2D object','cube': 'a 3D object' }
print(shapes.keys())
# dict_keys(['square', 'cube'])

Useage Example 2

shapes = {'square': 'a 2D object','cube': 'a 3D object' }

print(shapes['square'])
# a 2D object
print(shapes['cube'])
# a 3D object

Useage Example 3

shapes = {'square': 'a 2D object','cube': 'a 3D object' }

# Add new data under a new key
shapes['triangle']='has three sides.'
print(shapes)
# {'square': 'a 2D object', 'cube': 'a 3D object', 'triangle': 'has three sides.'}

Useage Example 4

Here is one way you can use a dictionary in a for loop.

shapes = {'square': 'a 2D object','cube': 'a 3D object' }

for value,key in shapes.items():
  print(key,' ---> ',value)
# a 2D object  --->  square
# a 3D object  --->  cube

1.5.2 Pandas DataFrame

For the sake of completion, a DateFrame is a much more sophisticated way to store data.

1.5.3 Using Python Lists to store date

#----------- Simple Python lists -----------#
py_list1 = [0,1,2,3,4,5]
py_list2 = ['a','b','c','d','e']
py_list3 = [['a','b','c'],['d','e','f'],['g','h','i']]

Useage Example 1

print(py_list1[0])              # Indexing
# 0
print(py_list1[2::4])           # Splicing
# [2]

Useage Example 2

print(py_list3[1])              # Indexing 2D lists
# ['d', 'e', 'f']
print(py_list3[1][0])           # Indexing 2D lists
# d

Useage Example 3

py_list2[2] = 10000             # Modifying a list
print(py_list2)
# ['a', 'b', 10000, 'd', 'e']

Useage Example 4

print(py_list1 + py_list2)      # Addition
# [0, 1, 2, 3, 4, 5, 'a', 'b', 10000, 'd', 'e']

Useage Example 5

print(2*py_list2)               # Multiplication
# ['a', 'b', 10000, 'd', 'e', 'a', 'b', 10000, 'd', 'e']

Useage Example 6

print(2*py_list2)               # Multiplication
# ['a', 'b', 10000, 'd', 'e', 'a', 'b', 10000, 'd', 'e']

1.5.4 Using Numpy Arrays to store data

Numpy arrays behave very differently from Python lists. For example indexing using np_list3[1,0] does not work with Python lists.

import numpy as np
#----------- Simple Python lists -----------#
np_list1 = np.array([0,1,2,3,4,5])
np_list2 = np.array([0,10,100,1000,10000,100000])
np_list3 = np.array([['a','b','c'],['d','e','f'],['g','h','i']])

Useage Example 1

print(np_list1[0])                          # Indexing
# 0
print(np_list1[2::4])                       # Splicing
# [2]

Useage Example 2

print(np_list3[1])                          # Indexing 2D lists
# ['d' 'e' 'f']
print(np_list3[1][0])                       # Indexing 2D lists
# d
print(np_list3[1,0])                        # Easier syntax for Indexing 2D lists.
# d

Useage Example 3

np_list2[2] = 10000                         # Modifying a list
print(np_list2)
# [     0     10  10000   1000  10000 100000]

Useage Example 4

print(np_list1 + np_list2)                  # Addition
# [     0     11  10002   1003  10004 100005]

Useage Example 5

print(2*np_list2)                           # Multiplication
# [     0     20  20000   2000  20000 200000]

Useage Example 6

print(np.vstack((np_list1,np_list2)))       # Combining 'vertically'
# [[     0      1      2      3      4      5]
#  [     0     10  10000   1000  10000 100000]]

Useage Example 7

print(np.hstack((np_list1,np_list2)))       # Combining 'horizontally'
# [     0      1      2      3      4      5      0     10  10000   1000
#   10000 100000]

1.5.5 Python Lists vs. Numpy arrays

py_list = [1,2,3]
np_arr = np.array([1,2,3])
py_list_2d = [[1,2,3],[10,20,30],[100,200,300]]
np_arr_2d = np.array([[1,2,3],[10,20,30],[100,200,300]])
Action Python list Result Numpy array Result Comment
Mutability Mutable Mutable
Addition py_list + [4,5] [1,2,3,4,5] np_arr + [4,5] Error
np_arr + [4,5,6] [5,7,9]
Multiplication 2*py_list [1,2,3,1,2,3] 2*np_arr [2,4,6]
Powers py_list**2 Error np_arr**2 [1,4,9]
Functions pow(py_list,2) Error pow(np_arr,2) [1,4,9]
Boolean py_list >= 2 Error np_arr >= 2 [False, True, True]
Masking py_list[[True,False,True]] Error np_arr[[True,False,True]] [1,3]
Indexing py_list_2d[:, 1] Error np_arr_2d[:, 1] [2,20,200]
py_list_2d[1,:] Error np_arr_2d[1,:] [1,2,3]
Appending py_list += 100 [1,2,3,100] np.append(np_arr_2d,100) [1,2,3,100] Python lists are faster.