1.5 Storing Data
There are several ways we can store data for various task while the programming is running.
1.5.1 Using Python Dictionaries to store data
- You can think of a dictionary as list where each item(value) has a name (key).
- We access the items by using the ‘key’
- Note the use of
{ }
Useage Example 1
# Create a dictionary
= {'square': 'a 2D object','cube': 'a 3D object' }
shapes print(shapes.keys())
# dict_keys(['square', 'cube'])
Useage Example 2
= {'square': 'a 2D object','cube': 'a 3D object' }
shapes
print(shapes['square'])
# a 2D object
print(shapes['cube'])
# a 3D object
Useage Example 3
= {'square': 'a 2D object','cube': 'a 3D object' }
shapes
# Add new data under a new key
'triangle']='has three sides.'
shapes[print(shapes)
# {'square': 'a 2D object', 'cube': 'a 3D object', 'triangle': 'has three sides.'}
Useage Example 4
Here is one way you can use a dictionary in a for
loop.
= {'square': 'a 2D object','cube': 'a 3D object' }
shapes
for value,key in shapes.items():
print(key,' ---> ',value)
# a 2D object ---> square
# a 3D object ---> cube
1.5.2 Pandas DataFrame
For the sake of completion, a DateFrame is a much more sophisticated way to store data.
1.5.3 Using Python Lists to store date
#----------- Simple Python lists -----------#
= [0,1,2,3,4,5]
py_list1 = ['a','b','c','d','e']
py_list2 = [['a','b','c'],['d','e','f'],['g','h','i']] py_list3
Useage Example 1
print(py_list1[0]) # Indexing
# 0
print(py_list1[2::4]) # Splicing
# [2]
Useage Example 2
print(py_list3[1]) # Indexing 2D lists
# ['d', 'e', 'f']
print(py_list3[1][0]) # Indexing 2D lists
# d
Useage Example 3
2] = 10000 # Modifying a list
py_list2[print(py_list2)
# ['a', 'b', 10000, 'd', 'e']
Useage Example 4
print(py_list1 + py_list2) # Addition
# [0, 1, 2, 3, 4, 5, 'a', 'b', 10000, 'd', 'e']
Useage Example 5
print(2*py_list2) # Multiplication
# ['a', 'b', 10000, 'd', 'e', 'a', 'b', 10000, 'd', 'e']
Useage Example 6
print(2*py_list2) # Multiplication
# ['a', 'b', 10000, 'd', 'e', 'a', 'b', 10000, 'd', 'e']
1.5.4 Using Numpy Arrays to store data
Numpy arrays behave very differently from Python lists. For example indexing using np_list3[1,0]
does not work with Python lists.
import numpy as np
#----------- Simple Python lists -----------#
= np.array([0,1,2,3,4,5])
np_list1 = np.array([0,10,100,1000,10000,100000])
np_list2 = np.array([['a','b','c'],['d','e','f'],['g','h','i']]) np_list3
Useage Example 1
print(np_list1[0]) # Indexing
# 0
print(np_list1[2::4]) # Splicing
# [2]
Useage Example 2
print(np_list3[1]) # Indexing 2D lists
# ['d' 'e' 'f']
print(np_list3[1][0]) # Indexing 2D lists
# d
print(np_list3[1,0]) # Easier syntax for Indexing 2D lists.
# d
Useage Example 3
2] = 10000 # Modifying a list
np_list2[print(np_list2)
# [ 0 10 10000 1000 10000 100000]
Useage Example 4
print(np_list1 + np_list2) # Addition
# [ 0 11 10002 1003 10004 100005]
Useage Example 5
print(2*np_list2) # Multiplication
# [ 0 20 20000 2000 20000 200000]
Useage Example 6
print(np.vstack((np_list1,np_list2))) # Combining 'vertically'
# [[ 0 1 2 3 4 5]
# [ 0 10 10000 1000 10000 100000]]
Useage Example 7
print(np.hstack((np_list1,np_list2))) # Combining 'horizontally'
# [ 0 1 2 3 4 5 0 10 10000 1000
# 10000 100000]
1.5.5 Python Lists vs. Numpy arrays
= [1,2,3]
py_list = np.array([1,2,3])
np_arr = [[1,2,3],[10,20,30],[100,200,300]]
py_list_2d = np.array([[1,2,3],[10,20,30],[100,200,300]]) np_arr_2d
Action | Python list | Result | Numpy array | Result | Comment |
---|---|---|---|---|---|
Mutability | Mutable | Mutable | |||
Addition | py_list + [4,5] |
[1,2,3,4,5] |
np_arr + [4,5] |
Error | |
np_arr + [4,5,6] |
[5,7,9] |
||||
Multiplication | 2*py_list |
[1,2,3,1,2,3] | 2*np_arr |
[2,4,6] |
|
Powers | py_list**2 |
Error | np_arr**2 |
[1,4,9] |
|
Functions | pow(py_list,2) |
Error | pow(np_arr,2) |
[1,4,9] |
|
Boolean | py_list >= 2 |
Error | np_arr >= 2 |
[False, True, True] |
|
Masking | py_list[[True,False,True]] |
Error | np_arr[[True,False,True]] |
[1,3] |
|
Indexing | py_list_2d[:, 1] |
Error | np_arr_2d[:, 1] |
[2,20,200] |
|
py_list_2d[1,:] |
Error | np_arr_2d[1,:] |
[1,2,3] |
||
Appending | py_list += 100 |
[1,2,3,100] | np.append(np_arr_2d,100) |
[1,2,3,100] | Python lists are faster. |