NumPy & File Handling

Python’s beginner libraries to start Machine Learning.

Deep Patel
4 min readFeb 15, 2021
Credit: Author

In this blog, we will look at Python modules(NumPY & File Handling) necessary for Machine Learning. Please try to run the code by yourself for a better understanding.

Prerequisite; Basic knowledge of Python is necessary to understand this blog.

NumPy: NumPY is a Python library used for working with arrays. It also has functions for working in domain of linear algebra, Fourier transform (Fourier Transform is an important image processing tool which is used to decompose an image into its sine and cosine components), and matrices.

NumPy — a library of arrays

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array object in NumPY is called, it provides a lot of supporting functions that make working with ndarray very easy. We will also go through few statistics using the NumPy library.

#Note: Here, I used italic to show output.import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
#[1 2 3 4 5]
#Creating zeros matrix
np.zeros((2,3))
#array([[0., 0., 0.],
[0., 0., 0.]])
#Creating ones matrix
np.ones(5,dtype=np.int32)
#array([1, 1, 1, 1, 1])
#Creating matrix with random data
np.random.rand(2, 3)
#array([[0.95580785, 0.98378873, 0.65133872],
[0.38330437, 0.16033608, 0.13826526]])
#Flattening or changing shape of matrix
a = np.ones((2,2))
print('Original shape :', a.shape)
print('Array :','\n', a)
print('Shape after flatten :',b.shape)
#Original shape : (2, 2)
Array :
[[1. 1.]
[1. 1.]]
Shape after flatten : (4,)
Array :
[1. 1. 1. 1.]
#Stacking matrix both horizontally and vertically
#Arange create a matrix from given range

a = np.arange(0,5)
b = np.arange(5,10)
print('Array 1 :','\n',a)
print('Array 2 :','\n',b)
print('Vertical stacking :','\n',np.vstack((a,b)))
print('Horizontal stacking :','\n',np.hstack((a,b)))
#Array 1 :
[0 1 2 3 4]
Array 2 :
[5 6 7 8 9]
Vertical stacking :
[[0 1 2 3 4]
[5 6 7 8 9]]
Horizontal stacking :
[0 1 2 3 4 5 6 7 8 9]
#Type of data structure
b = np.array([3.1, 11.02, 6.2, 213.2, 5.2])
type(b)
#numpy.ndarray
#data-type of array
b.dtype
#dtype(‘float64’)
# Slicing the numpy array
d = b[1:4]
print(d)
#[ 11.02 6.2 213.2 ]
# Get the number of dimensions of numpy array
b.ndim
#1
# Get the shape/size of numpy array
b.shape
#(5,)
# Get the mean of numpy array
mean = b.mean()
#47.74399999999999
# Get the standard deviation of numpy array
standard_deviation=b.std()
#82.76874134599365
# Get the biggest value in the numpy array
max_b = b.max()
#213.2
# Get the smallest value in the numpy array
min_b = b.min()
#3.1
# MATRIX Multiplication
mat1 = ([1, 6, 5],[3 ,4, 8],[2, 12, 3])
mat2 = ([3, 4, 6],[5, 6, 7],[6,56, 7])
np.dot(mat1, mat2)
#array([[ 63, 320, 83],
[ 77, 484, 102],
[ 84, 248, 117]])
# Pi is math function
np.pi
3.141592653589793
# Calculate the sin of each elements
y = np.sin(b)
#array([ 0.04158066, -0.99970171, -0.0830894 , -0.41532536, -0.88345466])
# Makeup a numpy array within [-2, 2] and 5 elements
np.linspace(-2, 2, num=5)
#array([-2., -1., 0., 1., 2.])

NOTE : In Python, we read images as NumPY arrays, which you will see in further blogs of deep learning.

File Handling in Python: File handling is an important part of any web application. Python has several functions for creating, reading, updating, and deleting files.

Handling FILES via Python
# OPENING FILE
#1st Way

fileref = open("olympics.txt", "r") #fileref is reference (opening a file in python3)
#opening file via relative path
#open('/Users/joebob01/myFiles/allProjects/myData/data2.txt', 'r')
#2nd way to open file
with open("olympics.txt","r") as fileref :
#lines of code to work on file ....
#automatically closes file
#READING and processing a file :
with open('fname', 'r') as fileref: # step 1
lines = fileref.readlines() # step 2 - get list of all** lines of text in file
for lin in lines: # step 3 - for loop to iterate lines
#WRITING in a file
with open("filename","w") as f1 :
for i in range(5) :
sqr = i*i
f1.write(str(sqr) + "\n") #also write data as int, float
#NOTE : File open in write 'w' mode overwrite the content (i.e deletes past content and save only new content) whereas in append mode 'a' data in added to end of past content#CLOSING a file
fileref.close()
#closing file is necessary to prevent misuse by other users
#DELETING file
import os
if os.path.exists("demofile.txt"):
os.remove("demofile.txt")
else:
print("The file does not exist")

In Data Science we mostly deal with CSV(Comma Separated Values) form of files. We mostly use pandas for opening this CSV form of file and creating a data frame. We will mostly discuss pandas while learning the core Data Science in further blogs.

For further reference, refer to the doc.

--

--

Deep Patel

Learning and exploring this beautiful world with amazing tech.