Deal with Sequence of Data in Python
Simply Python built-in data structure that can be used as a container to hold a dynamically changing sequence of different data types (e.g., int, float, object)
Suited for dealing with a small amount of data
'Lists are mutable, so they are naturally suitable for dealing with a dynamic sequence of data. Oftentimes, when I need to ‘remember’ some values while iterating through a for loop, I will create a list and just append the value to the list. In addition, a list allows a mixture of data types, which is useful when I have no clue about the upcoming data types.'
Works well with Tabular data (like Excel Spreadsheets) and time series data
Consume more memory
Has better performance when number of rows is 500K or more
Indexing of pandas series is very slow
'When it comes to tabular data with row index and column index, my go-to choice is pandas.DataFrame, as it allows flexible access to values using integer position or index.'
# Importing pandas library import pandas as pd # Creating and initializing a nested list age = [['Aman', 95.5, "Male"], ['Sunny', 65.7, "Female"], ['Monty', 85.1, "Male"], ['toni', 75.4, "Male"]] # Creating a pandas dataframe df = pd.DataFrame(age, columns=['Name', 'Marks', 'Gender'])
A numpy array is a grid of values (of the same type) that are indexed by a tuple of positive integers
Works well with numerical data and signals; suited for fast scientific computing
Better performance when number of rows is 50K or less
Indexing of numpy Arrays is very fast
# Importing Numpy package import numpy as np # Creating a 3-D numpy array using np.array() org_array = np.array([[23, 46, 85], [43, 56, 99], [11, 34, 55]])
(Source: Jiahui Wang. Python List, NumPy, and Pandas. 2019.)