Python Pandas MCQ : Set 6

Python Pandas MCQ

1). What is the output of the following code?

				
					import pandas as pd
import numpy as np
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, np.nan, 29, np.nan]}
df = pd.DataFrame(d)
print("Nan values in the dataframe: ", df.isnull().values.sum())
				
			

a) 0
b) 1
c) 2
d) 3

Correct answer is: c) 2
Explanation: The code creates a DataFrame `df` with four rows and three columns (‘Sr.no.’, ‘Name’, and ‘Age’). The ‘Age’ column contains two NaN (Not a Number) values. The `isnull().values` method is used to create a boolean array where True represents the presence of a NaN value, and False represents a non-NaN value. The `sum()` function is then applied to this boolean array to count the total number of True (i.e., NaN) values. In this case, the output will be `2`, indicating that there are two NaN value in the DataFrame.

2). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33], 'Salary': [50000, 65000, 58000, 66000]}
df = pd.DataFrame(d)
df = df.sample(frac=1)
print("\nNew DataFrame:")
print(df)

				
			

a) To create a DataFrame with given data and shuffle its rows randomly.
b) To sort the DataFrame rows in ascending order based on a specified column.
c) To remove duplicate rows from the DataFrame.
d) To convert the DataFrame into a NumPy array.

Correct answer is: a) To create a DataFrame with given data and shuffle its rows randomly.
Explanation: The given code performs the following actions:
1. Import the Pandas library as `pd`.
2. Create a Python dictionary `d` containing four columns: ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’.
3. Use the dictionary to create a DataFrame `df`.
4. `df.sample(frac=1)` is used to shuffle the rows of the DataFrame randomly. The parameter `frac=1` specifies that the entire DataFrame should be sampled, effectively shuffling the entire DataFrame.
5. Finally, the shuffled DataFrame is printed with the message “New DataFrame:”.

3). What is the output of the following code?

				
					import pandas as pd
d = {'Rank':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
df = df.rename(columns = {'Rank':'Sr.no.'})
print("New: \n",df)
				
			

a)
New:
Sr.no. Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000

b)
New:
Rank Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000

c)
New:
Sr.no. Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000

d)
New:
Sr.no. Name Age Salary
1 1 Alex 30 50000
2 2 John 27 65000
3 3 Peter 29 58000
4 4 Klaus 33 66000

Correct answer is: b)
New:
Rank Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000
Explanation: The code creates a DataFrame `df` using the provided dictionary `d`, which contains columns ‘Rank’, ‘Name’, ‘Age’, and ‘Salary’. The `rename()` function is then used to rename the ‘Rank’ column to ‘Sr.no.’. Finally, the modified DataFrame is printed using the `print()` function.

4). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33], 'Salary': [50000, 65000, 58000, 66000]}
df = pd.DataFrame(d)
name_list = df['Name'].tolist()
print("List of names: ", name_list)
				
			

b) List of names: ‘Alex’, ‘John’, ‘Peter’, ‘Klaus’
c) List of names: ‘Alex’ ‘John’ ‘Peter’ ‘Klaus’
d) List of names: [‘Alex’, ‘John’, ‘Peter’]

Correct answer is: a) List of names: [‘Alex’, ‘John’, ‘Peter’, ‘Klaus’]
Explanation: The code creates a DataFrame named ‘df’ with columns ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. Then, it extracts the ‘Name’ column using `df[‘Name’]`, converts it to a Python list using the `tolist()` method, and assigns it to the variable ‘name_list’. Finally, it prints the list of names. The correct output is “List of names: [‘Alex’, ‘John’, ‘Peter’, ‘Klaus’]”. Option a represents the correct list of names. Options b and c are incorrect because they use incorrect quotation marks and do not form a valid Python list. Option d is correct but contains unnecessary square brackets around the list, which are not present in the actual output.

5). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print("Row where Salary has maximum value:")
print(df['Salary'].argmax())
				
			

a) 0
b) 1
c) 2
d) 3

Correct answer is: d) 3
Explanation: The code creates a DataFrame `df` with four columns: ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. It then prints the row where the ‘Salary’ column has the maximum value using the `argmax()` method) The `argmax()` method returns the index label (row number) of the first occurrence of the maximum value in the Series. In this case, the ‘Salary’ column has the maximum value of 66000 at index 3 (row 4, considering zero-based indexing), and hence the output of the code will be 3.

6). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print("Row where Salary has minimum value:")
print(df['Salary'].argmin())
				
			

a) 0
b) 1
c) 2
d) 3

Correct answer is: a) 0
Explanation: The code creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. It then prints the row number where the ‘Salary’ column has the minimum value using the `argmin()` function. The `argmin()` function returns the index (row number) of the first occurrence of the minimum value in the Series. In this case, the ‘Salary’ column has the values [50000, 65000, 58000, 66000]. The minimum value in the ‘Salary’ column is 50000, which occurs in the first row (index 0) with the value ‘Alex’.

7). What is the output of the following code?

				
					import pandas as pd
import numpy as np
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33], 'Salary': [50000, 65000, 58000, 66000]}
df = pd.DataFrame(d)

for i in ["Company", 'Name']:
    if i in df.columns:
        print(f"{i} is present in DataFrame.")
    else:
        print(f"{i} is not present in DataFrame.")
				
			

a) Company is present in DataFrame.
Name is present in DataFrame.
b) Company is present in DataFrame.
Name is not present in DataFrame.
c) Company is not present in DataFrame.
Name is present in DataFrame.
d) Company is not present in DataFrame.
Name is not present in DataFrame.

Correct answer is: c) Company is not present in DataFrame.
Name is present in DataFrame.
Explanation: In the given code, a DataFrame `df` is created from the provided dictionary `d`. The DataFrame has columns: ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. The code then iterates through the list `[“Company”, ‘Name’]` and checks if each item is present in the DataFrame’s columns using the `in` operator.

8). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print(df.dtypes)
				
			

a) Sr.no. int64
Name object
Age int64
Salary int64
dtype: object

b) Sr.no. int32
Name object
Age int32
Salary int32
dtype: object

c) Sr.no. int32
Name object
Age int64
Salary int32
dtype: object

d) Sr.no. int64
Name object
Age int32
Salary int64
dtype: object

Correct answer is: a) Sr.no. int64
Name object
Age int64
Salary int64
dtype: object
Explanation: The given code creates a DataFrame named `df` with four columns: ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. The `print(df.dtypes)` statement will display the data types of each column in the DataFrame. In the DataFrame, the ‘Sr.no.’, ‘Age’, and ‘Salary’ columns contain integer values, so their data types are represented as `int64`, which is the standard 64-bit integer data type. The ‘Name’ column contains strings, so its data type is represented as `object`.

9). What is the output of the following code?

				
					import pandas as pd
l = [['Virat','cricket'],['Messi','football']]
df = pd.DataFrame(l)
print(df)
				
			

a)
0 1
0 Virat cricket
1 Messi football

b)
0 1
0 Virat
1 Messi
2 cricket
3 football

c)
0
0 Virat
1 Messi
2 cricket
3 football

d)
0 1
0 1 2
1 1 2

Correct answer is: a)
0 1
0 Virat cricket
1 Messi football
Explanation: The given code snippet imports the Pandas library as `pd` and creates a DataFrame `df` from a list of lists `l`. The DataFrame `df` contains two rows and two columns with data as follows:
0 1
0 Virat cricket
1 Messi football

10). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
index = df.columns.get_loc('Age')
print("Index no of age column: ", index)
				
			

a) Index no of age column: 2
b) Index no of age column: 3
c) Index no of age column: 1
d) Index no of age column: 0

Correct answer is: a) Index no of age column: 2
Explanation: The code first imports the pandas library as pd. Then, it creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. Next, it retrieves the index of the column ‘Age’ using the `get_loc()` method, and the index is assigned to the variable `index`. Finally, it prints the result as “Index no of age column: ” followed by the value of `index`. Since ‘Age’ is the third column in the DataFrame (index 2, as Python uses 0-based indexing), the correct output will be “Index no of age column: 2”. Thus, option a is the correct answer.

11). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33], 'Salary': [50000, 65000, 58000, 66000]}
df = pd.DataFrame(d)
print("After removing first 2 rows of the DataFrame:")
df1 = df.iloc[2:]
print(df1)
				
			

a) It calculates the average age of all employees in the DataFrame.
b) It prints the original DataFrame without any modifications.
c) It creates a new DataFrame `df1` by removing the first 2 rows from the original DataFrame `df`.
d) It sorts the DataFrame in ascending order based on the ‘Age’ column.

Correct answer is: c) It creates a new DataFrame `df1` by removing the first 2 rows from the original DataFrame `df`.
Explanation: The code starts by importing the Pandas library and creating a DataFrame `df` with columns: ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. The DataFrame contains information about four individuals, including their serial numbers, names, ages, and salaries. The next line of code prints a message indicating that the DataFrame will be modified by removing the first 2 rows. The subsequent line of code creates a new DataFrame `df1` by using the `iloc[2:]` attribute, which slices the original DataFrame `df` starting from the third row (index 2) to the end) This effectively removes the first two rows from `df`, and the resulting DataFrame is assigned to `df1`. Finally, the code prints the newly created DataFrame `df1`, which contains the data for the last two individuals in the original DataFrame `df`.

12). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print("Reverse row order:")
print(df.loc[::-1])
				
			

a) It creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’ and displays it in reverse order.
b) It reverses the order of the rows in the existing DataFrame `df` and displays the result.
c) It creates a new DataFrame by extracting rows from `df` in reverse order and displays it.
d) It sorts the DataFrame `df` in descending order based on the index labels and displays the result.

Correct answer is: b) It reverses the order of the rows in the existing DataFrame `df` and displays the result.
Explanation: The given code snippet uses the Pandas library to create a DataFrame `df` from a dictionary `d`, containing columns ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’. The DataFrame is then printed with a reverse row order using the `loc` attribute and the slicing notation `[::-1]`. Option b is the correct answer because the `df.loc[::-1]` syntax is used to reverse the order of rows in the DataFrame `df`. The result will display the DataFrame with rows in reverse order, meaning the last row becomes the first, the second-last row becomes the second, and so on.

13). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33], 'Salary': [50000, 65000, 58000, 66000]}
df = pd.DataFrame(d)
print("Reverse column order:")
print(df.loc[:, ::-1])
				
			

a) It drops the ‘Sr.no.’ column from the DataFrame.
b) It sorts the DataFrame in reverse alphabetical order based on the column names.
c) It reverses the order of rows in the DataFrame.
d) It reverses the order of columns in the DataFrame.

Correct answer is: d) It reverses the order of columns in the DataFrame.
Explanation: The given code snippet imports the Pandas library as ‘pd’, creates a DataFrame ‘df’ with four columns (Sr.no., Name, Age, and Salary), and then prints the DataFrame in reverse column order using the `loc[:, ::-1]` slicing. `df.loc[:, ::-1]` is a pandas DataFrame indexing operation that selects all rows (`:`) and reverses the order of columns (`::-1`). As a result, the DataFrame ‘df’ will be printed with the columns in reverse order.

14). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print("Select string columns")
print(df.select_dtypes(include = "object"))
				
			

a) To create a DataFrame with columns for Sr.no., Name, Age, and Salary
b) To filter and display only the columns with string data type
c) To calculate the mean salary of the employees
d) To calculate the median age of the employees

Correct answer is: b) To filter and display only the columns with string data type
Explanation: The provided code performs the following tasks:
1. Import the Pandas library and assign the DataFrame with columns “Sr.no.”, “Name”, “Age”, and “Salary” to the variable `df`.
2. Prints the message “Select string columns”.
3. Uses the `select_dtypes()` function to filter and display only the columns with the data type “object” (i.e., columns with string values) from the DataFrame `df`.

15). What is the output of the following code?

				
					import pandas as pd
d = {'Name': ['Kate', 'Jason', 'ROBERT', 'MARK', 'Dwyane']}
df = pd.DataFrame(d)
df['Uppercase'] = list(map(lambda x: x.isupper(), df['Name']))
print(df)
				
			

a)
Name Uppercase
0 Kate False
1 Jason False
2 ROBERT True
3 MARK True
4 Dwyane False

b)
Name Uppercase
0 Kate False
1 Jason False
2 ROBERT False
3 MARK False
4 Dwyane False

c)
Name Uppercase
0 Kate True
1 Jason True
2 ROBERT True
3 MARK True
4 Dwyane True

d)
Name Uppercase
0 Kate False
1 Jason True
2 ROBERT False
3 MARK True
4 Dwyane False

Correct answer is: a)
Name Uppercase
0 Kate False
1 Jason False
2 ROBERT True
3 MARK True
4 Dwyane False
Explanation: The given code creates a DataFrame `df` with a ‘Name’ column containing five names. It then adds a new column ‘Uppercase’ to the DataFrame, which is populated with True or False based on whether each name in the ‘Name’ column contains all uppercase letters.

16). What is the purpose of the following code?

				
					import pandas as pd
d = {'Name':['kate','jason','ROBERT','MARK','dwyane']}
df = pd.DataFrame(d)
df['Lowercase'] = list(map(lambda x: x.islower(), df['Name']))
print(df)
				
			

Correct answer is: c) To add a new column ‘Lowercase’ indicating if each name is lowercase or not
Explanation: The purpose of this code is to add a new column named ‘Lowercase’ to the DataFrame ‘df’. The ‘Lowercase’ column is created using the `map()` function and a lambda function that checks if each name in the ‘Name’ column is lowercase or not. The resulting Boolean values (True or False) are added to the ‘Lowercase’ column. The code then prints the updated DataFrame, which includes the new ‘Lowercase’ column indicating whether each name is lowercase or not.

17). What is the purpose of the following code?

				
					import pandas as pd
d = {'Marks':['Pass','88','First Class','90','Distinction']}
df = pd.DataFrame(d)
df['Numeric'] = list(map(lambda x: x.isdigit(), df['Marks']))
print(df)
				
			

a) To convert the ‘Marks’ column into numeric values wherever possible.
b) To check whether each element in the ‘Marks’ column is a digit or not.
c) To remove rows from the DataFrame where the ‘Marks’ column contains non-numeric values.
d) To filter rows in the DataFrame where the ‘Marks’ column is numeric)

Correct answer is: b) To check whether each element in the ‘Marks’ column is a digit or not.
Explanation: The code creates a DataFrame `df` with a ‘Marks’ column containing a mixture of strings and numbers. The ‘Numeric’ column is then added to the DataFrame using the `map()` function and a lambda function. The `map()` function applies the lambda function to each element in the ‘Marks’ column, and the lambda function `lambda x: x.isdigit()` checks whether each element is a digit or not. The result of this check, i.e., True (if the element is a digit) or False (if it’s not), is stored in the ‘Numeric’ column.

18). What is the purpose of the following code?

				
					import pandas as pd
df = pd.DataFrame({'Sales': [55000, 75000, 330000, 10000]})
print("Original DataFrame:")
print("Length of sale_amount:")
df['Length'] = df['Sales'].map(str).apply(len)
print(df)
				
			

a) To calculate the total sales amount for each entry in the DataFrame.
b) To calculate the sum of all sales amounts in the DataFrame.
c) To convert the ‘Sales’ column values to strings and find the length of each value.
d) To sort the DataFrame in ascending order based on the ‘Sales’ column.

Correct answer is: c) To convert the ‘Sales’ column values to strings and find the length of each value.
Explanation: In the given code, a DataFrame `df` is created with a single column ‘Sales’ containing four entries representing sales amounts. The code then adds a new column ‘Length’ to the DataFrame by converting the values in the ‘Sales’ column to strings using `map(str)` and then applying the `len` function to calculate the length of each string value.

19). What is the purpose of the following code?

				
					import pandas as pd
import re

d = {'Company_mail': ['TCS tcs@yahoo.com', 'Apple apple@icloud)com', 'Google google@gmail.com']}
df = pd.DataFrame(d)
def find_email(text):
    email = re.findall(r'[\w\.-]+@[\w\.-]+', str(text))
    return ",".join(email)

df['email'] = df['Company_mail'].apply(lambda x: find_email(x))
print("Extracting email from dataframe columns:")
print(df)
				
			

a) To filter rows containing email addresses in the ‘Company_mail’ column.
b) To replace email addresses in the ‘Company_mail’ column with the word ’email’.
c) To split the ‘Company_mail’ column into two columns, one containing the company names and the other containing the email addresses.
d) To extract and store email addresses from the ‘Company_mail’ column into a new ’email’ column.

Correct answer is: d) To extract and store email addresses from the ‘Company_mail’ column into a new ’email’ column.
Explanation: The purpose of the given code is to extract and store email addresses from the ‘Company_mail’ column of the DataFrame into a new column called ’email’. The code uses a regular expression pattern (`r'[\w\.-]+@[\w\.-]+’`) to find email addresses in the text and the `find_email()` function is applied to each row of the ‘Company_mail’ column using the `apply()` method) The result is a DataFrame with an additional ’email’ column containing the extracted email addresses. The `print(df)` statement at the end displays the updated DataFrame with the extracted email addresses.

20). What is the purpose of the following code?

				
					import pandas as pd
df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Yash', 'Gaurav', 'Sanket'], 'Age': [30, 27, 28]})
df2 = pd.DataFrame({'ID': [4, 3], 'Name': ['Tanmay', 'Athrva'], 'Age': [26, 22]})
result = pd.concat([df1, df2])
print("New dataframe")
print(result)
				
			

a) To merge two DataFrames df1 and df2 based on their common columns.
b) To concatenate two DataFrames df1 and df2 vertically, stacking one below the other.
c) To concatenate two DataFrames df1 and df2 horizontally, merging them side by side.
d) To create a new DataFrame by transposing the columns and rows of df1 and df2.

Correct answer is: b) To concatenate two DataFrames df1 and df2 vertically, stacking one below the other.
Explanation: The given code uses the `pd.concat()` function from the Pandas library to concatenate two DataFrames, df1 and df2, vertically. This results in a new DataFrame called ‘result’ where the rows of df2 are stacked below the rows of df1. The `pd.concat()` function is used to combine DataFrames, and the default behavior is to concatenate them vertically. This is the reason the new DataFrame ‘result’ contains all the rows from df1 followed by all the rows from df2.

21). What is the purpose of the following code?

				
					import pandas as pd
df1 = pd.DataFrame({'ID':[1,2,3],'Name':['Yash','Gaurav','Sanket'],
                   'Age':[30,27,28]})
df2 = pd.DataFrame({'ID':[4,3],'Name':['Tanmay','Athrva'],'Age':[26,22]})
result = pd.concat([df1,df2],axis=1)
print("New dataframe")
print(result)
				
			

a) Merging two DataFrames vertically based on the common column ‘ID’.
b) Merging two DataFrames horizontally based on the common column ‘Name’.
c) Concatenating two DataFrames horizontally, creating a new DataFrame.
d) Concatenating two DataFrames vertically, creating a new DataFrame.

Correct answer is: c) Concatenating two DataFrames horizontally, creating a new DataFrame.
Explanation: The purpose of the given code is to concatenate two DataFrames horizontally using the `pd.concat()` function with `axis=1`. Horizontal concatenation, or concatenation along columns, merges the two DataFrames side by side, based on their column names. The resulting DataFrame `result` will have all columns from both `df1` and `df2` in a single DataFrame. In this case, it will produce a new DataFrame with columns ‘ID’, ‘Name’, and ‘Age’ from both `df1` and `df2`.

22). What is the purpose of the following code?

				
					import pandas as pd
df1 = pd.DataFrame({'Id':['S1','S2','S3'],
                   'Name':['Ketan','Yash','Abhishek'],
                   'Marks':[90,87,77]})
df2 = pd.DataFrame({'Id':['S2','S4'],
                    'Name':['Yash','Gaurav'],
                   'Marks':[70,65]})
print('Dataframe 1: \n',df1)
print('Dataframe 2: \n',df2)
new = pd.merge(df1, df2, on='Id', how='inner')
print("Merged data:")
print(new)
				
			

a) To transpose the rows and columns of the merged DataFrame.
b) To calculate the mean of the ‘Marks’ column in each DataFrame.
c) To merge two DataFrames based on a common column ‘Id’ using an inner join.
d) To merge two DataFrames based on a common column ‘Name’ using an inner join.

Correct answer is: c) To merge two DataFrames based on a common column ‘Id’ using an inner join.
Explanation: The purpose of this code is to merge two DataFrames, `df1` and `df2`, based on a common column ‘Id’ using an inner join. The code uses the `pd.merge()` function, which is a powerful method in Pandas used to combine DataFrames based on shared columns. The ‘inner’ join ensures that only the rows with matching ‘Id’ values in both DataFrames are included in the resulting DataFrame ‘new’. The printed output displays the contents of `df1`, `df2`, and the merged DataFrame ‘new’, allowing us to observe the merging operation.

23). What is the output of the following code?

				
					import pandas as pd
import numpy as np

df = pd.DataFrame({'Sr.no.':[1,2,3,4],
                   'Name':['Alex','John','Peter','Klaus'],
                   'Age':[30,np.nan,29,np.nan]})

print("Original Dataframe: \n",df)
print(df.isna())
				
			

a) Original DataFrame:
Sr.no. Name Age
0 1 Alex 30.0
1 2 John NaN
2 3 Peter 29.0
3 4 Klaus NaN
dtype: float64

Output of `df.isna()`:
Sr.no. Name Age
0 False False False
1 False False True
2 False False False
3 False False True

b) Original DataFrame:
Sr.no. Name Age
0 1 Alex 30.0
1 2 John NaN
2 3 Peter 29.0
3 4 Klaus NaN
dtype: object

Output of `df.isna()`:
Sr.no. Name Age
0 False False False
1 False False True
2 False False False
3 True False True

c) Original DataFrame:
Sr.no. Name Age
0 1 Alex 30.0
1 2 John NaN
2 3 Peter 29.0
3 4 Klaus NaN
dtype: float64

Output of `df.isna()`:
Sr.no. Name Age
0 True False False
1 False False True
2 True False False
3 False False True

d) Original DataFrame:
Sr.no. Name Age
0 1 Alex 30.0
1 2 John NaN
2 3 Peter 29.0
3 4 Klaus NaN
dtype: object

Output of `df.isna()`:
Sr.no. Name Age
0 True False False
1 False False True
2 True False False
3 False False True

Correct answer is: a) Original DataFrame:
Sr.no. Name Age
0 1 Alex 30.0
1 2 John NaN
2 3 Peter 29.0
3 4 Klaus NaN
dtype: float64

Output of `df.isna()`:
Sr.no. Name Age
0 False False False
1 False False True
2 False False False
3 False False True
Explanation: The original DataFrame `df` contains columns ‘Sr.no.’, ‘Name’, and ‘Age’ with corresponding values. The `print(df.isna())` statement prints a DataFrame indicating whether each element in the original DataFrame is null (`NaN`) or not. The output shows `False` for non-null values and `True` for null values in the ‘Age’ column, as indicated by the `NaN` values.

24). What is the purpose of the following code?

				
					import pandas as pd
import numpy as np
df = pd.DataFrame({'Sr.no.':[1,2,3,4,5],
                   'Name':['Alex',np.nan,'Peter','Klaus','Stefan'],
                   'Age':[30,np.nan,29,22,22]})
print("Original Dataframe: \n",df)
result = df.fillna(df.mode().iloc[0])
print(result)
				
			

a) To remove duplicate rows from the DataFrame.
b) To calculate the median of each column in the DataFrame.
c) To fill missing values in the ‘Name’ and ‘Age’ columns with the most frequent value.
d) To calculate the cumulative sum of each column in the DataFrame.

Correct answer is: c) To fill missing values in the ‘Name’ and ‘Age’ columns with the most frequent value.
Explanation: The given code uses the Pandas library to manipulate the DataFrame `df`. The DataFrame contains three columns: ‘Sr.no.’, ‘Name’, and ‘Age’. It uses the `fillna()` method to fill missing values (NaN) in the ‘Name’ and ‘Age’ columns with the most frequent value present in each column. The most frequent value is calculated using the `mode()` function, and `iloc[0]` is used to access the first value in the resulting mode Series. This operation replaces the missing values with the most frequent non-missing value in their respective columns. The filled DataFrame is then stored in the variable `result`, and both the original and filled DataFrames are printed using the `print()` function.

25). What is the purpose of the following code?

				
					import pandas as pd
df = pd.read_excel("file name")
print(df)
				
			

a) To import the Pandas library and read an Excel file named “file name” into a DataFrame, then print the DataFrame.
b) To export an Excel file named “file name” into a Pandas DataFrame, then display the DataFrame.
c) To read a CSV file named “file name” into a Pandas DataFrame, then display the DataFrame.
d) To save a Pandas DataFrame into an Excel file named “file name”, then print the DataFrame.

Correct answer is: a) To import the Pandas library and read an Excel file named “file name” into a DataFrame, then print the DataFrame.
Explanation: The provided code performs the following actions:
1. `import pandas as pd`: This line imports the Pandas library and assigns it the alias `pd` to make it easier to refer to Pandas functions.
2. `df = pd.read_excel(“file name”)`: This line reads an Excel file named “file name” into a Pandas DataFrame `df`. The `read_excel()` function is used to read Excel files and create a DataFrame.
3. `print(df)`: This line prints the DataFrame `df`, which contains the data read from the Excel file.

Leave a Comment