Python Pandas MCQ : Set 5

Python Pandas MCQ

1). What is the output of the following code?

				
					import pandas as pd
df = pd.Series(['2 Feb 2020','5/11/2021','7-8-2022'])
print("Converting series of date strings to a timeseries:")
print(pd.to_datetime(df))

a) 0 2020-02-02
1 2021-05-11
dtype: datetime64[ns]

b) 0 Feb 02, 2020
1 May 11, 2021
2 Jul 08, 2022
dtype: datetime64[ns]

c) 0 2020-02-02
1 2021-11-05
2 2022-08-07
dtype: datetime64[ns]

d) ValueError: day is out of range for month

Correct answer is: c) 0 2020-02-02
1 2021-11-05
2 2022-08-07
dtype: datetime64[ns]
Explanation: The given code converts a pandas Series containing date strings into a time series using the `pd.to_datetime()` function. The function interprets various date formats and converts them into datetime64[ns] format. In the original Series, the date strings have different formats: ‘2 Feb 2020’, ‘5/11/2021’, and ‘7-8-2022’. The function `pd.to_datetime()` is capable of handling these formats and correctly converts them into the standard ‘YYYY-MM-DD’ format.

2). What is the output of the following code?

				
					import pandas as pd
df = pd.Series([54, 25, 38, 87, 67])
print("Index of the first smallest and largest value of the series:")
print(df.idxmin())
print(df.idxmax())

a) Index of the first smallest and largest value of the series:
1
3

b) Index of the first smallest and largest value of the series:
1
4

c) Index of the first smallest and largest value of the series:
2
3

d) Index of the first smallest and largest value of the series:
0
3

Correct answer is: a) Index of the first smallest and largest value of the series:
1
3
Explanation: The code defines a Pandas Series `df` containing five elements: [54, 25, 38, 87, 67]. The `idxmin()` function is then applied to find the index of the first smallest value in the Series, which is 1 (corresponding to the element 25). Similarly, the `idxmax()` function is used to find the index of the first largest value in the Series, which is 3 (corresponding to the element 87).

3). What is the output of the following code?

				
					import pandas as pd
dictionary = {'marks1':[34,20,32,30],'marks2':[36,22,10,44]}
df = pd.DataFrame(dictionary)
print(df)

a)
marks1 marks2
0 34 36
1 20 22
2 32 10
3 30 44

b)
marks1 marks2
0 34 36
1 20 22
2 32 10
3 30 44
4 22 33

c)
marks1 marks2
0 22 36
1 33 22
2 10 10
3 44 44

d)
marks1 marks2
0 34 22
1 20 30
2 32 32
3 30 10

Correct answer is: a)
marks1 marks2
0 34 36
1 20 22
2 32 10
3 30 44
Explanation: The given code imports the Pandas library as ‘pd’ and creates a DataFrame ‘df’ using a dictionary. The dictionary contains two keys ‘marks1’ and ‘marks2’, each representing a list of four integer values. The DataFrame ‘df’ is printed using the `print()` function. The resulting output displays the DataFrame ‘df’ with the ‘marks1’ and ‘marks2’ columns, and their respective integer values in each row. The values are displayed in the same order as in the dictionary.

4). What is the output of the following code?

				
					import pandas as pd
dictionary = {'marks1': [34, 20, 32, 30], 'marks2': [36, 22, 10, 44]}
df = pd.DataFrame(dictionary)
print("First n rows: \n", df.head(2))

a)
First n rows:
marks1 marks2
0 34 36
1 20 22

b)
First n rows:
marks1 marks2
0 34 36
1 20 22
2 32 10

c)
First n rows:
marks1 marks2
0 34 36

d)
First n rows:
marks1 marks2
1 20 22
2 32 10

Correct answer is: a)
First n rows:
marks1 marks2
0 34 36
1 20 22
Explanation: The `df.head(2)` function is used to access the first two rows of the DataFrame `df`. The DataFrame `df` is created from the provided dictionary, which has two columns ‘marks1’ and ‘marks2’ with four rows of data each. Therefore, the output will display the first two rows of the DataFrame, as shown in option (a). The head() function displays the specified number of rows from the top of the DataFrame, and since we passed 2 as an argument, it will display the first two rows of the DataFrame.

5). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3], 'Name': ['Alex', 'John', 'Peter'], 'Age': [30, 27, 29]}
df = pd.DataFrame(d)
print(df[['Name', 'Age']])

a)
Name Age
0 Alex 30
1 John 27
2 Peter 29

b)
Name Age
0 [‘Alex’, ‘John’, ‘Peter’] [30, 27, 29]

c)
Name Age
0 Alex 30
2 Peter 29

d)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27

Correct answer is: a)
Name Age
0 Alex 30
1 John 27
2 Peter

Explanation: The provided code creates a DataFrame `df` with three columns: ‘Sr.no.’, ‘Name’, and ‘Age’. The `print(df[[‘Name’, ‘Age’]])` statement selects and prints only the ‘Name’ and ‘Age’ columns from the DataFrame.

6). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3],'Name':['Alex','John','Peter'],'Age':[30,27,29]}
df = pd.DataFrame(d)
print("Second row: \n", df.iloc[1,:])

a) Sr.no. 2
Name John
Age 27
Name: 1, dtype: object

b) Sr.no. 2
Name Alex
Age 26
Name: 1, dtype: object

c) Sr.no. 2
Name John
Age 29
Name: 1, dtype: object

d) Sr.no. 1
Name John
Age 28
Name: 2, dtype: object

Correct answer is: a) Sr.no. 2
Name John
Age 27
Name: 1, dtype: object
Explanation: The provided code creates a DataFrame ‘df’ with three columns: ‘Sr.no.’, ‘Name’, and ‘Age’, containing the given data) The `iloc` function is used to access rows and columns by their index position. In the `print()` statement, `df.iloc[1, :]` is used to select the second row of the DataFrame.

7). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33]}
df = pd.DataFrame(d)
print("Rows where age is greater than 29")
print(df[df['Age'] > 29])

a)
Sr.no. Name Age
0 1 Alex 30
3 4 Klaus 33

b)
Sr.no. Name Age
1 2 John 27
2 3 Peter 29

c)
Sr.no. Name Age
3 4 Klaus 33

d)
Sr.no. Name Age
0 1 Alex 30

Correct answer is: a)
Sr.no. Name Age
0 1 Alex 30
3 4 Klaus 33
Explanation: The given code creates a DataFrame ‘df’ with columns ‘Sr.no.’, ‘Name’, and ‘Age’. The `print(df[df[‘Age’] > 29])` statement filters the DataFrame ‘df’ to only include rows where the ‘Age’ column is greater than 29. The resulting output shows the rows where the condition is satisfied) In the original DataFrame ‘df’, there are two rows with ages greater than 29, which are Alex (30) and Klaus (33).

8). Which of the following is the output of the given code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33]}
df = pd.DataFrame(d)
print("No. of rows: ", df.shape[0])
print("No. of columns: ", df.shape[1])

a) No. of rows: 3
No. of columns: 4

b) No. of rows: 4
No. of columns: 3

c) No. of rows: 4
No. of columns: 2

d) No. of rows: 2
No. of columns: 4

Correct answer is: b) No. of rows: 4
No. of columns: 3
Explanation: The code creates a DataFrame `df` from the dictionary `d` with 4 rows and 3 columns. The `shape` attribute of the DataFrame is a tuple that contains the number of rows and columns. The statement `df.shape[0]` prints the number of rows, which is 4, and `df.shape[1]` prints the number of columns, which is 3. Therefore, the correct output is “No. of rows: 4” and “No. of columns: 3”.

9). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,np.nan,29,np.nan]}
df = pd.DataFrame(d)
print("Rows where age is missing:")
print(df[df['Age'].isnull()])

a)
Sr.no. Name Age
1 2 John NaN
3 4 Klaus NaN

b)
Sr.no. Name Age
0 1 Alex 30.0
2 3 Peter 29.0

c)
Sr.no. Name Age
1 2 John NaN
2 3 Peter NaN

d)
Sr.no. Name Age
1 2 John NaN
3 4 Klaus 29.0

Correct answer is: a)
Sr.no. Name Age
1 2 John NaN
3 4 Klaus NaN
Explanation: The given code creates a DataFrame `df` with four columns: ‘Sr.no.’, ‘Name’, and ‘Age’. Two of the ‘Age’ values are assigned as `np.nan`, which represents missing or not-a-number values in pandas. The code then prints the rows where the ‘Age’ column has missing values using `df[df[‘Age’].isnull()]`.

10). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33]}
df = pd.DataFrame(d)
print("Rows where age is between 25 and 30 (inclusive):")
print(df[df['Age'].between(25, 30)])

a)
Sr.no. Name Age
1 2 John 27
2 3 Peter 29

b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29

c)
Sr.no. Name Age
2 3 Peter 29

d)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27

Correct answer is: b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
Explanation: The given code creates a DataFrame `df` with four columns: ‘Sr.no.’, ‘Name’, and ‘Age’. It then prints the rows where the ‘Age’ column falls between 25 and 30 (inclusive). The `between()` function in Pandas checks whether each element in the ‘Age’ column is between 25 and 30. When we apply this condition to the DataFrame using `df[‘Age’].between(25, 30)`, it returns a boolean Series with `True` for rows where the ‘Age’ is between 25 and 30 and `False` otherwise. The final output displays the rows where the condition is `True`, which corresponds to rows with ‘Age’ values of 27, 29, and 30 (inclusive).

11). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33]}
df = pd.DataFrame(d)
print(df)
print("Change the age of John to 24:")
df['Age'] = df['Age'].replace(27,24)
print(df)

a)
Sr.no. Name Age
0 1 Alex 30
1 2 John 24
2 3 Peter 29
3 4 Klaus 33

b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
Change the age of John to 24:
Sr.no. Name Age
0 1 Alex 30
1 2 John 24
2 3 Peter 29
3 4 Klaus 33

c)
Sr.no. Name Age
0 1 Alex 30
1 2 John 24
2 3 Peter 29
3 4 Klaus 33
Change the age of John to 27:
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

d)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
Change the age of John to 24:
Sr.no. Name Age
0 1 Alex 30
1 2 John 24
2 3 Peter 29

Correct answer is: b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
Change the age of John to 24:
Sr.no. Name Age
0 1 Alex 30
1 2 John 24
2 3 Peter 29
3 4 Klaus 33
Explanation: The code creates a DataFrame named `df` with columns ‘Sr.no.’, ‘Name’, and ‘Age’. The initial DataFrame is as follows:
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
Then, it modifies the ‘Age’ of John by replacing the value 27 with 24.

12). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33]}
df = pd.DataFrame(d)
print("Sum of age columns: ",df['Age'].sum())

b) Sum of age columns: 119.0
c) Sum of age columns: 119.00
d) Sum of age columns: 120

Correct answer is: a) Sum of age columns: 119
Explanation: The given code creates a DataFrame `df` using the dictionary `d`, which contains three columns: ‘Sr.no.’, ‘Name’, and ‘Age’. Then, it calculates the sum of the ‘Age’ column using the `sum()` function and prints the result. The ‘Age’ column in the DataFrame contains the values [30, 27, 29, 33]. When you calculate the sum of these values, you get 30 + 27 + 29 + 33 = 119.

13). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33]}
df = pd.DataFrame(d)
df1 = {'Sr.no.':5,'Name':'Jason','Age':28}
df = df.append(df1, ignore_index=True)
print("New Series: ")
print(df)

a)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
4 5 Jason 28

b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
4 Sr.no. Name Age
5 5 Jason 28

c)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

d)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
0 5 Jason 28

Correct answer is: a)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33
4 5 Jason 28
Explanation: The code creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, and ‘Age’ using a dictionary `d`. Then, it appends a new row represented by the dictionary `df1` to the DataFrame using the `append()` method with `ignore_index=True`. The `ignore_index=True` ensures that the new row is appended with a new index.

14). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33]}
df = pd.DataFrame(d)
new = df.sort_values(by=['Name'], ascending=[True])
print("After sorting:")
print(new)

a)
After sorting:
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
3 4 Klaus 33
2 3 Peter 29

b)
After sorting:
Sr.no. Name Age
2 3 Peter 29
3 4 Klaus 33
0 1 Alex 30
1 2 John 27

c)
After sorting:
Sr.no. Name Age
3 4 Klaus 33
2 3 Peter 29
1 2 John 27
0 1 Alex 30

d)
After sorting:
Sr.no. Name Age
1 2 John 27
0 1 Alex 30
2 3 Peter 29
3 4 Klaus 33

Correct answer is: a)
After sorting:
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
3 4 Klaus 33
2 3 Peter 29
Explanation: The given code snippet creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, and ‘Age’. It then sorts the DataFrame `df` based on the ‘Name’ column in ascending order using the `sort_values()` method with `ascending=True`. The sorted DataFrame is stored in a new variable `new`.

15). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33]}
df = pd.DataFrame(d)
print("Sum of age columns: ", df['Age'].mean())

a) 30.25
b) 29.75
c) 29.0
d) 32.25

Correct answer is: b) 29.75
Explanation: The given code creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, and ‘Age’. It then calculates the mean (average) of the ‘Age’ column using the `mean()` function and prints the result. The ‘Age’ column contains values [30, 27, 29, 33]. To find the mean, you add all the values and divide the sum by the total number of values: Mean = (30 + 27 + 29 + 33) / 4 = 29.75

16). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33]}
df = pd.DataFrame(d)
print("Change the name of John to Jim:")
df['Name'] = df['Name'].replace('John', 'Jim')
print(df)

a)
Sr.no. Name Age
0 1 Alex 30
2 3 Peter 29
3 4 Klaus 33

b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

c)
Sr.no. Name Age
0 1 Alex 30
1 2 Jim 27
2 3 Peter 29
3 4 Klaus 33

d)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

Correct answer is: c)
Sr.no. Name Age
0 1 Alex 30
1 2 Jim 27
2 3 Peter 29
3 4 Klaus 33
Explanation: The given code creates a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, and ‘Age’. Then, it replaces the value ‘John’ in the ‘Name’ column with ‘Jim’.

17). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33]}
df = pd.DataFrame(d)
df = df[df.Name != 'John']
print("New Series")
print(df)

a)
New Series
Sr.no. Name Age
0 1 Alex 30
2 3 Peter 29
3 4 Klaus 33

b)
New Series
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

c)
New Series
Sr.no. Name Age
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

d)
New Series
Sr.no. Name Age
0 1 Alex 30
2 3 Peter 29
3 4 Klaus 33
1 2 John 27

Correct answer is: a)
New Series
Sr.no. Name Age
0 1 Alex 30
2 3 Peter 29
3 4 Klaus 33
Explanation: The code creates a DataFrame `df` with three columns: ‘Sr.no.’, ‘Name’, and ‘Age’. Then, it filters the DataFrame using the condition `df.Name != ‘John’`, which keeps all rows where the ‘Name’ column is not equal to ‘John’. The resulting DataFrame is printed as “New Series.”

18). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33]}
df = pd.DataFrame(d)
Salary = [50000, 65000, 58000, 66000]
df['Salary'] = Salary
print("New Series: \n", df)

a)
Sr.no. Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000

b)
Sr.no. Name Age
0 1 Alex 30
1 2 John 27
2 3 Peter 29
3 4 Klaus 33

c)
Sr.no. Name Age Salary
0 1 Alex 30 58000
1 2 John 27 65000
2 3 Peter 29 66000
3 4 Klaus 33 50000

d)
Sr.no. Name Age Salary
0 1 Alex 30 66000
1 2 John 27 65000
2 3 Peter 29 50000
3 4 Klaus 33 58000

Correct answer is: a)
Sr.no. Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000
Explanation: The given code first creates a dictionary `d` containing data for the ‘Sr.no.’, ‘Name’, and ‘Age’ columns. It then creates a DataFrame `df` using this dictionary. Next, it defines a list `Salary` containing salary values and adds it to the DataFrame as a new column called ‘Salary’. The output of the `print()` function will display the updated DataFrame with the ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’ columns.

19). What is the output of the following code?

				
					import pandas as pd
d = [{'name':'Yash','percentage':78},{'name':'Rakesh','percentage':80},{'name':'Suresh','percentage':60}]
df = pd.DataFrame(d)
for index, row in df.iterrows():
    print(row['name'], row['percentage'])

a) Yash 78 Rakesh 80 Suresh 60
b) Yash Rakesh Suresh
c) 0 Yash 1 Rakesh 2 Suresh
d) Yash percentage Rakesh percentage Suresh percentage

Correct answer is: a) Yash 78 Rakesh 80 Suresh 60
Explanation: The code creates a DataFrame `df` using a list of dictionaries `d`. The DataFrame contains three rows, with ‘name’ and ‘percentage’ as columns. The `iterrows()` method is used to iterate over the rows of the DataFrame, and for each row, the ‘name’ and ‘percentage’ values are printed)

20). What is the output of the following code?

				
					import pandas as pd
d = {'name': ['Virat', 'Messi', 'Kobe'], 'sport': ['cricket', 'football', 'basketball']}
df = pd.DataFrame(d)
print("Names of columns: ")
print(list(df.columns.values))

a) `[‘name’, ‘sport’]`
b) `[‘Virat’, ‘Messi’, ‘Kobe’, ‘cricket’, ‘football’, ‘basketball’]`
c) `[‘name’, ‘Virat’, ‘Messi’, ‘Kobe’, ‘sport’, ‘cricket’, ‘football’, ‘basketball’]`
d) `[‘cricket’, ‘football’, ‘basketball’, ‘name’, ‘sport’]`

Correct answer is: a) `[‘name’, ‘sport’]`
Explanation: The given code first imports the pandas library and creates a DataFrame `df` from the dictionary `d`. The dictionary `d` consists of two key-value pairs: `’name’` with a list of names and `’sport’` with a list of corresponding sports.

21). What is the output of the following code?

				
					import pandas as pd
d = {'C1': [1, 3, 8], 'C2': [6, 8, 0], 'C3': [8, 2, 6]}
df = pd.DataFrame(d)
df = df.rename(columns={'C1': 'A', 'C2': 'B', 'C3': 'C'})
print("New DataFrame after renaming columns:")
print(df)

a)
New DataFrame after renaming columns:
A B C
0 1 6 8
1 3 8 2
2 8 0 6

b)
New DataFrame after renaming columns:
C1 C2 C3
0 1 6 8
1 3 8 2
2 8 0 6

c)
New DataFrame after renaming columns:
C1 C2 C3
0 1 3 8
1 6 8 0
2 8 2 6

d)
New DataFrame after renaming columns:
A B C
0 1 3 8
1 6 8 0
2 8 2 6

Correct answer is: a)
New DataFrame after renaming columns:
A B C
0 1 6 8
1 3 8 2
2 8 0 6
Explanation: The code creates a DataFrame `df` using the dictionary `d`, and then it renames the columns ‘C1’, ‘C2’, and ‘C3’ to ‘A’, ‘B’, and ‘C’ respectively. The `print(df)` statement displays the new DataFrame after renaming the columns. The output shows that the DataFrame `df` has been updated with the new column names ‘A’, ‘B’, and ‘C’, while maintaining the original data)

22). What is the output of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
df = df[['Sr.no.','Name','Salary','Age']]
print('After re-ordering columns: \n',df)

a) After re-ordering columns:
Sr.no. Name Salary Age
0 1 Alex 50000 30
1 2 John 65000 27
2 3 Peter 58000 29
3 4 Klaus 66000 33

b) After re-ordering columns:
Sr.no. Name Age Salary
0 1 Alex 30 50000
1 2 John 27 65000
2 3 Peter 29 58000
3 4 Klaus 33 66000

c) After re-ordering columns:
Sr.no. Salary Age Name
0 1 50000 30 Alex
1 2 65000 27 John
2 3 58000 29 Peter
3 4 66000 33 Klaus

d) The code will raise an error.

Correct answer is: a) After re-ordering columns:
Sr.no. Name Salary Age
0 1 Alex 50000 30
1 2 John 65000 27
2 3 Peter 58000 29
3 4 Klaus 66000 33
Explanation: The code first creates a DataFrame `df` using a dictionary `d`, which contains ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’ as keys, and their respective values. The DataFrame is then re-ordered to have ‘Sr.no.’, ‘Name’, ‘Salary’, and ‘Age’ as columns using the line `df = df[[‘Sr.no.’,’Name’,’Salary’,’Age’]]`. Finally, the output is printed using `print(‘After re-ordering columns: \n’, df)`.

23). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.': [1, 2, 3, 4], 'Name': ['Alex', 'John', 'Peter', 'Klaus'], 'Age': [30, 27, 29, 33], 'Salary': [50000, 65000, 58000, 66000]}
df = pd.DataFrame(d)
df.to_csv('new_file.csv', sep='\t', index=False)
new = pd.read_csv('new_file.csv')
print(new)

a) To read a CSV file into a DataFrame and display its contents.
b) To create a new CSV file and write the DataFrame contents into it.
c) To convert the DataFrame into a NumPy array and print its values.
d) To sort the DataFrame based on the ‘Age’ column and print the result.

Correct answer is: b) To create a new CSV file and write the DataFrame contents into it.
Explanation: The given code performs the following operations:
1. A dictionary `d` is defined containing data related to ‘Sr.no.’, ‘Name’, ‘Age’, and ‘Salary’.
2. A DataFrame `df` is created using the data from the dictionary `d`.
3. The `to_csv()` method is used to export the DataFrame `df` to a CSV file named ‘new_file.csv’, using tab (`\t`) as the separator and excluding the index column.
4. The `pd.read_csv()` function is used to read the contents of the ‘new_file.csv’ CSV file back into a new DataFrame `new`.
5. Finally, the contents of the DataFrame `new` are printed, displaying the data from the ‘new_file.csv’ file.

24). What is the purpose of the following code?

				
					import pandas as pd
df = pd.read_csv("population_by_country_2020.csv")
df1 = df.drop(['Yearly Change','Net Change','Density (P/Km²)','Land Area (Km²)',
              'Migrants (net)','Fert. Rate','Med) Age','Urban Pop %','World Share'],axis=1)
print(df1.head(10))

a) To read a CSV file named “population_by_country_2020.csv” and display the first 10 rows of the DataFrame.
b) To drop specific columns from the DataFrame ‘df’ and display the first 10 rows of the modified DataFrame.
c) To filter the DataFrame ‘df’ and display the first 10 rows that meet a certain condition.
d) To rename specific columns in the DataFrame ‘df’ and display the first 10 rows of the modified DataFrame.

Correct answer is: b) To drop specific columns from the DataFrame ‘df’ and display the first 10 rows of the modified DataFrame.
Explanation: The given code imports the Pandas library as ‘pd’ and then reads a CSV file named “population_by_country_2020.csv” into a DataFrame ‘df’ using the `pd.read_csv()` function. After that, the code creates a new DataFrame ‘df1’ by dropping specific columns from ‘df’ using the `drop()` method) The columns [‘Yearly Change’, ‘Net Change’, ‘Density (P/Km²)’, ‘Land Area (Km²)’, ‘Migrants (net)’, ‘Fert. Rate’, ‘Med) Age’, ‘Urban Pop %’, ‘World Share’] are dropped from the DataFrame ‘df’. Finally, the code prints the first 10 rows of the modified DataFrame ‘df1’ using the `head(10)` method)

25). What is the purpose of the following code?

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,np.nan,29,np.nan]}
df = pd.DataFrame(d)
print(df)
df.fillna(value = 25,inplace = True)
print("After filling nan values: \n",df)

a) To calculate the mean of the ‘Age’ column and fill the missing values with the mean.
b) To replace all the missing values in the ‘Age’ column with the value 25.
c) To drop the rows with missing values in the ‘Age’ column from the DataFrame.
d) To calculate the median of the ‘Age’ column and fill the missing values with the median.

Correct answer is: b) To replace all the missing values in the ‘Age’ column with the value 25.
Explanation: The given code is using the Pandas library to create a DataFrame `df` with columns ‘Sr.no.’, ‘Name’, and ‘Age’. The ‘Age’ column contains NaN (Not a Number) values. The `fillna()` method is then used to fill the missing (NaN) values in the ‘Age’ column with the value 25. The `inplace=True` argument is used to modify the DataFrame `df` in place, meaning the changes are applied directly to the original DataFrame.

Python Pandas MCQ

Leave a Comment Cancel reply