Python Pandas Problems Archives - Page 5 Of 10

Python Pandas program to shuffle rows in a DataFrame

In this python pandas program, we will shuffle rows in a DataFrame using the pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Shuffle rows in a Dataframe using df.sample(frac=1).
Print the output.

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print(df)
df = df.sample(frac=1)
print("\nNew DataFrame:")
print(df)

Output :

				
					0   Sr.no.   Name  Age  Salary
0       1   Alex   30   50000
1       2   John   27   65000
2       3  Peter   29   58000
3       4  Klaus   33   66000

New DataFrame:
   Sr.no.   Name  Age  Salary
3       4  Klaus   33   66000
2       3  Peter   29   58000
1       2   John   27   65000
0       1   Alex   30   50000

count the NaN values in a Dataframe

rename a column in a DataFrame

Python Pandas program to count the NaN values in a Dataframe

In this python pandas program, we will count the NaN values in a Dataframe using the pandas library.

Steps to solve the program

Import pandas library as pd.
Import NumPy library as np.
Create a dataframe using pd.DataFrame().
Count the NaN values in a Dataframe using df.isnull().values.sum().
Print the output.

				
					import pandas as pd
import numpy as np
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,np.nan,29,np.nan]}
df = pd.DataFrame(d)
print(df)
print("Nan values in the dataframe: ",df.isnull().values.sum())

Output :

				
					0   Sr.no.   Name   Age
0       1   Alex  30.0
1       2   John   NaN
2       3  Peter  29.0
3       4  Klaus   NaN
Nan values in the dataframe:  2

replace all the NaN values with a scaler in a column of a Dataframe

shuffle rows in a DataFrame

Python Pandas program to replace all the NaN values with a scaler in a column of a Dataframe

In this python pandas program, we will replace all the NaN values with a scaler in a column using the pandas library.

Steps to solve the program

Import pandas library as pd.
Import NumPy library as np.
Create a dataframe using pd.DataFrame().
Replace all the NaN values with a scaler in a column of a Dataframe using df.fillna(value = 25,inplace = True).
Print the output.

				
					import pandas as pd
import numpy as np
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,np.nan,29,np.nan]}
df = pd.DataFrame(d)
print(df)
df.fillna(value = 25,inplace = True)
print("After filling nan values: \n",df)

Output :

				
					0   Sr.no.   Name   Age
0       1   Alex  30.0
1       2   John   NaN
2       3  Peter  29.0
3       4  Klaus   NaN
After filling nan values: 
    Sr.no.   Name   Age
0       1   Alex  30.0
1       2   John  25.0
2       3  Peter  29.0
3       4  Klaus  25.0

count Country wise population from a given data set

count the NaN values in a Dataframe

Python Pandas program to count Country wise population from a given dataset

In this python pandas program, we will count Country wise population from a given dataset using the pandas library.

Link for Dataset- https://www.kaggle.com/datasets/tanuprabhu/population-by-country-2020?resource=download

Steps to solve the program

Import pandas library as pd.
First, read the dataset using pd.read_csv().
Now drop the unnecessary columns from the dataset using df.drop().
Print the first 10 countries using df.head(10).

				
					import pandas as pd
df = pd.read_csv("population_by_country_2020.csv")
df1 = df.drop(['Yearly Change','Net Change','Density (P/Km²)','Land Area (Km²)',
              'Migrants (net)','Fert. Rate','Med. Age','Urban Pop %','World Share'],axis=1)
print(df1.head(10))

Output :

				
					0  Country (or dependency)  Population (2020)
0                   China         1440297825
1                   India         1382345085
2           United States          331341050
3               Indonesia          274021604
4                Pakistan          221612785
5                  Brazil          212821986
6                 Nigeria          206984347
7              Bangladesh          164972348
8                  Russia          145945524
9                  Mexico          129166028

write a DataFrame to a CSV file using a tab separato

replace all the NaN values with a scaler in a column of a Dataframe

Program to write a DataFrame to a CSV file

In this python pandas program, we will write a DataFrame to a CSV file using the pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Write the dataFrame to a CSV file using a tab separator using df.to_csv(‘new_file.csv’, sep=’\t’, index=False)
Print the output.

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
df.to_csv('new_file.csv', sep='\t', index=False)
new = pd.read_csv('new_file.csv')
print(new)

Output :

				
					0  Sr.no.\tName\tAge\tSalary
0        1\tAlex\t30\t50000
1        2\tJohn\t27\t65000
2       3\tPeter\t29\t58000
3       4\tKlaus\t33\t66000

change the order of columns in a DataFrame

count Country wise population from a given data set

Python Pandas program to change the order of columns in a DataFrame

In this python pandas program, we will change the order of columns in a DataFrame using the pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Change the order of columns in a DataFrame using df[[‘Sr.no.’,’Name’,’Salary’,’Age’]].
Print the output.

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33],'Salary':[50000,65000,58000,66000]}
df = pd.DataFrame(d)
print("Original Dataframe: \n",df)
df = df[['Sr.no.','Name','Salary','Age']]
print('After re-ordering columns: \n',df)

Output :

				
					Original Dataframe: 
    Sr.no.   Name  Age  Salary
0       1   Alex   30   50000
1       2   John   27   65000
2       3  Peter   29   58000
3       4  Klaus   33   66000
After re-ordering columns: 
    Sr.no.   Name  Salary  Age
0       1   Alex   50000   30
1       2   John   65000   27
2       3  Peter   58000   29
3       4  Klaus   66000   33

rename columns of a given DataFrame

write a DataFrame to a CSV file using a tab separator

Python Pandas program to rename columns of a given DataFrame

In this python pandas program, we will rename columns of a given DataFrame using the pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Rename columns of a given DataFrame to A,B,C using df.rename(columns= {‘C1′:’A’,’C2′:’B’,’C3′:’C’}).
Print the output.

				
					import pandas as pd
d = {'C1':[1,3,8],'C2':[6,8,0],'C3':[8,2,6]}
df = pd.DataFrame(d)
print("Old Dataframe: \n",df)
df = df.rename(columns= {'C1':'A','C2':'B','C3':'C'})
print("New DataFrame after renaming columns:")
print(df)

Output :

				
					Old Dataframe: 
    C1  C2  C3
0   1   6   8
1   3   8   2
2   8   0   6
New DataFrame after renaming columns:
   A  B  C
0  1  6  8
1  3  8  2
2  8  0  6

get a list of column headers from the DataFrame

change the order of columns in a DataFrame

Python Pandas program to get a list of column headers from the DataFrame

In this python pandas program, we will get a list of column headers from the DataFrame using the pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Get a list of column headers from the DataFrame using list(df.columns.values).
Print the output.

				
					import pandas as pd
d = {'name':['Virat','Messi','Kobe'],'sport':['cricket','football','basketball']}
df = pd.DataFrame(d)
print("Dataframe: \n",df)
print("Names of columns: ")
print(list(df.columns.values))

Output :

				
					Dataframe: 
     name       sport
0  Virat     cricket
1  Messi    football
2   Kobe  basketball
Names of columns: 
['name', 'sport']

iterate over rows in a DataFrame

rename columns of a given DataFrame

Python Pandas program to iterate over rows in a DataFrame

In this python pandas program, we will iterate over rows in a DataFrame using pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Iterate over rows in a DataFrame using for loop and df.iterrows().
Print the records for each column using row[column1],row[columns2],……,row[column_n].

				
					import pandas as pd
import numpy as np
d = [{'name':'Yash','percentage':78},{'name':'Rakesh','percentage':80},{'name':'Suresh','percentage':60}]
df = pd.DataFrame(d)
for index, row in df.iterrows():
    print(row['name'], row['percentage'])

Output :

				
					Yash 78
Rakesh 80
Suresh 60

add a new column in a DataFrame

get a list of column headers from the DataFrame

Python Pandas program to add a new column in a DataFrame

In this python pandas program, we will add a new column in a DataFrame using pandas library.

Steps to solve the program

Import pandas library as pd.
Create a dataframe using pd.DataFrame().
Create a list of records for the new Salary column.
Add a Salary column in the given dataframe using df[‘Salary’] = Salary.
Print the output.

				
					import pandas as pd
d = {'Sr.no.':[1,2,3,4],'Name':['Alex','John','Peter','Klaus'],'Age':[30,27,29,33]}
df = pd.DataFrame(d)
print("Old Series: \n",df)
Salary = [50000,65000,58000,66000]
df['Salary'] = Salary
print("New Series: \n",df)

Output :

				
					Old Series: 
    Sr.no.   Name  Age
0       1   Alex   30
1       2   John   27
2       3  Peter   29
3       4  Klaus   33
New Series: 
    Sr.no.   Name  Age  Salary
0       1   Alex   30   50000
1       2   John   27   65000
2       3  Peter   29   58000
3       4  Klaus   33   66000

delete the record in a DataFrame

iterate over rows in a DataFrame