In this python pandas program, we will extract only words from a column of a DataFrame using the pandas library.
Steps to solve the program
- Import pandas library as pd.
- Import re library.
- Create a dataframe using pd.DataFrame().
- Create a function to extract only words from the record.
- Extract only words from the record using re.findall(r’\b[^\d\W]+\b’, text).
- It will extract words from the record and the return will give words as the output of the function.
- Now apply this function to the address column of the dataframe using df[‘Address’].apply(lambda x : search_words(x)).
- The lambda function will apply the created function to each row to extract words from the address and store it in the new column.
- Print the output.
import pandas as pd
import re
d = {'Name':['Ramesh','Suresh','Sanket'],
'Address':['297 shukrawar peth','200 ravivar peth','090 shanivar peth']}
df = pd.DataFrame(d)
print(df)
def search_words(text):
result = re.findall(r'\b[^\d\W]+\b', text)
return " ".join(result)
df['words']=df['Address'].apply(lambda x : search_words(x))
print("Only words:")
print(df)
Output :
0 Name Address
0 Ramesh 297 shukrawar peth
1 Suresh 200 ravivar peth
2 Sanket 090 shanivar peth
Only words:
Name Address words
0 Ramesh 297 shukrawar peth shukrawar peth
1 Suresh 200 ravivar peth ravivar peth
2 Sanket 090 shanivar peth shanivar peth