Modify a DataFrames
Adding a Column
One way that we can add a new column is by giving a list of the same length as the existing DataFrame.
We can also add a new column that is the same for all rows in the DataFrame.
Finally, you can add a new column by performing a function on the existing columns.
Note: when adding a new column, we can only use df['new_column_name']
to refer to the new column, and df.new_column_name
will not work.
Adding a column using apply() function
The Pandas apply()
function can be used to apply a function on every value in a column or row of a DataFrame, and transform that column or row to the resulting values. The function used in apply()
originally only work for one element, but because of apply()
it will apply the function to all values of a column. To perform it on a row instead, you can specify the argument axis=1
in the apply()
function call.
Renaming Columns
Rename All Columns
We can change all of the column names at once by setting the .columns
property to a different list. This is great when you need to change all of the column names at once.
Renaming Individual Columns
You also can rename individual columns by using the .rename
method. Pass a dictionary like the one below to the columns
keyword argument:
Here’s an example:
The code above will rename name
to First Name
and age
to Age
.
Using rename
with only the columns
keyword will create a new DataFrame, leaving your original DataFrame unchanged. That’s why we also passed in the keyword argument inplace=True
. Using inplace=True
lets us edit the original DataFrame.
There are several reasons why .rename
is preferable to .columns
:
You can rename just one column
You can be specific about which column names are getting changed (with
.column
you can accidentally switch column names if you’re not careful)
Note: If you misspell one of the original column names, this command won’t fail. It just won’t change anything.
Last updated
Was this helpful?