# Modify a DataFrames

## Adding a Column

One way that we can add a new column is by giving a list of the same length as the existing DataFrame.

```python
df = pd.DataFrame([
  [1, '3 inch screw', 0.5, 0.75],
  [2, '2 inch nail', 0.10, 0.25],
  [3, 'hammer', 3.00, 5.50],
  [4, 'screwdriver', 2.50, 3.00]
],
  columns=['Product ID', 'Description', 
           'Cost to Manufacture', 'Price']
)

# Add one column
df['Sold in Bulk?'] = ['Yes', 'Yes', 'No', 'No']
```

We can also add a new column that is the same for all rows in the DataFrame.

```python
df = pd.DataFrame([
  [1, '3 inch screw', 0.5, 0.75],
  [2, '2 inch nail', 0.10, 0.25],
  [3, 'hammer', 3.00, 5.50],
  [4, 'screwdriver', 2.50, 3.00]
],
  columns=['Product ID', 'Description', 
           'Cost to Manufacture', 'Price']
)

# Add one column
df['Sold in Bulk?'] = 'Yes'
```

Finally, you can add a new column by performing a function on the existing columns.

```python
df = pd.DataFrame([
  [1, '3 inch screw', 0.5, 0.75],
  [2, '2 inch nail', 0.10, 0.25],
  [3, 'hammer', 3.00, 5.50],
  [4, 'screwdriver', 2.50, 3.00]
],
  columns=['Product ID', 'Description', 
           'Cost to Manufacture', 'Price']
)

# Add column here
df['Margin'] = df['Price'] - df['Cost to Manufacture']
```

Note: when adding a new column, we can only use `df['new_column_name']` to refer to the new column, and `df.new_column_name` will not work.

### Adding a column using apply() function

The Pandas `apply()` function can be used to apply a function on every value in a column or row of a DataFrame, and transform that column or row to the resulting values. The function used in `apply()` originally only work for one element, but because of `apply()` it will apply the function to all values of a column. To perform it on a row instead, you can specify the argument `axis=1` in the `apply()` function call.

```python
# This function doubles the input value
def double(x):
  return 2*x

# Apply this function to double every value in a specified column
df.column1 = df.column1.apply(double)

# Lambda functions can also be supplied to `apply()`
df.column2 = df.column2.apply(lambda x : 3*x)

# Applying to a row requires it to be called on the entire DataFrame
df['newColumn'] = df.apply(lambda row: 
  row['column1'] * 1.5 + row['column2'],
  axis=1
)
```

## Renaming Columns

#### Rename All Columns

We can change all of the column names at once by setting the `.columns` property to a different list. This is great when you need to change all of the column names at once.

```python
df = pd.DataFrame({
    'name': ['John', 'Jane', 'Sue', 'Fred'],
    'age': [23, 29, 21, 18]
})
df.columns = ['First Name', 'Age']
```

#### Renaming Individual Columns

You also can rename individual columns by using the `.rename` method. Pass a dictionary like the one below to the `columns` keyword argument:

```python
{'old_column_name1': 'new_column_name1', 
'old_column_name2': 'new_column_name2'}
```

Here’s an example:

```python
df = pd.DataFrame({
    'name': ['John', 'Jane', 'Sue', 'Fred'],
    'age': [23, 29, 21, 18]
})
df.rename(columns={
    'name': 'First Name',
    'age': 'Age'},
    inplace=True)
```

The code above will rename `name` to `First Name` and `age` to `Age`.

Using `rename` with only the `columns` keyword will create a **new** DataFrame, leaving your original DataFrame unchanged. That’s why we also passed in the keyword argument **`inplace=True`**. Using `inplace=True` lets us edit the **original** DataFrame.

There are several reasons why `.rename` is preferable to `.columns`:

* You can rename just one column
* You can be specific about which column names are getting changed (with `.column` you can accidentally switch column names if you’re not careful)

*Note:* If you misspell one of the original column names, this command won’t fail. It just won’t change anything.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://lei-d.gitbook.io/python-for-data-analysis/pandas-1/modify-a-dataframes.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
