# Create a DataFrame

A DataFrame is an object that stores data as rows and columns. You can think of a DataFrame as a spreadsheet or as a SQL table. You can manually create a DataFrame or fill it with data from a CSV, an Excel spreadsheet, or a SQL query.

DataFrames have rows and columns. Each column has a name, which is a string. Each row has an index, which is an integer. DataFrames can contain many different data types: strings, ints, floats, tuples, etc.

## Convert from list

We can add data using lists. For example, we can pass in a list of lists, where each one represents a row of data. Use the keyword argument `columns` to pass a list of column names.

```python
df2 = pd.DataFrame([
    ['John Smith', '123 Main St.', 34],
    ['Jane Doe', '456 Maple Ave.', 28],
    ['Joe Schmo', '789 Broadway', 51]
    ],
    columns=['name', 'address', 'age'])
```

This command produces a DataFrame `df2` that looks like this:

| name       | address        | age |
| ---------- | -------------- | --- |
| John Smith | 123 Main St.   | 34  |
| Jane Doe   | 456 Maple Ave. | 28  |
| Joe Schmo  | 789 Broadway   | 51  |

In this example, we were able to control the ordering of the columns because we used lists.

## Convert from dictionary

We can pass in a dictionary to `pd.DataFrame()`. Each key is a column name and each value is a list of column values. The columns must all be the same length or you will get an error. Here’s an example:

```python
df1 = pd.DataFrame({
    'name': ['John Smith', 'Jane Doe', 'Joe Schmo'],
    'address': ['123 Main St.', '456 Maple Ave.', '789 Broadway'],
    'age': [34, 28, 51]})
```

This command creates a DataFrame called `df1` that looks like this:

| address        | age | name       |
| -------------- | --- | ---------- |
| 123 Main St.   | 34  | John Smith |
| 456 Maple Ave. | 28  | Jane Doe   |
| 789 Broadway   | 51  | Joe Schmo  |

Note that the columns will appear in alphabetical order because dictionaries don’t have any inherent order for columns.

## Read in CSV file

When you have data in a CSV, you can load it into a DataFrame in Pandas using `.read_csv()`:

```python
pd.read_csv('my-csv-file.csv')
```

We can also save data to a CSV, using `.to_csv()`.

```python
df.to_csv('new-csv-file.csv')
```

In the example above, the `.to_csv()` method is called on `df` (which represents a DataFrame object). The name of the CSV file is passed in as an argument (`new-csv-file.csv`). By default, this method will save the CSV file in your current directory.
