Home » Pandas Dataframe » Create Pandas DataFrame in Python (4 Examples)

Create Pandas DataFrame in Python (4 Examples)

In this tutorial, you will learn how to create Pandas DataFrame in Python with multiple examples.

Example 1: Create Pandas DataFrame from a CSV in Python

In the following example, it will make the SFrame to CSV and then create the DataFrame from CSV:

sf = sframe.SFrame('yourFolder/big_data.gl/') 
sf.save('yourFolder/big_data.csv', format='csv')
df = pandas.read_csv('yourFolder/big_data.csv.csv')

Example 2: Create DataFrame from HTML Data

Below example shows how to create Pandas DataFrame from the HTML data:

import pandas as pd
from bs4 import BeautifulSoup

html_doc = """

soup = BeautifulSoup(html_doc, "html.parser")

car_names = soup.find_all("h3")

data = []
for name in car_names:
    data.append({"Car Name": name.text})

df = pd.DataFrame(data)


Car Name
0     Ford
1   Toyota
2    Škoda

Example 3: Creating a DataFrame by Merging 2 Dictionaries in Python

The below example will merge the two dictionaries and will create a DataFrame:

adBasicInfo = {} # 1.st dictionary
adOtherInfo = {} # 2.nd dictionary
adFullInfo = {}  # Merged dictionary
adFullList = []  # List for appending merged dictionaries

# In each iteration merge dicts and append them in the list
 for div in soup.findAll('a', {'class': 'result'}):
           ..some code...
            adBasicInfo = {
                              u'adThumbImg':...some code...,
                              u'adCounty':...some code...

        adOtherInfo = getFullAdInfo(adLink)      # Get complex dict      
        adFull = {**adBasicInfo,**adOtherInfo} # Merge dicts
        adFullList.append(adFull)              # Append dicts to list

# Save final version of list as pandas dataframe
adFullDF = pd.DataFrame(data=adFullList) # Save final list to dataframe

Example 4: Create Pandas DataFrame from Excel File

Below Python code will create the Pandas DataFrame from an Excel file:

import pandas as pd

data = pd.read_excel(r'YourExcelFilePath\Filename.xlsx') 
df = pd.DataFrame(data, columns = ['Column1','Column2','Column3',...])

print (df)

See also: