Example - Data Engineering

Let us understand how CRUD operations are performed for Data Engineering using Python. Don’t worry if you do not understand the example completely. By the end of the course, you are supposed to gain fair amount of expertise related to Data Engineering.

Here are the highlevel steps to connect to database and perform CRUD operations for Data Engineering applications:

  • Make sure to install database driver such as psycopg2 using pip.

  • Import psycopg2 and connect to database

  • We can insert one record at a time or list of records at a time. After inserting the records, make sure to commit.

  • Get data from the table into resultset.

  • Iterate through the resultset and print.

  • Once all the database operations are performed make sure to close the database connection.

%load_ext sql
%env DATABASE_URL=postgresql://itversity_sms_user:itversity!23@localhost:5433/itversity_sms_db
%sql TRUNCATE TABLE users
!pip install psycopg2
import psycopg2
psycopg2.connect?
user = ('Gordan', 'Bradock', 'gbradock0@barnesandnoble.com', 'A', '2020-01-10')
query = '''
INSERT INTO users 
    (user_first_name, user_last_name, user_email_id, user_role, created_dt) 
VALUES 
    (%s, %s, %s, %s, %s)
'''
connection = psycopg2.connect(
    host='localhost',
    port='5433',
    database='itversity_sms_db',
    user='itversity_sms_user',
    password='itversity!23'
)
cursor = connection.cursor()
cursor.execute(query, user) # Inserts one record
%sql SELECT * FROM users
connection.commit() # Commit have to be explicit
%sql SELECT * FROM users
users = [
    ('Tobe', 'Lyness', 'tlyness1@paginegialle.it', '2020-02-10'),
    ('Addie', 'Mesias', 'amesias2@twitpic.com', '2020-03-05'),
    ('Corene', 'Kohrsen', 'ckohrsen3@buzzfeed.com', '2020-04-15'),
    ('Darill', 'Halsall', 'dhalsall4@intel.com', '2020-10-10') 
]
query = '''
INSERT INTO users 
    (user_first_name, user_last_name, user_email_id, created_dt) 
VALUES 
    (%s, %s, %s, %s)
'''
cursor.executemany(query, users)
connection.commit()
%sql SELECT * FROM users
users_query = 'SELECT * FROM users'
cursor.execute(users_query) # Now we can iterate through cursor to read the data
for user in cursor:
    print(user)
cursor.close()
connection.close()