Scraping Website Content¶
We can use Python core library requests
to get the content from HTML pages. Let us understand how we can pass the html content fetched by requests
to BeautifulSoup
.
requests
providesget
funcion to which we can pass web URL. -python_page = requests.get(python_url)
We can access content using
python_page.content
.We can use pass the content to BeautifulSoup and parse the HTML Tags and data for further processing.
import requests
python_url = 'https://python.itversity.com/mastering-python.html'
python_page = requests.get(python_url)
type(python_page)
requests.models.Response
# python_page.content
b'\n<!DOCTYPE html>\n\n<html>\n <head>\n <meta charset="utf-8" />\n <meta name="viewport" content="width=device-width, initial-scale=1.0" />\n <title>Mastering Python — Mastering Python</title>\n \n <link rel="stylesheet" href="_static/css/index.73d71520a4ca3b99cfee5594769eaaae.css">\n\n \n <link rel="stylesheet"\n href="_static/vendor/fontawesome/5.13.0/css/all.min.css">\n <link rel="preload" as="font" type="font/woff2" crossorigin\n href="_static/vendor/fontawesome/5.13.0/webfonts/fa-solid-900.woff2">\n <link rel="preload" as="font" type="font/woff2" crossorigin\n href="_static/vendor/fontawesome/5.13.0/webfonts/fa-brands-400.woff2">\n\n \n \n <link rel="stylesheet"\n href="_static/vendor/open-sans_all/1.44.1/index.css">\n <link rel="stylesheet"\n href="_static/vendor/lato_latin-ext/1.44.1/index.css">\n\n \n <link rel="stylesheet" href="_static/pygments.css" type="text/css" />\n <link rel="stylesheet" href="_static/sphinx-book-theme.2d2078699c18a0efb88233928e1cf6ed.css" type="text/css" />\n <link rel="stylesheet" type="text/css" href="_static/togglebutton.css" />\n <link rel="stylesheet" type="text/css" href="_static/copybutton.css" />\n <link rel="stylesheet" type="text/css" href="_static/mystnb.css" />\n <link rel="stylesheet" type="text/css" href="_static/sphinx-thebe.css" />\n <link rel="stylesheet" type="text/css" href="_static/panels-main.c949a650a448cc0ae9fd3441c0e17fb0.css" />\n <link rel="stylesheet" type="text/css" href="_static/panels-variables.06eb56fa6e07937060861dad626602ad.css" />\n \n <link rel="preload" as="script" href="_static/js/index.3da636dd464baa7582d2.js">\n\n <script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>\n <script src="_static/jquery.js"></script>\n <script src="_static/underscore.js"></script>\n <script src="_static/doctools.js"></script>\n <script src="_static/language_data.js"></script>\n <script src="_static/togglebutton.js"></script>\n <script src="_static/clipboard.min.js"></script>\n <script src="_static/copybutton.js"></script>\n <script >var togglebuttonSelector = \'.toggle, .admonition.dropdown, .tag_hide_input div.cell_input, .tag_hide-input div.cell_input, .tag_hide_output div.cell_output, .tag_hide-output div.cell_output, .tag_hide_cell.cell, .tag_remove-cell.cell\';</script>\n <script src="_static/sphinx-book-theme.be0a4a0c39cd630af62a2fcf693f3f06.js"></script>\n <script async="async" src="https://unpkg.com/thebelab@latest/lib/index.js"></script>\n <script >\n const thebe_selector = ".thebe"\n const thebe_selector_input = "pre"\n const thebe_selector_output = ".output"\n </script>\n <script async="async" src="_static/sphinx-thebe.js"></script>\n <link rel="index" title="Index" href="genindex.html" />\n <link rel="search" title="Search" href="search.html" />\n <link rel="next" title="Overview of Windows Operating System" href="01_overview_of_windows_os/01_overview_of_windows_os.html" />\n\n <meta name="viewport" content="width=device-width, initial-scale=1" />\n <meta name="docsearch:language" content="en" />\n\n\n\n </head>\n <body data-spy="scroll" data-target="#bd-toc-nav" data-offset="80">\n \n\n <div class="container-xl">\n <div class="row">\n \n<div class="col-12 col-md-3 bd-sidebar site-navigation show" id="site-navigation">\n \n <div class="navbar-brand-box">\n<a class="navbar-brand text-wrap" href="index.html">\n \n \n <h1 class="site-logo" id="site-title">Mastering Python</h1>\n \n</a>\n</div>\n\n<form class="bd-search d-flex align-items-center" action="search.html" method="get">\n <i class="icon fas fa-search"></i>\n <input type="search" class="form-control" name="q" id="search-input" placeholder="Search this book..." aria-label="Search this book..." autocomplete="off" >\n</form>\n\n<nav class="bd-links" id="bd-docs-nav" aria-label="Main navigation">\n <ul class="nav sidenav_l1">\n <li class="toctree-l1">\n <a class="reference internal" href="#">\n Mastering Python\n </a>\n </li>\n</ul>\n<ul class="nav sidenav_l1">\n <li class="toctree-l1">\n <a class="reference internal" href="01_overview_of_windows_os/01_overview_of_windows_os.html">\n Overview of Windows Operating System\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="04_postgres_database_operations/01_postgres_database_operations.html">\n Perform Database Operations\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="05_getting_started_with_python/01_getting_started_with_python.html">\n Getting Started with Python\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="06_basic_programming_constructs/01_basic_programming_constructs.html">\n Basic Programming Constructs\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="07_pre_defined_functions/01_pre_defined_functions.html">\n Pre-defined Functions\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="08_user_defined_functions/01_user_defined_functions.html">\n User Defined Functions\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="09_overview_of_collections_list_and_set/01_overview_of_collections_list_and_set.html">\n Overview of Collections - list and set\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="10_overview_of_collections_dict_and_tuple/01_overview_of_collections_dict_and_tuple.html">\n Overview of Collections - dict and tuple\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="11_manipulating_collections_using_loops/01_manipulating_collections_using_loops.html">\n Manipulating Collections using Loops\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="12_development_of_map_reduce_apis/01_development_of_map_reduce_apis.html">\n Development of Map Reduce APIs\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="13_understanding_map_reduce_libraries/01_understanding_map_reduce_libraries.html">\n Understanding Python Map Reduce Libraries\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="14_overview_of_object_oriented_programming/01_overview_of_object_oriented_programming.html">\n Overview of Object Oriented Programming\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="15_overview_of_pandas_libraries/01_overview_of_pandas_libraries.html">\n Overview of Pandas Libraries\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="16_web_scraping_using_beautifulsoup/01_web_scraping_using_beautifulsoup.html">\n Web Scraping using Beautiful Soup\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="17_database_programming_crud_operations/01_database_programming_crud_operations.html">\n Database Programming \xe2\x80\x93 CRUD Operations\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="18_database_programming_batch_operations/01_database_programming_batch_operations.html">\n Database Programming \xe2\x80\x93 Batch Operations\n </a>\n </li>\n <li class="toctree-l1">\n <a class="reference internal" href="19_project_web_scraping_into_database/01_project_web_scraping_into_database.html">\n Project \xe2\x80\x93 Web Scraping and loading into Database\n </a>\n </li>\n</ul>\n\n</nav>\n\n <!-- To handle the deprecated key -->\n\n<div class="navbar_extra_footer">\n Subscribe to our <a href="http://notifyme.itversity.com">Newsletter</a>\n</div>\n\n</div>\n\n\n \n\n\n \n<main class="col py-md-3 pl-md-4 bd-content overflow-auto" role="main">\n \n <div class="row topbar fixed-top container-xl">\n <div class="col-12 col-md-3 bd-topbar-whitespace site-navigation show">\n </div>\n <div class="col pl-2 topbar-main">\n \n <button id="navbar-toggler" class="navbar-toggler ml-0" type="button" data-toggle="collapse"\n data-toggle="tooltip" data-placement="bottom" data-target=".site-navigation" aria-controls="navbar-menu"\n aria-expanded="true" aria-label="Toggle navigation" aria-controls="site-navigation"\n title="Toggle navigation" data-toggle="tooltip" data-placement="left">\n <i class="fas fa-bars"></i>\n <i class="fas fa-arrow-left"></i>\n <i class="fas fa-arrow-up"></i>\n </button>\n \n <div class="dropdown-buttons-trigger">\n <button id="dropdown-buttons-trigger" class="btn btn-secondary topbarbtn" aria-label="Download this page"><i\n class="fas fa-download"></i></button>\n\n \n <div class="dropdown-buttons">\n <!-- ipynb file if we had a myst markdown file -->\n \n <!-- Download raw file -->\n <a class="dropdown-buttons" href="_sources/mastering-python.ipynb"><button type="button"\n class="btn btn-secondary topbarbtn" title="Download source file" data-toggle="tooltip"\n data-placement="left">.ipynb</button></a>\n <!-- Download PDF via print -->\n <button type="button" id="download-print" class="btn btn-secondary topbarbtn" title="Print to PDF"\n onClick="window.print()" data-toggle="tooltip" data-placement="left">.pdf</button>\n </div>\n \n</div>\n <!-- Source interaction buttons -->\n\n<div class="dropdown-buttons-trigger">\n <button id="dropdown-buttons-trigger" class="btn btn-secondary topbarbtn"\n aria-label="Connect with source repository"><i class="fab fa-github"></i></button>\n <div class="dropdown-buttons sourcebuttons">\n <a class="repository-button"\n href="https://github.com/itversity/mastering-python"><button type="button" class="btn btn-secondary topbarbtn"\n data-toggle="tooltip" data-placement="left" title="Source repository"><i\n class="fab fa-github"></i>repository</button></a>\n <a class="issues-button"\n href="https://github.com/itversity/mastering-python/issues/new?title=Issue%20on%20page%20%2Fmastering-python.html&body=Your%20issue%20content%20here."><button\n type="button" class="btn btn-secondary topbarbtn" data-toggle="tooltip" data-placement="left"\n title="Open an issue"><i class="fas fa-lightbulb"></i>open issue</button></a>\n <a class="edit-button" href="https://github.com/itversity/mastering-python/edit/master/mastering-python.ipynb"><button\n type="button" class="btn btn-secondary topbarbtn" data-toggle="tooltip" data-placement="left"\n title="Edit this page"><i class="fas fa-pencil-alt"></i>suggest edit</button></a>\n </div>\n</div>\n\n\n <!-- Full screen (wrap in <a> to have style consistency -->\n <a class="full-screen-button"><button type="button" class="btn btn-secondary topbarbtn" data-toggle="tooltip"\n data-placement="bottom" onclick="toggleFullScreen()" aria-label="Fullscreen mode"\n title="Fullscreen mode"><i\n class="fas fa-expand"></i></button></a>\n\n <!-- Launch buttons -->\n\n<div class="dropdown-buttons-trigger">\n <button id="dropdown-buttons-trigger" class="btn btn-secondary topbarbtn"\n aria-label="Launch interactive content"><i class="fas fa-rocket"></i></button>\n <div class="dropdown-buttons">\n \n <a class="binder-button" href="https://mybinder.org/v2/gh/itversity/mastering-python/master?urlpath=tree/mastering-python.ipynb"><button type="button"\n class="btn btn-secondary topbarbtn" title="Launch Binder" data-toggle="tooltip"\n data-placement="left"><img class="binder-button-logo"\n src="_static/images/logo_binder.svg"\n alt="Interact on binder">Binder</button></a>\n \n \n \n \n </div>\n</div>\n\n </div>\n\n <!-- Table of contents -->\n <div class="d-none d-md-block col-md-2 bd-toc show">\n \n <div class="tocsection onthispage pt-5 pb-3">\n <i class="fas fa-list"></i> Contents\n </div>\n <nav id="bd-toc-nav">\n <ul class="nav section-nav flex-column">\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#about-python">\n About Python\n </a>\n </li>\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#course-details">\n Course Details\n </a>\n </li>\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#desired-audience">\n Desired Audience\n </a>\n </li>\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#prerequisites">\n Prerequisites\n </a>\n </li>\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#key-objectives">\n Key Objectives\n </a>\n </li>\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#training-approach">\n Training Approach\n </a>\n </li>\n <li class="toc-h2 nav-item toc-entry">\n <a class="reference internal nav-link" href="#self-evaluation">\n Self Evaluation\n </a>\n </li>\n</ul>\n\n </nav>\n \n </div>\n</div>\n <div id="main-content" class="row">\n <div class="col-12 col-md-9 pl-md-3 pr-md-0">\n \n <div>\n \n <div class="section" id="mastering-python">\n<h1>Mastering Python<a class="headerlink" href="#mastering-python" title="Permalink to this headline">\xc2\xb6</a></h1>\n<p>This course is primarily designed to learn Python as Programming language to build web or mobile applications, data engineering applications, automation etc.</p>\n<div class="section" id="about-python">\n<h2>About Python<a class="headerlink" href="#about-python" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>Python is one of the leading programming language. It is an open source database and used for different types of applications.</p>\n<ul class="simple">\n<li><p>Web Applications</p></li>\n<li><p>Mobile Applications</p></li>\n<li><p>Data Engineering Applications</p></li>\n<li><p>Server Automation</p></li>\n<li><p>Troubleshooting and Debugging</p></li>\n<li><p>Quality Assurance</p></li>\n<li><p>Data Science based Applications</p></li>\n<li><p>and many many more</p></li>\n</ul>\n<p>It is one of the top 3 programming languages for almost a decade now along with Java and Java Script.</p>\n</div>\n<div class="section" id="course-details">\n<h2>Course Details<a class="headerlink" href="#course-details" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>This course is primarily designed to go through core programming using Python. It will serve as foundation for role specific courses for different types of IT Professionals or Roles. As part of this course you will be learning the following topics under core programming using Python.</p>\n<ul class="simple">\n<li><p>Overview of GCP and Setup Ubuntu VM</p></li>\n<li><p>Setup Postgres DB using Docker</p></li>\n<li><p>Postgres Database Operations - Basic DDL and DML</p></li>\n<li><p>Getting Started with Python</p></li>\n<li><p>Basic Programming Constructs</p></li>\n<li><p>Pre Defined Functions</p></li>\n<li><p>User Defined Functions</p></li>\n<li><p>Overview of Collections - list and set</p></li>\n<li><p>Overview of Collections - dict and tuple</p></li>\n<li><p>Manipulating Collections using loops</p></li>\n<li><p>Overview of Map Reduce Libraries</p></li>\n<li><p>Overview of Pandas Libraries</p></li>\n<li><p>Database Programming - CRUD Operations</p></li>\n<li><p>Database Programming - Batch Operations</p></li>\n</ul>\n</div>\n<div class="section" id="desired-audience">\n<h2>Desired Audience<a class="headerlink" href="#desired-audience" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>Here are the desired audience for this course.</p>\n<ul class="simple">\n<li><p>College students and entry level professionals to get hands on expertise with respect to programming using Python to be prepared for the interviews.</p></li>\n<li><p>Experienced application developers to understand key aspects of Python Programming to build Python based Web or Mobile Applications.</p></li>\n<li><p>Data Engineers and Data Warehouse Developers to understand key aspects of Python Programming to build batch or streaming pipelines.</p></li>\n<li><p>Testers to improve their scripting abilities to validate data in the files tables etc.</p></li>\n<li><p>DevOps Engineers, System Administrators, Database Administrators etc to understand Python as scripting language for the automation of day to day tasks.</p></li>\n<li><p>Data Scientists to be proficient in Python Programming to build the models.</p></li>\n</ul>\n<div class="admonition note">\n<p class="admonition-title">Note</p>\n<p>This course only covers Fundamentals of Python which is useful for almost all the hands on technical roles. It does not include the role specific libraries.</p>\n</div>\n</div>\n<div class="section" id="prerequisites">\n<h2>Prerequisites<a class="headerlink" href="#prerequisites" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>Here are the prerequisites before signing up for the course.</p>\n<div class="sphinx-bs container pb-4 docutils">\n<div class="row docutils">\n<div class="d-flex col-lg-6 col-md-6 col-sm-6 col-xs-12 p-2 docutils">\n<div class="card w-100 shadow docutils">\n<div class="card-body docutils">\n<p class="card-text"><strong>Logistics</strong></p>\n<ul class="simple">\n<li><p class="card-text">Computer with decent configuration</p>\n<ul>\n<li><p class="card-text">At least 4 GB RAM</p></li>\n<li><p class="card-text">8 GB RAM is highly desired</p></li>\n</ul>\n</li>\n<li><p class="card-text">Chrome Browser</p></li>\n<li><p class="card-text">High Speed Internet</p></li>\n</ul>\n</div>\n</div>\n</div>\n<div class="d-flex col-lg-6 col-md-6 col-sm-6 col-xs-12 p-2 docutils">\n<div class="card w-100 shadow docutils">\n<div class="card-body docutils">\n<p class="card-text"><strong>Desired Skills</strong></p>\n<ul class="simple">\n<li><p class="card-text">Engineering or Science Degree</p></li>\n<li><p class="card-text">Ability to use computer</p></li>\n<li><p class="card-text">Knowledge or working experience with databases is highly desired</p></li>\n</ul>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n<div class="section" id="key-objectives">\n<h2>Key Objectives<a class="headerlink" href="#key-objectives" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>The course is designed for the professionals to achieve these key objectives related to programming using Python as Programming Language.</p>\n<div class="admonition attention">\n<p class="admonition-title">Attention</p>\n<p>This course is primarily designed to gain key database skills for application developers, data engineers, testers, business analysts etc.</p>\n</div>\n</div>\n<div class="section" id="training-approach">\n<h2>Training Approach<a class="headerlink" href="#training-approach" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>Here are the details related to the training approach.</p>\n<ul class="simple">\n<li><p>It is self paced with reference material, code snippets and videos.</p></li>\n<li><p>One can either use environment provided by us or setup their own environment using Docker.</p></li>\n<li><p>Modules will be published as and when they are ready. We would recommend to complete <strong>2 modules every week</strong> by spending <strong>4 to 5 hours per week</strong>.</p></li>\n<li><p>It is highly recommended to take care of the exercises at the end to ensure that you are able to meet all the key objectives for each module.</p></li>\n<li><p>Support will be provided either through chat or email.</p></li>\n<li><p>For those who signed up, we will have weekly monitoring and review sessions to keep track of the progress.</p></li>\n</ul>\n<div class="admonition attention">\n<p class="admonition-title">Attention</p>\n<p>Spend 4 to 5 hours per week up to 8 weeks and complete all the exercises to get best out of this course.</p>\n</div>\n</div>\n<div class="section" id="self-evaluation">\n<h2>Self Evaluation<a class="headerlink" href="#self-evaluation" title="Permalink to this headline">\xc2\xb6</a></h2>\n<p>The course is designed in such a way that one can self evaluate through the course and confirm whether the skills are acquired.</p>\n<ul class="simple">\n<li><p>Here is the approach we recommend you to take this course.</p>\n<ul>\n<li><p>Go through the consolidated exercises and see if you are able to solve the problems or not.</p></li>\n<li><p>Make sure to follow the order we have defined as part of the course.</p></li>\n<li><p>After each and every section or module, make sure to solve the exercises. We have provided enough information to validate the output of your queries.</p></li>\n<li><p>After the completion of the course try to solve the exercises using consolidated list.</p></li>\n<li><p>Keep in mind that you will be reviewing the same exercises before the course, during the course as well as at the end of the course.</p></li>\n</ul>\n</li>\n<li><p>By the end of the course, if you are able to solve the problems, then you can come to a conclusion that you are able to master the key skill called as SQL.</p></li>\n</ul>\n<div class="toctree-wrapper compound">\n</div>\n</div>\n</div>\n\n <script type="text/x-thebe-config">\n {\n requestKernel: true,\n binderOptions: {\n repo: "binder-examples/jupyter-stacks-datascience",\n ref: "master",\n },\n codeMirrorConfig: {\n theme: "abcdef",\n mode: "python"\n },\n kernelOptions: {\n kernelName: "python3",\n path: "./."\n },\n predefinedOutput: true\n }\n </script>\n <script>kernelName = \'python3\'</script>\n\n </div>\n \n </div>\n </div>\n \n \n <div class=\'prev-next-bottom\'>\n \n <a class=\'right-next\' id="next-link" href="01_overview_of_windows_os/01_overview_of_windows_os.html" title="next page">Overview of Windows Operating System</a>\n\n </div>\n <footer class="footer mt-5 mt-md-0">\n <div class="container">\n <p>\n \n By Durga Gadiraju<br/>\n \n © Copyright ITVersity, Inc.<br/>\n </p>\n </div>\n </footer>\n</main>\n\n\n </div>\n </div>\n\n \n <script src="_static/js/index.3da636dd464baa7582d2.js"></script>\n\n\n \n <!-- Google Analytics -->\n <script>\n window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;\n ga(\'create\', \'UA-80990145-12\', \'auto\');\n ga(\'set\', \'anonymizeIp\', true);\n ga(\'send\', \'pageview\');\n </script>\n <script async src=\'https://www.google-analytics.com/analytics.js\'></script>\n <!-- End Google Analytics -->\n \n </body>\n</html>'
type(python_page.content)
bytes
Processing HTML Content¶
We can pass the content and extract HTML Tags as well as data using BeautifulSoup.
We have to pass the content using
html.parser
and build the BeautifulSoup object.Let us prettify and print the content.
from bs4 import BeautifulSoup
soup = BeautifulSoup(python_page.content, 'html.parser')
# print(soup.prettify())
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1.0" name="viewport"/>
<title>
Mastering Python — Mastering Python
</title>
<link href="_static/css/index.73d71520a4ca3b99cfee5594769eaaae.css" rel="stylesheet"/>
<link href="_static/vendor/fontawesome/5.13.0/css/all.min.css" rel="stylesheet"/>
<link as="font" crossorigin="" href="_static/vendor/fontawesome/5.13.0/webfonts/fa-solid-900.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="" href="_static/vendor/fontawesome/5.13.0/webfonts/fa-brands-400.woff2" rel="preload" type="font/woff2"/>
<link href="_static/vendor/open-sans_all/1.44.1/index.css" rel="stylesheet"/>
<link href="_static/vendor/lato_latin-ext/1.44.1/index.css" rel="stylesheet"/>
<link href="_static/pygments.css" rel="stylesheet" type="text/css">
<link href="_static/sphinx-book-theme.2d2078699c18a0efb88233928e1cf6ed.css" rel="stylesheet" type="text/css">
<link href="_static/togglebutton.css" rel="stylesheet" type="text/css">
<link href="_static/copybutton.css" rel="stylesheet" type="text/css">
<link href="_static/mystnb.css" rel="stylesheet" type="text/css">
<link href="_static/sphinx-thebe.css" rel="stylesheet" type="text/css">
<link href="_static/panels-main.c949a650a448cc0ae9fd3441c0e17fb0.css" rel="stylesheet" type="text/css"/>
<link href="_static/panels-variables.06eb56fa6e07937060861dad626602ad.css" rel="stylesheet" type="text/css"/>
<link as="script" href="_static/js/index.3da636dd464baa7582d2.js" rel="preload"/>
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js">
</script>
<script src="_static/jquery.js">
</script>
<script src="_static/underscore.js">
</script>
<script src="_static/doctools.js">
</script>
<script src="_static/language_data.js">
</script>
<script src="_static/togglebutton.js">
</script>
<script src="_static/clipboard.min.js">
</script>
<script src="_static/copybutton.js">
</script>
<script>
var togglebuttonSelector = '.toggle, .admonition.dropdown, .tag_hide_input div.cell_input, .tag_hide-input div.cell_input, .tag_hide_output div.cell_output, .tag_hide-output div.cell_output, .tag_hide_cell.cell, .tag_remove-cell.cell';
</script>
<script src="_static/sphinx-book-theme.be0a4a0c39cd630af62a2fcf693f3f06.js">
</script>
<script async="async" src="https://unpkg.com/thebelab@latest/lib/index.js">
</script>
<script>
const thebe_selector = ".thebe"
const thebe_selector_input = "pre"
const thebe_selector_output = ".output"
</script>
<script async="async" src="_static/sphinx-thebe.js">
</script>
<link href="genindex.html" rel="index" title="Index">
<link href="search.html" rel="search" title="Search"/>
<link href="01_overview_of_windows_os/01_overview_of_windows_os.html" rel="next" title="Overview of Windows Operating System"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<meta content="en" name="docsearch:language"/>
</link>
</link>
</link>
</link>
</link>
</link>
</link>
</head>
<body data-offset="80" data-spy="scroll" data-target="#bd-toc-nav">
<div class="container-xl">
<div class="row">
<div class="col-12 col-md-3 bd-sidebar site-navigation show" id="site-navigation">
<div class="navbar-brand-box">
<a class="navbar-brand text-wrap" href="index.html">
<h1 class="site-logo" id="site-title">
Mastering Python
</h1>
</a>
</div>
<form action="search.html" class="bd-search d-flex align-items-center" method="get">
<i class="icon fas fa-search">
</i>
<input aria-label="Search this book..." autocomplete="off" class="form-control" id="search-input" name="q" placeholder="Search this book..." type="search"/>
</form>
<nav aria-label="Main navigation" class="bd-links" id="bd-docs-nav">
<ul class="nav sidenav_l1">
<li class="toctree-l1">
<a class="reference internal" href="#">
Mastering Python
</a>
</li>
</ul>
<ul class="nav sidenav_l1">
<li class="toctree-l1">
<a class="reference internal" href="01_overview_of_windows_os/01_overview_of_windows_os.html">
Overview of Windows Operating System
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="04_postgres_database_operations/01_postgres_database_operations.html">
Perform Database Operations
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="05_getting_started_with_python/01_getting_started_with_python.html">
Getting Started with Python
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="06_basic_programming_constructs/01_basic_programming_constructs.html">
Basic Programming Constructs
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="07_pre_defined_functions/01_pre_defined_functions.html">
Pre-defined Functions
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="08_user_defined_functions/01_user_defined_functions.html">
User Defined Functions
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="09_overview_of_collections_list_and_set/01_overview_of_collections_list_and_set.html">
Overview of Collections - list and set
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="10_overview_of_collections_dict_and_tuple/01_overview_of_collections_dict_and_tuple.html">
Overview of Collections - dict and tuple
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="11_manipulating_collections_using_loops/01_manipulating_collections_using_loops.html">
Manipulating Collections using Loops
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="12_development_of_map_reduce_apis/01_development_of_map_reduce_apis.html">
Development of Map Reduce APIs
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="13_understanding_map_reduce_libraries/01_understanding_map_reduce_libraries.html">
Understanding Python Map Reduce Libraries
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="14_overview_of_object_oriented_programming/01_overview_of_object_oriented_programming.html">
Overview of Object Oriented Programming
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="15_overview_of_pandas_libraries/01_overview_of_pandas_libraries.html">
Overview of Pandas Libraries
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="16_web_scraping_using_beautifulsoup/01_web_scraping_using_beautifulsoup.html">
Web Scraping using Beautiful Soup
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="17_database_programming_crud_operations/01_database_programming_crud_operations.html">
Database Programming – CRUD Operations
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="18_database_programming_batch_operations/01_database_programming_batch_operations.html">
Database Programming – Batch Operations
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="19_project_web_scraping_into_database/01_project_web_scraping_into_database.html">
Project – Web Scraping and loading into Database
</a>
</li>
</ul>
</nav>
<!-- To handle the deprecated key -->
<div class="navbar_extra_footer">
Subscribe to our
<a href="http://notifyme.itversity.com">
Newsletter
</a>
</div>
</div>
<main class="col py-md-3 pl-md-4 bd-content overflow-auto" role="main">
<div class="row topbar fixed-top container-xl">
<div class="col-12 col-md-3 bd-topbar-whitespace site-navigation show">
</div>
<div class="col pl-2 topbar-main">
<button aria-controls="site-navigation" aria-expanded="true" aria-label="Toggle navigation" class="navbar-toggler ml-0" data-placement="left" data-target=".site-navigation" data-toggle="tooltip" id="navbar-toggler" title="Toggle navigation" type="button">
<i class="fas fa-bars">
</i>
<i class="fas fa-arrow-left">
</i>
<i class="fas fa-arrow-up">
</i>
</button>
<div class="dropdown-buttons-trigger">
<button aria-label="Download this page" class="btn btn-secondary topbarbtn" id="dropdown-buttons-trigger">
<i class="fas fa-download">
</i>
</button>
<div class="dropdown-buttons">
<!-- ipynb file if we had a myst markdown file -->
<!-- Download raw file -->
<a class="dropdown-buttons" href="_sources/mastering-python.ipynb">
<button class="btn btn-secondary topbarbtn" data-placement="left" data-toggle="tooltip" title="Download source file" type="button">
.ipynb
</button>
</a>
<!-- Download PDF via print -->
<button class="btn btn-secondary topbarbtn" data-placement="left" data-toggle="tooltip" id="download-print" onclick="window.print()" title="Print to PDF" type="button">
.pdf
</button>
</div>
</div>
<!-- Source interaction buttons -->
<div class="dropdown-buttons-trigger">
<button aria-label="Connect with source repository" class="btn btn-secondary topbarbtn" id="dropdown-buttons-trigger">
<i class="fab fa-github">
</i>
</button>
<div class="dropdown-buttons sourcebuttons">
<a class="repository-button" href="https://github.com/itversity/mastering-python">
<button class="btn btn-secondary topbarbtn" data-placement="left" data-toggle="tooltip" title="Source repository" type="button">
<i class="fab fa-github">
</i>
repository
</button>
</a>
<a class="issues-button" href="https://github.com/itversity/mastering-python/issues/new?title=Issue%20on%20page%20%2Fmastering-python.html&body=Your%20issue%20content%20here.">
<button class="btn btn-secondary topbarbtn" data-placement="left" data-toggle="tooltip" title="Open an issue" type="button">
<i class="fas fa-lightbulb">
</i>
open issue
</button>
</a>
<a class="edit-button" href="https://github.com/itversity/mastering-python/edit/master/mastering-python.ipynb">
<button class="btn btn-secondary topbarbtn" data-placement="left" data-toggle="tooltip" title="Edit this page" type="button">
<i class="fas fa-pencil-alt">
</i>
suggest edit
</button>
</a>
</div>
</div>
<!-- Full screen (wrap in <a> to have style consistency -->
<a class="full-screen-button">
<button aria-label="Fullscreen mode" class="btn btn-secondary topbarbtn" data-placement="bottom" data-toggle="tooltip" onclick="toggleFullScreen()" title="Fullscreen mode" type="button">
<i class="fas fa-expand">
</i>
</button>
</a>
<!-- Launch buttons -->
<div class="dropdown-buttons-trigger">
<button aria-label="Launch interactive content" class="btn btn-secondary topbarbtn" id="dropdown-buttons-trigger">
<i class="fas fa-rocket">
</i>
</button>
<div class="dropdown-buttons">
<a class="binder-button" href="https://mybinder.org/v2/gh/itversity/mastering-python/master?urlpath=tree/mastering-python.ipynb">
<button class="btn btn-secondary topbarbtn" data-placement="left" data-toggle="tooltip" title="Launch Binder" type="button">
<img alt="Interact on binder" class="binder-button-logo" src="_static/images/logo_binder.svg"/>
Binder
</button>
</a>
</div>
</div>
</div>
<!-- Table of contents -->
<div class="d-none d-md-block col-md-2 bd-toc show">
<div class="tocsection onthispage pt-5 pb-3">
<i class="fas fa-list">
</i>
Contents
</div>
<nav id="bd-toc-nav">
<ul class="nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#about-python">
About Python
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#course-details">
Course Details
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#desired-audience">
Desired Audience
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#prerequisites">
Prerequisites
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#key-objectives">
Key Objectives
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#training-approach">
Training Approach
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#self-evaluation">
Self Evaluation
</a>
</li>
</ul>
</nav>
</div>
</div>
<div class="row" id="main-content">
<div class="col-12 col-md-9 pl-md-3 pr-md-0">
<div>
<div class="section" id="mastering-python">
<h1>
Mastering Python
<a class="headerlink" href="#mastering-python" title="Permalink to this headline">
¶
</a>
</h1>
<p>
This course is primarily designed to learn Python as Programming language to build web or mobile applications, data engineering applications, automation etc.
</p>
<div class="section" id="about-python">
<h2>
About Python
<a class="headerlink" href="#about-python" title="Permalink to this headline">
¶
</a>
</h2>
<p>
Python is one of the leading programming language. It is an open source database and used for different types of applications.
</p>
<ul class="simple">
<li>
<p>
Web Applications
</p>
</li>
<li>
<p>
Mobile Applications
</p>
</li>
<li>
<p>
Data Engineering Applications
</p>
</li>
<li>
<p>
Server Automation
</p>
</li>
<li>
<p>
Troubleshooting and Debugging
</p>
</li>
<li>
<p>
Quality Assurance
</p>
</li>
<li>
<p>
Data Science based Applications
</p>
</li>
<li>
<p>
and many many more
</p>
</li>
</ul>
<p>
It is one of the top 3 programming languages for almost a decade now along with Java and Java Script.
</p>
</div>
<div class="section" id="course-details">
<h2>
Course Details
<a class="headerlink" href="#course-details" title="Permalink to this headline">
¶
</a>
</h2>
<p>
This course is primarily designed to go through core programming using Python. It will serve as foundation for role specific courses for different types of IT Professionals or Roles. As part of this course you will be learning the following topics under core programming using Python.
</p>
<ul class="simple">
<li>
<p>
Overview of GCP and Setup Ubuntu VM
</p>
</li>
<li>
<p>
Setup Postgres DB using Docker
</p>
</li>
<li>
<p>
Postgres Database Operations - Basic DDL and DML
</p>
</li>
<li>
<p>
Getting Started with Python
</p>
</li>
<li>
<p>
Basic Programming Constructs
</p>
</li>
<li>
<p>
Pre Defined Functions
</p>
</li>
<li>
<p>
User Defined Functions
</p>
</li>
<li>
<p>
Overview of Collections - list and set
</p>
</li>
<li>
<p>
Overview of Collections - dict and tuple
</p>
</li>
<li>
<p>
Manipulating Collections using loops
</p>
</li>
<li>
<p>
Overview of Map Reduce Libraries
</p>
</li>
<li>
<p>
Overview of Pandas Libraries
</p>
</li>
<li>
<p>
Database Programming - CRUD Operations
</p>
</li>
<li>
<p>
Database Programming - Batch Operations
</p>
</li>
</ul>
</div>
<div class="section" id="desired-audience">
<h2>
Desired Audience
<a class="headerlink" href="#desired-audience" title="Permalink to this headline">
¶
</a>
</h2>
<p>
Here are the desired audience for this course.
</p>
<ul class="simple">
<li>
<p>
College students and entry level professionals to get hands on expertise with respect to programming using Python to be prepared for the interviews.
</p>
</li>
<li>
<p>
Experienced application developers to understand key aspects of Python Programming to build Python based Web or Mobile Applications.
</p>
</li>
<li>
<p>
Data Engineers and Data Warehouse Developers to understand key aspects of Python Programming to build batch or streaming pipelines.
</p>
</li>
<li>
<p>
Testers to improve their scripting abilities to validate data in the files tables etc.
</p>
</li>
<li>
<p>
DevOps Engineers, System Administrators, Database Administrators etc to understand Python as scripting language for the automation of day to day tasks.
</p>
</li>
<li>
<p>
Data Scientists to be proficient in Python Programming to build the models.
</p>
</li>
</ul>
<div class="admonition note">
<p class="admonition-title">
Note
</p>
<p>
This course only covers Fundamentals of Python which is useful for almost all the hands on technical roles. It does not include the role specific libraries.
</p>
</div>
</div>
<div class="section" id="prerequisites">
<h2>
Prerequisites
<a class="headerlink" href="#prerequisites" title="Permalink to this headline">
¶
</a>
</h2>
<p>
Here are the prerequisites before signing up for the course.
</p>
<div class="sphinx-bs container pb-4 docutils">
<div class="row docutils">
<div class="d-flex col-lg-6 col-md-6 col-sm-6 col-xs-12 p-2 docutils">
<div class="card w-100 shadow docutils">
<div class="card-body docutils">
<p class="card-text">
<strong>
Logistics
</strong>
</p>
<ul class="simple">
<li>
<p class="card-text">
Computer with decent configuration
</p>
<ul>
<li>
<p class="card-text">
At least 4 GB RAM
</p>
</li>
<li>
<p class="card-text">
8 GB RAM is highly desired
</p>
</li>
</ul>
</li>
<li>
<p class="card-text">
Chrome Browser
</p>
</li>
<li>
<p class="card-text">
High Speed Internet
</p>
</li>
</ul>
</div>
</div>
</div>
<div class="d-flex col-lg-6 col-md-6 col-sm-6 col-xs-12 p-2 docutils">
<div class="card w-100 shadow docutils">
<div class="card-body docutils">
<p class="card-text">
<strong>
Desired Skills
</strong>
</p>
<ul class="simple">
<li>
<p class="card-text">
Engineering or Science Degree
</p>
</li>
<li>
<p class="card-text">
Ability to use computer
</p>
</li>
<li>
<p class="card-text">
Knowledge or working experience with databases is highly desired
</p>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="section" id="key-objectives">
<h2>
Key Objectives
<a class="headerlink" href="#key-objectives" title="Permalink to this headline">
¶
</a>
</h2>
<p>
The course is designed for the professionals to achieve these key objectives related to programming using Python as Programming Language.
</p>
<div class="admonition attention">
<p class="admonition-title">
Attention
</p>
<p>
This course is primarily designed to gain key database skills for application developers, data engineers, testers, business analysts etc.
</p>
</div>
</div>
<div class="section" id="training-approach">
<h2>
Training Approach
<a class="headerlink" href="#training-approach" title="Permalink to this headline">
¶
</a>
</h2>
<p>
Here are the details related to the training approach.
</p>
<ul class="simple">
<li>
<p>
It is self paced with reference material, code snippets and videos.
</p>
</li>
<li>
<p>
One can either use environment provided by us or setup their own environment using Docker.
</p>
</li>
<li>
<p>
Modules will be published as and when they are ready. We would recommend to complete
<strong>
2 modules every week
</strong>
by spending
<strong>
4 to 5 hours per week
</strong>
.
</p>
</li>
<li>
<p>
It is highly recommended to take care of the exercises at the end to ensure that you are able to meet all the key objectives for each module.
</p>
</li>
<li>
<p>
Support will be provided either through chat or email.
</p>
</li>
<li>
<p>
For those who signed up, we will have weekly monitoring and review sessions to keep track of the progress.
</p>
</li>
</ul>
<div class="admonition attention">
<p class="admonition-title">
Attention
</p>
<p>
Spend 4 to 5 hours per week up to 8 weeks and complete all the exercises to get best out of this course.
</p>
</div>
</div>
<div class="section" id="self-evaluation">
<h2>
Self Evaluation
<a class="headerlink" href="#self-evaluation" title="Permalink to this headline">
¶
</a>
</h2>
<p>
The course is designed in such a way that one can self evaluate through the course and confirm whether the skills are acquired.
</p>
<ul class="simple">
<li>
<p>
Here is the approach we recommend you to take this course.
</p>
<ul>
<li>
<p>
Go through the consolidated exercises and see if you are able to solve the problems or not.
</p>
</li>
<li>
<p>
Make sure to follow the order we have defined as part of the course.
</p>
</li>
<li>
<p>
After each and every section or module, make sure to solve the exercises. We have provided enough information to validate the output of your queries.
</p>
</li>
<li>
<p>
After the completion of the course try to solve the exercises using consolidated list.
</p>
</li>
<li>
<p>
Keep in mind that you will be reviewing the same exercises before the course, during the course as well as at the end of the course.
</p>
</li>
</ul>
</li>
<li>
<p>
By the end of the course, if you are able to solve the problems, then you can come to a conclusion that you are able to master the key skill called as SQL.
</p>
</li>
</ul>
<div class="toctree-wrapper compound">
</div>
</div>
</div>
<script type="text/x-thebe-config">
{
requestKernel: true,
binderOptions: {
repo: "binder-examples/jupyter-stacks-datascience",
ref: "master",
},
codeMirrorConfig: {
theme: "abcdef",
mode: "python"
},
kernelOptions: {
kernelName: "python3",
path: "./."
},
predefinedOutput: true
}
</script>
<script>
kernelName = 'python3'
</script>
</div>
</div>
</div>
<div class="prev-next-bottom">
<a class="right-next" href="01_overview_of_windows_os/01_overview_of_windows_os.html" id="next-link" title="next page">
Overview of Windows Operating System
</a>
</div>
<footer class="footer mt-5 mt-md-0">
<div class="container">
<p>
By Durga Gadiraju
<br/>
© Copyright ITVersity, Inc.
<br/>
</p>
</div>
</footer>
</main>
</div>
</div>
<script src="_static/js/index.3da636dd464baa7582d2.js">
</script>
<!-- Google Analytics -->
<script>
window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
ga('create', 'UA-80990145-12', 'auto');
ga('set', 'anonymizeIp', true);
ga('send', 'pageview');
</script>
<script async="" src="https://www.google-analytics.com/analytics.js">
</script>
<!-- End Google Analytics -->
</body>
</html>