» Review of Python Courses (Part 22) Option Fanatic

Review of Python Courses (Part 22)

Posted by Mark on January 29, 2021 at 07:31 | Last modified: February 9, 2021 13:29

In Part 21, I summarized my Datacamp courses 62-64. Today I will continue with the next three.

As a reminder, I introduced you to my recent work learning Python here.

My course #65 was Reshaping Data with pandas. This course covers:

Wide and long formats
Reshaping using pivot method
Pivot tables
Reshaping with melt
Wide to long function
Working with string columns
Stacking dataframes
Unstacking dataframes
Working with multiple levels
Handling missing data
Reshaping and combining data
Transforming a list-like column
Reading nested data into a dataframe (from pandas import json_normalize)
Dealing with nested data columns

My course #66 was Building Data Engineering Pipelines in Python. For some reason, these data engineering courses did not sit well with me and much of this sailed over my head. This course covers:

Components of a data platform
Introduction to data ingestion with Singer
Running an ingestion pipeline with Singer
Basic introduction to PySpark (from pyspark.sql import SparkSession)
Cleaning data
Transforming data with Spark
Packaging your application
On the importance of tests
Writing unit tests for PySpark
Continuous testing
Modern day workflow management
Building a data pipeline with Airflow (from airflow.operators.bash_operator import BashOperator)
Deploying Airflow (from airflow.models import DagBag)

My course #67 was Importing and Managing Financial Data in Python. This course covers:

Reading, inspecting, and cleaning data from CSV (parse_dates explained)
Read data from Excel worksheets
Combine data from multiple worksheets (importing market data from multiple Excel files)
The DataReader: access financial data online (from pandas_datareader.data import DataReader)
Economic data from the Federal Reserve
Select stocks and get data from Google Finance
Get several stocks and manage a MultiIndex
Summarize your data with descriptive stats
Describe the distribution of your data with quantiles (np.arange() to .describe() with constant-step percentiles)
Visualize the distribution of your data [ax = sns.distplot(df)]
Summarize categorical variables
Aggregate your data by category
Summary statistics by category with seaborn [sns.countplot()]
Distributions by category with seaborn [sns.boxplot(), sns.swarmplot()]

I will review more courses next time.

No comments posted.

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Review of Python Courses (Part 22)

Leave a Reply Cancel reply

Pages

Recent Posts

Categories