» 2020 » December » 21 Option Fanatic

Review of Python Courses (Part 11)

Posted by Mark on December 21, 2020 at 07:41 | Last modified: February 1, 2021 11:34

In Part 10, I summarized my Datacamp courses 28-30. Today I will continue with the next four.

As a reminder, I introduced you to my recent work learning Python here.

My course #31 was Customer Analytics and A/B Testing in Python. This course covers:

What is A/B testing?
Identifying and understanding KPIs
Exploratory analysis of KPIs
Calculating KPIs—a practical example
Working with time series data in pandas
Creating time series graphs with matplotlib
Understanding and visualizing trends in customer data
Events and releases
Introduction to A/B testing
Initial A/B test design
Preparing to run an A/B test
Calculating sample size
Analyzing the A/B test results
Understanding statistical significance (get_pvalue, get_ci)
Interpreting your test results

My course #32 was Machine Learning with Tree-Based Models in Python. This course covers:

Decision-tree for classification (from sklearn.tree import DecisionTreeClassifier)
Classification-tree learning
Decision-tree for regression
Generalization error (bias-variance tradeoff)
Diagnosing bias and variance problems
Ensemble learning
Bagging (from sklearn.ensemble import BaggingClassifier)
Out of bag evaluation
Random forests
AdaBoost (from sklearn.ensemble import AdaBoostClassifier)
Gradient boosting (from sklearn.ensemble import GradientBoostingRegressor)
Stochastic gradient boosting
Tuning a CART’s hyperparameters
Tuning an RF’s hyperparameters

My course #33 was Introduction to PySpark. This is a data engineering course—a field in which I found myself not very enthusiastic. This course covers:

What is Spark, anyway?
Using Spark in Python
Using dataframes
Joining
Maching learning pipelines
Data types
Strings and factors

My course #34 was Cleaning Data with PySpark. This course covers:

Intro to data cleaning with Apache Spark
Immutability and lazy processing
Understanding Parquet
Dataframe column operations
Conditional dataframe column operations
User defined functions
Partitioning and lazy processing
Caching
Improve import performance
Cluster sizing tips
Performance improvements
Introduction to data pipelines
Data handling techniques
Data validation
Final analysis and delivery

I will review more classes next time.

Categories: Python | Comments (0) | Permalink

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Review of Python Courses (Part 11)

Pages

Recent Posts

Categories