» 2020 » December » 04 Option Fanatic

Posted by Mark on December 4, 2020 at 06:53 | Last modified: January 21, 2021 13:14

In Part 5, I summarized my Datacamp courses 13-15. Today I will continue with the next three.

As a reminder, I introduced you to my recent work learning Python here.

My course #16 was Introduction to Data Science in Python. This course covers:

My course #17 was Joining Data with Pandas. This course covers:

Inner join (changing df values with .loc accessor)
One to many relationships
Merging multiple DataFrames
Left join (count number of rows in a column with missing data)
Right and outer joins
Merging a table to itself (i.e. self join)
Merging on indexes
Filtering joins (semi-joins, anti-joins)
Concatenate DataFrames together vertically [.append()]
verify_integrity=True identifies accidental duplicates while validate arg helps to identify relationship type
Using merge_ordered() (for ordered/time-series data and to fill in missing values)
Using .merge_asof() (matches on nearest-value rather than equal-value columns)
Selecting data with .query()
Reshaping data with .melt()

Introduction to Linear Modeling in Python was my eighteenth course. This covers:

Introductory concepts about models (interpolation, extrapolation)
Visualizing linear relationships [object-oriented (OOP) approach to matplotlib]
Quantifying linear relationships (covariance, correlation, normalization)
What makes a model linear (Taylor series, overfitting, defining function to plot graph)
Interpreting slope and intercept
Model optimization (RSS: sum of squared residuals)
Least-squares optimization (by numpy, Scipy, Statsmodels)
Modeling real data
The limits of prediction
Goodness of fit (deviations, residuals, and R-squared in code)
Standard error (RMSE measures spread of residuals whereas SE measures uncertainty in model params)
Inferential statistics concepts
Model estimation and likelihood
Model uncertainty and sample distributions (bootstrap in code)
Model errors and randomness

Pages