» 2021 » January » 19 Option Fanatic

Posted by Mark on January 19, 2021 at 07:12 | Last modified: February 6, 2021 04:54

In Part 18, I summarized my Datacamp courses 53-55. Today I will continue with the next three.

As a reminder, I introduced you to my recent work learning Python here.

My course #56 was Writing Efficient Code with pandas. This course covers:

The need for efficient coding (time.time(), list comprehensions faster than for loop)
Locate rows: .iloc[] (generally faster for rows) and .loc[] (generally faster for columns)
Select random rows (built-in sample() function faster than numpy random integer generator)
Replace scalar values using .replace() (much faster than using .loc[] to find values and reassigning them)
Replace values using lists (.replace() faster than using .loc[] )
Replace values using dictionaries (faster than using lists)
Looping through the .iterrows() function [for loop using .range() is faster than the smarter/cleaner/optimized .iterrows()]
Looping through the .apply() function (faster iterating along rows while native pandas .sum() faster along columns)
Vectorization over pandas series [vectorization method .apply() works faster than .iterrows()]
Vectorization using NumPy arrays using .values() (summing arrays is faster than summing series)
Data transformation using .groupby().transform (.transform() cleaner and much faster than native Python code)
Missing value imputation using .transform() (.transform() much faster than native Python code)
Data filtration using the .filter() function (.groupby().filter() faster than list comprehension + for loop)

My course #57 was Credit Risk Modeling in Python. This course covers:

Understanding credit risk
Outliers in credit data
Risk with missing data in loan data (finding, counting, and replacing missing data)
Logistic regression for probability of default
Predicting the probability of default
Credit model performance
Model discrimination and impact
Gradient boosted trees with XGBoost
Column selection for credit risk
Cross validation for credit models
Class imbalance in loan data
Model evaluation and implementation (from sklearn.calibration import calibration_curve)
Credit acceptance rates
Credit strategy and maximum expected loss

My course #58 was Analyzing IoT Data in Python. This course covers:

I will review more classes next time.

Pages