Review of Python Courses (Part 32)
Posted by Mark on March 4, 2021 at 07:48 | Last modified: February 19, 2021 10:46In Part 31, I summarized my Datacamp courses 92-94. Today I will continue with the next three.
As a reminder, I introduced you to my recent work learning Python here.
My course #95 was Dimensionality Reduction in Python. This course covers:
- Introduction
- Feature selection vs. feature extraction
- t-SNE visualization of high-dimensional data (from sklearn.manifold import TSNE)
- The curse of dimensionality (from sklearn.model_selection import train_test_split; from sklearn.svm import SVC)
- Features with missing values or little variance (from sklearn.feature_selection import VarianceThreshold)
- Pairwise correlation (hide correlation matrix redundancy)
- Removing highly correlated features
- Selecting features for model performance (from sklearn.feature_selection import RFE)
- Tree-based feature selection (from sklearn.ensemble import RandomForestClassifier)
- Regularized linear regression (from sklearn.linear_model import Lasso)
- Combining feature selectors (from sklearn.linear_model import LassoCV)
- Feature extraction
- Principal component analysis (from sklearn.decomposition import PCA)
- PCA applications (from sklearn.pipeline import Pipeline)
- Principal component selection
>
My course #96 was Writing Efficient Python Code. Topics covered in this course include:
- Defining efficient
- Building with built-ins
- The power of NumPy arrays
- Examining runtime (%timeit)
- Code profiling for runtime [pip install line_profiler; %lprun -f foo(args)]
- Code profiling for memory usage (import sys, pip install memory_profiler)
- Efficiently combining, counting, and iterating (from collections import Counter; from itertools import combinations)
- Set theory
- Eliminating loops
- Writing better loops
- Intro to pandas dataframe iteration
- Another iterator method: .itertuples() [faster than .iterrows()]
- Pandas alternative to looping (use .apply() on an entire dataframe)
- Optimal pandas iterating (use .values to get array rather than series)
- Final tips
>
My course #97 was Machine Learning for Finance in Python. This course covers:
- Predict the future (e.g. stock price changes) with machine learning
- Data transforms, features, and targets (import talib)
- Linear modeling with financial data
- Engineering features (from sklearn.model_selection import ParameterGrid)
- Decision trees (from sklearn.tree import DecisionTreeRegressor)
- Random forests (from sklearn.ensemble import RandomForestRegressor)
- Feature importances and gradient boosting [np.argsort(); from sklearn.ensemble import GradientBoostingRegressor]
- Scaling data and KNN regression (from sklearn.preprocessing import scaler)
- Neural networks (from keras.models import Sequential; from keras.layers import Dense)
- Custom loss functions (import tensorflow as tf; import keras.losses)
- Overfitting and ensembling (from keras.layers import Dropout; from sklearn.metrics import r2_score)
- Modern Portfolio Theory (MPT) and efficient frontiers (review this complex code involving covariance)
- Sharpe Ratios, features, and targets
- Machine learning for MPT
>
I will review more courses next time.
No comments posted.