Review of Python Courses (Part 15)
Posted by Mark on January 4, 2021 at 07:23 | Last modified: February 3, 2021 13:57In Part 14, I summarized my Datacamp courses 41-43. Today I will continue with the next three.
As a reminder, I introduced you to my recent work learning Python here.
My course #44 was Python for Spreadsheet Users. This course covers:
- Welcome to Python
- Dataframes and their methods
- Filtering rows and creating columns
- Grouping and summing: the beginner’s pivot table
- Grouping by multiple columns
- More ways to condense information (using .groupby().head() to get top rows of each group)
- Working with multiple sheets [pd.ExcelFile(), .sheet_names attribute, .parse()]
- Preparing to put tables together
- Merging: the VLOOKUP of Python
- How visualization works in Python
- Building up the barplot
- The power of hue
>
My course #45 was Preprocessing for Machine Learning in Python. This course covers:
- Preprocessing data for machine learning (count number of missing values in column)
- Working with data types (converting column types)
- Training and test sets (stratified sampling with train_test_split)
- Standardizing data and log normalization
- Scaling data (from sklearn.preprocessing import StandardScaler)
- Standardized data and modeling (from sklearn.neighbors import KNeighborsClassifier)
- Feature engineering
- Encoding categorical variables [lambda function, from sklearn.preprocessing import LabelEncoder, pd.get_dummies()]
- Engineering numerical features
- Engineering features from text (from sklearn.feature_extraction.text import TfidfVectorizer)
- Feature selection
- Removing redundant features
- Selecting features using text vectors
- Dimensionality reduction (from sklearn.decomposition import PCA)
- UFOs and preprocessing
>
My course #46 was Cluster Analysis in Python. This course covers:
- Unsupervised learning: basics
- Basics of cluster analysis (from scipy.cluster.hierarchy import linkage, fcluster; from scipy.cluster.vq import kmeans, vq)
- Data preparation for cluster analysis (from scipy.cluster.vq import whiten)
- Basics of hierarchical clustering
- Visualize clusters
- How many clusters (from scipy.cluster.hierarchy import dendrogram)?
- Limitations of hierarchical clustering
- Basics of k-means clustering
- How many clusters?
- Limitations of k-means clustering
- Dominant colors in images (import matplotlib.image as img)
- Document clustering
- Clustering with multiple features
>
I will review more classes next time.
No comments posted.