Review of Python Courses (Part 23)
Posted by Mark on February 1, 2021 at 07:34 | Last modified: February 10, 2021 10:35In Part 22, I summarized my Datacamp courses 65-67. Today I will continue with the next three.
As a reminder, I introduced you to my recent work learning Python here.
My course #68 was Linear Classifiers in Python. This course covers:
- Introduction (import sklearn.datasets)
- Applying logistic regression and SVM (general process, from sklearn.svm import LinearSVC)
- Linear decision boundaries
- Linear classifiers: prediction equations
- What is a loss function (from scipy.optimize import minimize)?
- Loss function diagrams
- Logistic regression and regularization
- Logistic regression and probabilities
- Multi-class logistic regression
- Support vectors
- Kernel SVMs
- Comparing logistic regression and SVM (from sklearn.linear_model import SGDClassifier)
>
My course #69 was Analyzing Social Media Data in Python. While I found this somewhat interesting, it seemed to incorporate as much JSON as it did Python. I have a hard enough time studying one new language—adding a second on top of that made things even more confusing for me:
- Analyzing Twitter data
- Collecting data through the Twitter API (from tweepy import Stream, OAuthHandler, API)
- Understanding Twitter JSON
- Processing Twitter text
- Counting words
- Time series
- Sentiment analysis
- Twitter networks
- Importing and visualizing Twitter networks (import networkx as nx)
- Node-level metrics
- Maps and Twitter data
- Geographical data in Twitter JSON
- Creating Twitter maps (from mpl_toolkits.basemap import Basemap)
>
My course #70 was Fraud Detection in Python. This course covers:
- Introduction to fraud detection
- Increasing successful detections using data resampling (from imblearn.over_sampling import RandomOverSampler)
- Fraud detection algorithms in action (from imblearn.pipeline import Pipeline)
- Review of classification methods
- Performance evaluation (from sklearn.metrics import precision_recall_curve, average_precision_score)
- More performance evaluation (from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score)
- Adjusting your algorithm weights
- Performance evaluation (from sklearn.model_selection import GridSearchCV)
- Ensemble methods (from sklearn.ensemble import VotingClassifier)
- Normal versus abnormal behavior
- Clustering methods (from sklearn.preprocessing import MinMaxScaler; from sklearn.cluster import MiniBatchKMeans)
- Assigning fraud versus non-fraud
- Other clustering fraud detection methods (from sklearn.cluster import DBSCAN)
- Using text data (from nltk import word_tokenize; import string)
- Text mining to detect fraud (from nltk.corpus import stopwords; from nltk.stem.wordnet import WordNetLemmatizer)
- Topic modeling on fraud (from gensim import corpora)
- Flagged fraud based on topics (import pyLDAvis.gensim for use with Jupyter Notebooks only)
>
I will review more courses next time.
No comments posted.