Review of Python Courses (Part 13)
Posted by Mark on December 29, 2020 at 07:48 | Last modified: February 2, 2021 11:44In Part 12, I summarized my Datacamp courses 35-37. Today I will continue with the next three.
As a reminder, I introduced you to my recent work learning Python here.
My course #38 was Web Scraping in Python. This gets complicated with some objected-oriented stuff that still throws me for a loop (no pun intended). I don’t think I will be using this anytime soon so I skimmed it in this review:
- Web scraping with Python
- HyperText Markup Language (HTML)
- HTML tags and attributes
- Crash course X
- Off the beaten XPath
- Introduction to the scrapy Selector (from scrapy import Selector)
- “Inspecting the HTML”
- CSS locators
- Attribute and text selection
- Getting ready to crawl
- Scraping for reals
- A classy spider (from scrapy.crawler import CrawlerProcess)
- A request for service
- Move your bloomin’ parse
- Capstone
>
My course #39 was Working with the Class System in Python. Like #38, this gets thick. The course covers:
- Intro to Object Oriented Programming (OOP) in Python
- Introduction to NumPy internals
- Introduction to objects and classes
- Deep dive on classes
- __Init__ializing a class
- Methods in classes
- Working with a dataset to create dataframes
- Renaming columns and the five-figure summary
- OOP best practices
- Inheritance: is-a versus has-a
- Inheritance with DataShells
- Composition
- Wrapping up OOP
>
My course #40 was Sentiment Analysis in Python. This course covers:
- What is sentiment analysis?
- Sentiment analysis types and approaches (from textblob import TextBlob)
- Let’s build a word cloud (from wordcloud import WordCloud)!
- Bag-of-words (from sklearn.feature_extraction.text import CountVectorizer)
- Getting granular with n-grams
- Build new features from text (from nltk import word_tokenize)
- Can you guess the language (from langdetect import detect_langs)?
- Stop words (from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS)
- Capturing a token pattern [.isalpha(), .isdigit(), .isalnum()]
- Stemming and lemmatization (from nltk.stem import PorterStemmer, WordNetLemmatizer)
- TfIdf: more ways to transform text (from sklearn.feature_extraction.text import TfidfVectorizer)
- Let’s predict the sentiment (from sklearn.linear_model import LogisticRegression)!
- Did we really predict the sentiment well (from sklearn.metrics import accuracy_score, confusion_matrix)?
- Logistic regression: revisited
- Bringing it all together
>
I will review more classes next time.
No comments posted.