Review of Python Courses (Part 5)
Posted by Mark on November 20, 2020 at 07:23 | Last modified: January 20, 2021 06:41In Part 4, I summarized my Datacamp courses 10-12. Today I will continue with the next three.
As a reminder, I introduced you to my recent work learning Python here.
My thirteenth course was Pandas Foundations. This course covers:
- Review of pandas DataFrames
- Building DataFrames from scratch
- Importing and exporting data [pd.read_csv() args: header, names, na_values, parse_dates]
- Plotting (arrays, series, DataFrames) with pandas
- Visual exploratory data analysis (line, scatter, box plots, histogram, and different plotting idioms)
- Statistical exploratory data analysis
- Separating populations
- Indexing time series (creating and using a Datetime index)
- Resampling time series data
- Manipulating time series data
- Time series visualization (pandas, not matplotlib)
- Reading and cleaning the data (cleaning and tidying datetime data)
>
Class #14 for me was Statistical Thinking in Python (Part 1). This course covers:
- Introduction to exploratory data analysis
- Plotting a histogram
- Plot all of your data: Bee swarm plots [sns.swarmplot()]
- Plot all of your data: Empirical Cumulative Distribution Functions (ECDF)
- Introduction to summary statistics: the sample mean and median
- Percentiles, outliers, and box plots
- Variance and standard deviation
- Covariance and the Pearson correlation coefficient
- Probabilistic logic and statistical inference
- Random number generators and hacker statistics
- Probability distributions and stories: the Binomial distribution (binomial PMF and CDF)
- Poisson processes and the Poisson distribution
- Probability density functions
- Introduction to the Normal distribution
- The Normal distribution: properties and warnings
- The Exponential distribution
>
My fifteenth course was Introduction to Data Visualization with Matplotlib. This course covers:
- Adding data to axes
- Customizing your plots (adding markers, setting linestyle, color, axis labels)
- Small multiples with plt.subplots
- Plotting time-series data (using fig/ax, zooming in on datetime range)
- Plotting time-series with different variables (using twin axes, coloring vars and ticks, all-encompassing function)
- Annotating time-series data
- Quantitative comparisons: bar charts (stacking, adding legend, color)
- Quantitative comparisons: histograms
- Statistical plotting
- Quantitative comparisons: scatter plots (encoding time by color)
- Preparing your figures to share with others (choosing plot style)
- Sharing your visualizations with others [fig.savefig()]
- Automating figures from data
>