总是有一些小伙伴觉得机器学习很高大上,令人望而生畏,其实它就是我们常见的统计学方法,比如做表达量矩阵分析,通常是需要绘制pca图看看组间差异是否足够明显。
如果你有单细胞转录组数据处理经验,实际上流程里面的降维聚类分群无一不是机器学习。如果你做肿瘤数据挖掘,经常会使用lasso,随机森林,支持向量机,它们都是在R里面非常容易实现。我们也多次推荐过 《精通机器学习:基于R(第2版)-图书-图灵社区》:https://www.ituring.com.cn/book/1989 (赠书活动)
如果你不想看中文书籍
有意思的是一些小伙伴对中文翻译比较抵触,喜欢看英文原版,我们也有推荐:
在线书籍地址:https://f0nzie.github.io/machine_learning_compilation/index.html
目录
- 1 Preface
The Basics of Machine Learning
- 2 Introduction to PCA
- 3 Comparison of two PCA packages
- 4 Detailed study of Principal Component Analysis
- 5 Detection of diabetes using Logistic Regression
- 6 Sensitivity analysis for a neural network
- 7 Data Visualization for ML models
Feature Engineering
- 8 Ten methods to assess Variable Importance
- 9 Employee Attrition using Feature Importance
Classification
- 10 A gentle introduction to Support Vector Machines
- 11 Broad view of SVM
- 12 Feature Selection to enhance cancer detection
- 13 Dealing with unbalanced data
- 14 Imputting missing values with Random Forest
- 15 Tuning of Support Vector Machine prediction
Classification
- 16 Introduction to algorithms for Classification
- 17 Comparing Classification algorithms
- 18 Who buys Social Network ads
- 19 Predicting Ozone levels
- 20 Building a Naive Bayes Classifier
- 21 Linear and Non-Linear Algorithms for Classification
- 22 Detect mines vs rocks with Random Forest
- 23 Predicting the type of glass
- 24 Naive Bayes for SMS spam
- 25 Vehicles classiification with Decision Trees
- 26 Applying Naive-Bayes on the Titanic case
- 27 Classification on bad loans
- 28 Predicting Flu outcome comparing eight classification algorithms
- 29 A detailed study of bike sharing demand
- 30 Prediction of arrhythmia with deep neural nets
Linear Regression
- 31 Linear Regression with ISLR
- 32 Evaluation of three linear regression models
- 33 Comparison of six Linear Regression algorithms
- 34 Comparing regression models
- 35 Finding the factors of happiness
- 36 Regression with a neural network
- 37 Comparing Multiple Regression vs a Neural Network
- 38 Temperature modeling using nested dataframes
Neural Networks
- 39 Credit Scoring with neuralnet
- 40 Wine classification with neuralnet
- 41 Predicting the rating of cereals
- 42 Fitting a linear model with neural networks
- 43 Visualization of neural networks
- 44 Build a fully connected R neural network from scratch
- 45 Tuning Hyperparameters in a Neural Network
- 46 Deep Learning tips for Classification and Regression
Appendix
- A What is dot hat in a regression output
- B Q-Q normal to compare data to distributions
- C QQ and PP Plots
- D Visualizing residuals
书籍可能没有视频动画更加通俗易懂
StatQuest生物统计学视频是一个很优秀的生物统计学教程,教程作者是Josh Starmer (个人博客https://statquest.org/),生信菜鸟图很早之前就推过相关的学习资源。而且还组建过学习小分队,给视频写配套笔记: