README.md 5.2 KB
Newer Older
P
pycaret 已提交
1
## pycaret
P
pycaret 已提交
2
pycaret is the free software and open source machine learning library for python programming language. It is built around several popular machine learning libraries in python. Its primary objective is to reduce the cycle time of hypothesis to insights by providing an easy to use high level unified API. pycaret's vision is to become defacto standard for teaching machine learning and data science. Our strength is in our easy to use unified interface for both supervised and unsupervised machine learning problems. It saves time and effort that citizen data scientists, students and researchers spent on coding or learning to code using different interfaces, so that now they can focus on business problem and value creation. 
P
pycaret 已提交
3

P
pycaret 已提交
4 5 6 7 8 9 10
## Key Features
* Ease of Use
* Focus on Business Problem
* 10x efficient
* Collaboration
* Business Ready
* Cloud Ready
P
pycaret 已提交
11

P
pycaret 已提交
12
## Current Release
P
pycaret 已提交
13
The current release is beta 0.0.5 (as of 23/12/2019). A full release for public is targetted to be available by 31/12/2020.
P
pycaret 已提交
14

P
pycaret 已提交
15 16 17 18 19 20 21
## Installation

#### Dependencies
Please read requirements.txt for list of requirements. They are automatically installed when pycaret is installed using pip.

#### User Installation
The easiest way to install pycaret is using pip.
P
pycaret 已提交
22

P
pycaret 已提交
23 24
```python
pip install pycaret
P
pycaret 已提交
25
```
P
pycaret 已提交
26 27

## Quick Start
P
pycaret 已提交
28
As of beta 0.0.5 classification, regression and nlp modules are available. Future release will be include Anamoly Detection, Association Rules, Clustering, Recommender System and Time Series.
P
pycaret 已提交
29 30 31 32 33 34 35 36 37 38

### Classification

Getting data from pycaret repository

```python
from pycaret.datasets import get_data
juice = get_data('juice')
```

P
pycaret 已提交
39
1. Initializing the pycaret environment setup
P
pycaret 已提交
40 41 42 43 44

```python
exp1 = setup(juice, 'Purchase')
```

P
pycaret 已提交
45
2. Creating a simple logistic regression (includes fitting, CV and metric evaluation)
P
pycaret 已提交
46 47 48 49 50 51
```python
lr = create_model('lr')
```

List of available estimators:

P
pycaret 已提交
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
Logistic Regression (lr) <br/>
K Nearest Neighbour (knn) <br/>
Naive Bayes (nb) <br/>
Decision Tree (dt) <br/>
Support Vector Machine - Linear (svm) <br/>
SVM Radial Function (rbfsvm) <br/>
Gaussian Process Classifier (gpc) <br/>
Multi Level Perceptron (mlp) <br/>
Ridge Classifier (ridge) <br/>
Random Forest (rf) <br/>
Quadtratic Discriminant Analysis (qda) <br/>
Adaboost (ada) <br/>
Gradient Boosting Classifier (gbc) <br/>
Linear Discriminant Analysis (lda) <br/>
Extra Trees Classifier (et) <br/>
Extreme Gradient Boosting - xgboost (xgboost) <br/>
Light Gradient Boosting - Microsoft LightGBM (lightgbm) <br/>
P
pycaret 已提交
69 70

3. Tuning a model using inbuilt grids.
P
pycaret 已提交
71
```python
P
pycaret 已提交
72
tuned_xgb = tune_model('xgboost')
P
pycaret 已提交
73 74
```

P
pycaret 已提交
75
4. Ensembling Model
P
pycaret 已提交
76
```python
P
pycaret 已提交
77 78 79
dt = create_model('dt')
dt_bagging = ensemble_model(dt, method='Bagging')
dt_boosting = ensemble_model(dt, method='Boosting')
P
pycaret 已提交
80 81
```

P
pycaret 已提交
82
5. Creating a voting classifier
P
pycaret 已提交
83
```python
P
pycaret 已提交
84
voting_all = blend_models() #creates voting classifier for entire library
P
pycaret 已提交
85 86 87 88 89 90 91

#create voting classifier for specific models
lr = create_model('lr')
svm = create_model('svm')
mlp = create_model('mlp')
xgboost = create_model('xgboost')

P
pycaret 已提交
92
voting_clf2 = blend_models( [ lr, svm, mlp, xgboost ] )
P
pycaret 已提交
93 94
```

P
pycaret 已提交
95
6. Stacking Models in Single Layer
P
pycaret 已提交
96 97 98 99 100 101 102 103 104 105
```python
#create individual classifiers
lr = create_model('lr')
svm = create_model('svm')
mlp = create_model('mlp')
xgboost = create_model('xgboost')

stacker = stack_models( [lr,svm,mlp], meta_model = xgboost )
```

P
pycaret 已提交
106
7. Stacking Models in Multiple Layers
P
pycaret 已提交
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
```python
#create individual classifiers
lr = create_model('lr')
svm = create_model('svm')
mlp = create_model('mlp')
gbc = create_model('gbc')
nb = create_model('nb')
lightgbm = create_model('lightgbm')
knn = create_model('knn')
xgboost = create_model('xgboost')

stacknet = create_stacknet( [ [lr,svm,mlp], [gbc, nb], [lightgbm, knn] ], meta_model = xgboost )
#meta model by default is Logistic Regression
```

P
pycaret 已提交
122
8. Plot Models
P
pycaret 已提交
123 124 125 126 127 128
```python
lr = create_model('lr')
plot_model(lr, plot='auc')
```
List of available plots:

P
pycaret 已提交
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
Area Under the Curve (auc) <br/>
Discrimination Threshold (threshold) <br/>
Precision Recall Curve (pr) <br/>
Confusion Matrix (confusion_matrix) <br/>
Class Prediction Error (error) <br/>
Classification Report (class_report) <br/>
Decision Boundary (boundary) <br/>
Recursive Feature Selection (rfe) <br/>
Learning Curve (learning) <br/>
Manifold Learning (manifold) <br/>
Calibration Curve (calibration) <br/>
Validation Curve (vc) <br/>
Dimension Learning (dimension) <br/>
Feature Importance (feature) <br/>
Model Hyperparameter (parameter) <br/>
P
pycaret 已提交
144 145 146 147 148 149 150 151 152 153 154 155 156 157

9. Evaluate Model
```python
lr = create_model('lr')
evaluate_model(lr) #displays user interface for interactive plotting
```

10. Interpret Tree Based Models
```python
xgboost = create_model('xgboost')
interpret_model(xgboost)
```

11. Saving Model for Deployment
P
pycaret 已提交
158 159 160 161
```python
lr = create_model('lr')
save_model(lr, 'lr_23122019')
```
P
pycaret 已提交
162 163

12. Saving Entire Experiment Pipeline
P
pycaret 已提交
164 165 166
```python
save_experiment('expname1')
```
P
pycaret 已提交
167 168

13. Loading Model / Experiment
P
pycaret 已提交
169 170 171 172
```python
m = load_model('lr_23122019')
e = load_experiment('expname1')
```
P
pycaret 已提交
173

P
pycaret 已提交
174 175 176 177 178 179
14. Compare all Models
```python
compare_models()
```

15. AutoML
P
pycaret 已提交
180 181 182
```python
aml1 = automl()
```
P
pycaret 已提交
183

P
pycaret 已提交
184
## Documentation
P
pycaret 已提交
185
Documentation work is in progress. They will be uploaded on our website http://www.pycaret.org as soon as they are available. (Target Availability : 21/01/2020)
P
pycaret 已提交
186 187 188

## Contributions
Contributions are most welcome. To make contribution please reach out moez.ali@queensu.ca