There's a difference:
Scikit lumps both under the same API - OneVsRestClassifier, the main difference is the format of the training inputs:
https://scikit-learn.org/stable/modules/multiclass.html#one-vs-the-rest
When reviewing the results of correspondence analysis, remember:
For example:
Associated, but not a strong association:
Associated, with a strong association:
PS:
Classifiers in this note refer to:
predict_proba
function is sometimes used as a measure of confidencePS, in anomaly detection if the output of the confidence is sufficiently low, the datapoint can be considered an anomaly.
Both types of classifiers however sometimes skew the results of their confidence measures (1)(2)
In order to resolve this, probability calibration is sometimes necessary. Basically the output of each classifier's confidence is passed through a regressor that has been trained on the predicted vs actual confidence. In other words:
1. The classifier is trained as usual
2. For each output class, create a regressor
(isotonic or normal)
3. For every training sample, pass it through the model, and for each output class record the predicted probability and the actual probability (which is usually 0, or 1)
4. Train the regressors created in step 2 on the data collected in step 3
SImilar to skip-gram in your typical word2vec. However in this case the sentence is passed through a RNN encoder, whose output is fed simultaneously into:
General flow:
The "datatable" package is significantly faster than pandas when dealing with large datasets as well as datasets which do not fit in memory
Bagging creates subsets of the original data (with or without replacement), and trains models on these subsets (the models are subsequently used in an ensemble).
Boosting trains a series of models, each model is fed data that has been weighted according to the results of the previous model. Those data points which were labeled wrongly by the previous model are given a higher weight in the subsequent model in the hopes that this model will classify the data points correctly. The series of models is then combined into an ensemble
Technique | Advantage |
---|---|
Bagging | Better vs Over-fitting |
Boosting | Better vs Bias |
Both | Better model stability |
Online Algorithms are variants of the more common "batch variants" we typically use which are capable of incrementally learn when presented with new training data points. This means they typically perform slightly less well than batch counterparts, but take a lot less RAM and can be easily re-trained even when in production:
(SciKit Learn) The "RandomForestClassifier" and the "RandomForestRegressor" both have a useful attribute named "featureimportances". This attribute returns the "weight" or "usefulness" of each feature, which allows for feature reduction. Full example:
PySyft uses SMPC to enable private machine learning by: