## Package

**weka.classifiers.functions**

## Synopsis

Implements stochastic gradient descent for learning a linear binary class SVM or binary class logistic regression on text data. Operates directly on String attributes. From Weka 3.7.5.

## Options

The table below describes the options available for SGDText.

Option |
Description |
---|---|

LNorm |
The LNorm to use for document length normalization. |

debug |
If set to true, classifier may output additional info to the console. |

epochs |
The number of epochs to perform (batch learning). The total number of iterations is epochs * num instances. |

lambda |
The regularization constant. (default = 0.0001) |

learningRate |
The learning rate. |

lossFunction |
The loss function to use. Hinge loss (SVM), log loss (logistic regression) or squared loss (regression). |

lowercaseTokens |
Whether to convert all tokens to lowercase |

minWordFrequency |
Ignore any words that don't occur at least min frequency times in the training data. If periodic pruning is turned on, then the dictionary is pruned according to this value |

norm |
The norm of the instances after normalization. |

periodicPruning |
How often (number of instances) to prune the dictionary of low frequency terms. 0 means don't prune. Setting a positive integer n means prune after every n instances |

seed |
The random number seed to be used. |

stemmer |
The stemming algorithm to use on the words. |

stopwords |
The file containing the stopwords (if this is a directory then the default ones are used). |

tokenizer |
The tokenizing algorithm to use on the strings. |

useStopList |
If true, ignores all words that are on the stoplist. |

useWordFrequencies |
Use word frequencies rather than binary bag of words representation |

## Capabilities

The table below describes the capabilities of SGDText.

Capability |
Supported |
---|---|

Class |
Binary class, Missing class values |

Attributes |
String attributes, Missing values |

Min # of instances |
0 |