## Package

**weka.classifiers.functions**

## Synopsis

Implements stochastic gradient descent for learning a linear binary class SVM or binary class logistic regression on text data. Operates directly on String attributes. From Weka 3.7.5.

## Options

The table below describes the options available for SGDText.

Option | Description |
---|---|

LNorm | The LNorm to use for document length normalization. |

debug | If set to true, classifier may output additional info to the console. |

epochs | The number of epochs to perform (batch learning). The total number of iterations is epochs * num instances. |

lambda | The regularization constant. (default = 0.0001) |

learningRate | The learning rate. |

lossFunction | The loss function to use. Hinge loss (SVM), log loss (logistic regression) or squared loss (regression). |

lowercaseTokens | Whether to convert all tokens to lowercase |

minWordFrequency | Ignore any words that don't occur at least min frequency times in the training data. If periodic pruning is turned on, then the dictionary is pruned according to this value |

norm | The norm of the instances after normalization. |

periodicPruning | How often (number of instances) to prune the dictionary of low frequency terms. 0 means don't prune. Setting a positive integer n means prune after every n instances |

seed | The random number seed to be used. |

stemmer | The stemming algorithm to use on the words. |

stopwords | The file containing the stopwords (if this is a directory then the default ones are used). |

tokenizer | The tokenizing algorithm to use on the strings. |

useStopList | If true, ignores all words that are on the stoplist. |

useWordFrequencies | Use word frequencies rather than binary bag of words representation |

## Capabilities

The table below describes the capabilities of SGDText.

Capability | Supported |
---|---|

Class | Binary class, Missing class values |

Attributes | String attributes, Missing values |

Min # of instances | 0 |