Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)




Implements stochastic gradient descent for learning a linear binary class SVM or binary class logistic regression on text data. Operates directly on String attributes. From Weka 3.7.5.


The table below describes the options available for SGDText.

Option Description
LNorm The LNorm to use for document length normalization.
debug If set to true, classifier may output additional info to the console.
epochs The number of epochs to perform (batch learning). The total number of iterations is epochs * num instances.
lambda The regularization constant. (default = 0.0001)
learningRate The learning rate.
lossFunction The loss function to use. Hinge loss (SVM), log loss (logistic regression) or squared loss (regression).
lowercaseTokens Whether to convert all tokens to lowercase
minWordFrequency Ignore any words that don't occur at least min frequency times in the training data. If periodic pruning is turned on, then the dictionary is pruned according to this value
norm The norm of the instances after normalization.
periodicPruning How often (number of instances) to prune the dictionary of low frequency terms. 0 means don't prune. Setting a positive integer n means prune after every n instances
seed The random number seed to be used.
stemmer The stemming algorithm to use on the words.
stopwords The file containing the stopwords (if this is a directory then the default ones are used).
tokenizer The tokenizing algorithm to use on the strings.
useStopList If true, ignores all words that are on the stoplist.
useWordFrequencies Use word frequencies rather than binary bag of words representation


The table below describes the capabilities of SGDText.

Capability Supported
Class Binary class, Missing class values
Attributes String attributes, Missing values
Min # of instances 0

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence