Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

Description

The Get File Names step allows you to get information associated with file names on the file system. The retrieved file names are added as rows onto the stream.

The output fields for this step are:

  • filename - the complete filename, including the path (/tmp/kettle/somefile.txt)
  • short_filename - only the filename, without the path (somefile.txt)
  • path - only the path (/tmp/kettle/)
  • type
  • exists
  • ishidden
  • isreadable
  • iswriteable
  • lastmodifiedtime
  • size
  • extension
  • uri
  • rooturi

Note: If you have no files then the step (and the transformation) do not abort. If you want to abort the transformation you could use a detect Empty Stream step with some logic (see attached example GetFileNamesAbortExample.ktr). It is also possible to check for no files and Abort within a job by the Checks if files exist job entry.

File tab

This tab defines the location of the files you want to retrieve filenames for. For more information about specifying file locations, see section "Selecting file using Regular Expressions" on the Text File Input step.

Example: You have a static directory of c:\temp where you expect files with an extension of .dat to be placed. Under file/directory you would specify c:\temp  and under Wildcard you would have a RegEx with something like .*\.dat$

Filters

The filters tab allows you to filter the retrieved file names based on:

  • All files and folders
  • Files only
  • Folders only

It also gives you:

  • The ability to include a row number in the output
  • The ability to limit the number of rows returned
  • The ability to add the filename(s) to the result list
  • No labels