Added by Matt Casters, last edited by Jem Matzan on Feb 05, 2009  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

Description

The Get File Names step allows you to get information associated with file names on the file system. The retrieved file names are added as rows onto the stream.

The output fields for this step are:

  • filename - the complete filename, including the path (/tmp/kettle/somefile.txt)
  • short_filename - only the filename, without the path (somefile.txt)
  • path - only the path (/tmp/kettle/)
  • type
  • exists
  • ishidden
  • isreadable
  • iswriteable
  • lastmodifiedtime
  • size
  • extension
  • uri
  • rooturi

File tab

This tab defines the location of the files you want to retrieve filenames for. For more information about specifying file locations, see Selecting Files to read data from.

The "Selecting Files to read data from" page referred to above doesnt appear to exist on this site (at least, I was unable to find it). In the absence of such a page, I'll point out that the "Wildcard" field does not take what you would normally use as a wildcard when doing directory listings in Unix or Windows (e.g. a * to represent all files). In fact, what you need to put in here is a regular expression, as understood by java.util.regex. So, for example, to get names of all files in a directory, you could use .+ in the Wildcard field. For full details of regular expression syntax, see http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html

Filters

The filters tab allows you to filter the retrieved file names based on:

  • All files and folders
  • Files only
  • Folders only

It also gives you:

  • The ability to include a row number in the output
  • The ability to limit the number of rows returned
  • The ability to add the filename(s) to the result list

As another jump start for beginners, let's say you have a static directory of c:\temp where you expect files with an extension of .dat to be placed. Under file/directory you would specify c:\temp  and under Wildcard you would have a RegEx with somthing like .*\.dat$

Comment: Posted by Darrin Blocker at Aug 13, 2010 12:53