The source data for this guide will reside in a Hive table called weblogs. If you have previously completed the Loading Data into MapR Hive guide, then you can skip to #Create a Database Connection to Hive. If not, then you will need the following datafile and perform the Create a Hive Table] instructions before proceeding.
The sample data file needed for the #Create a Hive Table instructions is:
NOTE: This task may be skipped if you have completed the Loading Data into MapR Hive guide.
- Open the Hive Shell: Open the Hive shell so you can manually create a Hive table by entering 'hive' at the command line.
- Create the Table in Hive: You need a hive table to load the data to, so enter the following in the hive shell.
create table weblogs ( client_ip string, full_request_date string, day string, month string, month_num int, year string, hour string, minute string, second string, timezone string, http_verb string, uri string, http_status_code string, bytes_returned string, referrer string, user_agent string) row format delimited fields terminated by '\t';
- Close the Hive Shell: You are done with the Hive Shell for now, so close it by entering 'quit;' in the Hive Shell.
- Load the Table: Load the Hive table by running the following commands:
hadoop fs –cp /weblogs/parse/part-00000 /user/hive/warehouse/weblogs/