Comments
Transcript
R AE API Reference IBM Netezza Analytics
IBM® Netezza® Analytics Release 3.0.2.0 R AE API Reference Note: Before using this information and the product that it supports, read the information in "Notices and Trademarks" on page 40. © Copyright IBM Corporation 2011, 2014. US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Part Number D20713-04 Rev. 1 Contents Preface Audience for This Guide......................................................................................................................v Purpose of This Guide.........................................................................................................................v Conventions........................................................................................................................................v If You Need Help.................................................................................................................................v Comments on the Documentation.....................................................................................................vi 1 Module Documentation Initialization APIs................................................................................................................................7 Modules.......................................................................................................................................7 Detailed Description.....................................................................................................................7 Local Initialization.........................................................................................................................8 Remote Connection Point............................................................................................................9 Remote Initialization....................................................................................................................9 High Level initialization...............................................................................................................10 Data Connection APIs.......................................................................................................................11 Functions....................................................................................................................................11 Modules.....................................................................................................................................11 Detailed Description...................................................................................................................12 Function Documentation............................................................................................................12 Function.....................................................................................................................................12 Aggregate API.............................................................................................................................21 Shaper and Sizer API...................................................................................................................22 Data Type Support............................................................................................................................25 Functions....................................................................................................................................25 Enumerations.............................................................................................................................25 Detailed Description...................................................................................................................26 Function Documentation............................................................................................................26 Enumeration Type Documentation............................................................................................26 Support APIs.....................................................................................................................................27 Modules.....................................................................................................................................27 Detailed Description...................................................................................................................27 Date and Time Functions............................................................................................................27 iii Runtime and Environment Information......................................................................................29 Utilities.......................................................................................................................................32 Working Modes................................................................................................................................37 Functions....................................................................................................................................37 Detailed Description...................................................................................................................37 Function Documentation............................................................................................................37 Notices and Trademarks Notices..............................................................................................................................................40 Trademarks ......................................................................................................................................41 Regulatory and Compliance .............................................................................................................42 Regulatory Notices.....................................................................................................................42 Homologation Statement...........................................................................................................42 FCC - Industry Canada Statement...............................................................................................42 CE Statement (Europe)...............................................................................................................42 VCCI Statement..........................................................................................................................43 Index iv Preface This guide provides an API reference for R AE programmers. Audience for This Guide The R AE API Reference is written for programmers who intend to create Analytic Executables for IBM Netezza Analytics using the R language. This guide does not provide a tutorial on AE concepts. More information about AEs can be found in the User-Defined Analytic Process Developer's Guide. Purpose of This Guide This guide describes the R AE API, which is a language adapter provided as part of IBM Netezza Analytics. The R AE API provides programmatic access to the AE interface for R programmers. This interface package is named nzrserver, and provides the server side functionality of the connectivity tools for running R on the NPS. Downloading, installing, and working with Open Source R and all other required packages is subject to the terms and conditions that are mentioned in the appropriate license files of those packages. Conventions The following conventions apply: ► ► ► ► ► ► ► ► In the technical literature, both the guides and reference guides, the term "Analytic Executable" or "AE" is used. In marketing materials, the term "User-Defined Analytic Process" or "UDAP" is used. The terms User-Defined Analytic Process and UDAP are synonymous with the terms Analytic Executable and AE. Italics for emphasis on terms and user-defined values such as user input. Upper case for SQL commands, for example, INSERT or DELETE. Bold for command line input, for example, nzsystem stop. Bold to denote parameter names, argument names, or other named references. Angle brackets ( < > ) to indicate a placeholder (variable) that should be replaced with actual text, for example, nzmat <- nz.matrix("<matrix_name>") In code samples, a single backslash ("\") at the end of a line denotes a line continuation and should be omitted when using the code at the command line, a SQL command or in a file. When referencing a sequence of menu and submenu selections, the ">" character denotes the different menu options in the form: "Menu Name > Submenu Name > Selection". Note that not all commands use submenus, while some selections may utilize a number of nested submenus. If You Need Help If you are having trouble using the IBM Netezza appliance, IBM Netezza Analytics or any of its components: 1. Retry the action, carefully following the instructions in the documentation. v 2. Go to the IBM Support Portal at http://www.ibm.com/support. Log in using your IBM ID and password. You can search the Support Portal for solutions. To submit a support request, click the 'Service Requests & PMRs' tab. 3. If you have an active service contract maintenance agreement with IBM, you can contact customer support teams via telephone. For individual countries, please visit the Technical Support section of the IBM Directory of worldwide contacts http://www14.software.ibm.com/webapp/set2/sas/f/handbook/contacts.html#phone. Comments on the Documentation We welcome any questions, comments, or suggestions that you have for the IBM Netezza documentation. Please send us an e-mail message at [email protected] and include the following information: The name and version of the manual that you are using Any comments that you have about the manual Your name, address, and phone number We appreciate your comments. ► ► ► vi C H A P T E R 1 Module Documentation Initialization APIs This API family is used to get an open data connection. Modules ► Local Initialization Initialization functions related to Local AEs. Local AEs are initialized using the function createLocalConnection. If an AE is local, the function isLocal returns TRUE. If isLocal returns FALSE, the AE is remote. ► Remote Connection Point A Remote Connection Point is used to address a Remote AE from the NPS software. ► Remote Initialization Initialization functions related to Remote AEs. 1) Create a connection point. 2) Listen using that connection point. 3) Accept a Data Connection API handle. ► High Level initialization Used to run both local and remote initialization. Detailed Description This API family is used to get an open data connection. All data structures created and used by these functions are completely internal and cannot be accessed from R. The dispatcher function is the top-level abstraction for the whole R AE. By default it is called just after loading the nzrserver package. It calls directly handleConnection and handles possible errors. handleConnection implements the local and remote connection creation and the remote connection loop. It then calls runWrapper. D20713-04 Rev. 1 7 R AE API Reference runWrapper decides upon the API type what user-provided data is required and finally calls the user function. Local Initialization Initialization functions related to Local AEs. Local AEs are initialized using the function createLocalConnection. If an AE is local, the function isLocal returns TRUE. If isLocal returns FALSE, the AE is remote. Functions ► closeLocalConnection Closes the local connection. ► createLocalConnection Creates a local connection. ► isLocal Returns a TRUE value if the AE is local. Detailed Description Initialization functions related to Local AEs. Local AEs are initialized using the function createLocalConnection. If an AE is local, the function isLocal returns TRUE. If isLocal returns FALSE, the AE is remote. Function Documentation ► closeLocalConnection() Closes the local connection. ▲ ► createLocalConnection() Creates a local connection. ▲ ► Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. isLocal() Returns a TRUE value if the AE is local. ▲ 8 Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. Returns A logical value. TRUE indicates local mode. D20713-04 Rev. 1 Initialization APIs An AE can be started as Local or Remote. This function can be used to determine the mode at runtime. The life cycle of a local process is controlled by the NPS software. Remote Connection Point A Remote Connection Point is used to address a Remote AE from the NPS software. Functions ► createConnectionPoint Creates a connection point. Detailed Description A Remote Connection Point is used to address a Remote AE from the NPS software. Function Documentation ► createConnectionPoint(name=NULL, dataslice=TRUE, transaction=TRUE, session=TRUE) Creates a connection point. ▲ ▲ Parameters ► name The connection point name. If NULL then the value of the NZAE_REMOTE_NAME environment variable is used. If the parameter value is NULL and the environment variable is not set, an error is reported. ► dataslice Specifies whether the dataslice ID should be used to define the connection point name. ► transaction Specified whether the transaction ID should be used to define the connection point name. ► session Specifies whether the session ID should be used to define the connection point name. Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. Remote Initialization Initialization functions related to Remote AEs. 1) Create a connection point. 2) Listen using that connection point. 3) Accept a Data Connection API handle. Functions ► closeRemoteConnection Closes a remote connection. ► createRemoteConnection Creates a remote connection. D20713-04 Rev. 1 9 R AE API Reference Detailed Description Initialization functions related to Remote AEs. 1) Create a connection point. 2) Listen using that connection point. 3) Accept a Data Connection API handle. Function Documentation ► closeRemoteConnection() Closes a remote connection. ▲ ► Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. createRemoteConnection() Creates a remote connection. ▲ Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. High Level initialization Used to run both local and remote initialization. Functions ► dispatcher Top Level abstraction. ► handleConnection Handle connection. ► runWrapper Determines the API type and runs the user function. Detailed Description Used to run both local and remote initialization. Function Documentation ► dispatcher() Top Level abstraction. ▲ 10 Returns A logical value indicating whether the invocation finished correctly. If any error occurs, D20713-04 Rev. 1 Initialization APIs the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. The dispatcher function is the top-level abstraction for the entire R AE. By default, it is called just after loading the nzrserver package. It calls handleConnection directly and handles possible errors. ► handleConnection() Handle connection. ▲ Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. The handleConnection function implements the local and remote connection creation and the remote connection loop. It then calls runWrapper. ► runWrapper() Determines the API type and runs the user function. ▲ Returns A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the user. The runWrapper function determines, based on the API type, what user-provided data is required. It then calls the user function. Data Connection APIs This API family is used to process data after a data connection has been opened. This involves running one of the three types of API functions. Functions ► getApiType Gets the API type. Modules ► Function Function AEs are called from Scalar or Table SQL Functions. ► Aggregate API Aggregate AEs are called from Aggregate SQL Functions. Apart from the functions described in this section, aggregates also use some of the functions from the Function API section. ► Shaper and Sizer API Shapers can be optionally called for Table Function AEs. Sizers can be optionally called for Scalar Function AEs. D20713-04 Rev. 1 11 R AE API Reference Detailed Description This API family is used to process data after a data connection has been opened. This involves running one of the three types of API functions. Function Documentation ► getApiType() Gets the API type. ▲ Returns An integer equal to either NZ.API.FUN, NZ.API.AGG, NZ.API.SHP. The AE can be started in one of three modes: as a function/table function (NZ.API.FUN), as an aggregate (NZ.API.AGG), or as a shaper/sizer (NZ.API.SHP). This function can be used to determine the API type at runtime. Function Function AEs are called from Scalar or Table SQL Functions. Functions 12 ► getInputColumn Gets the input column. ► getNext Gets the next row of data. ► inputColumnCount Gets the input column count. ► outputResult Sends an output row to the NPS software. ► setOutput Sets the output column value. ► setOutputBool Sets the output column value to a value of type Boolean. ► setOutputDate Sets the output column value to a value of type Date. ► setOutputDouble Sets the output column value to a value of type Double. ► setOutputFloat Sets the output column value to a value of type Float. ► setOutputInt16 Sets the output column value to a value of type Int16. D20713-04 Rev. 1 Data Connection APIs ► setOutputInt32 Sets the output column value to a value of type Int32. ► setOutputInt64 Sets the output column value to a value of type Int64. ► setOutputInt8 Sets the output column value to a value of type Int8. ► setOutputInterval Sets the output column value to a value of type Interval. ► setOutputNull Sets the output column value to NULL. ► setOutputString Sets the output column value to a value of type String. ► setOutputTime Sets the output column value to a value of type Time. ► setOutputTimeFromR Sets the output column value to a value of type Time. ► setOutputTimeStamp Sets the output column value to a value of type Timestamp. ► setOutputTimeTz Sets the output column value to a value of type TimeTz. Detailed Description Function AEs are called from Scalar or Table SQL Functions. Function Documentation ► getInputColumn(index) Gets the input column. ▲ Parameters ► index The input column index as an integer ▲ Returns One of: numeric, character, logical, or integer. Returns the value of the column specified by index. Data is cast to the R data type that is closest to the actual Netezza data type. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► getNext() Gets the next row of data. ▲ D20713-04 Rev. 1 Returns 13 R AE API Reference Logical TRUE if no error occurred. Informs the NPS software that it should send the next row of data. This function must be called before processing the first row of data. It also must also be called before outputting any data. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. ► inputColumnCount() Gets the input column count. ▲ Returns The column count as an integer. Returns the number of input columns. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► outputResult() Sends an output row to the NPS software. ▲ Returns Logical TRUE if no error occurred. After the output columns' values are set using the various setOutput functions, the last step is to call outputResult. This function sends the values to the NPS software. It can be called many times if the R AE is executed as a Table Function. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. ► setOutput(index, value) Sets the output column value. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. The actual value constraints, for example, valid number of seconds, depends on the specific function. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type clos- 14 D20713-04 Rev. 1 Data Connection APIs est to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputBool(index, value) Sets the output column value to a value of type Boolean. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputDate(index, value) Sets the output column value to a value of type Date. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ▲ ► See Also ► Date and Time Functions setOutputDouble(index, value) Sets the output column value to a value of type Double. ▲ D20713-04 Rev. 1 Parameters ► index 15 R AE API Reference The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputFloat(index, value) Sets the output column value to a value of type Float. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputInt16(index, value) Sets the output column value to a value of type Int16. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ 16 value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. D20713-04 Rev. 1 Data Connection APIs Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputInt32(index, value) Sets the output column value to a value of type Int32. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputInt64(index, value) Sets the output column value to a value of type Int64. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputInt8(index, value) Sets the output column value to a value of type Int8. ▲ D20713-04 Rev. 1 Parameters ► index The input or output column index, as an integer. 17 R AE API Reference ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputInterval(index, tm, mon) Sets the output column value to a value of type Interval. ▲ ▲ Parameters ► index The input or output column index, as an integer. ► tm The time value; exact constraints depend on the specific function. ► mon The number of months. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ▲ ► See Also ► Date and Time Functions setOutputNull(index) Sets the output column value to NULL. ▲ Parameters ► index The input or output column index, as an integer. ▲ Returns Logical TRUE if no error occurred. Sets the value of the specified output column to NULL. 18 D20713-04 Rev. 1 Data Connection APIs If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputString(index, value) Sets the output column value to a value of type String. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ► setOutputTime(index, tm) Sets the output column value to a value of type Time. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ tm The time value; exact constraints depend on the specific function. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ▲ ► See Also ► Date and Time Functions setOutputTimeFromR(index, seconds=NULL, milliseconds=NULL) Sets the output column value to a value of type Time. ▲ Parameters ► index The input or output column index, as an integer. ► D20713-04 Rev. 1 seconds 19 R AE API Reference The number or seconds ► ▲ milliseconds The number of milliseconds Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ▲ ► See Also ► Date and Time Functions setOutputTimeStamp(index, value) Sets the output column value to a value of type Timestamp. ▲ Parameters ► index The input or output column index, as an integer. ► ▲ value A type value; basic type checking and casting is performed before passing data to the NPS software. If casting is impossible, an error is reported. Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ▲ ► setOutputTimeTz(index, tm, off) Sets the output column value to a value of type TimeTz. ▲ 20 See Also ► Date and Time Functions Parameters ► index The input or output column index, as an integer. ► tm The time value; exact constraints depend on the specific function. ► off D20713-04 Rev. 1 Data Connection APIs The offset. ▲ Returns Logical TRUE if no error occurred. Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. This function may also be used for Aggregates. ▲ See Also ► Date and Time Functions Aggregate API Aggregate AEs are called from Aggregate SQL Functions. Apart from the functions described in this section, aggregates also use some of the functions from the Function API section. Functions ► getOutputColumn Gets the output column. ► getState Returns the state type. ► outputColumnCount Gets the output column count. Detailed Description Aggregate AEs are called from Aggregate SQL Functions. Apart from the functions described in this section, aggregates also use some of the functions from the Function API section. Function Documentation ► getOutputColumn(index) Gets the output column. ▲ Parameters ► index The input column index, as an integer ▲ Returns One of: numeric, character, logical, integer. Returns the value of the column specified by index. Data is cast to the R data type that is closest to the actual Netezza data type. The getOutputColumn function returns a value depending on the state. It returns either the value of the state variable if in the INITIALIZE, ACCUMULATE, or MERGE states. It returns the value of result column in the FINAL_RESULT state. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. D20713-04 Rev. 1 21 R AE API Reference ► getState() Returns the state type. ▲ Returns An Integer identifying the state type. Returns the state identifier, one of: NZ.INIT, NZ.ACCUM, NZ.MERGE, NZ.FINAL. The respective variables are set when the nzrserver package is loaded. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. ► outputColumnCount() Gets the output column count. ▲ Returns The column count as an integer. Returns the number of output columns. If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called. Shaper and Sizer API Shapers can be optionally called for Table Function AEs. Sizers can be optionally called for Scalar Function AEs. Functions 22 ► addOutputColumn Adds a non-character or numeric column to the output. ► addOutputColumnString Adds a Character column to the output. ► getInputColumnInfo Returns the details for input columns. ► getOutputColumnInfo Returns details for the output column. ► getOutputColumnName Returns the output column name. ► getUdfReturnType Returns the output data type. ► oneOutputRowRestriction Specifies whether the output row is restricted to one output row per input row. ► ► systemCatalogIsUpper updateInfo Updates the NPS software with the output signature. D20713-04 Rev. 1 Data Connection APIs Detailed Description Shapers can be optionally called for Table Function AEs. Sizers can be optionally called for Scalar Function AEs. NOTE: Numeric output is not supported. Function Documentation ► addOutputColumn(tp, nm) Adds a non-character or numeric column to the output. ▲ Parameters ► tp The output column type. ► ▲ nm The output column name. Returns TRUE when completed successfully or FALSE if an error occurs. If an error occurs, it is also reported via the R error handling mechanism. Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function. ► addOutputColumnString(tp, nm, sz) Adds a Character column to the output. ▲ ▲ Parameters ► tp The output column type. ► nm The output column name. ► sz The output column size. Returns TRUE when completed successfully or FALSE if an error occurs. If an error occurs, it is also reported via the R error handling mechanism. Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function. D20713-04 Rev. 1 23 R AE API Reference ► getInputColumnInfo(index) Returns the details for input columns. ▲ Parameters ► index The column index. ▲ Returns An integer vector with elements: input type, isConstant (0 or 1), size, scale. Scale is for numeric columns and size is for numeric and character columns. Returns details for the input column specified by index. Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function. ► getOutputColumnInfo(index) Returns details for the output column. ▲ Parameters ► index The column Index ▲ Returns An integer vector with elements: input type, size, scale. Scale is for numeric columns and size is for numeric and character columns. Returns details for the output column specified by index. Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function. ► ► 24 getOutputColumnName(index) Returns the output column name. ▲ Parameters ► index The column index. ▲ Returns The output column name. getUdfReturnType() Returns the output data type. D20713-04 Rev. 1 Data Connection APIs ▲ Returns Output data type Can only be used in 'function' mode; cannot be used in table function mode. Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function. ► oneOutputRowRestriction() Specifies whether the output row is restricted to one output row per input row. ▲ ► Returns A logical value indicating whether the output is restricted to exactly one output row per input row. systemCatalogIsUpper() ▲ Returns A logical value indicating whether catalog names are in upper case. whether catalog names are in upper case. ► updateInfo() Updates the NPS software with the output signature. ▲ Returns TRUE when completed successfully or FALSE if an error occurs. If an error occurs, it is also reported via the R error handling mechanism. Must be called when the output signature is sent to the NPS software. It should be the last command in a shaper or sizer. Data Type Support The data APIs work with these data types. Functions ► getNpsDataTypes Get the Netezza data types names and identifiers. Enumerations ► enum NZ { FIXED, VARIABLE, NATIONAL_FIXED, NATIONAL_VARIABLE, BOOL, DATE, TIME, TIMETZ, NUMERIC32, NUMERIC64, NUMERIC128, FLOAT, DOUBLE, INTERVAL, INT8, INT16, INT32, INT64, TIMESTAMP, GEOMETRY, VARBINARY, DEBUG, TRACE, INIT, ACCUM, MERGE, FINAL, FUNC, AGG, SHP } D20713-04 Rev. 1 25 R AE API Reference Constants used in user code. Detailed Description The data APIs work with these data types. Function Documentation ► getNpsDataTypes() Get the Netezza data types names and identifiers. ▲ Returns An integer vector containing the numeric data types identifiers. Each value has a name corresponding to the data type. The types are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. Enumeration Type Documentation ► enum NZ Constants used in user code. FIXED Fixed string VARIABLE Variable string NATIONAL_FIXED Fixed national string NATIONAL_VARIABLE Variable national string BOOL Boolean DATE Date TIME Time TIMETZ Time zone NUMERIC32 Numeric 32 NUMERIC64 Numeric 64 NUMERIC128 Numeric 128 FLOAT Float DOUBLE Double INTERVAL Interval INT8 1 byte integer 26 D20713-04 Rev. 1 Data Type Support INT16 2 byte integer INT32 4 byte integer INT64 8 byte integer TIMESTAMP Time stamp GEOMETRY Geometry VARBINARY Varbinary DEBUG Log level debug TRACE Log level trace INIT Aggregate initialize identifier ACCUM Aggregate accumulate identifier MERGE Aggregate merge state identifier FINAL Aggregate final result identifier FUNC (API.FUN) API function identifier AGG (API.AGG) API aggregate identifier SHP (API.SHP) API shaper identifier Support APIs This API family provides support functions for date and time conversions, and for getting runtime environment information. Modules ► Date and Time Functions Date and Time helper functions used to convert to and from Netezza date and time formats. ► Runtime and Environment Information Runtime, Environment, and Shared Library Information. ► Utilities Utilities. Detailed Description This API family provides support functions for date and time conversions, and for getting runtime environment information. Date and Time Functions Date and Time helper functions used to convert to and from Netezza date and time formats. Functions ► millisecondsToNzTime Converts the number of milliseconds since the start of the day to Netezza time. D20713-04 Rev. 1 27 R AE API Reference ► posixTimeSecondsToNzDate Converts Posix seconds since the Epoch to a Netezza date. ► secondsToNzTime Converts the number of seconds since the start of the day to a Netezza time. Detailed Description Date and Time helper functions used to convert to and from Netezza date and time formats. The NDN Developer Guide defines the following data types: date (referred to as 'Netezza Date') is a 4-byte integer representing the number of days before (-) or after (+) 1/1/2000; min: -730,119 (1/1/0001); max: 2,921,939 (12/31/9999) ► time (referred to as 'Netezza Time') is a 8-byte integer representing the number of microseconds between midnight and one microsecond before midnight; min: 0 (00:00:00.000000); max: 86,399,999,999 (23:59:59.999999) ► time with timezone (referred to as 'Netezza TimeTZ') consisting of a 'Netezza Time' field, and a timezone field which is a 4-byte integer representing the offset in seconds, sign reversed (for example, the offset of "+1 hour" is stored as -3600); offset must be a whole number of minutes, for example, offset% 60 = 0; offset min: -46800 (+ 13:00:00); offset max: 46740 (12:59:00) ► timestamp (referred to as 'Netezza Timestamp') which is a 8-byte integer representing the number of microseconds before (-) or after (+) 00:00:00.0, 1/1/2000; min: -63,082,281,600,000,000 (00:00:00, 1/1/0001); max: 252,455,615,999,999,999 (23:59:59.999999, 12/31/9999) ► interval (referred to as 'Netezza Interval') which consists of a 4-byte integer (the number of months, signed) and a 8-byte integer (the number of microseconds, signed); a configuration of a negative (-) months value and a positive (+) microseconds value, or a positive (+) months value and a negative (-) microseconds value, is possible and supported by the Netezza appliance; months min: 3,000,000 (-250000 years); months max: 3,000,000 (250000 years); microseconds min: NONE (min signed int64); microseconds max: NONE (max signed int64). Please note that: (a) The microsecond value can be as large as the int64 data type allows and overflows into negatives without error. ► (b) A month is always considered to contain 30 days. (c) The months and microseconds values are stored separately and there is no information exchange between them. To support the above data types as closely as possible, in the R Adapter 8-byte integers are represented as numeric values. Note that this representation introduces approximation and not all values can be sent to the NPS software in a precise manner. The setOutputDate function accepts a number of input formats: POSIXlt and POSIXct are cast as integers and are translated to 'Netezza Date' using posixTimeSecondsToNzDate; The date object is cast as an integer and translated to be relative to 1/1/2000, that is, 10957 days are subtracted. The setOutputTime function accepts a numeric value representing time value as defined in 'Netezza Time'. The setOutputTimeFromR function accepts either seconds or milliseconds, both expressed as 4- 28 D20713-04 Rev. 1 Support APIs byte integer values; the time value is then translated using secondsToNzTime or millisecondsToNzTime, respectively. The setOutputTimeStamp function accepts a timestamp which, being a 8-byte integer, cannot be represented precisely in R; thus a numeric value is expected. Note that rounding results when the value is to large to fit the IEEE double data type. The setOutputTimeTz function accepts time and offset values. See 'Netezza TimeTZ' for details. The setOutputInterval function accepts a number of microseconds usec and a number of months mon. See 'Netezza Interval' for details. Function Documentation ► ► ► millisecondsToNzTime(msec) Converts the number of milliseconds since the start of the day to Netezza time. ▲ Parameters ► msec The number of milliseconds relative to 00:00:00, less that or equal to 86,399,999 (23:59:59.999). ▲ Returns A Netezza time. posixTimeSecondsToNzDate(posixSec) Converts Posix seconds since the Epoch to a Netezza date. ▲ Parameters ► posixSec POSIX time, that is, the number of seconds relative to midnight Coordinated Universal Time (UTC) of January 1, 1970. ▲ Returns A Netezza date. secondsToNzTime(sec) Converts the number of seconds since the start of the day to a Netezza time. ▲ Parameters ► sec The number of seconds relative to 00:00:00, less than or equal to 86399 (23:59:59). ▲ Returns A Netezza time. Runtime and Environment Information Runtime, Environment, and Shared Library Information. D20713-04 Rev. 1 29 R AE API Reference Functions ► getEnv Returns the value of the specified environment variable. ► getFirstEnvironmentEntry Returns the first environment entry. ► getLibraryFullPath Returns the full path for the specified library. ► getLibraryInfo Returns shared libraries information for this request. ► getLibraryProcessInfo Returns shared libraries information for the process. ► getNextEnvironmentEntry Returns the next environment entry. ► getRuntime Gets runtime information about the R AE. ► getSystemLog Gets the path of the system log file. ► logMessage Sends a log message to the NPS software. ► userError Sends an error message to the Netezza appliance. Detailed Description Runtime, Environment, and Shared Library Information. Function Documentation ► ► getEnv(name) Returns the value of the specified environment variable. ▲ Parameters ► name The environment variable name. ▲ Returns A character string containing the value of the specified environment variable or NULL if not found. getFirstEnvironmentEntry() Returns the first environment entry. ▲ 30 Returns D20713-04 Rev. 1 Support APIs A two-element character vector or NULL on completion. The vector, if returned, contains the name of the variable as well as its value. Returns the first environment entry. This function call should be followed by repeated calls to getNextEnvironmentEntry. ► getLibraryFullPath(name, case) Returns the full path for the specified library. ▲ Parameters ► name The name of the library. ► ▲ case Specifies if matching should be case-sensitive (TRUE) or case-insensitive (FALSE) Returns A character string containing the specified library path. This R Adapter API function allows the Netezza system to be queried about shared libraries required by the UDX responsible for running the current Analytic Executable. ► getLibraryInfo() Returns shared libraries information for this request. ▲ Returns A three-column (name, path, autoLoad) data.frame. This R Adapter API function allows the Netezza system to be queried about shared libraries required by the UDX responsible for running the current Analytic Executable. ► getLibraryProcessInfo() Returns shared libraries information for the process. ▲ Returns A three-column (name, path, autoLoad) data.frame. Returns shared libraries information for the process. Returns NULL if the AE is not Remote. This R Adapter API function allows the Netezza system to be queried about shared libraries required by the UDX responsible for running the current Analytic Executable. ► getNextEnvironmentEntry() Returns the next environment entry. ▲ Returns A two-element character vector or NULL on completion. The vector, if returned, contains the name of the variable as well as its value. Returns the next environment entry. The first call to getNextEnvironmentEntry must follow a call to getFirstEnvironmentEntry. Returns NULL on completion. Key names may repeat but the current version of a keyname is last. D20713-04 Rev. 1 31 R AE API Reference ► getRuntime() Gets runtime information about the R AE. ▲ Returns A list with the following elements: data.slice.id, transaction.id, hardware.id, number.data.slices, number.spus, suggested.memory.limit, locus, adapter.type, user.query, session.id. During AE execution, various runtime details regarding the database, execution locus, and so on, are available. These can be accessed using getRuntime. ► getSystemLog() Gets the path of the system log file. ▲ ► Returns A character string containing the path to the system log file. logMessage(level, msg) Sends a log message to the NPS software. ▲ Parameters ► level The message importance level; can be set to NZ.DEBUG or NZ.TRACE. ► msg The log message itself. The logMessage function sends a message to the NPS software, which is then printed to console where the NPS software was started or stored in one of the log files. ► userError(..., exit=TRUE) Sends an error message to the Netezza appliance. ▲ Parameters ► ... Arguments to be passed to paste(...,col='') to produce the final error message. ► exit Exit the process after sending the error message. The AE process must then clear its runtime data and exit. The exit parameter allows R to exit normally, otherwise userError calls the exit library function directly. Ideally, the standard R stop function, should be called, which terminates R execution and throws an exception that is eventually intercepted and passed to userError in a safe way. Utilities Utilities. 32 D20713-04 Rev. 1 Support APIs Functions ► decodebase64 Decodes a base64-encoded character string. ► fastdataframe Implements the protocol for sending large data frame from client R to the Netezza software. ► fetchGroup Gets the specified number of input rows. ► fetchRows Gets the specified number of input rows. ► getFilePath Returns the location of a user-provided code file. ► getGroupValue Gets the group value. ► ping Notifies the NPS software that the AE is still active. ► placefile Saves user-provided code in the workspace directory. ► prepareUserCode Saves user-provided code in the workspace directory. ► workspacePath Returns the location of the workspace folder. Detailed Description Utilities. Function Documentation ► decodebase64() Decodes a base64-encoded character string. ▲ Returns A decoded character string If the R AE is called via the Netezza R Library (nzr), the user-provided function is sent as a serialized, base64-encoded character string. This function is used to decode that string. ► fastdataframe() Implements the protocol for sending large data frame from client R to the Netezza software. This function is the first function called after loading the nzrserver package when the nzr..fastdataframe UDTF is run from the Netezza R Library (nzr). It implements a protocol for sending large data.frames from a client R instance to the NPS software. D20713-04 Rev. 1 33 R AE API Reference ► fetchGroup(cfrom=NULL, cto=NULL) Gets the specified number of input rows. ▲ Parameters ► cfrom The index of the first input column to be fetched; by default 0. ► ▲ cto The index of the last input column to be fetched; by default inputColumnCount -1. Returns data.frame. Because of its optimized input implementation, this function allows a number of input rows to be fetched with one callback invocation, increasing the performance by 10-50 times. The fetchGroup utility assumes that the last three columns are the group ID, the row number, and the total row count in the current group; it reads the number of rows from the last column. It also saves the group ID value, which can be later accessed by getGroupValue. This utility can only be used with Table functions. ► fetchRows(n=1, cfrom=NULL, cto=NULL, out=NULL) Gets the specified number of input rows. ▲ ▲ Parameters ► n The number of input rows to be fetched. ► cfrom The index of the first input column to be fetched; by default 0. ► cto The index of the last input column to be fetched; by default inputColumnCount -1. ► out Optional parameter. If the name of an allocated data frame is passed, the R Adapter overwrites the current data frame and does not allocate new memory. Returns data.frame. Because of its optimized input implementation, this function allows a number of input rows to be fetched with one callback invocation, increasing the performance by 10-50 times. The fetchRows utility reads in at most the specified number of input rows, presuming they are available, and returns them as a data frame, with columns named Xn. If the out parameter is specified, data is passed outside the function through the out object, which is uncommon in R itself. The number of rows in the object specified by out must be equal to n; if they differ, an error is reported and Adapter execution ceases. This parameter should be used to tune the data input performance, if necessary. The fetchRows utility can be used without passing a value with virtually no performance impact. 34 D20713-04 Rev. 1 Support APIs This utility can only be used with Table SQL functions. ► getFilePath() Returns the location of a user-provided code file. ▲ Returns A character string pointing to the code file. If the R AE is called via the Netezza R Library (nzr), the user-provided function is stored in the workspace folder. This function returns the absolute file path. When searching for the user code, the following locations are checked: The location specified by the WORKSPACE_PATH environment variable, which should contain the file name to be concatenated to the result of workspace() invocation. ▲ If the WORKSPACE_PATH variable cannot be found then the location specified by the ABSOLUTE_PATH environment variable, which should contain the full path to an existing file, is checked. ▲ ► getGroupValue() Gets the group value. ▲ Returns Atomic vector containing one element, whose type depends on the group input column. This element is the group ID value. Gets the saved group ID value from the last call to fetchGroup. Because of its optimized input implementation, this function allows a number of input rows to be fetched with one callback invocation, increasing the performance by 10-50 times. This function can only be used with Table SQL functions. ► ping() Notifies the NPS software that the AE is still active. Ping can be used to indicate that the AE is still active and not hanging, which may be useful for timeintensive computations. ► placefile() Saves user-provided code in the workspace directory. This function is the first function called after loading the nzrserver package when the nzr..placefile UDTF is run from the Netezza R Library (nzr). It saves the user-provided code in the workspace directory. The placefile function should be registered as a separate UDTF and called only via the Netezza R Library (nzr) functions. The placefile function expects to find a least two columns. The first column is the mode column, which has three possible values: placefile, createfile, or appendfile. If the value in the mode column is placefile, the second column is the base64-encoded character string containing the data to be written to a randomly-named file. The file is created in the workspace directory and the file name is returned as the UDTF result. D20713-04 Rev. 1 35 R AE API Reference If the value in the mode column is either createfile or appendfile, the second column must contain the file name, and a third column must contain the data string. The file is then created in the workspace directory with the provided name. If the value of mode was createfile, the file is overwritten; if the value was appendfile, the file is appended. The file name is returned as the UDTF result. ► prepareUserCode() Saves user-provided code in the workspace directory. The dispatcher function assumes that there is a file or an environment variable that contain all required user data. The data is assumed to be serialized if in a file, or serialized and base64-encoded if stored in an environment variable. When the user code is searched for the following places are checked: The location specified by the WORKSPACE_PATH environment variable, which should contain the file name to be concatenated to the result of workspace() invocation. ▲ If the WORKSPACE_PATH variable cannot be found then the location specified by the ABSOLUTE_PATH environment variable, which should contain the full path to an existing file, is checked. ▲ If the ABSOLUTE_PATH variable cannot be found, the the CODE_SERIALIZED environment variable, which should contain a base64-encoded string, is checked. The contents of the file or the base64-decoded string are then passed to charToRaw(), and its output to unserialize(), which are both standard R functions. The result of the deserialization should be a list containing a subset of the following elements: ▲ mode - a character string with 'mode' identifier fun - a function object args - a list with additional function arguments cols - a vector with input columns names file - the name of the package file to be installed shaper - a shaper function shaper.args - additional arguments for the shaper function The mode element is used to determine which high-level wrapper should be invoked. It can take one of the following values: apply, tapply, run, or install. In the case of aggregation mode (UDA) and 'Shaper & Sizers' no mode value is required. For exact implementation see dispatcher, handleConnection, and runWrapper functions. IMPORTANT: When TABLE(ANY) is given as the table function result, such as when Shapers & Sizers are being used, the dynamic environment implementation requires the last argument be cast as non-national VARCHAR. ► 36 workspacePath() Returns the location of the workspace folder. D20713-04 Rev. 1 Support APIs ▲ Returns A character string containing the path to the workspace folder. If the R AE is called via the Netezza R Library (nzr), the user-provided function is stored in the workspace folder. This function returns the absolute folder path. Working Modes Working mode wrappers. Functions ► apply Handles the client-side invocation for nzApply. ► groupedApply Handles the client-side invocation for nzGroupedApply. ► install Handles the client-side invocation for nzInstallPackages. ► rawtapply Handles the client-side invocation for special grouped apply. ► run Handles the client-side invocation for nzRun. ► tapply Handles the client-side invocation for nzTApply. Detailed Description Working mode wrappers. An R Analytic Executable can operate on a number of levels of abstraction, utilizing the most basic API or taking the advantage of some predefined wrappers. The wrappers are described here. Function Documentation ► nzrsrv::apply(fun, args) Handles the client-side invocation for nzApply. ▲ Parameters ► fun The function object to be invoked for the data stream. The fun object must be a function that accepts data row x and any additional user-provided arguments args. ► args Additional arguments for fun. Called when the dispatcher function is called, which is the default way of working with R AEs. ► nzrsrv::groupedApply() D20713-04 Rev. 1 37 R AE API Reference Handles the client-side invocation for nzGroupedApply. Called when the dispatcher function is called, which is the default way of working with R AEs. The groupedApply wrapper is responsible for handling aggregates and reads its data from a file identified by the current session identifier. See implementation and client-side documentation for details. ► nzrsrv::install(file) Handles the client-side invocation for nzInstallPackages. ▲ Parameters ► file A package file name. The package is assumed to be present in the workspace directory. The file must match the name of an existing R package located in the workspace directory. Called when the dispatcher function is called, which is the default way of working with R AEs. ► nzrsrv::rawtapply() Handles the client-side invocation for special grouped apply. ▲ Parameters ► fun The function object to be invoked for the data stream. ► args Additional arguments for fun. This function implements a special grouped apply to be used directly as a SQL query. This function is called by dispatcher when the nz.mode variable in the compiled file is set to 'groupedapply'. Called when the dispatcher function is called, which is the default way of working with R AEs. ► nzrsrv::run(fun, args) Handles the client-side invocation for nzRun. ▲ Parameters ► fun The function object to be invoked for the data stream. The fun object must be a function that accepts user-provided arguments args. ► args Additional arguments for fun. Called when the dispatcher function is called, which is the default way of working with R 38 D20713-04 Rev. 1 Working Modes AEs. ► nzrsrv::tapply(fun, args, cols) Handles the client-side invocation for nzTApply. ▲ Parameters ► fun The function object to be invoked for the data stream. The fun object must be a function that accepts data frame x and additional user-provided arguments args. ► args Additional arguments for fun. ► cols The input column names. Called when the dispatcher function is called, which is the default way of working with R AEs. The tapply wrapper requires a character vector that includes the names of the input columns, which is used to assure that fun can access input data using the original names. D20713-04 Rev. 1 39 Notices and Trademarks Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation 26 Forest Street Marlborough, MA 01752 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provid- ed by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to nonIBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. Each copy or any portion of these sample programs or any derivative work, must include a copyright notice as follows: © (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. © Copyright IBM Corp. (enter the year or years). All rights reserved. Trademarks IBM, the IBM logo, ibm.com and Netezza are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™),these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at ibm.com/legal/copytrade.shtml. The following terms are trademarks or registered trademarks of other companies: Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/or other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. NEC is a registered trademark of NEC Corporation. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Red Hat is a trademark or registered trademark of Red Hat, Inc. in the United States and/or other countries. D-CC, D-C++, Diab+, FastJ, pSOS+, SingleStep, Tornado, VxWorks, Wind River, and the Wind River logo are trademarks, registered trademarks, or service marks of Wind River Systems, Inc. Tornado patent pending. APC and the APC logo are trademarks or registered trademarks of American Power Conversion Corporation. Other company, product or service names may be trademarks or service marks of others. Regulatory and Compliance Regulatory Notices Install the NPS system in a restricted-access location. Ensure that only those trained to operate or service the equipment have physical access to it. Install each AC power outlet near the NPS rack that plugs into it, and keep it freely accessible. Provide approved 30A circuit breakers on all power sources. Product may be powered by redundant power sources. Disconnect ALL power sources before servicing. High leakage current. Earth connection essential before connecting supply. Courant de fuite élevé. Raccordement à la terre indispensable avant le raccordement au réseau. Homologation Statement This product may not be certified in your country for connection by any means whatsoever to interfaces of public telecommunications networks. Further certification may be required by law prior to making any such connection. Contact an IBM representative or reseller for any questions. FCC - Industry Canada Statement This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate radio-frequency energy and, if not installed and used in accordance with the instruction manual, may cause harmful interference to radio communications. Operation of this equipment in a residential area is likely to cause harmful interference, in which case users will be required to correct the interference at their own expense. This Class A digital apparatus meets all requirements of the Canadian Interference-Causing Equipment Regulations. Cet appareil numérique de la classe A respecte toutes les exigences du Règlement sur le matériel brouilleur du Canada. CE Statement (Europe) This product complies with the European Low Voltage Directive 73/23/EEC and EMC Directive 89/336/EEC as amended by European Directive 93/68/EEC. Warning: This is a class A product. In a domestic environment this product may cause radio interference in which case the user may be required to take adequate measures. VCCI Statement Index Index A addOutputColumn Shaper and Sizer API,23 addOutputColumnString Shaper and Sizer API,23 Aggregate API,21 getOutputColumn,21 getState,22 outputColumnCount,22 apply Working Modes,37 C closeLocalConnection Local Initialization,8 closeRemoteConnection Remote Initialization,10 createConnectionPoint Remote Connection Point,9 createLocalConnection Local Initialization,8 createRemoteConnection Remote Initialization,10 D Data Connection APIs,11 getApiType,12 Data Type Support,25 getNpsDataTypes,26 NZ,26 Date and Time Functions,27 millisecondsToNzTime,29 posixTimeSecondsToNzDate,29 secondsToNzTime,29 decodebase64 Utilities,33 dispatcher High Level initialization,10 F fastdataframe Utilities,33 fetchGroup Utilities,34 fetchRows Utilities,34 Function,12 getInputColumn,13 getNext,13 inputColumnCount,14 outputResult,14 setOutput,14 setOutputBool,15 setOutputDate,15 setOutputDouble,15 setOutputFloat,16 setOutputInt16,16 setOutputInt32,17 setOutputInt64,17 setOutputInt8,17 setOutputInterval,18 setOutputNull,18 setOutputString,19 setOutputTime,19 setOutputTimeFromR,19 setOutputTimeStamp,20 setOutputTimeTz,20 G getApiType Data Connection APIs,12 getEnv Runtime and Environment Information,30 getFilePath Utilities,35 getFirstEnvironmentEntry Runtime and Environment Information,30 getGroupValue Utilities,35 getInputColumn Function,13 getInputColumnInfo 45 Index Shaper and Sizer API,24 getLibraryFullPath Runtime and Environment Information,31 getLibraryInfo Runtime and Environment Information,31 getLibraryProcessInfo Runtime and Environment Information,31 getNext Function,13 getNextEnvironmentEntry Runtime and Environment Information,31 getNpsDataTypes Data Type Support,26 getOutputColumn Aggregate API,21 getOutputColumnInfo Shaper and Sizer API,24 getOutputColumnName Shaper and Sizer API,24 getRuntime Runtime and Environment Information,32 getState Aggregate API,22 getSystemLog Runtime and Environment Information,32 getUdfReturnType Shaper and Sizer API,24 groupedApply Working Modes,37 H handleConnection High Level initialization,11 High Level initialization,10 dispatcher,10 handleConnection,11 runWrapper,11 I Initialization APIs,7 inputColumnCount Function,14 install 46 Working Modes,38 isLocal Local Initialization,8 L Local Initialization,8 closeLocalConnection,8 createLocalConnection,8 isLocal,8 logMessage Runtime and Environment Information,32 M millisecondsToNzTime Date and Time Functions,29 N NZ Data Type Support,26 O oneOutputRowRestriction Shaper and Sizer API,25 outputColumnCount Aggregate API,22 outputResult Function,14 P ping Utilities,35 placefile Utilities,35 posixTimeSecondsToNzDate Date and Time Functions,29 prepareUserCode Utilities,36 R rawtapply Index Working Modes,38 Remote Connection Point,9 createConnectionPoint,9 Remote Initialization,9 closeRemoteConnection,10 createRemoteConnection,10 run Working Modes,38 Runtime and Environment Information,29 getEnv,30 getFirstEnvironmentEntry,30 getLibraryFullPath,31 getLibraryInfo,31 getLibraryProcessInfo,31 getNextEnvironmentEntry,31 getRuntime,32 getSystemLog,32 logMessage,32 userError,32 runWrapper High Level initialization,11 S secondsToNzTime Date and Time Functions,29 setOutput Function,14 setOutputBool Function,15 setOutputDate Function,15 setOutputDouble Function,15 setOutputFloat Function,16 setOutputInt16 Function,16 setOutputInt32 Function,17 setOutputInt64 Function,17 setOutputInt8 Function,17 setOutputInterval Function,18 setOutputNull Function,18 setOutputString Function,19 setOutputTime Function,19 setOutputTimeFromR Function,19 setOutputTimeStamp Function,20 setOutputTimeTz Function,20 Shaper and Sizer API,22 addOutputColumn,23 addOutputColumnString,23 getInputColumnInfo,24 getOutputColumnInfo,24 getOutputColumnName,24 getUdfReturnType,24 oneOutputRowRestriction,25 systemCatalogIsUpper,25 updateInfo,25 Support APIs,27 systemCatalogIsUpper Shaper and Sizer API,25 T tapply Working Modes,39 U updateInfo Shaper and Sizer API,25 userError Runtime and Environment Information,32 Utilities,32 decodebase64,33 fastdataframe,33 fetchGroup,34 fetchRows,34 getFilePath,35 getGroupValue,35 47 Index ping,35 placefile,35 prepareUserCode,36 workspacePath,36 W Working Modes,37 apply,37 groupedApply,37 install,38 rawtapply,38 run,38 tapply,39 workspacePath Utilities,36 48