...

R AE API Reference IBM Netezza Analytics

by user

on
Category: Documents
24

views

Report

Comments

Transcript

R AE API Reference IBM Netezza Analytics
IBM® Netezza® Analytics
Release 3.0.2.0
R AE API Reference
Note: Before using this information and the product that it
supports, read the information in "Notices and
Trademarks" on page 40.
© Copyright IBM Corporation 2011, 2014.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Part Number D20713-04 Rev. 1
Contents
Preface
Audience for This Guide......................................................................................................................v
Purpose of This Guide.........................................................................................................................v
Conventions........................................................................................................................................v
If You Need Help.................................................................................................................................v
Comments on the Documentation.....................................................................................................vi
1
Module Documentation
Initialization APIs................................................................................................................................7
Modules.......................................................................................................................................7
Detailed Description.....................................................................................................................7
Local Initialization.........................................................................................................................8
Remote Connection Point............................................................................................................9
Remote Initialization....................................................................................................................9
High Level initialization...............................................................................................................10
Data Connection APIs.......................................................................................................................11
Functions....................................................................................................................................11
Modules.....................................................................................................................................11
Detailed Description...................................................................................................................12
Function Documentation............................................................................................................12
Function.....................................................................................................................................12
Aggregate API.............................................................................................................................21
Shaper and Sizer API...................................................................................................................22
Data Type Support............................................................................................................................25
Functions....................................................................................................................................25
Enumerations.............................................................................................................................25
Detailed Description...................................................................................................................26
Function Documentation............................................................................................................26
Enumeration Type Documentation............................................................................................26
Support APIs.....................................................................................................................................27
Modules.....................................................................................................................................27
Detailed Description...................................................................................................................27
Date and Time Functions............................................................................................................27
iii
Runtime and Environment Information......................................................................................29
Utilities.......................................................................................................................................32
Working Modes................................................................................................................................37
Functions....................................................................................................................................37
Detailed Description...................................................................................................................37
Function Documentation............................................................................................................37
Notices and Trademarks
Notices..............................................................................................................................................40
Trademarks ......................................................................................................................................41
Regulatory and Compliance .............................................................................................................42
Regulatory Notices.....................................................................................................................42
Homologation Statement...........................................................................................................42
FCC - Industry Canada Statement...............................................................................................42
CE Statement (Europe)...............................................................................................................42
VCCI Statement..........................................................................................................................43
Index
iv
Preface
This guide provides an API reference for R AE programmers.
Audience for This Guide
The R AE API Reference is written for programmers who intend to create Analytic Executables for
IBM Netezza Analytics using the R language. This guide does not provide a tutorial on AE concepts.
More information about AEs can be found in the User-Defined Analytic Process Developer's Guide.
Purpose of This Guide
This guide describes the R AE API, which is a language adapter provided as part of IBM Netezza Analytics. The R AE API provides programmatic access to the AE interface for R programmers. This interface package is named nzrserver, and provides the server side functionality of the connectivity
tools for running R on the NPS. Downloading, installing, and working with Open Source R and all
other required packages is subject to the terms and conditions that are mentioned in the appropriate license files of those packages.
Conventions
The following conventions apply:
►
►
►
►
►
►
►
►
In the technical literature, both the guides and reference guides, the term "Analytic Executable" or "AE" is used. In marketing materials, the term "User-Defined Analytic Process" or
"UDAP" is used. The terms User-Defined Analytic Process and UDAP are synonymous with the
terms Analytic Executable and AE.
Italics for emphasis on terms and user-defined values such as user input.
Upper case for SQL commands, for example, INSERT or DELETE.
Bold for command line input, for example, nzsystem stop.
Bold to denote parameter names, argument names, or other named references.
Angle brackets ( < > ) to indicate a placeholder (variable) that should be replaced with actual
text, for example, nzmat <- nz.matrix("<matrix_name>")
In code samples, a single backslash ("\") at the end of a line denotes a line continuation and
should be omitted when using the code at the command line, a SQL command or in a file.
When referencing a sequence of menu and submenu selections, the ">" character denotes the
different menu options in the form: "Menu Name > Submenu Name > Selection". Note that not
all commands use submenus, while some selections may utilize a number of nested submenus.
If You Need Help
If you are having trouble using the IBM Netezza appliance, IBM Netezza Analytics or any of its components:
1. Retry the action, carefully following the instructions in the documentation.
v
2. Go to the IBM Support Portal at http://www.ibm.com/support. Log in using your IBM ID
and password. You can search the Support Portal for solutions. To submit a support request, click the 'Service Requests & PMRs' tab.
3. If you have an active service contract maintenance agreement with IBM, you can contact customer support teams via telephone. For individual countries, please visit the
Technical Support section of the IBM Directory of worldwide contacts
http://www14.software.ibm.com/webapp/set2/sas/f/handbook/contacts.html#phone.
Comments on the Documentation
We welcome any questions, comments, or suggestions that you have for the IBM Netezza documentation. Please send us an e-mail message at [email protected] and include
the following information:
The name and version of the manual that you are using
Any comments that you have about the manual
Your name, address, and phone number
We appreciate your comments.
►
►
►
vi
C H A P T E R 1
Module Documentation
Initialization APIs
This API family is used to get an open data connection.
Modules
►
Local Initialization
Initialization functions related to Local AEs. Local AEs are initialized using the function createLocalConnection. If an AE is local, the function isLocal returns TRUE. If isLocal returns FALSE,
the AE is remote.
►
Remote Connection Point
A Remote Connection Point is used to address a Remote AE from the NPS software.
►
Remote Initialization
Initialization functions related to Remote AEs. 1) Create a connection point. 2) Listen using that
connection point. 3) Accept a Data Connection API handle.
►
High Level initialization
Used to run both local and remote initialization.
Detailed Description
This API family is used to get an open data connection.
All data structures created and used by these functions are completely internal and cannot be accessed from R.
The dispatcher function is the top-level abstraction for the whole R AE. By default it is called just after loading the nzrserver package. It calls directly handleConnection and handles possible errors.
handleConnection implements the local and remote connection creation and the remote connection loop. It then calls runWrapper.
D20713-04 Rev. 1
7
R AE API Reference
runWrapper decides upon the API type what user-provided data is required and finally calls the
user function.
Local Initialization
Initialization functions related to Local AEs. Local AEs are initialized using the function createLocalConnection. If an AE is local, the function isLocal returns TRUE. If isLocal returns FALSE, the AE
is remote.
Functions
►
closeLocalConnection
Closes the local connection.
►
createLocalConnection
Creates a local connection.
►
isLocal
Returns a TRUE value if the AE is local.
Detailed Description
Initialization functions related to Local AEs. Local AEs are initialized using the function createLocalConnection. If an AE is local, the function isLocal returns TRUE. If isLocal returns FALSE, the AE
is remote.
Function Documentation
►
closeLocalConnection()
Closes the local connection.
▲
►
createLocalConnection()
Creates a local connection.
▲
►
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs,
the Rf_error function is called, which results in premature exit and an appropriate message returned to the user.
isLocal()
Returns a TRUE value if the AE is local.
▲
8
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs,
the Rf_error function is called, which results in premature exit and an appropriate message returned to the user.
Returns
A logical value. TRUE indicates local mode.
D20713-04 Rev. 1
Initialization APIs
An AE can be started as Local or Remote. This function can be used to determine the mode at runtime.
The life cycle of a local process is controlled by the NPS software.
Remote Connection Point
A Remote Connection Point is used to address a Remote AE from the NPS software.
Functions
►
createConnectionPoint
Creates a connection point.
Detailed Description
A Remote Connection Point is used to address a Remote AE from the NPS software.
Function Documentation
►
createConnectionPoint(name=NULL, dataslice=TRUE, transaction=TRUE, session=TRUE)
Creates a connection point.
▲
▲
Parameters
► name
The connection point name. If NULL then the value of the NZAE_REMOTE_NAME environment variable is used. If the parameter value is NULL and the environment variable is not set,
an error is reported.
►
dataslice
Specifies whether the dataslice ID should be used to define the connection point name.
►
transaction
Specified whether the transaction ID should be used to define the connection point name.
►
session
Specifies whether the session ID should be used to define the connection point name.
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the
user.
Remote Initialization
Initialization functions related to Remote AEs. 1) Create a connection point. 2) Listen using that connection point. 3) Accept a Data Connection API handle.
Functions
►
closeRemoteConnection
Closes a remote connection.
►
createRemoteConnection
Creates a remote connection.
D20713-04 Rev. 1
9
R AE API Reference
Detailed Description
Initialization functions related to Remote AEs. 1) Create a connection point. 2) Listen using that
connection point. 3) Accept a Data Connection API handle.
Function Documentation
►
closeRemoteConnection()
Closes a remote connection.
▲
►
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs,
the Rf_error function is called, which results in premature exit and an appropriate message returned to the user.
createRemoteConnection()
Creates a remote connection.
▲
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs,
the Rf_error function is called, which results in premature exit and an appropriate message returned to the user.
High Level initialization
Used to run both local and remote initialization.
Functions
►
dispatcher
Top Level abstraction.
►
handleConnection
Handle connection.
►
runWrapper
Determines the API type and runs the user function.
Detailed Description
Used to run both local and remote initialization.
Function Documentation
►
dispatcher()
Top Level abstraction.
▲
10
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs,
D20713-04 Rev. 1
Initialization APIs
the Rf_error function is called, which results in premature exit and an appropriate message returned to the user.
The dispatcher function is the top-level abstraction for the entire R AE. By default, it is called just after
loading the nzrserver package. It calls handleConnection directly and handles possible errors.
►
handleConnection()
Handle connection.
▲
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the
user.
The handleConnection function implements the local and remote connection creation and the remote connection loop. It then calls runWrapper.
►
runWrapper()
Determines the API type and runs the user function.
▲
Returns
A logical value indicating whether the invocation finished correctly. If any error occurs, the Rf_error function is called, which results in premature exit and an appropriate message returned to the
user.
The runWrapper function determines, based on the API type, what user-provided data is required. It
then calls the user function.
Data Connection APIs
This API family is used to process data after a data connection has been opened. This involves running one
of the three types of API functions.
Functions
►
getApiType
Gets the API type.
Modules
►
Function
Function AEs are called from Scalar or Table SQL Functions.
►
Aggregate API
Aggregate AEs are called from Aggregate SQL Functions. Apart from the functions described in this
section, aggregates also use some of the functions from the Function API section.
►
Shaper and Sizer API
Shapers can be optionally called for Table Function AEs. Sizers can be optionally called for Scalar Function AEs.
D20713-04 Rev. 1
11
R AE API Reference
Detailed Description
This API family is used to process data after a data connection has been opened. This involves
running one of the three types of API functions.
Function Documentation
►
getApiType()
Gets the API type.
▲
Returns
An integer equal to either NZ.API.FUN, NZ.API.AGG, NZ.API.SHP.
The AE can be started in one of three modes: as a function/table function (NZ.API.FUN), as
an aggregate (NZ.API.AGG), or as a shaper/sizer (NZ.API.SHP). This function can be used to
determine the API type at runtime.
Function
Function AEs are called from Scalar or Table SQL Functions.
Functions
12
►
getInputColumn
Gets the input column.
►
getNext
Gets the next row of data.
►
inputColumnCount
Gets the input column count.
►
outputResult
Sends an output row to the NPS software.
►
setOutput
Sets the output column value.
►
setOutputBool
Sets the output column value to a value of type Boolean.
►
setOutputDate
Sets the output column value to a value of type Date.
►
setOutputDouble
Sets the output column value to a value of type Double.
►
setOutputFloat
Sets the output column value to a value of type Float.
►
setOutputInt16
Sets the output column value to a value of type Int16.
D20713-04 Rev. 1
Data Connection APIs
►
setOutputInt32
Sets the output column value to a value of type Int32.
►
setOutputInt64
Sets the output column value to a value of type Int64.
►
setOutputInt8
Sets the output column value to a value of type Int8.
►
setOutputInterval
Sets the output column value to a value of type Interval.
►
setOutputNull
Sets the output column value to NULL.
►
setOutputString
Sets the output column value to a value of type String.
►
setOutputTime
Sets the output column value to a value of type Time.
►
setOutputTimeFromR
Sets the output column value to a value of type Time.
►
setOutputTimeStamp
Sets the output column value to a value of type Timestamp.
►
setOutputTimeTz
Sets the output column value to a value of type TimeTz.
Detailed Description
Function AEs are called from Scalar or Table SQL Functions.
Function Documentation
►
getInputColumn(index)
Gets the input column.
▲
Parameters
► index
The input column index as an integer
▲
Returns
One of: numeric, character, logical, or integer.
Returns the value of the column specified by index. Data is cast to the R data type that is closest to
the actual Netezza data type.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
getNext()
Gets the next row of data.
▲
D20713-04 Rev. 1
Returns
13
R AE API Reference
Logical TRUE if no error occurred.
Informs the NPS software that it should send the next row of data. This function must be
called before processing the first row of data. It also must also be called before outputting
any data.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
►
inputColumnCount()
Gets the input column count.
▲
Returns
The column count as an integer.
Returns the number of input columns.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
►
outputResult()
Sends an output row to the NPS software.
▲
Returns
Logical TRUE if no error occurred.
After the output columns' values are set using the various setOutput functions, the last step
is to call outputResult. This function sends the values to the NPS software. It can be called
many times if the R AE is executed as a Table Function.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
►
setOutput(index, value)
Sets the output column value.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to
the NPS software. If casting is impossible, an error is reported. The actual value constraints, for example, valid number of seconds, depends on the specific function.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type clos-
14
D20713-04 Rev. 1
Data Connection APIs
est to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputBool(index, value)
Sets the output column value to a value of type Boolean.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to the NPS
software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputDate(index, value)
Sets the output column value to a value of type Date.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to the NPS
software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
▲
►
See Also
► Date and Time Functions
setOutputDouble(index, value)
Sets the output column value to a value of type Double.
▲
D20713-04 Rev. 1
Parameters
► index
15
R AE API Reference
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to
the NPS software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
►
setOutputFloat(index, value)
Sets the output column value to a value of type Float.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to
the NPS software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
►
setOutputInt16(index, value)
Sets the output column value to a value of type Int16.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
16
value
A type value; basic type checking and casting is performed before passing data to
the NPS software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
D20713-04 Rev. 1
Data Connection APIs
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputInt32(index, value)
Sets the output column value to a value of type Int32.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to the NPS
software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputInt64(index, value)
Sets the output column value to a value of type Int64.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to the NPS
software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputInt8(index, value)
Sets the output column value to a value of type Int8.
▲
D20713-04 Rev. 1
Parameters
► index
The input or output column index, as an integer.
17
R AE API Reference
►
▲
value
A type value; basic type checking and casting is performed before passing data to
the NPS software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
►
setOutputInterval(index, tm, mon)
Sets the output column value to a value of type Interval.
▲
▲
Parameters
► index
The input or output column index, as an integer.
►
tm
The time value; exact constraints depend on the specific function.
►
mon
The number of months.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
▲
►
See Also
► Date and Time Functions
setOutputNull(index)
Sets the output column value to NULL.
▲
Parameters
► index
The input or output column index, as an integer.
▲
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column to NULL.
18
D20713-04 Rev. 1
Data Connection APIs
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputString(index, value)
Sets the output column value to a value of type String.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to the NPS
software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
►
setOutputTime(index, tm)
Sets the output column value to a value of type Time.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
tm
The time value; exact constraints depend on the specific function.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
▲
►
See Also
► Date and Time Functions
setOutputTimeFromR(index, seconds=NULL, milliseconds=NULL)
Sets the output column value to a value of type Time.
▲
Parameters
► index
The input or output column index, as an integer.
►
D20713-04 Rev. 1
seconds
19
R AE API Reference
The number or seconds
►
▲
milliseconds
The number of milliseconds
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
▲
►
See Also
► Date and Time Functions
setOutputTimeStamp(index, value)
Sets the output column value to a value of type Timestamp.
▲
Parameters
► index
The input or output column index, as an integer.
►
▲
value
A type value; basic type checking and casting is performed before passing data to
the NPS software. If casting is impossible, an error is reported.
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
This function may also be used for Aggregates.
▲
►
setOutputTimeTz(index, tm, off)
Sets the output column value to a value of type TimeTz.
▲
20
See Also
► Date and Time Functions
Parameters
► index
The input or output column index, as an integer.
►
tm
The time value; exact constraints depend on the specific function.
►
off
D20713-04 Rev. 1
Data Connection APIs
The offset.
▲
Returns
Logical TRUE if no error occurred.
Sets the value of the specified output column; the value is cast to the Netezza data type closest to the
type of the value parameter.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
This function may also be used for Aggregates.
▲
See Also
► Date and Time Functions
Aggregate API
Aggregate AEs are called from Aggregate SQL Functions. Apart from the functions described in this section, aggregates also use some of the functions from the Function API section.
Functions
►
getOutputColumn
Gets the output column.
►
getState
Returns the state type.
►
outputColumnCount
Gets the output column count.
Detailed Description
Aggregate AEs are called from Aggregate SQL Functions. Apart from the functions described in this section, aggregates also use some of the functions from the Function API section.
Function Documentation
►
getOutputColumn(index)
Gets the output column.
▲
Parameters
► index
The input column index, as an integer
▲
Returns
One of: numeric, character, logical, integer.
Returns the value of the column specified by index. Data is cast to the R data type that is closest to
the actual Netezza data type.
The getOutputColumn function returns a value depending on the state. It returns either the value of
the state variable if in the INITIALIZE, ACCUMULATE, or MERGE states. It returns the value of result
column in the FINAL_RESULT state.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is called.
D20713-04 Rev. 1
21
R AE API Reference
►
getState()
Returns the state type.
▲
Returns
An Integer identifying the state type.
Returns the state identifier, one of: NZ.INIT, NZ.ACCUM, NZ.MERGE, NZ.FINAL. The respective variables are set when the nzrserver package is loaded.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
►
outputColumnCount()
Gets the output column count.
▲
Returns
The column count as an integer.
Returns the number of output columns.
If an error occurs, the stop() function, or its internal equivalent, the Rf_error function, is
called.
Shaper and Sizer API
Shapers can be optionally called for Table Function AEs. Sizers can be optionally called for Scalar
Function AEs.
Functions
22
►
addOutputColumn
Adds a non-character or numeric column to the output.
►
addOutputColumnString
Adds a Character column to the output.
►
getInputColumnInfo
Returns the details for input columns.
►
getOutputColumnInfo
Returns details for the output column.
►
getOutputColumnName
Returns the output column name.
►
getUdfReturnType
Returns the output data type.
►
oneOutputRowRestriction
Specifies whether the output row is restricted to one output row per input row.
►
►
systemCatalogIsUpper
updateInfo
Updates the NPS software with the output signature.
D20713-04 Rev. 1
Data Connection APIs
Detailed Description
Shapers can be optionally called for Table Function AEs. Sizers can be optionally called for Scalar Function
AEs.
NOTE: Numeric output is not supported.
Function Documentation
►
addOutputColumn(tp, nm)
Adds a non-character or numeric column to the output.
▲
Parameters
► tp
The output column type.
►
▲
nm
The output column name.
Returns
TRUE when completed successfully or FALSE if an error occurs. If an error occurs, it is also reported via the R error handling mechanism.
Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE,
NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16,
NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with
the getNpsDataTypes function.
►
addOutputColumnString(tp, nm, sz)
Adds a Character column to the output.
▲
▲
Parameters
► tp
The output column type.
►
nm
The output column name.
►
sz
The output column size.
Returns
TRUE when completed successfully or FALSE if an error occurs. If an error occurs, it is also reported via the R error handling mechanism.
Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE,
NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16,
NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with
the getNpsDataTypes function.
D20713-04 Rev. 1
23
R AE API Reference
►
getInputColumnInfo(index)
Returns the details for input columns.
▲
Parameters
► index
The column index.
▲
Returns
An integer vector with elements: input type, isConstant (0 or 1), size, scale. Scale is for
numeric columns and size is for numeric and character columns.
Returns details for the input column specified by index.
Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function.
►
getOutputColumnInfo(index)
Returns details for the output column.
▲
Parameters
► index
The column Index
▲
Returns
An integer vector with elements: input type, size, scale. Scale is for numeric columns and
size is for numeric and character columns.
Returns details for the output column specified by index.
Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with the getNpsDataTypes function.
►
►
24
getOutputColumnName(index)
Returns the output column name.
▲
Parameters
► index
The column index.
▲
Returns
The output column name.
getUdfReturnType()
Returns the output data type.
D20713-04 Rev. 1
Data Connection APIs
▲
Returns
Output data type
Can only be used in 'function' mode; cannot be used in table function mode.
Data type identifiers are set when the nzrserver package is loaded and are: NZ.FIXED, NZ.VARIABLE,
NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128, NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16,
NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY. They can also be accessed with
the getNpsDataTypes function.
►
oneOutputRowRestriction()
Specifies whether the output row is restricted to one output row per input row.
▲
►
Returns
A logical value indicating whether the output is restricted to exactly one output row per input
row.
systemCatalogIsUpper()
▲ Returns
A logical value indicating whether catalog names are in upper case.
whether catalog names are in upper case.
►
updateInfo()
Updates the NPS software with the output signature.
▲
Returns
TRUE when completed successfully or FALSE if an error occurs. If an error occurs, it is also reported via the R error handling mechanism.
Must be called when the output signature is sent to the NPS software. It should be the last command
in a shaper or sizer.
Data Type Support
The data APIs work with these data types.
Functions
►
getNpsDataTypes
Get the Netezza data types names and identifiers.
Enumerations
►
enum NZ {
FIXED, VARIABLE, NATIONAL_FIXED, NATIONAL_VARIABLE, BOOL, DATE, TIME, TIMETZ, NUMERIC32,
NUMERIC64, NUMERIC128, FLOAT, DOUBLE, INTERVAL, INT8, INT16, INT32, INT64, TIMESTAMP, GEOMETRY, VARBINARY, DEBUG, TRACE, INIT, ACCUM, MERGE, FINAL, FUNC, AGG, SHP }
D20713-04 Rev. 1
25
R AE API Reference
Constants used in user code.
Detailed Description
The data APIs work with these data types.
Function Documentation
►
getNpsDataTypes()
Get the Netezza data types names and identifiers.
▲
Returns
An integer vector containing the numeric data types identifiers. Each value has a name
corresponding to the data type.
The types are: NZ.FIXED, NZ.VARIABLE, NZ.NATIONAL_FIXED, NZ.NATIONAL_VARIABLE, NZ.BOOL, NZ.DATE, NZ.TIME, NZ.TIMETZ, NZ.NUMERIC32, NZ.NUMERIC64, NZ.NUMERIC128,
NZ.FLOAT, NZ.DOUBLE, NZ.INTERVAL, NZ.INT8, NZ.INT16, NZ.INT32, NZ.INT64, NZ.TIMESTAMP, NZ.GEOMETRY, NZ.VARBINARY.
Enumeration Type Documentation
►
enum NZ
Constants used in user code.
FIXED Fixed string
VARIABLE Variable string
NATIONAL_FIXED Fixed national string
NATIONAL_VARIABLE Variable national string
BOOL Boolean
DATE Date
TIME Time
TIMETZ Time zone
NUMERIC32 Numeric 32
NUMERIC64 Numeric 64
NUMERIC128 Numeric 128
FLOAT Float
DOUBLE Double
INTERVAL Interval
INT8 1 byte integer
26
D20713-04 Rev. 1
Data Type Support
INT16 2 byte integer
INT32 4 byte integer
INT64 8 byte integer
TIMESTAMP Time stamp
GEOMETRY Geometry
VARBINARY Varbinary
DEBUG Log level debug
TRACE Log level trace
INIT Aggregate initialize identifier
ACCUM Aggregate accumulate identifier
MERGE Aggregate merge state identifier
FINAL Aggregate final result identifier
FUNC (API.FUN) API function identifier
AGG (API.AGG) API aggregate identifier
SHP (API.SHP) API shaper identifier
Support APIs
This API family provides support functions for date and time conversions, and for getting runtime environment information.
Modules
►
Date and Time Functions
Date and Time helper functions used to convert to and from Netezza date and time formats.
►
Runtime and Environment Information
Runtime, Environment, and Shared Library Information.
►
Utilities
Utilities.
Detailed Description
This API family provides support functions for date and time conversions, and for getting runtime environment information.
Date and Time Functions
Date and Time helper functions used to convert to and from Netezza date and time formats.
Functions
►
millisecondsToNzTime
Converts the number of milliseconds since the start of the day to Netezza time.
D20713-04 Rev. 1
27
R AE API Reference
►
posixTimeSecondsToNzDate
Converts Posix seconds since the Epoch to a Netezza date.
►
secondsToNzTime
Converts the number of seconds since the start of the day to a Netezza time.
Detailed Description
Date and Time helper functions used to convert to and from Netezza date and time formats.
The NDN Developer Guide defines the following data types:
date (referred to as 'Netezza Date') is a 4-byte integer representing the number of days before (-) or after (+) 1/1/2000; min: -730,119 (1/1/0001); max: 2,921,939 (12/31/9999)
► time (referred to as 'Netezza Time') is a 8-byte integer representing the number of microseconds between midnight and one microsecond before midnight; min: 0 (00:00:00.000000);
max: 86,399,999,999 (23:59:59.999999)
► time with timezone (referred to as 'Netezza TimeTZ') consisting of a 'Netezza Time' field, and
a timezone field which is a 4-byte integer representing the offset in seconds, sign reversed
(for example, the offset of "+1 hour" is stored as -3600); offset must be a whole number of
minutes, for example, offset% 60 = 0; offset min: -46800 (+ 13:00:00); offset max: 46740 (12:59:00)
► timestamp (referred to as 'Netezza Timestamp') which is a 8-byte integer representing the
number of microseconds before (-) or after (+) 00:00:00.0, 1/1/2000; min:
-63,082,281,600,000,000 (00:00:00, 1/1/0001); max: 252,455,615,999,999,999
(23:59:59.999999, 12/31/9999)
► interval (referred to as 'Netezza Interval') which consists of a 4-byte integer (the number of
months, signed) and a 8-byte integer (the number of microseconds, signed); a configuration
of a negative (-) months value and a positive (+) microseconds value, or a positive (+) months
value and a negative (-) microseconds value, is possible and supported by the Netezza appliance; months min: 3,000,000 (-250000 years); months max: 3,000,000 (250000 years); microseconds min: NONE (min signed int64); microseconds max: NONE (max signed int64).
Please note that:
(a) The microsecond value can be as large as the int64 data type allows and overflows into negatives without error.
►
(b) A month is always considered to contain 30 days. (c) The months and microseconds values
are stored separately and there is no information exchange between them.
To support the above data types as closely as possible, in the R Adapter 8-byte integers are represented as numeric values. Note that this representation introduces approximation and not all
values can be sent to the NPS software in a precise manner.
The setOutputDate function accepts a number of input formats: POSIXlt and POSIXct are cast as
integers and are translated to 'Netezza Date' using posixTimeSecondsToNzDate; The date object
is cast as an integer and translated to be relative to 1/1/2000, that is, 10957 days are subtracted.
The setOutputTime function accepts a numeric value representing time value as defined in
'Netezza Time'.
The setOutputTimeFromR function accepts either seconds or milliseconds, both expressed as 4-
28
D20713-04 Rev. 1
Support APIs
byte integer values; the time value is then translated using secondsToNzTime or millisecondsToNzTime,
respectively.
The setOutputTimeStamp function accepts a timestamp which, being a 8-byte integer, cannot be represented precisely in R; thus a numeric value is expected. Note that rounding results when the value is to
large to fit the IEEE double data type.
The setOutputTimeTz function accepts time and offset values. See 'Netezza TimeTZ' for details.
The setOutputInterval function accepts a number of microseconds usec and a number of months mon.
See 'Netezza Interval' for details.
Function Documentation
►
►
►
millisecondsToNzTime(msec)
Converts the number of milliseconds since the start of the day to Netezza time.
▲
Parameters
► msec
The number of milliseconds relative to 00:00:00, less that or equal to 86,399,999
(23:59:59.999).
▲
Returns
A Netezza time.
posixTimeSecondsToNzDate(posixSec)
Converts Posix seconds since the Epoch to a Netezza date.
▲
Parameters
► posixSec
POSIX time, that is, the number of seconds relative to midnight Coordinated Universal Time
(UTC) of January 1, 1970.
▲
Returns
A Netezza date.
secondsToNzTime(sec)
Converts the number of seconds since the start of the day to a Netezza time.
▲
Parameters
► sec
The number of seconds relative to 00:00:00, less than or equal to 86399 (23:59:59).
▲
Returns
A Netezza time.
Runtime and Environment Information
Runtime, Environment, and Shared Library Information.
D20713-04 Rev. 1
29
R AE API Reference
Functions
►
getEnv
Returns the value of the specified environment variable.
►
getFirstEnvironmentEntry
Returns the first environment entry.
►
getLibraryFullPath
Returns the full path for the specified library.
►
getLibraryInfo
Returns shared libraries information for this request.
►
getLibraryProcessInfo
Returns shared libraries information for the process.
►
getNextEnvironmentEntry
Returns the next environment entry.
►
getRuntime
Gets runtime information about the R AE.
►
getSystemLog
Gets the path of the system log file.
►
logMessage
Sends a log message to the NPS software.
►
userError
Sends an error message to the Netezza appliance.
Detailed Description
Runtime, Environment, and Shared Library Information.
Function Documentation
►
►
getEnv(name)
Returns the value of the specified environment variable.
▲
Parameters
► name
The environment variable name.
▲
Returns
A character string containing the value of the specified environment variable or NULL if
not found.
getFirstEnvironmentEntry()
Returns the first environment entry.
▲
30
Returns
D20713-04 Rev. 1
Support APIs
A two-element character vector or NULL on completion. The vector, if returned, contains the
name of the variable as well as its value.
Returns the first environment entry. This function call should be followed by repeated calls to getNextEnvironmentEntry.
►
getLibraryFullPath(name, case)
Returns the full path for the specified library.
▲
Parameters
► name
The name of the library.
►
▲
case
Specifies if matching should be case-sensitive (TRUE) or case-insensitive (FALSE)
Returns
A character string containing the specified library path.
This R Adapter API function allows the Netezza system to be queried about shared libraries required
by the UDX responsible for running the current Analytic Executable.
►
getLibraryInfo()
Returns shared libraries information for this request.
▲
Returns
A three-column (name, path, autoLoad) data.frame.
This R Adapter API function allows the Netezza system to be queried about shared libraries required
by the UDX responsible for running the current Analytic Executable.
►
getLibraryProcessInfo()
Returns shared libraries information for the process.
▲
Returns
A three-column (name, path, autoLoad) data.frame.
Returns shared libraries information for the process. Returns NULL if the AE is not Remote.
This R Adapter API function allows the Netezza system to be queried about shared libraries required
by the UDX responsible for running the current Analytic Executable.
►
getNextEnvironmentEntry()
Returns the next environment entry.
▲
Returns
A two-element character vector or NULL on completion. The vector, if returned, contains the
name of the variable as well as its value.
Returns the next environment entry. The first call to getNextEnvironmentEntry must follow a call to
getFirstEnvironmentEntry. Returns NULL on completion. Key names may repeat but the current version of a keyname is last.
D20713-04 Rev. 1
31
R AE API Reference
►
getRuntime()
Gets runtime information about the R AE.
▲
Returns
A list with the following elements: data.slice.id, transaction.id, hardware.id, number.data.slices, number.spus, suggested.memory.limit, locus, adapter.type, user.query, session.id.
During AE execution, various runtime details regarding the database, execution locus, and so
on, are available. These can be accessed using getRuntime.
►
getSystemLog()
Gets the path of the system log file.
▲
►
Returns
A character string containing the path to the system log file.
logMessage(level, msg)
Sends a log message to the NPS software.
▲
Parameters
► level
The message importance level; can be set to NZ.DEBUG or NZ.TRACE.
►
msg
The log message itself.
The logMessage function sends a message to the NPS software, which is then printed to console where the NPS software was started or stored in one of the log files.
►
userError(..., exit=TRUE)
Sends an error message to the Netezza appliance.
▲
Parameters
► ...
Arguments to be passed to paste(...,col='') to produce the final error message.
►
exit
Exit the process after sending the error message.
The AE process must then clear its runtime data and exit. The exit parameter allows R to exit
normally, otherwise userError calls the exit library function directly. Ideally, the standard R
stop function, should be called, which terminates R execution and throws an exception that
is eventually intercepted and passed to userError in a safe way.
Utilities
Utilities.
32
D20713-04 Rev. 1
Support APIs
Functions
►
decodebase64
Decodes a base64-encoded character string.
►
fastdataframe
Implements the protocol for sending large data frame from client R to the Netezza software.
►
fetchGroup
Gets the specified number of input rows.
►
fetchRows
Gets the specified number of input rows.
►
getFilePath
Returns the location of a user-provided code file.
►
getGroupValue
Gets the group value.
►
ping
Notifies the NPS software that the AE is still active.
►
placefile
Saves user-provided code in the workspace directory.
►
prepareUserCode
Saves user-provided code in the workspace directory.
►
workspacePath
Returns the location of the workspace folder.
Detailed Description
Utilities.
Function Documentation
►
decodebase64()
Decodes a base64-encoded character string.
▲
Returns
A decoded character string
If the R AE is called via the Netezza R Library (nzr), the user-provided function is sent as a serialized,
base64-encoded character string. This function is used to decode that string.
►
fastdataframe()
Implements the protocol for sending large data frame from client R to the Netezza software.
This function is the first function called after loading the nzrserver package when the nzr..fastdataframe UDTF is run from the Netezza R Library (nzr). It implements a protocol for sending large data.frames from a client R instance to the NPS software.
D20713-04 Rev. 1
33
R AE API Reference
►
fetchGroup(cfrom=NULL, cto=NULL)
Gets the specified number of input rows.
▲
Parameters
► cfrom
The index of the first input column to be fetched; by default 0.
►
▲
cto
The index of the last input column to be fetched; by default inputColumnCount -1.
Returns
data.frame.
Because of its optimized input implementation, this function allows a number of input rows
to be fetched with one callback invocation, increasing the performance by 10-50 times.
The fetchGroup utility assumes that the last three columns are the group ID, the row number, and the total row count in the current group; it reads the number of rows from the last
column. It also saves the group ID value, which can be later accessed by getGroupValue.
This utility can only be used with Table functions.
►
fetchRows(n=1, cfrom=NULL, cto=NULL, out=NULL)
Gets the specified number of input rows.
▲
▲
Parameters
► n
The number of input rows to be fetched.
►
cfrom
The index of the first input column to be fetched; by default 0.
►
cto
The index of the last input column to be fetched; by default inputColumnCount -1.
►
out
Optional parameter. If the name of an allocated data frame is passed, the R Adapter
overwrites the current data frame and does not allocate new memory.
Returns
data.frame.
Because of its optimized input implementation, this function allows a number of input rows
to be fetched with one callback invocation, increasing the performance by 10-50 times.
The fetchRows utility reads in at most the specified number of input rows, presuming they
are available, and returns them as a data frame, with columns named Xn.
If the out parameter is specified, data is passed outside the function through the out object,
which is uncommon in R itself. The number of rows in the object specified by out must be
equal to n; if they differ, an error is reported and Adapter execution ceases. This parameter
should be used to tune the data input performance, if necessary. The fetchRows utility can
be used without passing a value with virtually no performance impact.
34
D20713-04 Rev. 1
Support APIs
This utility can only be used with Table SQL functions.
►
getFilePath()
Returns the location of a user-provided code file.
▲
Returns
A character string pointing to the code file.
If the R AE is called via the Netezza R Library (nzr), the user-provided function is stored in the
workspace folder. This function returns the absolute file path.
When searching for the user code, the following locations are checked:
The location specified by the WORKSPACE_PATH environment variable, which should contain the
file name to be concatenated to the result of workspace() invocation.
▲ If the WORKSPACE_PATH variable cannot be found then the location specified by the
ABSOLUTE_PATH environment variable, which should contain the full path to an existing file, is
checked.
▲
►
getGroupValue()
Gets the group value.
▲
Returns
Atomic vector containing one element, whose type depends on the group input column. This element is the group ID value.
Gets the saved group ID value from the last call to fetchGroup.
Because of its optimized input implementation, this function allows a number of input rows to be
fetched with one callback invocation, increasing the performance by 10-50 times.
This function can only be used with Table SQL functions.
►
ping()
Notifies the NPS software that the AE is still active.
Ping can be used to indicate that the AE is still active and not hanging, which may be useful for timeintensive computations.
►
placefile()
Saves user-provided code in the workspace directory.
This function is the first function called after loading the nzrserver package when the nzr..placefile
UDTF is run from the Netezza R Library (nzr). It saves the user-provided code in the workspace directory. The placefile function should be registered as a separate UDTF and called only via the Netezza R
Library (nzr) functions.
The placefile function expects to find a least two columns. The first column is the mode column,
which has three possible values: placefile, createfile, or appendfile.
If the value in the mode column is placefile, the second column is the base64-encoded character
string containing the data to be written to a randomly-named file. The file is created in the workspace
directory and the file name is returned as the UDTF result.
D20713-04 Rev. 1
35
R AE API Reference
If the value in the mode column is either createfile or appendfile, the second column must
contain the file name, and a third column must contain the data string. The file is then created in the workspace directory with the provided name. If the value of mode was createfile,
the file is overwritten; if the value was appendfile, the file is appended. The file name is returned as the UDTF result.
►
prepareUserCode()
Saves user-provided code in the workspace directory.
The dispatcher function assumes that there is a file or an environment variable that contain
all required user data. The data is assumed to be serialized if in a file, or serialized and
base64-encoded if stored in an environment variable.
When the user code is searched for the following places are checked:
The location specified by the WORKSPACE_PATH environment variable, which should
contain the file name to be concatenated to the result of workspace() invocation.
▲ If the WORKSPACE_PATH variable cannot be found then the location specified by the ABSOLUTE_PATH environment variable, which should contain the full path to an existing
file, is checked.
▲ If the ABSOLUTE_PATH variable cannot be found, the the CODE_SERIALIZED environment
variable, which should contain a base64-encoded string, is checked.
The contents of the file or the base64-decoded string are then passed to charToRaw(), and
its output to unserialize(), which are both standard R functions. The result of the deserialization should be a list containing a subset of the following elements:
▲
mode - a character string with 'mode' identifier
fun - a function object
args - a list with additional function arguments
cols - a vector with input columns names
file - the name of the package file to be installed
shaper - a shaper function
shaper.args - additional arguments for the shaper function
The mode element is used to determine which high-level wrapper should be invoked. It can
take one of the following values: apply, tapply, run, or install. In the case of aggregation
mode (UDA) and 'Shaper & Sizers' no mode value is required.
For exact implementation see dispatcher, handleConnection, and runWrapper functions.
IMPORTANT: When TABLE(ANY) is given as the table function result, such as when Shapers &
Sizers are being used, the dynamic environment implementation requires the last argument
be cast as non-national VARCHAR.
►
36
workspacePath()
Returns the location of the workspace folder.
D20713-04 Rev. 1
Support APIs
▲
Returns
A character string containing the path to the workspace folder.
If the R AE is called via the Netezza R Library (nzr), the user-provided function is stored in the
workspace folder. This function returns the absolute folder path.
Working Modes
Working mode wrappers.
Functions
►
apply
Handles the client-side invocation for nzApply.
►
groupedApply
Handles the client-side invocation for nzGroupedApply.
►
install
Handles the client-side invocation for nzInstallPackages.
►
rawtapply
Handles the client-side invocation for special grouped apply.
►
run
Handles the client-side invocation for nzRun.
►
tapply
Handles the client-side invocation for nzTApply.
Detailed Description
Working mode wrappers.
An R Analytic Executable can operate on a number of levels of abstraction, utilizing the most basic API or
taking the advantage of some predefined wrappers. The wrappers are described here.
Function Documentation
►
nzrsrv::apply(fun, args)
Handles the client-side invocation for nzApply.
▲
Parameters
► fun
The function object to be invoked for the data stream. The fun object must be a function that
accepts data row x and any additional user-provided arguments args.
►
args
Additional arguments for fun.
Called when the dispatcher function is called, which is the default way of working with R AEs.
►
nzrsrv::groupedApply()
D20713-04 Rev. 1
37
R AE API Reference
Handles the client-side invocation for nzGroupedApply.
Called when the dispatcher function is called, which is the default way of working with R
AEs.
The groupedApply wrapper is responsible for handling aggregates and reads its data from a
file identified by the current session identifier. See implementation and client-side documentation for details.
►
nzrsrv::install(file)
Handles the client-side invocation for nzInstallPackages.
▲
Parameters
► file
A package file name. The package is assumed to be present in the workspace directory. The file must match the name of an existing R package located in the
workspace directory.
Called when the dispatcher function is called, which is the default way of working with R
AEs.
►
nzrsrv::rawtapply()
Handles the client-side invocation for special grouped apply.
▲
Parameters
► fun
The function object to be invoked for the data stream.
►
args
Additional arguments for fun.
This function implements a special grouped apply to be used directly as a SQL query.
This function is called by dispatcher when the nz.mode variable in the compiled file is set to
'groupedapply'.
Called when the dispatcher function is called, which is the default way of working with R
AEs.
►
nzrsrv::run(fun, args)
Handles the client-side invocation for nzRun.
▲
Parameters
► fun
The function object to be invoked for the data stream. The fun object must be a
function that accepts user-provided arguments args.
►
args
Additional arguments for fun.
Called when the dispatcher function is called, which is the default way of working with R
38
D20713-04 Rev. 1
Working Modes
AEs.
►
nzrsrv::tapply(fun, args, cols)
Handles the client-side invocation for nzTApply.
▲
Parameters
► fun
The function object to be invoked for the data stream. The fun object must be a function that
accepts data frame x and additional user-provided arguments args.
►
args
Additional arguments for fun.
►
cols
The input column names.
Called when the dispatcher function is called, which is the default way of working with R AEs.
The tapply wrapper requires a character vector that includes the names of the input columns, which
is used to assure that fun can access input data using the original names.
D20713-04 Rev. 1
39
Notices and Trademarks
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer
the products, services, or features discussed in this document in other countries. Consult your local
IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that
IBM product, program, or service may be used. Any functionally equivalent product, program, or
service that does not infringe any IBM intellectual property right may be used instead. However, it is
the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or
service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can
send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:
Intellectual Property Licensing
Legal and Intellectual Property Law
IBM Japan Ltd.
1623-14, Shimotsuruma, Yamato-shi
Kanagawa 242-8502 Japan
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do
not in any manner serve as an endorsement of those Web sites. The materials at those Web sites
are not part of the materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the
exchange of information between independently created programs and other programs (including
this one) and (ii) the mutual use of the information which has been exchanged, should contact:
IBM Corporation
26 Forest Street
Marlborough, MA 01752 U.S.A.
Such information may be available, subject to appropriate terms and conditions, including in some
cases, payment of a fee.
The licensed program described in this document and all licensed material available for it are provid-
ed by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.
Any performance data contained herein was determined in a controlled environment. Therefore,
the results obtained in other operating environments may vary significantly. Some measurements
may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have
been estimated through extrapolation. Actual results may vary. Users of this document should verify
the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their
published announcements or other publicly available sources. IBM has not tested those products
and cannot confirm the accuracy of performance, compatibility or any other claims related to nonIBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information is for planning purposes only.
The information herein is subject to change before the products described become available.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies,
brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample
programs in any form without payment to IBM, for the purposes of developing, using, marketing or
distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or
function of these programs. The sample programs are provided "AS IS", without warranty of any
kind. IBM shall not be liable for any damages arising out of your use of the sample programs.
Each copy or any portion of these sample programs or any derivative work, must include a copyright
notice as follows:
© (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs.
© Copyright IBM Corp. (enter the year or years). All rights reserved.
Trademarks
IBM, the IBM logo, ibm.com and Netezza are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other countries, or both. If these and other
IBM trademarked terms are marked on their first occurrence in this information with a trademark
symbol (® or ™),these symbols indicate U.S. registered or common law trademarks owned by IBM at
the time this information was published. Such trademarks may also be registered or common law
trademarks in other countries. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at ibm.com/legal/copytrade.shtml.
The following terms are trademarks or registered trademarks of other companies:
Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/or other
countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation
in the United States, other countries, or both.
NEC is a registered trademark of NEC Corporation.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in
the United States, other countries, or both.
Red Hat is a trademark or registered trademark of Red Hat, Inc. in the United
States and/or other countries.
D-CC, D-C++, Diab+, FastJ, pSOS+, SingleStep, Tornado, VxWorks, Wind River, and
the Wind River logo are trademarks, registered trademarks, or service marks of
Wind River Systems, Inc. Tornado patent pending.
APC and the APC logo are trademarks or registered trademarks of American Power Conversion Corporation.
Other company, product or service names may be trademarks or service marks of others.
Regulatory and Compliance
Regulatory Notices
Install the NPS system in a restricted-access location. Ensure that only those trained to operate or
service the equipment have physical access to it. Install each AC power outlet near the NPS rack that
plugs into it, and keep it freely accessible. Provide approved 30A circuit breakers on all power
sources.
Product may be powered by redundant power sources. Disconnect ALL power sources before servicing. High leakage current. Earth connection essential before connecting supply. Courant de fuite
élevé. Raccordement à la terre indispensable avant le raccordement au réseau.
Homologation Statement
This product may not be certified in your country for connection by any means whatsoever to interfaces of public telecommunications networks. Further certification may be required by law prior to
making any such connection. Contact an IBM representative or reseller for any questions.
FCC - Industry Canada Statement
This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC rules. These limits are designed to provide reasonable protection against
harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate radio-frequency energy and, if not installed and used in accordance with the instruction manual, may cause harmful interference to radio communications.
Operation of this equipment in a residential area is likely to cause harmful interference, in which
case users will be required to correct the interference at their own expense.
This Class A digital apparatus meets all requirements of the Canadian Interference-Causing Equipment Regulations.
Cet appareil numérique de la classe A respecte toutes les exigences du Règlement sur le matériel
brouilleur du Canada.
CE Statement (Europe)
This product complies with the European Low Voltage Directive 73/23/EEC and EMC Directive
89/336/EEC as amended by European Directive 93/68/EEC.
Warning: This is a class A product. In a domestic environment this product may cause radio interference in which case the user may be required to take adequate measures.
VCCI Statement
Index
Index
A
addOutputColumn
Shaper and Sizer API,23
addOutputColumnString
Shaper and Sizer API,23
Aggregate API,21
getOutputColumn,21
getState,22
outputColumnCount,22
apply
Working Modes,37
C
closeLocalConnection
Local Initialization,8
closeRemoteConnection
Remote Initialization,10
createConnectionPoint
Remote Connection Point,9
createLocalConnection
Local Initialization,8
createRemoteConnection
Remote Initialization,10
D
Data Connection APIs,11
getApiType,12
Data Type Support,25
getNpsDataTypes,26
NZ,26
Date and Time Functions,27
millisecondsToNzTime,29
posixTimeSecondsToNzDate,29
secondsToNzTime,29
decodebase64
Utilities,33
dispatcher
High Level initialization,10
F
fastdataframe
Utilities,33
fetchGroup
Utilities,34
fetchRows
Utilities,34
Function,12
getInputColumn,13
getNext,13
inputColumnCount,14
outputResult,14
setOutput,14
setOutputBool,15
setOutputDate,15
setOutputDouble,15
setOutputFloat,16
setOutputInt16,16
setOutputInt32,17
setOutputInt64,17
setOutputInt8,17
setOutputInterval,18
setOutputNull,18
setOutputString,19
setOutputTime,19
setOutputTimeFromR,19
setOutputTimeStamp,20
setOutputTimeTz,20
G
getApiType
Data Connection APIs,12
getEnv
Runtime and Environment Information,30
getFilePath
Utilities,35
getFirstEnvironmentEntry
Runtime and Environment Information,30
getGroupValue
Utilities,35
getInputColumn
Function,13
getInputColumnInfo
45
Index
Shaper and Sizer API,24
getLibraryFullPath
Runtime and Environment Information,31
getLibraryInfo
Runtime and Environment Information,31
getLibraryProcessInfo
Runtime and Environment Information,31
getNext
Function,13
getNextEnvironmentEntry
Runtime and Environment Information,31
getNpsDataTypes
Data Type Support,26
getOutputColumn
Aggregate API,21
getOutputColumnInfo
Shaper and Sizer API,24
getOutputColumnName
Shaper and Sizer API,24
getRuntime
Runtime and Environment Information,32
getState
Aggregate API,22
getSystemLog
Runtime and Environment Information,32
getUdfReturnType
Shaper and Sizer API,24
groupedApply
Working Modes,37
H
handleConnection
High Level initialization,11
High Level initialization,10
dispatcher,10
handleConnection,11
runWrapper,11
I
Initialization APIs,7
inputColumnCount
Function,14
install
46
Working Modes,38
isLocal
Local Initialization,8
L
Local Initialization,8
closeLocalConnection,8
createLocalConnection,8
isLocal,8
logMessage
Runtime and Environment Information,32
M
millisecondsToNzTime
Date and Time Functions,29
N
NZ
Data Type Support,26
O
oneOutputRowRestriction
Shaper and Sizer API,25
outputColumnCount
Aggregate API,22
outputResult
Function,14
P
ping
Utilities,35
placefile
Utilities,35
posixTimeSecondsToNzDate
Date and Time Functions,29
prepareUserCode
Utilities,36
R
rawtapply
Index
Working Modes,38
Remote Connection Point,9
createConnectionPoint,9
Remote Initialization,9
closeRemoteConnection,10
createRemoteConnection,10
run
Working Modes,38
Runtime and Environment Information,29
getEnv,30
getFirstEnvironmentEntry,30
getLibraryFullPath,31
getLibraryInfo,31
getLibraryProcessInfo,31
getNextEnvironmentEntry,31
getRuntime,32
getSystemLog,32
logMessage,32
userError,32
runWrapper
High Level initialization,11
S
secondsToNzTime
Date and Time Functions,29
setOutput
Function,14
setOutputBool
Function,15
setOutputDate
Function,15
setOutputDouble
Function,15
setOutputFloat
Function,16
setOutputInt16
Function,16
setOutputInt32
Function,17
setOutputInt64
Function,17
setOutputInt8
Function,17
setOutputInterval
Function,18
setOutputNull
Function,18
setOutputString
Function,19
setOutputTime
Function,19
setOutputTimeFromR
Function,19
setOutputTimeStamp
Function,20
setOutputTimeTz
Function,20
Shaper and Sizer API,22
addOutputColumn,23
addOutputColumnString,23
getInputColumnInfo,24
getOutputColumnInfo,24
getOutputColumnName,24
getUdfReturnType,24
oneOutputRowRestriction,25
systemCatalogIsUpper,25
updateInfo,25
Support APIs,27
systemCatalogIsUpper
Shaper and Sizer API,25
T
tapply
Working Modes,39
U
updateInfo
Shaper and Sizer API,25
userError
Runtime and Environment Information,32
Utilities,32
decodebase64,33
fastdataframe,33
fetchGroup,34
fetchRows,34
getFilePath,35
getGroupValue,35
47
Index
ping,35
placefile,35
prepareUserCode,36
workspacePath,36
W
Working Modes,37
apply,37
groupedApply,37
install,38
rawtapply,38
run,38
tapply,39
workspacePath
Utilities,36
48
Fly UP