Function list

You can create rules using functions. This can be a very useful method

This section describes each function in terms of the following.

  • Category

  • Description

  • Function interface

  • Arguments

  • Return type

  • Example

  • Remarks

The following functions are currently supported by data preperation

Functions can be supplemented on an ongoing basis.

length

Category

  • String Function

Description

  • Returns the length of the input string

Function interface

  • length(string_value)

Arguments

  • string_value: the string whose length you want to find.

Return type

  • Integer

Example

  • length(first_name)

if

Category

  • Logical Function

Description

  • Examine the conditional statement and return a value corresponding to TRUE or FALSE.

Function interface

  • if(condition)

  • if(condition, true_value, false_value)

Arguments

  • condition: The condition to check for true / false

  • true_value: The value returned if the conditional statement is true.

  • false_value: The value returned if the conditional statement is false.

Return type

  • Any

Example

  • if(gender==‘male’) : TRUE

  • if(age<18, ‘kid’, ‘adult’) : ‘adult’

Remarks

  • If true_value/false_value does not exist, it returns TURE or FALSE as a result of Boolean type.

  • ture_value와 false_value의 데이터 타입은 동일해야 합니다.

isnull

Category

  • Logical Function

Description

  • Determines whether the value of the input column is null. Returns TRUE if null, or FALSE.

Function interface

  • isnull(condition)

Arguments

  • condition: The column to determine if null.

Return type

  • Boolean

Example

  • isnull(telephone) : FALSE

isnan

Category

  • Logical Function

Description

  • Determines if input value is NaN (Not-a-Number). Returns TRUE if NaN, FALSE otherwise.

Function interface

  • isnan(condition)

Arguments

  • condition: The column or formula for which to determine NaN.

Return type

  • Boolean

Example

  • isnan(1000/ratio)

Remarks

  • The result of the condition must be a Double Value.

upper

Category

  • String Function

Description

  • Returns all uppercase letters of the alphabet entered.

Function interface

  • upper(string_value)

Arguments

  • string_value: The string to replace with an uppercase letter.

Return type

  • String

Example

  • upper(last_name)

  • upper(‘Hello world’) : ’HELLO WORLD’

lower

Category

  • String Function

Description

  • Returns all lowercase letters of the entered string.

Function interface

  • lower(string_value)

Arguments

  • string_value: the string you want to replace with lowercase.

Return type

  • String

Example

  • lower(last_name)

  • lower(‘Hello WORLD’) : ’hello world’

trim

Category

  • String Function

Description

  • Returns the spaces before and after the input string.

Function interface

  • trim(string_value)

Arguments

  • string_value: The string to remove whitespace from.

Return type

  • String

Example

  • trim(comment)

  • trim(‘  .   Hi!   ‘) : ‘.   Hi!’

ltrim

Category

  • String Function

Description

  • Remove and return the space before the input string.

Function interface

  • ltrim(string_value)

Arguments

  • string_value: The string to remove whitespace from.

Return type

  • String

Example

  • ltrim(comment)

  • ltrim(‘  .   Hi!   ‘) : ’.   Hi!   ‘

rtrim

Category

  • String Function

Description

  • Returns the space after the input string.

Function interface

  • rtrim(string_value)

Arguments

  • string_value: The string to remove whitespace from.

Return type

  • String

Example

  • rtrim(comment)

  • rtrim(‘  .   Hi!   ‘) : ‘  .   Hi!’

substring

Category

  • String Function

Description

  • Returns part of the input string.

Function interface

  • substring(string_value, begin_index, offset)

  • substring(string_value, begin_index)

Arguments

  • string_value: The string to edit.

  • begin_index: Start index of the part to extract from the target string. The beginning of the string is 0. If you enter a negative number, it goes back to the last character of the string.

  • offset: The length of the string to extract from the target string. If not entered, extracts from begine_index to the end of the string.

Return type

  • String

Example

  • substring(user_id, 0, 5)

  • substring(‘hello world’, 1, 7) : ‘ello w’

  • substring(‘metatron’, -2) : ‘on’

concat

Category

  • String Function

Description

  • 입력된 복수의 문자열을 연결하여 반환합니다.

Function interface

  • Concatenate and return multiple input strings.

Arguments

  • string_value (X): String to concatenate. You can enter multiple n items.

Return type

  • String

Example

  • concat(first_name, ‘-‘, last_name) : ‘Jane-Doe’

  • concat(‘1980’, ’02’) : ‘198002’

concat_ws

Category

  • String Function

Description

  • Concatenates multiple input strings and returns a Separator between them.

Function interface

  • concat(separator, stirng_value1, string_value2)

Arguments

  • separator: Separator to insert between strings to be concatenated.

  • string_value (X): String to concatenate. You can enter multiple n items.

Return type

  • String

Example

  • concat_ws(‘, ’, first_name, last_name) : ‘Jane, Doe’

  • concat_ws(‘-‘, ‘010’, ‘1234’, ‘5678’) : ’010-1234-5678’

year

Category

  • Timestamp Function

Description

  • Returns a value corresponding to the year from the entered Timestamp value.

Function interface

  • year(timestamp_value)

Arguments

  • timestamp_value: 연도를 추출하고자 하는 timestamp

Return type

  • Integer

Example

  • year(birthday)

month

Category

  • Timestamp Function

Description

  • Returns the value corresponding to the month in the entered Timestamp value.

Function interface

  • month(timestamp_value)

Arguments

  • timestamp_value: the timestamp from which you want to extract the month

Return type

  • Integer

Example

  • month(birthday)

day

Category

  • Timestamp Function

Description

  • Returns a value corresponding to day from an entered Timestamp value.

Function interface

  • day(timestamp_value)

Arguments

  • timestamp_value: the timestamp from which you want to extract the day

Return type

  • Integer

Example

  • day(birthday)

hour

Category

  • Timestamp Function

Description

  • Returns a value corresponding to a time from an entered Timestamp value.

Function interface

  • hour(timestamp_value)

Arguments

  • timestamp_value: timestamp from which you want to extract time

Return type

  • Integer

Example

  • hour(last_login)

minute

Category

  • Timestamp Function

Description

  • Returns a value corresponding to minutes from the entered Timestamp value.

Function interface

  • minute(timestamp_value)

Arguments

  • timestamp_value: the timestamp from which you want to extract minutes

Return type

  • Integer

Example

  • minute(last_login)

second

Category

  • Timestamp Function

Description

  • Returns the value corresponding to seconds from the entered Timestamp value.

Function interface

  • second(timestamp_value)

Arguments

  • timestamp_value: the timestamp from which you want to extract seconds

Return type

  • Integer

Example

  • second(last_login)

millisecond

Category

  • Timestamp Function

Description

  • Returns the value corresponding to milliseconds (1/1000 second) from the entered Timestamp value.

Function interface

  • millisecond(timestamp_value)

Arguments

  • timestamp_value: the timestamp from which you want to extract milliseconds

Return type

  • Integer

Example

  • millisecond(last_login)

now

Category

  • Timestamp Function

Description

  • Returns the current time based on the entered Timezone.

Function interface

  • now()

  • now(timezone)

Arguments

  • timzone: 현재시간을 구하고자 하는 Timezone의 fulll-name.

Return type

  • Integer

Example

  • now()

  • now(‘Asia/Seoul’)

Remarks

  • If no Timezone value is entered, returns the time in UTC.

add_time

Category

  • Timestamp Function

Description

  • Returns the value added or subtracted from the input Timestamp value.

Function interface

  • add_time(timestamp, delta, time_unit)

Arguments

  • timestamp: the original timestamp value being targeted

  • delta: the date / time value to add or subtract

  • time_unit: The unit of date / time to add or subtract (in string). year, month, day, hour, minute, second, millisecond.

Return type

  • Integer

Example

  • add_time(end_date, 10, ‘day’)

  • add_time(end_date, -1, ‘month’)

sum

Category

  • Aggregation Function

Description

  • Returns the sum of the target values.

Function interface

  • sum(target_col)

Arguments

  • target_col: Target column to sum

Return type

  • Double

Example

  • sum(profit)

Remarks

  • Only available for aggregation and window rules.

avg

Category

  • Aggregation Function

Description

  • Returns the average of the target values.

Function interface

  • avg(target_col)

Arguments

  • target_col: Target column to average

Return type

  • Double

Example

  • avg(profit)

Remarks

  • Only available for aggregation and window rules.

max

Category

  • Aggregation Function

Description

  • Returns the largest of the target values.

Function interface

  • max(target_col)

Arguments

  • target_col: Target column to get the maximum value

Return type

  • Double

Example

  • max(profit)

Remarks

  • Only available for aggregation and window rules.

min

Category

  • Aggregation Function

Description

  • Returns the smallest of the target values.

Function interface

  • min(target_col)

Arguments

  • target_col: Target column to get the minimum value

Return type

  • Double

Example

  • min(profit)

Remarks

  • Only available for aggregation and window rules.

count

Category

  • Aggregation Function

Description

  • Returns the number of rows in the target.

Function interface

  • count()

Return type

  • Double

Example

  • count()

Remarks

  • Only available for aggregation and window rules.

math.abs

Category

  • Math Function

Description

  • Returns the absolute value of the entered value.

Function interface

  • math.abs(value)

Arguments

  • value: A number whose absolute value you want to find.

Return type

  • Double

Example

  • math.abs(-10) : 10

math.acos

Category

  • Math Function

Description

  • Returns the arc cosine of the entered value.

Function interface

  • math.acos(value)

Arguments

  • value: The cosine of which you want to find the arc cosine.

Return type

  • Double

Example

  • math.acos(-1) : 3.141592653589793

math.asin

Category

  • Math Function

Description

  • Returns the arc sine of the entered value.

Function interface

  • math.asin(value)

Arguments

  • value: The sine of which you want to find the arc sine, in the range -1 to 1.

Return type

  • Double

Example

  • math.asin(-1) : -1.5707963267948966

math.atan

Category

  • Math Function

Description

  • Returns the arc sine of the entered value.

Function interface

  • math.atan(value)

Arguments

  • value: The sine of which you want to find the arc sine, in the range -1 to 1.

Return type

  • Double

Example

  • math.asin(-1) : -1.5707963267948966

math.cbrt

Category

  • Math Function

Description

  • Returns the cube root of the entered value.

Function interface

  • math.cbrt(value)

Arguments

  • value: The number whose cube root you want to find.

Return type

  • Double

Example

  • math.cbrt(5) : 1.709975946676697

math.ceil

Category

  • Math Function

Description

  • Returns the value rounded up to be a multiple of day.

Function interface

  • math.ceil(value)

Arguments

  • value: The number you want to round to one’s place.

Return type

  • Double

Example

  • math.ceil(15.142) : 16

math.cos

Category

  • Math Function

Description

  • Returns the cosine of the entered value.

Function interface

  • math.cos(value)

Arguments

  • value: the radian angle to get the cosine of

Return type

  • Double

Example

  • math.cos(45) : 0.5253219888177297

math.cosh

Category

  • Math Function

Description

  • Returns the hyperbolic cosine of the entered value.

Function interface

  • math.cosh(value)

Arguments

  • value: The number whose hyperbolic cosine is to be obtained.

Return type

  • Double

Example

  • math.cosh(9) : COSH(9) => 4051.5420254925943

math.exp

Category

  • Math Function

Description

  • Returns the natural logarithm of e raised to the power of the input value.

Function interface

  • math.exp(value)

Arguments

  • value: The number of times to want to log the natural logarithm e.

Return type

  • Double

Example

  • math.exp(4) : 54.598150033144236

math.expm1

Category

  • Math Function

Description

  • Returns the natural logarithm e, multiplied by the value entered, minus one.

Function interface

  • math.expm1(value)

Arguments

  • value: The number of times to want to log the natural logarithm e.

Return type

  • Double

Example

  • math.expm1(4) : 53.598150033144236

math.getExponent

Category

  • Math Function

Description

  • Returns the largest of exp values ​​that satisfy 2exp <= N for the entered value N.

Function interface

  • math.getExponent(value)

Arguments

  • value: The number corresponding to N when looking for an exp value that satisfies 2exp <= N.

Return type

  • Double

Example

  • math.getExponent(9) : 3

math.round

Category

  • Math Function

Description

  • Returns the value rounded to the ones place.

Function interface

  • math.round(value)

Arguments

  • value: the number to be rounded to

Return type

  • Double

Example

  • math.round(14.2) : 14

math.signum

Category

  • Math Function

Description

  • Returns the sign of the entered value.

Function interface

  • math.signum(value)

Arguments

  • value: the number to extract the sign of

Return type

  • Double

Example

  • math.signum(-24) : -1

Remarks

  • If the number entered is 1, it is 1, 0 is 0, and -1 if it is negative.

math.sin

Category

  • Math Function

Description

  • Returns the sine of the entered value.

Function interface

  • math.sin(value)

Arguments

  • value: the radian angle for which you want to find the sine

Return type

  • Double

Example

  • math.sin(90) : 0.8939966636005579

math.sinh

Category

  • Math Function

Description

  • Returns the hyperbolic sine of the entered value.

Function interface

  • math.sinh(value)

Arguments

  • value: the number whose hyperbolic sine is to be obtained

Return type

  • Double

Example

  • math.sinh(1) : 1.1752011936438014

math.sqrt

Category

  • Math Function

Description

  • Returns the square root of the entered value.

Function interface

  • math.sqrt(value)

Arguments

  • value: the number whose square root you want to find

Return type

  • Double

Example

  • math.sqrt(4) : 2

math.tan

Category

  • Math Function

Description

  • Returns the tangent of the entered value.

Function interface

  • math.tan(value)

Arguments

  • value: the radian angle for the tangent value

Return type

  • Double

Example

  • math.tan(10) : 0.6483608274590866

math.tanh

Category

  • Math Function

Description

  • Returns the hyperbolic tangent of the entered value.

Function interface

  • math.tanh(value)

Arguments

  • value: The angle to get the hyperbolic tangent of.

Return type

  • Double

Example

  • math.tanh(4) : 0.999329299739067

time_diff

Category

  • Timestamp Function

Description

  • Calculates and returns the difference between two input Timestamp values ​​in milliseconds.

Function interface

  • time_diff(timestamp1, timestamp2)

Arguments

  • timestamp1:C = B - A 에서 A에 해당하는 시간 값.

  • timestamp1: C = B - A, the timestamp of B

Return type

  • Double

Example

  • time_diff(order_date, shipped_date)

Remarks

  • result value = timestamp2 - timestamp1

timestamp

Category

  • Timestamp Function

Description

  • Create a new Timestamp value.

Function interface

  • timestamp(value, format)

Arguments

  • value: Date/Time value to create as timestamp value.

  • format: The time format of the value value.

Return type

  • Timestamp

Example

  • timestamp(‘2011-01-01’, ‘yyyy-MM-dd’) : 2011-01-01T00:00:00.000Z

row_number

Category

  • Window Function

Description

  • Generates serial numbers of rows arranged in order in the partition.

Function interface

  • row_number()

Return type

  • Long

Example

  • row_number()

Remarks

  • Only available with Window Rule.

rolling_sum

Category

  • Window Function

Description

  • Returns the sum of the values ​​of the specified number of rows before and after within the partition.

Function interface

  • rolling_sum(target_col, before, after)

Arguments

  • target_col: Target column name to sum.

  • before: Number of preceding rows to sum.

  • after: The number of trailing rows to sum.

Return type

  • Long/Double

Example

  • rolling_sum (profit, 3, 3): Combines profits for a total of seven rows, including three rows before and after the same partition.

Remarks

  • Only available with Window Rule.

rolling_avg

Category

  • Window Function

Description

  • Returns the average of the values ​​of the specified number of rows before and after in the partition.

Function interface

  • rolling_avg(target_col, before, after)

Arguments

  • target_col: The target column name for which you want to average.

  • before: The number of preceding rows to average.

  • after: number of trailing rows to average.

Return type

  • Long/Double

Example

  • rolling_avg (profit, 3, 3): average of 7 rows’ profits including 3 rows before and after the same partition

Remarks

  • Only available with Window Rule.

lag

Category

  • Window Function

Description

  • Returns the value of the row that is earlier than the specified number in the partition.

Function interface

  • lag(target_col, before)

Arguments

  • target_col: Target column name.

  • before: A number that specifies how far back to return the current row.

Return type

  • Long/Double

Example

  • lag (profit, 2): Returns the profit value of the row above 2 lines in the same partition. If there is no value above line 2, it returns null.

Remarks

  • Only available with Window Rule.

lead

Category

  • Window Function

Description

  • Returns the value of Row after the specified number within the partition.

Function interface

  • lead(target_col, after)

Arguments

  • target_col: Target column name.

  • after: A number that specifies how far behind the current row to return.

Return type

  • Long/Double

Example

  • lead (profit, 2): returns the profit value of a row below 2 lines in the same partition. If there is no value under line 2, it returns null.

Remarks

  • Only available with Window Rule.

ismismatched

Category

  • Logical Function

Description

  • Returns whether the Value of the specified column matches a specific Column Type.

Function interface

  • ismismatched(target_col, column_type)

Arguments

  • target_col: Column name to check type.

  • column_type: Type to check for match. (Type as string) String, Boolean, Timestamp, Long, Double

Return type

  • Boolean

Example

  • ismismatched (birth_date, timestamp): false if the value of the row is timestamp, true otherwise.

contains

Category

  • String Function

Description

  • Returns whether the Value of the specified column contains a specific string.

Function interface

  • contains(target_col, search_word)

Arguments

  • target_col: The column name to search for a string.

  • search_word: The string to search for in the column.

Return type

  • Boolean

Example

  • contains (name, ‘son’): True if name contains son. ‘Micheal Jackson’, ‘Son Heung Min’, etc.

startswith

Category

  • String Function

Description

  • Returns whether the Value of the specified column starts with a specific string.

Function interface

  • startswith(target_col, search_word)

Arguments

  • target_col: The column name to search for a string.

  • search_word: The string to search for in the column.

Return type

  • Boolean

Example

  • startswith (name, ‘kim’): True if name starts with ‘kim’. Kim Chul-soo, Kim Soo-ji, etc.

endswith

Category

  • String Function

Description

  • Returns whether the Value of the specified column ends a specific string.

Function interface

  • endswith(target_col, search_word)

Arguments

  • target_col: The column name to search for a string.

  • search_word: The string to search for in the column.

Return type

  • Boolean

Example

  • endswith (customer_code, ‘M’): True if customer_code ends with M ‘1340M’, ‘0020M’, etc.